Machine Learning Reveals Key Glycoprotein Mutations and Rapidly Assigns Lassa Virus Lineages

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Lassa fever, caused by the Lassa virus (LASV), has led to numerous fatalities in West Africa and cases exported intercontinentally since its discovery in 1969. Currently, there are no approved vaccines, with recent research focusing on immunotherapy. Lassa virus is grouped into different lineages that circulate in specific geographical areas, elicit varying immune responses, and display distinct pathophysiological effects. Therefore, investigating the genetic differences between these lineages is crucial. Here, we analyzed the LASV glycoprotein, the only surface protein, using statistics, machine learning, and phylogenetics to identify key differences between Nigerian lineages and those endemic to other West African countries. We found that amino acid positions near the stable signal peptide cleavage site and sites impacting immune recognition, such as those between positions 59 and 76, were highly variable among the lineages. Additionally, we discovered that Lineage II and Lineage III sequences are one codon shorter than Lineage IV sequences, due to a codon insertion in positions 178-180, corresponding to amino acid position 60. This may explain the structural and phenotypical differences between the lineages. To quickly identify which lineages cause emerging outbreaks or exported infections, we also developed a highly accurate lineage classification tool based on machine learning.

Article activity feed