Genetic Variability, N-Glycosylation Sites, and Recombination Events in the ORF5 (GP5) Gene of Lineage 1A (NADC34) Betaarterivirus americense Strains from Lima, Peru Molecular Variability of the GP5 Glycoprotein in Betaarterivirus americense

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Abstract

ABSTRACT Lima is a major swine production hub in Peru; therefore, these data provide an initial overview of GP5 variability in a high-risk epidemiological setting. This study aimed to characterize the genetic variability, N-glycosylation sites, and recombination events in the ORF5 (GP5) gene of lineage 1A (NADC34) Betaarterivirus americense strains circulating in Lima, Peru. Bioinformatics servers and software were employed, including Nextclade v3, NetNGlyc 1.0, RDP v4.1, DnaSP v6, and MEGA6. Nextclade v3 supported the classification of 24 strains within lineage 1, sublineage 1A (NADC34), which exhibited high divergence based on GP5 phylogenetic analysis. Significant amino acid substitutions were identified in the hypervariable regions HVR1 and HVR2, which are associated with the induction of neutralizing and non-neutralizing antibodies: 16/24 at N32 (S/G/R/E), 3/24 at S34 (N/T), 6/24 at S35 (N/I), 7/24 at L39 (F), 6/24 at Q40 (L/R), 3/24 at L41 (Y/V), 15/24 at K58 (E/V/R), and 8/24 at S59 (H/R/N), located within GP5 epitopes A, B, and C. Nine N-glycosylation patterns (A-I) were identified across the 24 strains, comprising nine putative sites at N30, N32, N33, N34, N35, N44, N50, N51, and N57. Patterns A, B, E, and G exhibited five to six glycosylation sites in 12/24 strains. A statistically robust recombination event was detected in strain 42_Montana2019 (lineage 1A), with a putative major parent MH719138_1 (lineage 1A, Peru-2016/5 variant; 95.3% sequence similarity) and an unknown minor parent, although strain 36_Montana2019 showed the closest phylogenetic affinity. Genetic diversity analysis revealed 530 polymorphic sites, and Tajima D test yielded a value of -0.78746, indicating high genetic variability within the Lima cohort. Overall, this study provides the first comprehensive molecular characterization of ORF5 (GP5) genetic variability in sublineage 1A (NADC34) Betaarterivirus americense strains circulating in swine farms in Lima. Broader and temporally structured sampling is warranted to assess nationwide evolutionary patterns. Keywords: genetic diversity; recombination; ORF5 (GP5); N-glycosylation; sublineage 1A (NADC34); RDP v4.1.

Article activity feed

  1. This Zenodo record is a permanently preserved version of a PREreview. You can view the complete PREreview at https://prereview.org/reviews/18738489.

    General Assessment

    The study provides valuable molecular epidemiological data on the NADC34 variant of PRRSV-2 in Peru, focusing on the GP5 protein—a key target for neutralizing antibodies. The identification of a potentially new N-glycosylation site (N57) and the detailed mapping of mutations within known epitopes are the primary strengths of the work. The manuscript is generally well-structured, but it could be strengthened by clarifying certain statistical interpretations and improving the visual presentation of the data.

    Major Comments

    1. Statistical Interpretation (Tajima's D)

    • Observation: In lines 201–204, the authors report a negative Tajima's D value (-0.78746) and suggest this indicates "high genetic variability" or "population subdivision."

    • Recommendation: Technically, a negative Tajima's D typically indicates a recent population expansion or purifying selection following a bottleneck, leading to an excess of rare alleles. While it does reflect diversity, the authors should be more precise in their evolutionary interpretation. Is the negative value significant? The text says "no significant deviation (P > 0.10)." If the value is not statistically significant, the authors should be cautious about using it to draw strong conclusions about population dynamics.

    2. Functional Implications vs. Speculation

    • Observation: The manuscript frequently links the observed mutations and glycosylation patterns to "immune evasion" and "vaccine failure" (e.g., lines 238, 265, 272).

    • Recommendation: While these links are well-supported by general PRRSV literature, they remain speculative for these specific 24 strains since no in vitro neutralization assays were performed. The authors should explicitly state in the "Limitations" section (or Conclusion) that functional assays are required to confirm if these specific NADC34 variants indeed escape current commercial vaccines used in Peru.

    3. The "Unreported" N57 Site

    • Observation: The authors highlight N57 as a "previously unreported site" (line 194).

    • Recommendation: This is a significant finding. The authors should expand on this in the Discussion. Does this site appear in any international databases (GenBank) for NADC34 strains outside of Peru? A quick comparison with recent NADC34 sequences from China or the US would help determine if this is a unique "Peruvian signature" or a broader emerging trend in sublineage 1.5.

    4. Table 1 Presentation

    • Observation: Table 1 (page 15) lists the N-glycosylation sites.

    • Recommendation: It would be much more informative to add a column for "Total Number of Sites" per strain. This would allow the reader to quickly see that strains 40 and 41 are "hyper-glycosylated" (6 sites) compared to strains 36–39 (2 sites), which is a point emphasized in the text.

    Minor Comments

    1. Figure 3 (Phylogenetic Tree)

    • Observation: The circular tree is visually appealing but the labels in the "turquoise" branch are very small and difficult to read in the current format.

    • Recommendation: Consider providing a "zoomed-in" rectangular version of the Lineage 1.5 cluster as a panel B, or increase the font size of the Peruvian study strains to make them stand out from the reference sequences.

    2. Figure 4 (Alignment)

    • Observation: The alignment is comprehensive but spans two pages and is quite dense.

    • Recommendation: Highlight the "N57" site specifically in the figure or with an arrow, as it is one of the key findings. Also, ensure the color-coded legend for epitopes (A, B, C, etc.) is clearly defined in the figure caption to avoid the reader having to hunt for it in the text.

    3. Terminology and Consistency

    • Line 36: You mention a negative Tajima's D. Ensure the minus sign is a proper mathematical symbol (–) rather than a hyphen (-) for consistency with scientific formatting.

    • Lines 153-154: You mention "nine isolates previously reported... in 2019 and 2025." Given the preprint date says 2026, please ensure the timeline of "previous reports" is clear to the reader (especially regarding the 2025 citation).

    4. Materials and Methods

    • Line 103: Mentioning "Chromas Lite®" is good for transparency. Did the authors also use a specific tool for the multiple sequence alignment (MSA) before the tree construction? (e.g., ClustalW, MUSCLE, or MAFFT). This should be explicitly stated.

    Conclusion

    This is a solid molecular characterization paper. By addressing the statistical nuances of the evolutionary analysis and clearly distinguishing between observed genetic traits and predicted functional outcomes, the authors will improve the impact and accuracy of the manuscript.

    Competing interests

    The author declares that they have no competing interests.

    Use of Artificial Intelligence (AI)

    The author declares that they did not use generative AI to come up with new ideas for their review.