The first near-complete genome assembly of pig: enabling more accurate genetic research
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (GigaScience)
Abstract
Pigs are crucial sources of meat and protein, valuable animal models, and potential donors for xenotransplantation. However, the existing reference genome for pigs is incomplete, with thousands of segments and missing centromeres and telomeres, which limits our understanding of the important traits in these genomic regions. To address this issue, we present a near complete genome assembly for the Jinhua pig (JH-T2T), constructed using PacBio HiFi and ONT long reads. This assembly includes all 18 autosomes and the X and Y sex chromosomes, with only six gaps. It features annotations of 46.90% repetitive sequences, 35 telomeres, 17 centromeres, and 23,924 high-confident genes. Compared to the Sscrofa11.1, JH-T2T closes nearly all gaps, extends sequences by 177 Mb, predicts more intact telomeres and centromeres, and gains 799 more genes and loses 114 genes. Moreover, it enhances the mapping rate for both Western and Chinese local pigs, outperforming Sscrofa11.1 as a reference genome. Additionally, this comprehensive genome assembly will facilitate large-scale variant detection and enable the exploration of genes associated with pig domestication, such as GPAM , CYP2C18 , LY9 , ITLN2 , and CHIA . Our findings represent a significant advancement in pig genomics, providing a robust resource that enhances genetic research, breeding programs, and biomedical applications.
Article activity feed
-
Pigs are crucial sources of meat and protein, valuable animal models, and potential donors for xenotransplantation. However, the existing reference genome for pigs is incomplete, with thousands of segments and missing centromeres and telomeres, which limits our understanding of the important traits in these genomic regions. To address this issue, we present a near complete genome assembly for the Jinhua pig (JH-T2T), constructed using PacBio HiFi and ONT long reads. This assembly includes all 18 autosomes and the X and Y sex chromosomes, with only six gaps. It features annotations of 46.90% repetitive sequences, 35 telomeres, 17 centromeres, and 23,924 high-confident genes. Compared to the Sscrofa11.1, JH-T2T closes nearly all gaps, extends sequences by 177 Mb, predicts more intact telomeres and centromeres, and gains 799 more genes …
Pigs are crucial sources of meat and protein, valuable animal models, and potential donors for xenotransplantation. However, the existing reference genome for pigs is incomplete, with thousands of segments and missing centromeres and telomeres, which limits our understanding of the important traits in these genomic regions. To address this issue, we present a near complete genome assembly for the Jinhua pig (JH-T2T), constructed using PacBio HiFi and ONT long reads. This assembly includes all 18 autosomes and the X and Y sex chromosomes, with only six gaps. It features annotations of 46.90% repetitive sequences, 35 telomeres, 17 centromeres, and 23,924 high-confident genes. Compared to the Sscrofa11.1, JH-T2T closes nearly all gaps, extends sequences by 177 Mb, predicts more intact telomeres and centromeres, and gains 799 more genes and loses 114 genes. Moreover, it enhances the mapping rate for both Western and Chinese local pigs, outperforming Sscrofa11.1 as a reference genome. Additionally, this comprehensive genome assembly will facilitate large-scale variant detection and enable the exploration of genes associated with pig domestication, such as GPAM, CYP2C18, LY9, ITLN2, and CHIA. Our findings represent a significant advancement in pig genomics, providing a robust resource that enhances genetic research, breeding programs, and biomedical applications.
A version of this preprint has been published in the Open Access journal *GigaScience *(see paper https://doi.org/10.1093/gigascience/giaf048), where the paper and peer reviews are published openly under a CC-BY 4.0 license.
Revision 2 version
Reviewer 2: Benjamin D Rosen
The first near-complete genome assembly of pig: enabling more accurate genetic research.
General comments:
The authors have clarified how their HiC manual curation efforts were able to remove gaps from the assembly. This was my only remaining major issue. I only have a few minor comments remaining.
Minor comments:
Line 1 - Title: "A Near Telomere-to-Telomere Genome Assembly of the Jinhua Pig"
Line 369 - replace "only 6 gaps left in our final JH assembly" with "only 6 gaps remain in our final JH assembly"
Line 370 - Figure S5 needs a more detailed legend
Line 405 - I just noticed this, but are the authors proposing that chr9 has 2 centromeres? Given the know pig karyotype (metacentric chr9), it seems more likely that they have identified some other form of tandem repeat at the beginning of chr9.
-
Pigs are crucial sources of meat and protein, valuable animal models, and potential donors for xenotransplantation. However, the existing reference genome for pigs is incomplete, with thousands of segments and missing centromeres and telomeres, which limits our understanding of the important traits in these genomic regions. To address this issue, we present a near complete genome assembly for the Jinhua pig (JH-T2T), constructed using PacBio HiFi and ONT long reads. This assembly includes all 18 autosomes and the X and Y sex chromosomes, with only six gaps. It features annotations of 46.90% repetitive sequences, 35 telomeres, 17 centromeres, and 23,924 high-confident genes. Compared to the Sscrofa11.1, JH-T2T closes nearly all gaps, extends sequences by 177 Mb, predicts more intact telomeres and centromeres, and gains 799 more genes …
Pigs are crucial sources of meat and protein, valuable animal models, and potential donors for xenotransplantation. However, the existing reference genome for pigs is incomplete, with thousands of segments and missing centromeres and telomeres, which limits our understanding of the important traits in these genomic regions. To address this issue, we present a near complete genome assembly for the Jinhua pig (JH-T2T), constructed using PacBio HiFi and ONT long reads. This assembly includes all 18 autosomes and the X and Y sex chromosomes, with only six gaps. It features annotations of 46.90% repetitive sequences, 35 telomeres, 17 centromeres, and 23,924 high-confident genes. Compared to the Sscrofa11.1, JH-T2T closes nearly all gaps, extends sequences by 177 Mb, predicts more intact telomeres and centromeres, and gains 799 more genes and loses 114 genes. Moreover, it enhances the mapping rate for both Western and Chinese local pigs, outperforming Sscrofa11.1 as a reference genome. Additionally, this comprehensive genome assembly will facilitate large-scale variant detection and enable the exploration of genes associated with pig domestication, such as GPAM, CYP2C18, LY9, ITLN2, and CHIA. Our findings represent a significant advancement in pig genomics, providing a robust resource that enhances genetic research, breeding programs, and biomedical applications.
A version of this preprint has been published in the Open Access journal *GigaScience *(see paper https://doi.org/10.1093/gigascience/giaf048), where the paper and peer reviews are published openly under a CC-BY 4.0 license.
Revision 1 version
Reviewer 1: Martien Groenen
In their revised version of the manuscript, the authors have addressed all my major concerns raised in my earlier review and have made the many editorial edits as suggested. I only have a few (mostly editorial) comments for the revised version. The most important one is the title of the manuscript. I realize I did not mention this in my earlier review, but I think the title is not very appropriate and could be more informative. I suggest something like "A telomere-to-telomere genome assembly of the Jinhua pig"
Minor editorial comments: Line 40: Replace "provides" by "provide"; "genome" to "genomes" and "JH" to "Jinhua" Lines 50-51: "This study produced a gapless and near-gapless assembly of the pig genome, and provides a set of diploid JH reference genome." Should be changes to something like "This study produced a near-gapless assembly of the pig genome and provides a set of haploid Jinhua reference genomes." Line 177: Change "with with" to "with" Line 194: Replace "population" by "populations" Lines 232-233: Referring to human as a "closely related species" is rather awkward and not correct. I suggest replacing this with "eleven other mammals" Lines 299, 301 and 303: Insert "of" after "consisting" Line 317: Insert "and" before "2.33 Gb" Line 319: Insert "and" before "2.17 Gb" Line 320-321: Change to "The more continuous contigs of the two assemblies were selected to construct the final haploid assemblies". Line 323: Replace "assembly" by "assembler "Line 354: Delete "ranging" Lines 358359: Change "The average properly mapped rate" to "The average rate of properly mapped reads" Line 379: Insert "respectively" after "60.07"Line 380: "suggested" (remove space)Line 385: Change "indicate a gapless and near-gapless" to "indicate a near-gapless" Line 455: Change "were overlapped with" to "were overlapping with" Lines 557-559" The sentence "The insertion found in the SLA-DOB gene, which serves to enhance the immune system's response and is relevant to transplant rejection" seems incomplete and sound awkward. Perhaps you mean something like "The insertion found in SLA-DOB, a gene involved in enhancing the immune system's response to infection, might be relevant in relation to transplant rejection"
Reviewer 2: Benjamin D Rosen
The first near-complete genome assembly of pig: enabling more accurate genetic research
General comments: I thank the authors for addressing most of my points and providing more details on the parameters they have used. Unfortunately, I still have some unanswered questions regarding the methodology. My current understanding from the authors responses to my previous comments leads me to believe that the assembly has been scaffolded incorrectly. If the authors did indeed use HiC data to place 8 contigs into gaps and then joined those contigs without placing gaps at the joins or doing any further gap filling, that calls into question the validity of the assembly. Finally, the language needs further improvement for readability.
Specific comments: Line 85 - *will contribute to. Lines 187-191 - HiC interaction maps do not provide information for gap filling. Either this has been explained insufficiently, or it has been done incorrectly. Placing assembled sequences in the correct order does not mean that it is okay to join them without a gap. It is necessary to return to the gap filling procedure now that the contigs are in the correct order and attempt to fill them as done previously. Line 191 - Figure S3 - These HiC contact maps are not very informative they need to be labeled and have a scale bar. Additionally, contact maps can have a lack of signal due to a gap in the sequence or due to multimapping reads in repetitive regions being filtered so it's not clear what they are trying to show in A-C. The authors reply to my previous concern regarding the labeling of this figure does not help, furthermore, the figure legend in the supplemental materials is still insufficient. I think I understand that panels D and E are chr3 before and after misassembly correction, it would be helpful if the two panels were at the same scale. I still don't know why panel F is shown, how is this related to panel C and I don't see any red ellipses indicated by the legend. Line 275 - "ensemble from Duroc pigs" is incorrect. It is an "assembly of a Duroc pig". Lines 299, 301, 303 - "containing" not "consisting" Lines 306-308 - Again, HiC data orders and orients contigs, but it does not fill gaps. Please clarify how the assembly was reduced from 14 gaps to 6 gaps with HiC data. Was an additional round of gap filling performed? Lines 313-314 - How is the contig N50 larger than the scaffold N50 above? Lines 335-336 - Does this refer to the Merqury analysis? I don't think "using mapped K-mers" is correct here, please reword. Lines 367-368 - what does it mean that "8 out of 63 gaps were corrected" is this from the HiC ordering of contigs? Line 369 - what does the mapping between Sscrofa11.1 and JH-T2T shown in figure S6 have to do with the JH-T2T gap filling being described here? Line 369 - I previously asked about this supplemental table only containing 55 entries. The authors response "The other filled 8 gaps were resolved through adjustments made to the Hi-C map to correct misassembles. As a result, these gaps cannot be precisely located within the existing order of the assembly." indicates that contigs must have been incorrectly joined solely based on the HiC signal between contigs. The authors must know what contigs were added or joined to form the final assembly. It would be trivial to align the two assembly versions and identify the positions of the old contigs in the new assembly. I believe that these incorrectly joined contigs should be broken and put through the same gap filling procedure as performed earlier. Lines 375-378 - Dramatic coverage changes in read mappings as found in these figures are usually indicative of assembly errors. I do not agree that "These findings confirmed the accuracy and reliability" of the assembly. I suggest replacing the last sentence with something more measured such as "Although supported by some read data, the inconsistency of coverage across these gap filled regions suggests that caution should be used when interpreting findings in these regions, cross-referencing results with the gap positions (Supplementary Table S9) is advised." Line 375 - "evidenced by fully coverage" remove "fully", it isn't proper usage of the word and I wouldn't interpret the low coverage in many of these regions as "full coverage". Line 385 - should read "Overall, our assembly quality metrics indicate a near-gapless assembly of the pig genome" Line 390 - should read "a gapless T2T sequence for 16 out of 20" Line 396 - Supplemental table 10 not 9.Lines 398399 - according to supplemental table S4 and figure 3A, chromosome 2 also has a single telomere. Line 402 - the centromeres are not marked in Figure 3A.Line 402 - Figure S8 - please rename chr19 and chr20, chrX and chrY. Line 406 - "at early research" unclear what is meant by this. please reword. Line 423 - as indicated on line 397, 33 telomeres were identified, not 35.Line 426 - "The JH-T2T assembly IDENTIFIED 17 centromeres" Line 450 - "are located in" Line 453 - "these SVs are located in" Line 455 - Moreover, 12,129 genes overlap these SVs" Line 502 - "which contained 544 gaps" Line 841 - Figure 2 legend description is still incorrect. Only A is mapping rates, B and C are PM rates and base error rates.
-
Pigs are crucial sources of meat and protein, valuable animal models, and potential donors for xenotransplantation. However, the existing reference genome for pigs is incomplete, with thousands of segments and missing centromeres and telomeres, which limits our understanding of the important traits in these genomic regions. To address this issue, we present a near complete genome assembly for the Jinhua pig (JH-T2T), constructed using PacBio HiFi and ONT long reads. This assembly includes all 18 autosomes and the X and Y sex chromosomes, with only six gaps. It features annotations of 46.90% repetitive sequences, 35 telomeres, 17 centromeres, and 23,924 high-confident genes. Compared to the Sscrofa11.1, JH-T2T closes nearly all gaps, extends sequences by 177 Mb, predicts more intact telomeres and centromeres, and gains 799 more genes …
Pigs are crucial sources of meat and protein, valuable animal models, and potential donors for xenotransplantation. However, the existing reference genome for pigs is incomplete, with thousands of segments and missing centromeres and telomeres, which limits our understanding of the important traits in these genomic regions. To address this issue, we present a near complete genome assembly for the Jinhua pig (JH-T2T), constructed using PacBio HiFi and ONT long reads. This assembly includes all 18 autosomes and the X and Y sex chromosomes, with only six gaps. It features annotations of 46.90% repetitive sequences, 35 telomeres, 17 centromeres, and 23,924 high-confident genes. Compared to the Sscrofa11.1, JH-T2T closes nearly all gaps, extends sequences by 177 Mb, predicts more intact telomeres and centromeres, and gains 799 more genes and loses 114 genes. Moreover, it enhances the mapping rate for both Western and Chinese local pigs, outperforming Sscrofa11.1 as a reference genome. Additionally, this comprehensive genome assembly will facilitate large-scale variant detection and enable the exploration of genes associated with pig domestication, such as GPAM, CYP2C18, LY9, ITLN2, and CHIA. Our findings represent a significant advancement in pig genomics, providing a robust resource that enhances genetic research, breeding programs, and biomedical applications.
A version of this preprint has been published in the Open Access journal *GigaScience *(see paper https://doi.org/10.1093/gigascience/giaf048), where the paper and peer reviews are published openly under a CC-BY 4.0 license.
Original version
Reviewer 1: Martien Groenen
The manuscript describes the T2T genome assembly for the Chinese pig breed Jinhua, which presents a vast improvement compared to the current reference genome of the Duroc pig TJTabasco (build11.1). The results and methodology use for the assembly are described clearly and the authors show the improvement of this assembly by a detailed comparison with the current reference 11.1. While clearly of interest to be published, several aspects of the manuscript should be improved. Most of these changes are minor modifications or inaccuracies in the presentation of the results.
However, there are two major aspects that need further attention:
The T2T assembly presented, represents a combination of the two haplotypes of the pig sequenced. I am surprised why the authors did not also develop two haplotype resolved assemblies of this genome. Haplotype resolved assemblies will be the assemblies of choice for future developments of a reference pan-genome for pigs. The authors describe that they have sequenced the two parents of the sequenced F1 individual, so why did they not use the trio-binning approach to also develop haplotype resolved assemblies. I, think adding these to the manuscript would be a vast improvement for this important resource.
The results described for the identification of selective sweep regions is not very convincing. This analysis shows differences in the genomes of two breeds: Duroc and Jinhua. However, these breeds have a very different origin of domestication of wild boars that diverged 1 million years ago, followed by the development of a wide range of different breeds selected for different traits. Therefore, the comparison made by the authors cannot distinguish between differences in evolution of Chinese and European Wild Boar, more recent selection after breed formation and even drift. To be able to do so, these analyses would need the inclusion of additional breeds and wild boars from China and Europe. Alternatively, the authors can decide to tone down this part of the manuscript or even delete it altogether, as it does not add to the major message of the manuscript.Minor comments Line 34: Change the sentence to: "with thousands of segments and centromeres and telomeres missing" Line 37: Insert "and Hi-C" after "long reads "Line 46: Delete " such as GPAM, CYP2C18, LY9, ITLN2, and CHIA" Line 54: Insert "potential" before "xenotransplantation" Line 82: Delete "in response to the gap of a T2T-level pig genome" as this does not add anything and the use of "gap" in this context is confusing. Line 93: Change "The fresh blood" to "Fresh blood" Line 100: The authors need to provide a reference for the SDS method. Lines 152-153, line 444, and table S6: This is confusing. The authors mention Genotypes from 939 individuals, but in the table it is shown that they have used WGS data. You need to describe how the WGS data was used to call the genotypes for these individuals. Furthermore, in line 444 you mention 289 JH pigs and 616 DU pigs which together is 905. What about the other 34 individuals shown in table S6?Line 244: Replace "were" by "was" and delete "the" before "fastp" Lines 287292: Here you use several times "length of xx Gb and yy contigs". This is not correct as the value for the contigs refers to a number and not a length. Rephase e.g. like "length of xx Gb and consisting of yy contigs" Line 294: The use of "bone" sems strange. Either use "backbone" or "core"Line 306: Replace "chromosome" by "genome" Lines 308-309: For the comment "Second, 16 of the 20 chromosomes were each represented by a single contig" you refer to figure 1D however from this figure it cannot be seen if the different chromosomes consist of a single or multiple contigs. Line 346: Do you mean build 11.1 with "historical genome version". If so, please use that instead. Line 349: "post-gap filled" Line 353: The largest gap is 35 kb not 36 kb. Figures 2F-I should be better explained in the legends and the main text (lines 353-358). Lines 378: For the 23,924 genes you refer to supp table S13. However, that table shows a list of SV enriched QTL not these genes. Furthermore, I checked all tables but a table with all the protein coding genes is missing. Line 380: For the 799 newly anchored genes, refer to table S10. Now you refer to table S17 which shows genes enriched KEGG pathways. Lines 383-386: For the higher gene density in GC rich regions, you refer to figure 1D, but it is impossible to see this correlation from figure 1D. For the density of genes and telomeres, you refer to figure 1G. However, that figure does not show gene densities only repeat densities. Line 406-407. This should be table S11.Lines 409412: For this result you refer to table S11. However, that table only shows data for the gained genes, not the lost genes. Lines 419-420: You refer to table S12 and figure 3B, but the information is only shown in figure 3B and not in table S12.Line 420: Replace "were" by "is" Line 422: Better to use "repeats" instead of "they" Line 425: "Moreover, 12,129 genes located in these SVs". Unclear to what "these" refers to and I assume that you mean genes that (partially) overlap with SVs? Also, this is an incomplete sentence (verb missing). Likewise, this number is not very meaningful as many of these SVs are within introns. It is much more informative to mention for how many genes SVs affect the CDS. Line 433 and table S14: This validation is not clear at all. What exactly are these numbers that are shown? You also mention "greater than 1.00" but the table does not contain any number that is greater than 1.00. Line 435: "Table" not "Tables" Line 436: Change to " SVs with a length larger than 500 bp "The term "invalidate" in figure 3D is rather awkward. Better to use "not-validated" and "validated" in this figure. Line 449: This should be Table S16. Line 452: There is not Table S18Lines 484-486: Change to "Similarly, in human, the use of the T2T-CHM13 genome assembly yields a more comprehensive view of SVs genome-wide, with a greatly improved balance of insertions and deletions [61]." Lines 500-501: Change to "For example, in human, the T2T-CHM13 assembly was shown to improve the analysis of global" Lines 517-528: This paragraph should be deleted as these genes have already been annotated and described in previous genome builds including 11.1. Why discuss these genes here? Following that line of thinking, almost every gene of the 20,000 can be discussed. Line 532: "%" instead of "%%" and insert "which" after "SVs" Lines 537-542: These sentences should be deleted. It is common knowledge that second generation sequencing is not very sensitive to identify SVs. The authors also do not provide any results about dPCR. Line 544: "affect" rather than "harbor" Lines 544-547: This is repetitive and has been stated multiple times so better to delete. Line 561: "which is serve to immune system's response and relevant to transplant rejection" This is an incorrect sentence and should rephrased. Lines 562-568: I don't agree with is statement and suggest to remove it from the discussion.
Reviewer 2: Benjamin D Rosen
The first near-complete genome assembly of pig: enabling more accurate genetic research. The authors describe the telomere-to-telomere assembly of a Jinhua breed pig. They sequenced genomic DNA from whole blood with PacBio HiFi and Oxford Nanopore (ONT) long-read technologies as well as Illumina for short reads. They generated HiC data for scaffolding from blood and extracted RNA from 19 tissues for short read RNAseq for gene annotation. A hifiasm assembly was generated with the HiFi data and scaffolded with HiC to chromosome level with 63 gaps. The scaffolded assembly was gap filled with contigs from a NextDenovo assembly of the ONT data bringing the gaps down to 14. Finally, the assembly was manually curated with juicebox somehow closing a further 8 gaps. This needs to be clarified. Standard assembly assessments were performed as well as genome annotation. The authors compared their assembly to the current reference, Sscrofa11.1, and called SVs between the assemblies. The SVs were validated with additional Jinhua and Duroc animals. They then identified signatures of selection present in some of the largest SVs.
General comments: The manuscript is mostly easy to read but would benefit from further editing for language throughout. The described assembly appears to be high quality and quite contiguous. Although the authors do mention obtaining parental samples and claim the assembly is fully phased, there is no mention of how this was done. There are many additional places where the methods could be described more fully including the addition of parameters used.
Specific comments: Line 39 - Figure 1 only displays 34 telomeres, not 35. Additionally, I was only able to detect 33 telomeres using seqtk telo. Seqtk only reports telomeres at the beginning and end of sequences, digging further, the telomere on chr2 is ~59kb from the end of the chromosome, perhaps indicating a misassembly. Lines 79-81 - there are not hundreds of species with gap free genome assemblies and reference 19 does not claim that there are. Line 82 - the assembly is not gap-free, replace with "nearly gap-free" Line 95 - were these parental tissue samples ever used? Lines 151-156 - this section would be better located below the assembly methods. Please number supplementary tables in order of their appearance in the text. Line 171 - please provide parameters used here and for all analyses. Lines 187-188 - how did rearranging contigs decrease the gaps? Was the same gap filling procedure used after HiC manual adjustments? Line 188 - Figure S3 - I don't understand the relationship between the panels nor what the authors are attempting to show. If panels A-C display chromosomes 2, 8, and 13, Why does D display chr3? Both panels C and E are labeled chr13 but they look nothing alike. Are D-E whole chromosomes or zoomed in views? Missing description of panel F. Lines 222-224 - why weren't pig proteins used? Ensembl rapid release has annotated protein datasets for 9 pig assemblies. Line 264 - although most will know this, make it clear that Sscrofa11.1 is an assembly of a Duroc pig. Line 292 - how was polishing performed? This is missing from the methods. Line 294 - should this read "selected it for the backbone of the genome assembly."? Lines 298-299 - methods? Line 314 - what is meant by "using mapped K-mers from trio Illumina PCR-free reads data"? Line 331 - accession numbers for assemblies would be useful. Line 333 - what is "properly mapped rate"? Do you mean properly paired mapping rate? Line 346 - what is the historical genome version? Line 349 - Supplemental Table S8 only has 55 entries including the 6 remaining gaps. Where are the other filled 8 gaps located? Lines 350-358 - read depth displays wouldn't show the presence of clipped reads which would indicate an improperly closed gap. It would be more convincing to display IGV windows containing these alignments showing that there are no clipped reads. Line 354 - Figure S5 needs a better legend. What is ref and what is own? Line 359 - the assembly is near-gapless. Line 359 - where is the data regarding assembly phasing? How was this determined to be fully phased? Line 363 - 16 of 20 chromosomes are gapless. Line 370 - only 33 telomeres were found at the expected location (end of the chromosome), if you count the telomere on chr2 59kb from the end, then 34 telomeres were identified. Line 372 - chr13 also only has a single telomere. It does not have a telomere at the beginning. Line 372 - chr19 is chrX correct? Line 374 - Figure 1G - It would be nice to have the centromeres marked on this plot (or in Figure 3A). Are the long blocks of telomeric repeats internal to the chromosomes expected? Line 423 - Figure 3A - there is no telomeric repeat at the beginning of chr4 or chrXLine 431 - why were only 5 pigs of each breed used to validate SVs when 100's of WGS datasets from the two breeds had been aligned? How were these 5 selected? Line 481 - Sscrofa11.1 only has 544 gaps.Line 492 - ONT data was used to fill more than 6 gaps. Gaps in the assembly were reduced from 63 to 14 using ONT contigs. Lines 588-589 - please make your code publicly available through zenodo, github, figshare, or something similar. Line 815-824 - Figure 2 - legend description needs to be improved. Only A is mapping rates, B and C are PM rates and base error rates. The color switch from A-C having European pigs in blue to D having JH-T2T in blue might confuse readers.
-
