Distinguish virulent and temperate phage-derived sequences in metavirome data with a deep learning approach
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (GigaScience)
Abstract
Background
Prokaryotic viruses referred to as phages can be divided into virulent and temperate phages. Distinguishing virulent and temperate phage-derived sequences in metavirome data is important for their role in interactions with bacterial hosts and regulations of microbial communities. However there is no experimental or computational approach to classify sequences of these two in culture-independent metavirome effectively, we present a new computational method DeePhage, which can directly and rapidly judge each read or contig as a virulent or temperate phage-derived fragment.
Findings
DeePhage utilizes a “one-hot” encoding form to have an overall and detailed representation of DNA sequences. Sequence signatures are detected via a deep learning algorithm, namely a convolutional neural network to extract valuable local features. DeePhage makes better performance than the most related method PHACTS. The accuracy of DeePhage on five-fold validation reach as high as 88%, nearly 30% higher than PHACTS. Evaluation on real metavirome shows DeePhage annotated 54.4% of reliable contigs while PHACTS annotated 44.5%. While running on the same machine, DeePhage reduces computational time than PHACTS by 810 times. Besides, we proposed a new strategy to explore phage transformations in the microbial community by direct detection of the temperate viral fragments from metagenome and metavirome. The detectable transformation of temperate phages provided us a new insight into the potential treatment for human disease.
Conclusions
DeePhage is the first tool that can rapidly and efficiently identify two kinds of phage fragments especially for metagenomics analysis with satisfactory performance. DeePhage is freely available via http://cqb.pku.edu.cn/ZhuLab/DeePhage or https://github.com/shufangwu/DeePhage .
Article activity feed
-
Now published in GigaScience doi: 10.1093/gigascience/giab056
Shufang Wu 1State Key Laboratory for Turbulence and Complex Systems and Department of Biomedical Engineering, College of Engineering, Peking University, Beijing 100871, China2Center for Quantitative Biology, Peking University, Beijing 100871, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteZhencheng Fang 1State Key Laboratory for Turbulence and Complex Systems and Department of Biomedical Engineering, College of Engineering, Peking University, Beijing 100871, China2Center for Quantitative Biology, Peking University, Beijing 100871, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJie Tan 1State Key Laboratory for Turbulence and Complex Systems and Department of Biomedical …
Now published in GigaScience doi: 10.1093/gigascience/giab056
Shufang Wu 1State Key Laboratory for Turbulence and Complex Systems and Department of Biomedical Engineering, College of Engineering, Peking University, Beijing 100871, China2Center for Quantitative Biology, Peking University, Beijing 100871, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteZhencheng Fang 1State Key Laboratory for Turbulence and Complex Systems and Department of Biomedical Engineering, College of Engineering, Peking University, Beijing 100871, China2Center for Quantitative Biology, Peking University, Beijing 100871, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJie Tan 1State Key Laboratory for Turbulence and Complex Systems and Department of Biomedical Engineering, College of Engineering, Peking University, Beijing 100871, China2Center for Quantitative Biology, Peking University, Beijing 100871, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteMo Li 3Peking University-Tsinghua University - National Institute of Biological Sciences (PTN) joint PhD program, School of Life Sciences, Peking University, Beijing 100871, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteChunhui Wang 3Peking University-Tsinghua University - National Institute of Biological Sciences (PTN) joint PhD program, School of Life Sciences, Peking University, Beijing 100871, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteQian Guo 1State Key Laboratory for Turbulence and Complex Systems and Department of Biomedical Engineering, College of Engineering, Peking University, Beijing 100871, China2Center for Quantitative Biology, Peking University, Beijing 100871, China4Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Georgia 30332, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteCongmin Xu 1State Key Laboratory for Turbulence and Complex Systems and Department of Biomedical Engineering, College of Engineering, Peking University, Beijing 100871, China2Center for Quantitative Biology, Peking University, Beijing 100871, China4Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Georgia 30332, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteXiaoqing Jiang 1State Key Laboratory for Turbulence and Complex Systems and Department of Biomedical Engineering, College of Engineering, Peking University, Beijing 100871, China2Center for Quantitative Biology, Peking University, Beijing 100871, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteHuaiqiu Zhu 1State Key Laboratory for Turbulence and Complex Systems and Department of Biomedical Engineering, College of Engineering, Peking University, Beijing 100871, China2Center for Quantitative Biology, Peking University, Beijing 100871, China4Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Georgia 30332, USA5Institute of Medical Technology, Peking University Health Science Center, Beijing 100191, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Huaiqiu ZhuFor correspondence: hqzhu@pku.edu.cn
A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giab056 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.
These peer reviews were as follows:
Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102812 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102813
-
