Using Text Mining to Track Outbreak Trends in Global Surveillance of Emerging Diseases: ProMED-mail
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
ProMED-mail (Program for Monitoring Emerging Disease) is an international disease outbreak monitoring and early warning system. Every year, users contribute thousands of reports that include reference to infectious diseases and toxins. However, due to the uneven distribution of the reports for each disease, traditional statistics-based text mining techniques, represented by term frequency-related algorithm, are not suitable. Thus, we conducted a study in three steps (i) report filtering, (ii) keyword extraction from reports and finally (iii) word co-occurrence network analysis to fill the gap between ProMED and its utilization. The keyword extraction was performed with the TextRank algorithm, keywords co-occurrence networks were then produced using the top keywords from each document and multiple network centrality measures were computed to analyse the co-occurrence networks. We used two major outbreaks in recent years, Ebola, 2014 and Zika 2015, as cases to illustrate and validate the process. We found that the extracted information structures are consistent with World Health Organisation description of the timeline and phases of the epidemics. Our research presents a pipeline that can extract and organize the information to characterize the evolution of epidemic outbreaks. It also highlights the potential for ProMED to be utilized in monitoring, evaluating and improving responses to outbreaks.
Article activity feed
-
-
SciScore for 10.1101/2020.01.10.20017145: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources METHODS The pipeline shown in Figure 1 is built in Python. Pythonsuggested: (IPython, SCR_001658)Evaluation of ProMED-mail as an electronic early warning system for emerging animal diseases: 1996 to 2004. ProMED-mailsuggested: (ProMed-Mail, SCR_010260)Data from additional tools added to each annotation on a weekly basis.
About SciScore
SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information …
SciScore for 10.1101/2020.01.10.20017145: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources METHODS The pipeline shown in Figure 1 is built in Python. Pythonsuggested: (IPython, SCR_001658)Evaluation of ProMED-mail as an electronic early warning system for emerging animal diseases: 1996 to 2004. ProMED-mailsuggested: (ProMed-Mail, SCR_010260)Data from additional tools added to each annotation on a weekly basis.
About SciScore
SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore is not a substitute for expert review. SciScore checks for the presence and correctness of RRIDs (research resource identifiers) in the manuscript, and detects sentences that appear to be missing RRIDs. SciScore also checks to make sure that rigor criteria are addressed by authors. It does this by detecting sentences that discuss criteria such as blinding or power analysis. SciScore does not guarantee that the rigor criteria that it detects are appropriate for the particular study. Instead it assists authors, editors, and reviewers by drawing attention to sections of the manuscript that contain or should contain various rigor criteria and key resources. For details on the results shown here, including references cited, please follow this link.
-