Assessment of PHRED Score Characteristics inIllumina MiSeq Amplicon Sequencing
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background : PHRED scores are confidence values associated with each basecall generated by sequencers. The score is defined as the probability that the basecall is incorrect. The calibration of PHRED scores has been examined in other studies by evaluating errors made in reading known sequences. Methods : We investigated the calibration of the Illumina MiSeq instrument PHRED model using data from a large data set. We also derive calibration meth- ods to adjust the PHRED scores to reflect characteristics of data sets similar to those produced by the Global Hepatitis Outbreak and Surveillance Technology (GHOST) error correction pipeline. The GHOST protocol intentionally uses a short amplicon, resulting in a region where many positions have two base calls, one coming from each of the paired reads. A maximum likelihood model of redun- dant base calls that match each other was used to estimate corrected probabilities of the PHRED scores. Results : The PHRED scores showed only small absolute deviations from their target values. These differences are statistically significant deviations (p < 0.0001) from being calibrated. The accuracy of the scores varied significantly among MiSeq instrument runs. Recalibration procedures produced quality scores that improved Brier scores for paired base matching by an average relative improvement of 2.83%. Conclusions : Methods developed to create calibration curves for PHRED scores will be useful in improving error correction pipelines based on redundant deep sequencing of amplicon data. However, quality scores are relatively uninformative of substitution errors. The MiSeq instrument fails to separate substitution base calls from adjacent correct base calls. The quality scores assigned are determined more by the global error rate of the sequencing run in the current machine cycle than by the characteristics of the specific base call.