2FAST2Q: a general-purpose sequence search and counting program for FASTQ files

Afonso M. Bravo
Athanasios Typas
Jan-Willem Veening

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Evaluated articles (PeerJ)
Preprint highlights Jan - March (scietyHQ)

Abstract

The increasingly widespread use of next generation sequencing protocols has brought the need for the development of user-friendly raw data processing tools. Here, we explore 2FAST2Q, a versatile and intuitive standalone program capable of extracting and counting feature occurrences in FASTQ files. Despite 2FAST2Q being previously described as part of a CRISPRi-seq analysis pipeline, in here we further elaborate on the program’s functionality, and its broader applicability and functions.

Methods

2FAST2Q is built in Python, with published standalone executables in Windows MS, MacOS, and Linux. It has a familiar user interface, and uses an advanced custom sequence searching algorithm.

Results

Using published CRISPRi datasets in which Escherichia coli and Mycobacterium tuberculosis gene essentiality, as well as host-cell sensitivity towards SARS-CoV2 infectivity were tested, we demonstrate that 2FAST2Q efficiently recapitulates published output in read counts per provided feature. We further show that 2FAST2Q can be used in any experimental setup that requires feature extraction from raw reads, being able to quickly handle Hamming distance based mismatch alignments, nucleotide wise Phred score filtering, custom read trimming, and sequence searching within a single program. Moreover, we exemplify how different FASTQ read filtering parameters impact downstream analysis, and suggest a default usage protocol. 2FAST2Q is easier to use and faster than currently available tools, efficiently processing not only CRISPRi-seq / random-barcode sequencing datasets on any up-to-date laptop, but also handling the advanced extraction of de novo features from FASTQ files. We expect that 2FAST2Q will not only be useful for people working in microbiology but also for other fields in which amplicon sequencing data is generated. 2FAST2Q is available as an executable file for all current operating systems without installation and as a Python3 module on the PyPI repository (available at https://veeninglab.com/2fast2q ).

PeerJ
Oct 25, 2022

Read the original source
PeerJ
Oct 25, 2022

Read the original source
PeerJ
Oct 25, 2022

Read the original source
Version published to 10.7717/peerj.14041
Oct 25, 2022
Version published to 10.1101/2021.12.17.473121 on bioRxiv
Dec 18, 2021

pynnotate: a flexible tool for retrieving and processing GenBank data in molecular evolution research and education

This article has 4 authors:
1. Fernanda Caron
2. Felipe Magalhães
3. Matheus Salles
4. Fabricius Domingos
This article has no evaluationsLatest version Feb 26, 2026
Sequenoscope: A Modular Tool for Nanopore Adaptive Sequencing Analytics and Beyond

This article has 9 authors:
1. Abdallah Meknas
2. Kyrylo Bessonov
3. Shannon H.C. Eagle
4. Christy-Lynn Peterson
5. James Robertson
6. Nicole Ricker
7. Tara Signorelli
8. John Nash
9. Aleisha Reimer
Reviewed by Access Microbiology

This article has 8 evaluationsLatest version Feb 27, 2026Latest activity Feb 27, 2026
HitSV: Maximizing discovery of structural variants across sequencing technologies

This article has 5 authors:
1. Yadong Wang
2. Gaoyang Li
3. Yadong Liu
4. Bo Liu
5. Long Qian
This article has no evaluationsLatest version Feb 20, 2026

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Methods

Results

Article activity feed

Related articles

pynnotate: a flexible tool for retrieving and processing GenBank data in molecular evolution research and education

Sequenoscope: A Modular Tool for Nanopore Adaptive Sequencing Analytics and Beyond

HitSV: Maximizing discovery of structural variants across sequencing technologies