NJGPT: A Large Language Model-Driven, User-Friendly Solution for Phylogenetic Tree Construction

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Motivation

Phylogenetic reconstruction plays an integral part in much of the research in evolutionary biology. Currently, a newly minted phylogeneticist must choose amongst a relatively large array of phylogenetic software, each often with its own analytical routines, inputs and outputs. Our overarching aim is to construct a user-friendly pipeline with the most recent generative AI tool, ChatGPT-4 (at the time of writing) released by OpenAI, that is able to understand queries written in natural language, to build a phylogenetic tree using sequence data. By doing this, we demonstrate how generative AI may be used in phylogenetics, as a proof-of-concept. We also demonstrate the steps needed, presently, to ensure that ChatGPT can build phylogenetic trees accurately.

Results

We present NJGPT, a phylogenetic tool built using ChatGPT, a Large Language Model (LLM) Generative Pre-trained Transformer (GPT),which employs the Neighbor-Joining method to construct phylogenetic trees. NJGPT simplifies phylogenetic tree construction by allowing users to generate and visualize trees using natural language queries. It supports multiple sequence file formats, matrix calculation models, and gap-deletion methods. To evaluate the performance of NJGPT, we compared output and runtimes with the widely-used phylogenetic software, MEGA. Our results show that NJGPT produces identical trees over a range of sequence lengths and simple models of evolution. However, NJGPT faces visualization issues with datasets over 50 taxa and operational failures with larger datasets due to token limits. NJGPT runtimes were also substantially slower than MEGA; however, NJGPT’s user-friendly interface makes it ideal for beginners.

Availability

This plugin is available for free at https://chatgpt.com/g/g-1OzP3Qviw-njgpt , the source code is available on GitHub ( https://github.com/ZWan622/NJGPT1.0.git ) and is implemented using Python

Contact

zwan622@aucklanduni.ac.nz

Supplementary information

Supplementary data are available at Bioinformatics online.

Article activity feed