Automated Labeling of Scientific Names and Etymological Trend Analysis in Phytophagous Arthropods Using Large Language Model

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Scientific names, especially epithets, are derived from various factors, not only species characteristics but also cultural backgrounds, such as the names of people. They reflect how species were perceived at the time. However, several ethical issues have been raised, such as naming species after criminals and gender imbalance in eponyms (epithets named after people). Previous research has been conducted through thorough literature reviews with random sampling, which requires significant time and effort. In this study, the accuracy of the automated labeling using a Large Language Model (LLM) was assessed, and the temporal etymological trends of 2,705 species of phytophagous arthropods were investigated. LLM-based classification achieved F1 scores above 75% and accuracy above 90% in the Morphology, Host, Geography , and People . However, the Ecology & Behavior and Other exhibited accuracy issues. Analyses using the Generalized Additive Model (GAM) revealed shifting naming trends, with a decrease in Morphology and an increase in Geography and People , consistent with previous research on spiders. This study demonstrates the effectiveness of LLM-based classification for epithets and provides a new perspective on the social and scientific debates surrounding scientific names based on etymological trends.

Article activity feed