Indigenous Plant Knowledge in the De la Cruz-Badiano Codex (1552): A Text Mining and Morphometric Study
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
SummaryRationale and aims Nahuatl plant names such as ahuacatl and chīlli remain embedded in global languages, reflecting enduring Indigenous botanical knowledge. The 1552 De la Cruz-Badiano Codex preserves this knowledge through Nahuatl names, Latin text, and botanical illustrations. We asked whether these distinct representations encode a shared underlying structure of Nahua plant classification.Methods We compiled a multimodal dataset linking three data types from the Codex: leaf shape morphometrics derived from traced illustrations, text embeddings from English and Spanish translations, and graph embeddings based on co-occurrence patterns among Nahuatl plant names. A contrastive multimodal learning framework was used to integrate these representations into a shared embedding space.Key results The model successfully aligned two of the three modalities. Linguistic and name-structure data clustered closely together, while leaf shape occupied a more distinct region of the embedding space, indicating differing roles for morphology and language in classification.Conclusions These results suggest that Nahua botanical knowledge in the codex emphasizes relational and linguistic structure alongside, but not reducible to, plant form. Our study demonstrates how computational methods can reveal patterns in historical Indigenous knowledge systems and support efforts to re-examine colonial-era sources in plant science.Societal Impact Statement Nahuatl In huehca nahuatlahcuilolli in,in tlaneltiliztli in xihuitl, in cuahuitl, ihuan in tlacahamo zan tlahcuilolpan mopiya,zan mochipa yoltilana.Inin tlamachtiliztli ticnextia queninin tlahtolli, in ixiptlahcuiloltin, ihuan in xihuitlce tlamantli tlamatiliztli mochipa quichihua.Ticpohuah ihuan tiquixmati in xihuitlamo zan ipampa patli,zan ipampa nemiliztli ihuan tlalli.Inin tequitl quipalehuia in tlamatiliztli nahua,in tlahtolli, ihuan in tlaneltiliztli,ma yoltilana, ma tlapalehuia in tlacah ihuan in tlalliaxcan ihuan moztla.EnglishTranslated from Nahuatl: In ancient Nahua writings, knowledge of plants, trees, and people was not fixed on the page, but carried forward in living ways. This study shows how language, images, and plants work together as one enduring system of knowledge. We examine plants not only as medicines, but as part of human life and the land itself. By reading these manuscripts through Indigenous structures of understanding, our work supports the vitality of Nahua knowledge, language, and plant relationships. This matters today because sustaining Indigenous ways of knowing strengthens care for people, plants, and the land, now and for the future. To enhance the reach of this work, a Spanish language version of the paper is available in the Supporting Information (see Translation_ES).KeywordsBotany; De la Cruz-Badiano Herbal; Indigenous knowledge; Libellus de Medicinalibus Indorum Herbis; Multimodal learning; Nahuatl; Plant classification