Challenges for multilingual computational text analysis researchers: evidence from a survey of social scientists

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

This paper explores the impact of the English-centric development of computational text analysis methods (CTAM) on political science and other social science scholars, comparing those working with English texts to those using multilingual or non-English texts. Surveying 433 international scholars who published text-based research in top social science journals from 2016 to 2020, we assess concerns about CTAM validity, validation strategies, and challenges with multilingual corpora. Our analysis shows that multilingual scholars have more concerns about CTAM validity but do not use stricter validation methods. Additionally, non-native English speakers are more likely to analyze English texts with CTAM. These findings highlight an English-language bias in computational tools, suggesting a need for a more inclusive, multilingual approach in computational social science.

Article activity feed