The Representation of SDG-Related Research in Bibliometric Databases: Persisting Imbalances and Varying Perspectives

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Large bibliometric databases, such as Web of Science, Scopus, and OpenAlex, play a crucial role for decision-makers in science and science policy, as they are used as sources for informing decisions at both national and international levels, in public and private sectors. Although these databases facilitate bibliometric analyses, they are performative, affecting the visibility of scientific outputs and the measurement of participating entities. Recently, they have also incorporated the UN’s Sustainable Development Goals (SDGs) into their respective classifications, which have been criticized for their diverging nature. On another note, their infrastructural information processing is, of course, susceptible to emerging technologies. As a matter of fact, AI-supported and -powered tools have recently entered research practice and society at large. Large Language Models (LLMs), the branch of generative AI specifically focused on text, underlie their operation. By leveraging their features (i.e., in particular, mirroring what is thoroughly embedded in their training data under certain conditions), LLMs act as data magnifiers on SDG-classified publications to detect data biases that bibliometric databases are affected by. Within a broader perspective, our general setup serves as a conceptual exercise that characterizes the expected macro-level effects on the representation of SDG-related research in bibliometric databases, originating from the introduction of a generic LLM-based tool. Our analysis shows that the deployment of LLMs in the information processing of bibliometric databases reveals a systematic overlook in the data (i.e., scientific publications classified by SDGs) of the most disadvantaged categories of individuals, the poorest countries, and underrepresented topics that SDG targets explicitly focus on. Conversely, an unsolicited hegemonic role played by economic superpowers and Global North is identified.

Article activity feed