ValuesML: A new Multilingual Dataset for Values Detection in News and Political Manifestos

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Values are important building blocks of ideologies and regularly referenced in political debates all over the world. They are also abstract, requiring some process of translation to connect them with behaviours or preferences over policies. However, it is still unclear how this translation takes form or how it differs across contexts, like different countries and cultures. With this paper, we introduce a large-scale and high-quality dataset of news and political manifestos in multiple languages, that was annotated and curated by values scholars according to the Schwartz (1992), Schwartz et al. (2012) theory of human values. Moreover, each expression of values annotated in the texts included a second layer of identification expressing the degree of fulfilment of the value in the text, specifically whether the value was (partially) attained or (partially) constrained. The final dataset comprises 2,648 annotated texts in nine languages, totalling 74,231 sentences. The dataset can be used to investigate the expression of values in the news articles and political manifestos that were annotated and as training materials for the development of automated values detection methods, using natural language processing algorithms.

Article activity feed