Multi-Scale Computational Analysis of Wikipedia’s Telling of Global History
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Computational approaches to global history have the potential to advance multi-scale analyses of globality, globalization, interaction, integration, connectivity and interrelatedness. We present a multi-scale analysis of temporal markers in Wikipedia from 3.25M articles, comparing their distributions across 357 natural and constructed languages in language-normalized year-frequency vectors for three broad domains: ideologies, sports, and objects. Using distance-based metrics and heatmap visualizations, we illuminate findings at three different scales. A closer look at temporal ranges in the twentieth century shows varying levels of consensus across topics. The highest level of consensus is found for sport for the 1966–2022 range. This novel approach not only suggests opportunities for the historiography of global history; it may also contribute to research on the mitigation of recency and popularity bias in AI technologies, including generative artificial intelligence and recommendation tools. It highlights the opportunity to develop simple, dynamic multiscale visualisations to support user awareness and navigation of consensus about phenomena from the past. Moreover, multi-scale analyses help manage recency and popularity bias in a broad range of AI technologies.