Applying Benford’s law to COVID-19 data: the case of the European Union

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Background

Previous studies have used Benford’s distribution to assess the accuracy of COVID-19 data. Data inaccuracies provide false information to the media, undermine global response and hinder the preventive measures taken by authorities.

Methods

Daily new cases and deaths from all the countries of the European Union were analyzed and the conformance to Benford’s distribution was estimated. Two statistical tests and two measures of deviation were calculated to determine whether the reported statistics comply with the expected distribution. Four country-level developmental indexes were included, the GDP per capita, health expenditures, the Universal Health Coverage (UHC) Index and the full vaccination rate. Regression analysis was implemented to examine whether the deviation from Benford’s distribution is affected by the aforementioned indexes.

Results

The findings indicate that Bulgaria, Croatia, Lithuania and Romania were in line with Benford’s distribution. Regarding daily cases, Denmark, Ireland and Greece, showed the greatest deviation from Benford’s distribution. Furthermore, it was found that the vaccination rate is positively associated with deviation from Benford’s distribution.

Conclusions

The findings suggest that overall, official data provided by authorities are not confirming Benford’s law, yet this approach acts as a preliminary tool for data verification. More extensive studies should be made with a more thorough investigation of countries that showed the greatest deviation.

Article activity feed

  1. SciScore for 10.1101/2021.12.24.21268373: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Apart from COVID-19 data, we included the gross domestic product per capita (GDPc), the healthcare expenditures of countries as percentage of GDP (HGDP), and the Universal Health Coverage Index (UHC) from the World Bank (https://data.worldbank.org/).
    https://data.worldbank.org/
    suggested: (Data World Bank, RRID:SCR_012767)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.