Big Data Analysis of Income Inequality and Distribution in the United States Using IRS and Census Microdata

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

This study examines income inequality and distribution across U.S. states and demographic groups using linked Internal Revenue Service administrative records combined with Census microdata. By integrating large-scale administrative and survey-based datasets, the analysis overcomes longstanding limitations in traditional inequality measurement, particularly those related to top-income under coverage and reporting bias. The methodological framework emphasizes big data integration, geospatial analysis, and micro–macro reconciliation to ensure consistency between individual-level income records and aggregate national accounts. State-level and demographic disaggregation enable a more precise assessment of spatial and population-based inequality patterns that are often obscured in conventional survey estimates. The findings reveal substantial discrepancies between inequality measures derived from survey data and those based on administrative tax records, with surveys systematically underestimating income concentration at the upper tail of the distribution. High-income earners are shown to be significantly underrepresented in traditional datasets, leading to downward-biased estimates of income shares and inequality trends. Geospatial analysis further highlights pronounced heterogeneity in income concentration across states, as well as persistent disparities across demographic groups. The study concludes that administrative big data sources provide a more granular, accurate, and policy-relevant understanding of income inequality in the United States. By leveraging linked administrative and census records, the approach enhances empirical precision and offers stronger foundations for evaluating distributional outcomes, fiscal policy, and regional economic inequality.

Article activity feed