Developing Computer Vision and Machine Learning Strategies to Unlock Government-created Records

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

This work explores the development of AI and ML computer vision techniques to unlock digitized handwritten US Census records from the 1950s, which includes over 6.5 million images and was only recently made available to the public on April 1, 2022, following a 72-year access restriction period. The 1950 Census offers a unique window "into one of the most transformative periods in modern American history, revealing a country of roughly 151 million people who had just recently emerged from the hardships and uncertainties of World War II and the Great Depression." (census.gov). This computer vision and machine learning work is part of a larger case study based in Sacramento, California focusing on creating a so-far unseen window into the fate of the Japanese American community. Sacramento once housed the fourth largest Japantown community on the West Coast and saw its community forced out twice in a decade: in 1942 during WWII Japanese American Incarceration (the largest single forced relocation in US history), and in 1954 during urban renewal where Japantown residential and business districts were leveled. Our project uses AI-based computational treatments to help recover and memorialize the history of the erased Sacramento Japantown. Moreover, we contrast these findings with the processing of the 1940 Census (released in 2012), thus producing a novel "before-and-after" representation. We demonstrate a workflow for extracting demographic information using image segmentation, computer vision techniques, and deep learning for handwritten character recognition. These techniques are generalizable to other cities, states, and communities, and demonstrate AI-assisted strategies to unlock vital demographic information. The approach highlights the potential benefits of computational techniques on social justice issues. The workflow represents an AI-assisted filtering process for Census records, with a user interface for computationally driven page review. The goal is to automate the culling of pages to select a smaller subset of pages that can then be further targeted or crowdsourced.

Article activity feed