Automating Loan Eligibility from Retail Invoices with AI-Based OCR and Classification

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

In consumer lending, validating whether a customer purchase qualifies for a loan ofteninvolves manual entry of invoice and tax information through an online form. This processis time-consuming, error-prone, and difficult to scale due to the lack of standardized invoiceformats and nuanced loan eligibility rules. This paper presents an AI-powered systemintegrating Optical Character Recognition (OCR) with transformer-based language modelsto automate the extraction, interpretation, and classification of invoice data. We describe ahybrid architecture combining intelligent document processing techniques with configurablebusiness rule engines. To illustrate feasibility, we evaluated the system on a synthetic datasetof 2,500 retail invoices generated to reflect heterogeneous layouts, merchant categories, andbilingual descriptions. On this illustrative dataset, the system demonstrated an indicativeline-item classification F1 score of 88 percent (precision 90, recall 87) and reduced simulatedmanual review requirements by about 68 percent. We discuss architectural design, evaluationmetrics, related research, fairness considerations, and regulatory implications, highlightingthat the results are illustrative and intended to inform future validation on real-world U.S.financial data

Article activity feed