LLM-Based Measurement of Latent Attributes in Trade Data

Matthew DiGiuseppe
Xuelong Fu
Michael E Flynn

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Trade data are available at a high level of disaggregation, allowing scholars to examine flows of highly specific goods. Yet the sheer number of goods classifications (5,000+) makes it difficult to analyze trade flows and tariff policy at a mid-level of aggregation beyond a few existing categorizations. Here, we outline a method that can scale---not merely classify---traded goods on researcher-defined dimensions that are orthogonal to existing classification schemes. We propose that the embedded knowledge in large language models (LLMs) can be used to conduct pairwise comparisons (PWCs) of Harmonized System (HS) product descriptions by determining their relative proximity to a specific concept. A Bayesian Bradley--Terry model then uses these PWCs to place individual items on a latent scale of interest. These estimates and their associated uncertainty can then be used for downstream descriptive or causal analysis.

Version published to 10.31235/osf.io/t8wdg_v1 on OSF Preprints
Mar 27, 2026

Improving Small-Area Estimates of Public Opinion by Calibrating to Known Population Quantities

This article has 2 authors:
1. William Marble
2. Joshua D. Clinton
This article has no evaluationsLatest version Mar 18, 2026
Classifying strategy use in multi-attribute subjective choice: Application to conjoint experiments in political science

This article has 2 authors:
1. Nidhi V Banavar
2. Kirk Bansak
This article has no evaluationsLatest version Mar 21, 2026
From Text to Sectors: Classifying 140 Years of Swiss Firm Registrations

This article has 5 authors:
1. Danyl Denysenko
2. Filippo Pasquali
3. Jesper Findahl
4. Andrea Mocci
5. Gianmarco Torchetti
This article has no evaluationsLatest version Apr 17, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Improving Small-Area Estimates of Public Opinion by Calibrating to Known Population Quantities

Classifying strategy use in multi-attribute subjective choice: Application to conjoint experiments in political science

From Text to Sectors: Classifying 140 Years of Swiss Firm Registrations