TeLL-Me What You Can’t See: A Vision-Language Framework for Forensic Mugshot Augmentation

Saverio Cavasin
Pietro Biasetton
Mattia Tamiazzo
Simone Milani
Mauro Conti

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

During criminal investigations, the availability of images depicting persons of interest directly influences the success of an identification procedures. However, law enforcement agencies often face challenges related to the scarcity of high-quality images or their obsolescence, which can affect the accuracy and success of people-search processes. This paper introduces a novel forensic mugshot augmentation framework aimed at addressing these limitations. In order to assist law enforcement in identification procedures, our approach enhances visual evidence by creating synthetic, high-quality pictures through customizable data augmentation techniques. These are enabled by the combination of generative AI models and structured to preserve biometric identity and visual coherence with respect to the original data. Experimental results demonstrate that our method consistently enriches multimedia data quality for forensic identification and provides several robust enhancements across multiple investigative scenarios. Such effectiveness has been validated by means of both vision-based and evidence-based metrics supporting its potential as a tool for law enforcement applications. Attribute extraction reached 84.1% (+2.3 percentage points over the original mugshots), and re-identification indicated strong identity preservation: similarity for same-subject pairs ~ 0.89 vs. ~ 0.19 for different-subject pairs. These results suggest that the framework reliably extracts and leverages the required target characteristics, with no notable hallucinations observed.

Version published to 10.21203/rs.3.rs-7580012/v1 on Research Square
Oct 27, 2025

GReX-Bench: Benchmarking Generalization, Robustness, and Explainability in AI-Generated Image Detection

This article has 3 authors:
1. Nusrat Tasnim
2. kutub Uddin
3. Khalid Malik
This article has no evaluationsLatest version Feb 12, 2026
Score-based Likelihood Ratios for Deepfake Image Evidence

This article has 3 authors:
1. Tianli Guo
2. Jisong Li
3. Yunqi Tang
This article has no evaluationsLatest version Feb 25, 2026
Reference-Guided Texture Transfer with Deformable Convolutions for Indoor Image Dehazing

This article has 2 authors:
1. Esteban Reyes-Saldaña
2. Mariano Rivera
This article has no evaluationsLatest version Mar 6, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

GReX-Bench: Benchmarking Generalization, Robustness, and Explainability in AI-Generated Image Detection

Score-based Likelihood Ratios for Deepfake Image Evidence

Reference-Guided Texture Transfer with Deformable Convolutions for Indoor Image Dehazing