Detecting Fraud-Associated Characteristics in the Medical AI Literature: A Multi-Signal NLP Framework Reveals Distinct Paper Mill Subtypes

Hayden Farquhar

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Paper mills increasingly compromise the integrity of the medical artificial intelligence (AI) literature. We developed a pre-registered, multi-signal natural language processing pipeline combining seven feature categories -- tortured phrases, structural formulaicity, AI-generated text markers, citation anomalies, cross-document similarity, co-authorship networks, and geographic metadata -- and applied it to 2,478 medical AI papers (2018-2025) labelled using Retraction Watch data via the Crossref Labs API. A Random Forest-XGBoost ensemble classifier achieved average precision 0.858 and AUC-ROC 0.916 on 5-fold cross-validation, but the most important features were writing quality indicators (vocabulary diversity, reference count) rather than fraud-specific signals, reflecting the dominance of the 2021-2022 Hindawi mass retractions. Retraction subtype analysis revealed distinct fingerprints across fraud types: AI-generated content papers had twice the boilerplate density of other subtypes, while fake peer review papers had the highest co-authorship network density. Unsupervised clustering identified an "author pool" cluster (n=133, 47% retracted) with extreme co-author reuse (0.75 vs 0.11 corpus mean). Among unlabelled papers, 9.1% fell in high or very high risk tiers. Prevalence was broadly uniform across WHO regions (15-20%) and robust to corpus definition. Four pre-registered sensitivity analyses confirmed robustness. The heterogeneity of paper mill operations -- synonym-substitution mills, AI content generators, and peer-review manipulation rings -- demands subtype-aware detection strategies. Code: https://doi.org/10.5281/zenodo.19488868. Pre-registration: https://doi.org/10.17605/OSF.IO/JB4T6.

Version published to 10.31222/osf.io/3bvzc_v1 on OSF Preprints
Apr 10, 2026

AI-Powered Fraud Detection in Financial Networks: A Systematic Literature Review.

This article has 5 authors:
1. Srinivas Pochincharla
2. Jenfier Lawson
3. Farhat Kabir
4. David Wilson
5. Muhammad Sameer
This article has no evaluationsLatest version Mar 31, 2026
Paper mill or paper mine? A tentative answer to the sharp increase in research papers based on the Global Burden of Disease database

This article has 3 authors:
1. Peter Methys Degen
2. Andrea Riebler
3. Leonhard Held
This article has no evaluationsLatest version Apr 7, 2026
Software Unfairness Detection in Machine Learning-Based Systems: A Systematic Mapping Study

This article has 2 authors:
1. Roa Alharbi
2. Noureddine Abbadeni
This article has no evaluationsLatest version Apr 27, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

AI-Powered Fraud Detection in Financial Networks: A Systematic Literature Review.

Paper mill or paper mine? A tentative answer to the sharp increase in research papers based on the Global Burden of Disease database

Software Unfairness Detection in Machine Learning-Based Systems: A Systematic Mapping Study