The Cognitive Fingerprint Problem: Transformer-LSTM Perplexity Geometry for Fair and Adversarially Robust AI Text Detection

Pranil Raichura

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

As AI text detection systems become increasingly embedded in academic integrity enforcement, recent empirical audits have exposed a severe and systematic bias against English as a Second Language (ESL) writers. Standard detectors, which rely predominantly on sub-word perplexity measured against models trained on native English, frequently conflate the natural linguistic simplicity of human language learners with the statistical predictability of Large Language Models (LLMs). Furthermore, naive retraining strategies that incorporate ESL data to mitigate this bias introduce a critical new adversarial vulnerability: LLMs can be explicitly prompted to simulate ESL writing patterns---including grammatical errors and constrained vocabulary---allowing them to evade even ESL-aware detectors. The result is a scenario in which innocent ESL students are falsely accused of academic misconduct while a deliberately adversarial AI goes undetected. To resolve this dual challenge, we propose a novel feature-extraction architecture termed Geometric Fairness. Rather than measuring how "fluent" a text is on a single scale, this system maps every essay into a two-dimensional bivariate feature space defined by two orthogonal perplexity signals: (1) a Native-Expert, a pre-trained DistilGPT-2 Transformer computing sub-word perplexity against standard American English; and (2) an ESL-Expert, a custom Character-Level Long Short-Term Memory (LSTM) network trained exclusively on the ELLIPSE learner corpus, computing next-character perplexity. By contrasting sub-word probability under standard English against the sequence-level character variance of authentic learner interlanguage, our architecture geometrically separates the cognitive stochasticity of human error from the probabilistic smoothing of AI generation. On the most adversarial classification task considered---distinguishing authentic ESL student writing from AI-generated text specifically prompted to mimic ESL style---our dual-stream logistic regression classifier achieved an accuracy of 93.3%, outperforming a single-dimensional perplexity baseline of 75.0% by 18.3 percentage points. The dual-stream method achieved a Receiver Operating Characteristic Area Under the Curve (ROC-AUC) of 0.96, compared to 0.73 for the 1D ESL-only baseline. Critically, the false-positive rate---the proportion of genuine human ESL students who would be wrongly accused---was reduced from 29.0% to 9.7%, a relative reduction of 66.7%.

Version published to 10.21203/rs.3.rs-9006491/v1 on Research Square
Mar 3, 2026

Attention Amplification in Multilingual LLMs: Why Script Representation Matters

This article has 3 authors:
1. Yash Mishra
2. Suyash Mishra
3. Kedarnath senapati
This article has no evaluationsLatest version Feb 25, 2026
Optimizing Fake News Detection in Low-Resource Languages: A Comparative Study of Deep Learning Models Using Sentence-Level FastText Vectors in Kurdish and English

This article has 2 authors:
1. Azad M. Karim¹
2. Bryar A.Hassn
This article has no evaluationsLatest version Apr 15, 2026
Mitigating text data privacy risks from gradient and model inversion attacks with a dual-pronged defense

This article has 2 authors:
1. Yuxin Xie
2. Ying Gao
This article has no evaluationsLatest version Apr 7, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Attention Amplification in Multilingual LLMs: Why Script Representation Matters

Optimizing Fake News Detection in Low-Resource Languages: A Comparative Study of Deep Learning Models Using Sentence-Level FastText Vectors in Kurdish and English

Mitigating text data privacy risks from gradient and model inversion attacks with a dual-pronged defense