Variable Naming Impact on AI Code Completion: An Empirical Study

Alexey Yakubov

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

As AI code completion tools become central to software development, a fundamental question emerges: do variable naming conventions that aid human comprehension also improve AI model performance? We investigate this question using a controlled experimental design with 500 Python code examples generated by mistralai/Magistral-Small-2506 (24B parameters, quantized to 8bit), transformed into 7 naming schemes (descriptive, minimal, obfuscated, original, Pascal-Case, SCREAM SNAKE CASE, snake case), and tested across 8 models (0.5B-8B parameters) spanning two architectures (Llama and Qwen). The same model performs renaming transformations and serves as a semantic judge for evaluating completion outputs. Despite requiring more tokens, descriptive variable names consistently achieved the best semantic similarity (0.874), while obfuscated names performed worst (0.802) — consistent with human cognition research. Our evaluation combines exact token matching, Levenshtein similarity, and semantic similarity. Strong correlations between syntactic and semantic metrics (r=0.945) validate our evaluation approach. Model scaling effects demonstrate clear performance improvements with size, with semantic similarity ranging from 0.743 (1B) to 0.898 (7B) for Llama models. These findings provide initial quantitative evidence for code style considerations in AI code completion, with implications for developers using AI coding assistants.

Version published to 10.21203/rs.3.rs-7180885/v1 on Research Square
Jul 23, 2025

Self-Debugging AI: A Comprehensive Analysis of Claude 4.1 Sonnet's Code Generation and Error Resolution Capabilities

This article has 1 author:
1. Harshith Vaddiparthy
This article has no evaluationsLatest version Aug 29, 2025
The inaccuracy of uniform counting in software metrics: empirical evidence with a weighted remedy

This article has 1 author:
1. Gholamali Nejad Hajali Irani
This article has no evaluationsLatest version Aug 22, 2025
Toward Efficient and Faithful Reasoning in Large Language Models

This article has 3 authors:
1. Lukas Schneider
2. Anna Muller
3. Mareike Gerhardt
This article has no evaluationsLatest version Jul 18, 2025

Listed in

Abstract

Article activity feed

Related articles

Self-Debugging AI: A Comprehensive Analysis of Claude 4.1 Sonnet's Code Generation and Error Resolution Capabilities

The inaccuracy of uniform counting in software metrics: empirical evidence with a weighted remedy

Toward Efficient and Faithful Reasoning in Large Language Models