Automating Software Size Measurement from Code Using Language Models

Samet Tenekeci
Hüseyin Ünlü
Bedir Arda Gül
Damla Keleş
Murat Küçük
Onur Demirörs

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Software size is a key input for project planning, effort estimation, and productivity analysis. While pre-trained language models have shown promise in deriving functional size from natural-language requirements, predicting size directly from source code remains under-explored. Yet, code-based size measurement is critical in modern workflows where requirement documents are often incomplete or unavailable, especially in Agile development environments. This study investigates the use of CodeBERT, a pre-trained bimodal transformer model, for predicting software size from code according to two measurement methods: COSMIC Function Points and MicroM. We construct two curated datasets from the Python subset of the CodeSearchNet corpus, and manually annotate each function with its corresponding size. Our experimental results show that CodeBERT can successfully predict COSMIC data movements with up to 91.4% accuracy and generalize to the functional, architectural, and algorithmic event types defined in MicroM, reaching up to 81.5% accuracy. These findings highlight the potential of code-based language models for automated functional size measurement when requirement artifacts are absent or unreliable.

Version published to 10.21203/rs.3.rs-7153894/v1 on Research Square
Jul 25, 2025

Refactoring in Software Maintenance and Development: Application with Case Study

This article has 3 authors:
1. Rahmon Ariyo Badru
2. Akorede Mojeed Shittu
3. Idowu Olugbenga Adewumi
This article has no evaluationsLatest version Sep 9, 2025
Actionable Insights from Developer Behavior: A Practical Approach to Software Defect Prediction

This article has 2 authors:
1. Carlos Andres Ramirez Catano
2. Makoto Itoh
This article has no evaluationsLatest version Sep 8, 2025
Overview of Bad Code Smells in Software Development and Researches

This article has 3 authors:
1. Rahmon Ariyo Badru
2. Abdurrazaq Olusola Ogunlade
3. Idowu Olugbenga Adewumi
This article has no evaluationsLatest version Aug 19, 2025

Listed in

Abstract

Article activity feed

Related articles

Refactoring in Software Maintenance and Development: Application with Case Study

Actionable Insights from Developer Behavior: A Practical Approach to Software Defect Prediction

Overview of Bad Code Smells in Software Development and Researches