Automating Best-Practice Refactoring in Java via Multi-Agent Planning and Verification

Jian Yang
Jing Li
Yuanyuan Gao
Jiao Jiao

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Best-practice rules capture widely accepted coding conventions thatimprove software maintainability, safety, and extensibility. Modern static analysistools, such as PMD and SonarQube, can accurately detect best-practiceviolations, yet repairing these issues remains largely manual, time-consuming,and error-prone. Although large language models show promise in code generation,directly applying them to refactoring often yields unsafe or incompletetransformations, especially for semantically subtle best-practice rules. In this paper, we propose BestRefactor, the first automated approachthat explicitly targets Java best-practice violations at scale. BestRefactoradopts a multi-agent, recipe-guided framework that decomposes refactoringinto three coordinated stages: (i) a Planning Agent that determines applicabilityand safety, (ii) a Refactoring Agent that applies rule-specific transformationrecipes, and (iii) a Verification Agent that validates correctness and rulecompliance. This design enables reliable, behavior-preserving refactoring beyondna¨ıve single-step LLM rewriting. We implement BestRefactor as a practicaltool integrated with PMD and evaluate it on 10 real-world Java librariesspanning five application domains, comprising 844 detected best-practice violations.Experimental results show that BestRefactor successfully producesverified repairs for 72.6% of all detected violations end to end, outperforminga direct LLM baseline by over 11 percentage points in verified repair rate. Anablation study confirms that planning, recipe guidance, and verification eachplay a critical role in achieving reliable refactoring. Finally, our performanceanalysis demonstrates that BestRefactor is practical in terms of runtime andcost.

Version published to 10.21203/rs.3.rs-9145617/v1 on Research Square
Apr 13, 2026

SAMF: SAWANT (Structured Agentic Workflow for Alignment, Validation, and Negotiated Testing) for Reliable, Safe, and Verifiable LLM Prompting

This article has 1 author:
1. Prashant Sawant
This article has no evaluationsLatest version Apr 15, 2026
ReATest: enhancing policy-as-code workflows through automated test case generation from Rego policies

This article has 2 authors:
1. Thanh-Binh Trinh
2. Ngoc-Minh Le
This article has no evaluationsLatest version Apr 10, 2026
Deterministic Compliance Failures in Large Language Models

This article has 1 author:
1. Toyoji Kanagawa
This article has no evaluationsLatest version Mar 30, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

SAMF: SAWANT (Structured Agentic Workflow for Alignment, Validation, and Negotiated Testing) for Reliable, Safe, and Verifiable LLM Prompting

ReATest: enhancing policy-as-code workflows through automated test case generation from Rego policies

Deterministic Compliance Failures in Large Language Models