Enhancing Search-Based Testing with LLMs for Finding Bugs in System Simulators

Aidan Dakhama
Karine Even-Mendoza
W.B Langdon
Héctor D. Menéndez
Justyna Petke

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Despite wide availability of automated testing techniques such as fuzzing, little attention has been devoted to testing computer architecture simulators. We propose a fully automated approach for this task. Our approach uses large languagemodels to create input programs, including information about their parametersand their types, as test cases for the simulators. The LLM’s output becomesthe initial seed for an existing fuzzer, AFL, which has been enhanced with threemutation operators, targeting both the input binary program and its parame-ters. We implement our approach in a tool called SearchSYS. We use it to testthe gem5 system simulator. SearchSYS discovered 21 new bugs in gem5, 14 where gem5’s software prediction differs from the real behaviour on actual hardware and 7 where it crashed. New defects were uncovered with each of the 6 LLMs used.

Version published to 10.21203/rs.3.rs-5004178/v1 on Research Square
Sep 18, 2024

Integrated test case generation and prioritization based on multiple coverage criteria

This article has 4 authors:
1. Zhonghao Guo
2. Chenge Geng
3. Xinyue Xu
4. Xiangxian Chen
This article has no evaluationsLatest version Nov 4, 2024
Systematic Evaluation of AI-Generated Python Code: A Comparative Study across Progressive Programming Tasks

This article has 1 author:
1. Yang Qianyi
This article has no evaluationsLatest version Sep 23, 2024
MolAR: memory-safe library for analysis of MD simulations written in Rust

This article has 1 author:
1. Semen Yesylevskyy
This article has no evaluationsLatest version Sep 24, 2024

Listed in

Abstract

Article activity feed

Related articles

Integrated test case generation and prioritization based on multiple coverage criteria

Systematic Evaluation of AI-Generated Python Code: A Comparative Study across Progressive Programming Tasks

MolAR: memory-safe library for analysis of MD simulations written in Rust