Advanced Feature Engineering and Machine Learning Techniques for High Accurate Price Prediction of Heterogeneous Pre-own Cars

Imran Fayyaz
G. G. Md. Nawaz Ali
SamanthaSyeda Khairunnesa

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The rapid growth of the automobile industry has intensified the demand for accurate price prediction models in the used car market. Buyers often struggle to determine fair market value due to the complexity of factors such as mileage, brand, model, transmission type, accident history, and overall condition. This study presents a comparative analysis of machine learning models for used car price prediction, with a strong emphasis on the impact of feature engineering. We begin by evaluating multiple models—including Linear Regression, Decision Trees, Random Forest, Support Vector Regression (SVR), XGBoost, Stacking Regressor, and Keras-based neural networks—on raw, unprocessed data. We then apply a comprehensive feature engineering pipeline that includes categorical encoding, outlier removal, data standardization, and extraction of hidden features (e.g., vehicle age, horsepower). Results demonstrate that advanced preprocessing significantly improves predictive performance across all models. For instance, the Stacking Regressor’s R² score increased from 0.14 to 0.8899 after feature engineering. Ensemble methods such as CatBoost and XGBoost also showed strong gains. This research not only benchmarks models for this task but also serves as a practical tutorial illustrating how engineered features enhance performance in structured ML pipelines for the fellow researchers. The proposed workflow offers a reproducible template for building high-accuracy pricing tools in the automotive domain, fostering transparency and informed decision-making.

Version published to 10.20944/preprints202507.1150.v1
Jul 15, 2025

Predicting CO₂ Corrosion of Natural Gas Pipeline Transport using Supervised Machine Learning Models

This article has 4 authors:
1. Joan Ejeta
2. Tolu Emiola
3. Robert Eshun
4. Kristen Rhinehardt
This article has no evaluationsLatest version Jul 30, 2025
Daily water demand forecasting: Comparing AI models with SHAP-optimized features

This article has 3 authors:
1. Rui Li
2. Kunlun Xin
3. Weihao Chen
This article has no evaluationsLatest version Aug 1, 2025
A Machine Learning-Based Model for Predicting High Deficiency Risk Ships in Port State Control: A Case Study of the Port of Singapore

This article has 1 author:
1. Ming-Cheng Tsou
This article has no evaluationsLatest version Jul 31, 2025

Listed in

Abstract

Article activity feed

Related articles

Predicting CO₂ Corrosion of Natural Gas Pipeline Transport using Supervised Machine Learning Models

Daily water demand forecasting: Comparing AI models with SHAP-optimized features

A Machine Learning-Based Model for Predicting High Deficiency Risk Ships in Port State Control: A Case Study of the Port of Singapore