A Comparative Study of Explainable Machine Learning and Multivariate Regression for Predicting Fuel Consumption and CO2 Emissions in Multi-Brand Passenger Vehicles

Jingwei Sun
Xiaoqin Mo
Shifeng Wang

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

This paper develops and benchmarks a unified, explainable modeling framework for predicting passenger-vehicle fuel consumption and CO$_2$ emissions, and for interpreting the drivers of these outcomes. Using a multi-brand dataset spanning model years 2019–2023, we formulate the prediction tasks as supervised learning problems and compare two regression baselines (multiple linear regression and ridge regression) with two nonlinear ensemble learners (random forest and gradient boosting). Across both targets, ensemble models consistently deliver higher out-of-sample accuracy than linear methods, indicating the presence of nonlinearity and feature interactions that are not well captured by purely additive specifications. To ensure interpretability alongside performance, we employ SHAP-based attribution to decompose predictions into feature-level contributions and to provide both global and instance-wise explanations. The explanation results robustly rank engine size, vehicle class, and fuel type among the most influential predictors, and the inferred effects are directionally consistent with engineering intuition. Overall, the study demonstrates that explainable machine learning can simultaneously improve predictive fidelity and provide transparent, decision-relevant insights, supporting applications in vehicle design optimization and evidence-based environmental policy.

Version published to 10.21203/rs.3.rs-8762253/v1 on Research Square
Feb 16, 2026

Beyond Linear Models: Evaluating Tree-Based, Instance-Based, and Deep Learning Methods for Carbon Market Forecasting

This article has 1 author:
1. Ozan Nadirgil
This article has no evaluationsLatest version Mar 28, 2026
Dynamic Ensemble Learning with Explainability for Photovoltaic Power Prediction

This article has 6 authors:
1. Fethi Achouri
2. Fouzi Harrou
3. Mehdi Damou
4. Benamar Bouyeddou
5. Abdelhakim Dorbane
6. Ying Sun
This article has no evaluationsLatest version Feb 19, 2026
Explainable multi-output ensemble learning for early-stage prediction of building heating and cooling loads

This article has 1 author:
1. Saleem Ahmed Al-Azazi
This article has no evaluationsLatest version Mar 12, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Beyond Linear Models: Evaluating Tree-Based, Instance-Based, and Deep Learning Methods for Carbon Market Forecasting

Dynamic Ensemble Learning with Explainability for Photovoltaic Power Prediction

Explainable multi-output ensemble learning for early-stage prediction of building heating and cooling loads