Supervised Learning for Predicting Unknown Modifying Variables in Pliable Lasso: Applications to High-Dimensional Datasets

Zainab Subhi Mahmood Hawrami
Mehmet Ali Cengiz
Emre Dünder

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Accurate outcome prediction often requires modeling complex interactions between input features and context-specific modifiers. The pliable lasso is a flexible regression framework that integrates such modifiers into the prediction process. In many real- world applications, however, these modifiers are unobserved at test time and must be estimated. This study investigates the performance of eight supervised machine learning algorithms for estimating the modifier matrix Z in a pliable lasso model under a known-to-unknown scenario. The analysis considers both classification accuracy for modifier estimation and regression accuracy for the final response prediction, using simulated data and two relevant real-world datasets: the Superconductivity dataset and the Mice Protein Expression dataset. Results indicate that tree-based ensemble models (e.g., XGBoost, Random Forest, Decision Tree) deliver superior modifier classification (AUC > 0.99), while regularized models such as Lasso and Elastic Net achieve the best regression performance. The findings support a hybrid modeling approach in which tree-based classifiers estimate modifying variables, followed by regularized regression for accurate and interpretable predictions. This strategy holds promise for data-driven modeling in high-dimensional engineering systems where partial contextual information is available.

Version published to 10.21203/rs.3.rs-7495915/v1 on Research Square
Sep 12, 2025

An Explainable Self-Supervised Learning Framework for Interpretable and Accurate Heart Disease Prediction Using EDA–SimCLR–SHAP Pipeline

This article has 4 authors:
1. Shajedul Hasan Arman
2. Omar Faruque Siyam
3. Md.Faishal Ahmed Rudro
4. Afiah Rahman
This article has no evaluationsLatest version Oct 22, 2025
Comparative Analysis of Machine Learning Models for House Price Prediction: From Linear Regression to Boosted Trees

This article has 2 authors:
1. Mahim Al Muntashir Billah
2. Tasrifa Sarker
This article has no evaluationsLatest version Oct 14, 2025
Unified approach for Accurate Heart Disease Prediction using Machine Learning Techniques

This article has 4 authors:
1. Raghavendra Rao RV
2. Ram Mohan Reddy Ch
3. Hemanth K
4. Hruthik Chavan D
This article has no evaluationsLatest version Oct 28, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

An Explainable Self-Supervised Learning Framework for Interpretable and Accurate Heart Disease Prediction Using EDA–SimCLR–SHAP Pipeline

Comparative Analysis of Machine Learning Models for House Price Prediction: From Linear Regression to Boosted Trees

Unified approach for Accurate Heart Disease Prediction using Machine Learning Techniques