QGeoSEP: A Novel Multitask Learning Framework Integrating Semantic Features for Collaborative Prediction of Multiple Properties of EMs
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The accelerated development of high-value energetic molecules (EMs) requires reliable and precise property assessments. Scant EMs datasets limit the applicability of single-task models based on traditional machine learning. By leveraging the Q-GEM algorithm, this study proposes QGeoSEP, a multitask property prediction framework, by integrating the quantum–geometric and descriptor features of molecules with the multihead attention mechanism of the transformer. After validating the prediction performance of Q-GEN against single-task models including three 2D-GNNs and XGBoost, we developed QGeoSEP and compared it to multitask learning (MTL). In predicting four properties—density, melting point, heat of combustion, and decomposition temperature—QGeoSEP outperformed MTL with R 2 values of 0.949, 0.877, 0.880, and 0.647. On an independent external test set of EMs density and decomposition temperature, QGeoSEP outperformed all the other models with mean absolute errors of 0.090 g·cm − ³ and 21.886 K. Therefore, QGeoSEP can address prediction problems with sparse data and is generalizable.