Incorporating Hierarchical Information into Multiple Instance Learning for Patient Phenotype Prediction with scRNA-seq Data
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Multiple Instance Learning (MIL) provides a structured approach to patient phenotype prediction with single-cell RNA-sequencing (scRNA-seq) data. However, existing MIL methods tend to overlook the hierarchical structure inherent in scRNA-seq data, especially the biological groupings of cells, or cell types. This limitation may lead to suboptimal performance and poor interpretability at higher levels of cellular division. To address this gap, we present a novel approach to incorporate hierarchical information into the attentionbased MIL framework. Specifically, our model applies the attention-based aggregation mechanism over both cells and cell types, thus enforcing a hierarchical structure on the flow of information throughout the model. Across extensive experiments, our proposed approach consistently outperforms existing models and demonstrates robustness in data-constrained scenarios. Moreover, ablation test results show that simply applying the attention mechanism on cell types instead of cells leads to improved performance, underscoring the benefits of incorporating the hierarchical groupings. By identifying the critical cell types that are most relevant for prediction, we show that our model is capable of capturing biologically meaningful associations, thus facilitating biological discoveries.