Discovering potential key features of genome wide profiling data using Decision Variable Analysis

Jie Xie
Feng Xie
Cheng Li
Weike Lu
Zhen Yang

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The identification of key features related to phenotype of interest (POI) from high dimensional data has been one of the important issues for omics-data study, such as transcriptome or DNA methylome data. However, these data are commonly contaminated by sources of unwanted variation caused by platforms, batches or other types of biological factors. Thus, the data can be considered as a combination of variation derived from POI and other confounding factors. Not taking into consideration for these factors could lead to spurious associations and missing important signals. Based on this idea, we propose a novel feature selection method called Decision Variable Analysis (DVA) to extract the important features related to POI from the data containing potential confounding factors. Using this method on the simulated data and real data, respectively, we found DVA performed better in identifying confounding factors comparing to other methods, including linear regression and surrogate variable analysis. Especially, our method is more efficient for the data in which there are much more feature number than sample size. We show improvements of DVA across high-dimensional datasets with smaller samples size compared to feature number on different platforms. The results indicate that DVA is an effective method to dissect sources of variation for omics-data with potential confounding factors. DVA is freely available for use at [https://github.com/xvon1/DVA](https://github.com/xvon1/DVA).

Version published to 10.22541/au.171023413.39007505/v1
Mar 12, 2024

Multiomics and Machine Learning Identify Prognostic Immune Related Gene Signatures in Ovarian Cancer

This article has 4 authors:
1. Xiulan Wang
2. Xuewang Guo
3. Yanying Xu
4. Shaofang Hua
This article has no evaluationsLatest version Dec 18, 2025
An Integrative Variant Scoring Function for Finding Novel Genes Associated with Ovarian and Thyroid Cancer

This article has 5 authors:
1. Amanda Bataycan
2. Omodolapo Nurudeen
3. Jonathon E. Mohl
4. Khodeza Begum Mitchell
5. Ming-Ying Leung
This article has no evaluationsLatest version Jan 7, 2026
Pathway-Centric Global Expression Profiling Reveals Key Molecular Drivers in Hepatocellular Carcinoma

This article has 2 authors:
1. Raghavendra Krishnappa
2. Kanthesh M B
This article has no evaluationsLatest version Jan 8, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Multiomics and Machine Learning Identify Prognostic Immune Related Gene Signatures in Ovarian Cancer

An Integrative Variant Scoring Function for Finding Novel Genes Associated with Ovarian and Thyroid Cancer

Pathway-Centric Global Expression Profiling Reveals Key Molecular Drivers in Hepatocellular Carcinoma