A multi-model relationship detection method to assist with naïve exploration of high-dimensional data

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

This paper presents a method for classifying predictors Xp based on their relationship to an outcome variable Y; as having main effects, interactions, collinearity, or no effects. The presented method operates by combining complimentary information from a multivariate model and a series of bivariate models. We demonstrate how the method works using simulated data. In addition, we experimentally vary the effect sizes in our data generation process to see if the proposed method can detect different relationships between predictors Xp and outcome Y at varied strengths. We also vary the sample size (n) and observe the impact on relationship classification. We find that the proposed method functions as desired within the constraints of this study. We propose future simulation designs for continued testing of said method. We conclude by providing broad instructions for applying this method. Our goal is to use this method to develop initial analytical profiles of high-dimensional data in naïve data exploration contexts. This work stems from trying to find an efficient alternative to scatterplot matrices when exploring data that contain thousands of variables.

Article activity feed