Feature Selection by Mutual Information
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Mutual information (MI), a crucial component in statistical inference and an essential tool for data analysis, has been largely overlooked for seven decades in the statistical literature. Emerging from the analysis of data information within the realms of biological, engineering and physical sciences, essential working MI formulas have been involved with asymmetric expressions of terms found in both MI and Shannon entropy, consequently leading to a reduction in effective statistical inference. The innovative observation of the equivalence among the three principles: maximum entropy, maximum likelihood, and minimum MI, has offered new insights into the geometry of data likelihood and established a new framework for statistical inference by Cheng et al. (2008, 2010). Advanced data analysis, in contrast to the existing methods, is established based on the MI identities and the fundamental Pythagorean law of conditional MI. This article presents the new methodology by elaborating its effective applications to feature selection in genetics for predicting patients with depressive disorders.