Feature Selection by Mutual Information

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Mutual information (MI), a crucial component in statistical inference and an essential tool for data analysis, has been largely overlooked for seven decades in the statistical literature. Emerging from the analysis of data information within the realms of biological, engineering and physical sciences, essential working MI formulas have been involved with asymmetric expressions of terms found in both MI and Shannon entropy, consequently leading to a reduction in effective statistical inference. The innovative observation of the equivalence among the three principles: maximum entropy, maximum likelihood, and minimum MI, has offered new insights into the geometry of data likelihood and established a new framework for statistical inference by Cheng et al. (2008, 2010). Advanced data analysis, in contrast to the existing methods, is established based on the MI identities and the fundamental Pythagorean law of conditional MI. This article presents the new methodology by elaborating its effective applications to feature selection in genetics for predicting patients with depressive disorders.

Article activity feed