Unraveling HIV protease drug resistance and genetic diversity with kernel methods

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

A definitive cure for HIV/AIDS does not exist yet and, thus, patients rely in antiretroviral therapy for life. In this scenario, the emergence of drug resistance is an important concern. The automatic prediction of resistance from HIV sequences is a fast tool for physicians to choose the best possible medical treatment. This paper proposes three kernel functions to deal with this data: one focused on single residue mutations, another on k -mers (close-range information in sequence), and another on pairwise interactions between amino acids (close and long-range information). Furthermore, the three kernels are able to deal with the categorical nature of HIV data and the presence of allelic mixtures. The experiments on the PI dataset from the Stanford Genotype-Phenotype database show that they generate prediction models with a very good performance, while remaining simple, open and interpretable. Most of the mutations and patterns they consider relevant are in agreement with previous literature. Also, this paper compares the different but complementary view that two kernel methods (SVM and kernel PCA) give over HIV data, showing that the former is focused on optimizing prediction while the latter summarizes the main patterns of genetic diversity, which in the Stanford Genotype-Phenotype database are related to drug resistance and HIV subtype.

Article activity feed