The Multiple Approaches for Drug-Drug Interaction Extraction using Machine learning and transformer based Model

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

This research paper investigates a machine learning based approach for Drug-Drug Interaction (DDI) extraction for determining the side effects of multi drugs when prescribed simultaneously. In our proposed model we used TAC 2017 Dataset, which has Adverse Drug Reactions (ADRs) data for the classification ofdrug-drug interaction. TAC 2017 Dataset has various types of information which are related to drugs and their interactions. Our method uses Term Frequency-Inverse Document Frequency (TF-IDF) to transform the textual descriptions ofside effects for DDI into numerical feature vectors, followed by a Random Forest Classifier, Gradient Boosting, BioBERT, Support Vector Machine (SVM) Algorithms to predict the potential interactions between drug pairs. One of the key strength of the Random Forest approach is its ability to provide feature importance scores, which allows us to interpret which side effects are most influential in predicting drug interactions. The key advantage of Gradient Boosting is its high predictive performance combined with interpretability. It is able to handle complex, structured data efficiently. Additionally, the model’s decisions are more transparent, which is necessary in the biomedical domain. The advantageof SVM is its ability to handle high-dimensional data, capture complex non-linear interactions using kernel functions, and generalize with datasets, making it robust to over-fitting. BioBERT (Bidirectional Encoder Representations from Transformers for Biomedical Text Mining) is advantageous for DDI prediction due to its biomedical domain knowledge, contextual understanding of complex drug-related texts.These method captures the relevance and importance of each side effect of multi drugs and also generate pairs of drugs from the dataset. Our model demonstrates competitive performance in DDI prediction, which highlights the utility of text-based feature extraction combined with an interpretable ensemble learning model.

Article activity feed