Predicting New York Heart Association (NYHA) Heart Failure Classification from medical student notes following simulated patient encounters
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Random forest models have demonstrated utility in the determination of New York Heart Association (NYHA) Heart Failure Classifications. This study aims to determine the prediction accuracy of a random forest model to derive NYHA Classification from medical students’ free text history of present illness (HPI). NYHA Classifications established terminology for delineation of various heart failure presentations, this terminology was converted into keywords shared by standardized patients. 649 typed HPIs were de-identified, tokenized, cleaned, and assessed for number of correct keywords, incorrect keywords, and keyword usage. Models were trained using bootstrapped training data and assessed on test data. In testing, the model demonstrated a 0.775% error rate in identifying NYHA II, 26.3% for NYHA III, and 6.90% for NYHA IV. Overall reporting a 0.420% estimated error rate on the bootstrap sample training set and an 8.20% misclassification rate on the testing set. In future applications, developing a method of instantaneous feedback centered around keywords and their importance measures, specifically as determined by the variable importance plot (VIP), may aid students in their determination of NYHA Classifications and improve their lexical density.