ATCodeR: a dictionary-based R-tool to standardize medication free-text
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Over the past decades, oncology treatment paradigms have developed significantly. Yet, the often unstructured nature of substance-related documentation in medical records presents a time-consuming challenge for analyzing treatment patterns and outcomes. To advance oncological research further, clinical data science must offer solutions that facilitate research and analysis with real-world data (RWD). The present contribution introduces a user-friendly R-tool designed to transform free-text medication entries into the structured Anatomical Therapeutic Chemical (ATC) Classification System by applying a dictionary-based approach. The resulting output is a structured data frame containing columns for antineoplastic medication, other medications, and supplementary information. For accuracy validation, 561 data entries from an evaluation data set were reviewed, consisting of 935 tokens. 88.5% of these tokens were successfully transformed into their respective ATC codes. Additional information was extracted from 129 data entries (23%), while 23 entries (4.1%) presented no usable information. All tokens underwent a manual review; 8.9% (84 tokens) failed transformations. This approach improves the standardization and analysis of systemic anti-cancer treatment data in German-speaking regions by optimizing efficiency while maintaining relevant accuracy.