Prior Knowledge Shapes Fine-Tuning Success for Biomedical Term Normalization
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Large language models (LLMs) often fail to correctly link biomedical terms to their standardized ontology identifiers, posing challenges for downstream applications that depend on accurate, machine-readable codes. These linking failures can compromise the integrity of data used in precision medicine, clinical decision support, and population health. Fine-tuning can partially remedy these issues, but the degree of improvement varies across terms and terminologies. Focusing on the Human Phenotype Ontology (HPO), we show that a model’s prior knowledge of term–identifier pairs, acquired during pre-training, strongly predicts whether fine-tuning will enhance its linking accuracy. We evaluate prior knowledge in three complementary ways: (1) Latent probabilistic knowledge, revealed through stochastic prompting, captures hidden associations not evident in deterministic output; (2) Partial subtoken knowledge, reflected in incomplete but non-random generation of identifier components; and (3) Term familiarity, inferred from annotation frequencies in the biomedical literature, which serve as a proxy for training exposure. We then assess how these forms of prior knowledge influence deterministic accuracy in identifier linking. Fine-tuning performance varies most for terms in what we call the reactive middle zone of the ontology—terms with intermediate levels of prior knowledge that are neither absent nor fully consolidated. These terms exhibit the largest gains or losses in accuracy during fine-tuning, suggesting that the success of knowledge injection critically depends on the initial level of model familiarity with the term–identifier pair.