Semantic Embeddings and the Peripheral Transcriptome in Ischemic Stroke: Connecting Molecular Signatures to NANDA-I Diagnoses
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Objective
To construct and evaluate, in an exploratory manner, a pathophysiologic rationale linking biological pathways derived from the peripheral transcriptome in ischemic stroke (IS) to nursing diagnoses in the NANDA-I 2024-2026 taxonomy, while emphasizing that this association is not direct, deterministic, or automatically inferable from textual similarity with large language models (LLMs).
Methods
A computational study was conducted using public secondary data from the Gene Expression Omnibus series GSE16561, which includes 63 peripheral blood samples: 39 from individuals with IS and 24 from healthy controls. The pipeline integrated transcriptomic analysis and functional enrichment, semantic mapping through ClinicalBERT embeddings, and mechanistic and clinical-conceptual judgment using Claude Sonnet 4.6 as a judge. The judgment stage was treated as the central interpretive layer, designed to mediate the transcriptome, pathophysiology, functional manifestation, and NANDA-I diagnosis.
Results
The analysis identified a bimodal transcriptomic pattern, with activation of pathways related to innate immunity and suppression of pathways related to adaptive immunity. Semantic mapping generated 158 pathway-diagnosis pairs. The Spearman correlation between cosine similarity and the mechanistic score was negative and statistically significant (rho = -0.243; p = 2.09e-03), but weak in magnitude. This effect size indicates that semantic similarity explained less than 6% of the variance in mechanistic plausibility, reinforcing the insufficiency of embeddings as a stand-alone criterion. Of the 158 pairs, 14 were classified as high concordance, 8 as moderate, and 136 as divergent.
Conclusion
The main value of this study lies in demonstrating that translating biological pathways into nursing diagnoses requires pathophysiologic, functional, and clinical-conceptual mediation. The prioritized pairs represent mechanistically plausible hypotheses for future research, without implying causality, direct clinical confirmation, or immediate care recommendations.