Prompting large language models to extract chemical‒disease relation precisely and comprehensively at the document level

Mei Chen
Tingting Zhang
Shibin Wang

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Currently, the intricate relationships between chemicals and diseases have increasingly been revealed and documented in many academic studies. However, constrained by the scarcity of high-quality annotated data, document-level relation extraction techniques based on deep learning are plagued by ambiguity and one-sidedness, severely impeding further automatic in-depth analysis of these relationships. In this study, we harness the advanced reading comprehension capabilities and extensive world knowledge of LLMs to innovatively construct precise and comprehensive zero-shot prompt workflows for extracting chemical‒disease relationships on the basis of the patterns of LLM-based relation extraction, the attributes of chemical‒disease relationships, and the linguistic features of biomedical literature. Evaluations of the enhanced chemical‒disease relation (CDR) datasets demonstrate that the F1 scores for the workflows in precise extraction can reach 87% and can reach 73% for comprehensive extraction. Additionally, we investigate the pivotal factors that influence LLMs' ability to extract complex relations, offering crucial insights for relation extraction across diverse fields.

Version published to 10.21203/rs.3.rs-4712263/v1 on Research Square
Aug 9, 2024

Prompt-Orchestrated Large Language Models for Clinical Information Extraction

This article has 13 authors:
1. Livia Lilli
2. Andrea Rosati
3. Giovanni Paolo Tobia
4. Massimo Criscione
5. Federica Tomassini
6. Chiara Dachena
7. Alice Luraschi
8. Chiara Cantarini
9. Carolina De Maria
10. Luigi Congedo
11. Massimo Bernaschi
12. Stefano Patarnello
13. Anna Fagotti
This article has no evaluationsLatest version Jan 16, 2026
Emergence of Biological Structural Discovery in General-Purpose Language Models

This article has 1 author:
1. Liang Wang
This article has no evaluationsLatest version Jan 8, 2026
DiLLaB: Discussion Labeling with LLMs for Building Datasets

This article has 6 authors:
1. Ludimila Gonçalves
2. Márcia Lima
3. André Carvalho
4. Walter Nakamura
5. Igor Steinmacher
6. Tayana Conte
This article has no evaluationsLatest version Jan 28, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Prompt-Orchestrated Large Language Models for Clinical Information Extraction

Emergence of Biological Structural Discovery in General-Purpose Language Models

DiLLaB: Discussion Labeling with LLMs for Building Datasets