Extracting the Research Goal from Biomedical Abstracts

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Extracting the research goal of a publication can potentially support researchers when searching the biomedical literature. Systems can make use of this information for various tasks, e.g., query processing or matching and ranking the candidates. However, this is a very subjective and complex task, which might involve various semantic types and may vary depending on the research area. Previous work have ventured in this area, but as far as we know, no specific dataset is yet available. We reused seven reviews from the European Commission with annotations about the research goal for more than 2.8k articles. We compiled the RG4C dataset, which we then used to fine tune a model for the automatic extraction of four criteria: “Field of Application”, “Disease Area”, “Disease Feature”, and “Biological Endpoint”. We obtained an overall f-score of 0.59, with results for each criterion ranging from 0.35 to 0.83. The RG4C dataset is available at: https://www.kaggle.com/datasets/marianaln/eu-qa-complete . Our source code is available at: https://www.kaggle.com/code/marianaln/eu-qa-features/notebook

Article activity feed