Knowledge representation of a multi-centre adolescent and young adult (AYA) cancer infrastructure; development of the STRONG AYA Knowledge Graph

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Purpose

Rare diseases are difficult to fully capture, and regularly call for large, geographically dispersed initiatives. Such initiatives are often met with data harmonisation challenges. These challenges render data incompatible and impede successful realisation. The STRONG AYA project is such an initiative, specifically focusing on adolescents and young adults (AYAs) with cancer. STRONG AYA is setting up a federated data infrastructure containing data of varying format. Here, we elaborate on how we used healthcare-agnostic Semantic Web technologies to overcome such challenges.

Methodology

We structured the STRONG AYA case-mix and core outcome measures concepts and their properties as knowledge graphs. Having identified the corresponding standard terminologies, we developed a semantic map based on the knowledge graphs and the here introduced annotation helper plugin for Flyover. Flyover is a tool that converts structured data into Resource Descriptor Framework (RDF) triples and enables semantic interoperability. As a demonstration, we mapped data that is to be included in the STRONG AYA infrastructure.

Results

The knowledge graphs provided a comprehensive overview of the large number of STRONG AYA concepts. The semantic terminology mapping and annotation helper allowed us to query data with incomprehensible terminologies, without changing them. Both the knowledge graphs and semantic map were made available on a Hugo webpage for increased transparency and understanding.

Discussion

The use of Semantic Web technologies such as RDF and knowledge graphs are a viable solution to overcome challenges regarding data interoperability and reusability for a federated AYA cancer data infrastructure without being bound to rigid standardised schemas. The linkage of semantically meaningful concepts to otherwise incomprehensible data elements demonstrates how by using these domain-agnostic technologies we made non-standardised healthcare data interoperable.

Article activity feed