BH25DE report: On the path to machine-actionable training materials

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The fragmentation of training materials across research infrastructures often results in unsustainable resource duplication and significant barriers to upskilling. This work aims to enable developers to build systems that effectively discover relevant materials by promoting a federated, FAIR-compliant strategy for open training. The project operated across three interrelated streams: metadata interoperability, material analysis, and the definition and representation of learning paths in a machine readable manner.We demonstrated content federation via the mTeSS-X platform, enabling cross-instance exchange and preparing for future integration with the EOSC federation. To enhance interoperability, we indexed relevant ontologies and curated semantic crosswalks between established metadata models, specifically MoDALIA and Schema.org/Bioschemas. These mappings were implemented within the open-source OERbservatory Python package, providing a facility for exchanging data between platforms such as DALIA and TeSS. For material analysis, we utilised Large Language Models (LLMs) and explored vectorisation techniques to calculate similarity, allowing for the identification of related materials and the potential for future deduplication of records across registries.To address the lack of machine-actionable trajectories across related or sequential materials, we proposed new Bioschemas profiles specifically for learning paths. By extending Schema.org types, including Course and Syllabus, we developed a schema that supports modular and linear orderings of training materials. This model was validated using SPARQL queries on knowledge graphs derived from real-world examples like the Galaxy Training Network. Such advancements provide a foundation for automated path generation and improved discoverability within training catalogues, and serves as a use case and strategy with broader applicability beyond those materials.

Article activity feed