Box Embeddings for Extending Ontologies: A Data-Driven and Interpretable Approach
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Deriving symbolic knowledge from trained deep learning models is challenging due to the lack of transparency in models. A promising approach to address this issue is to couple a semantic structure with the predictions and make the outcomes interpretable. In prediction tasks such as multi-label classification, labels tend to form hierarchical relationships. Therefore, we propose enforcing a taxonomical structure on the model’s outputs throughout the training phase. In vector space, a taxonomy can be represented using axis-aligned hyperrectangles, or boxes, which may overlap or nest within one another. The boundaries of a box determine the extent of a particular category. Thus, we used box-shaped embeddings of ontology classes to learn and transparently represent logical relations that are only implicit in multi-label datasets. We assessed our model by measuring its ability to approximate the deductive closure of subsumption relations in the ChEBI ontology, which is a distinguished knowledge base in the field of chemistry. We demonstrate that our model captures implicit hierarchical relationships among labels, ensuring consistency with the underlying ontological conceptualization, while also achieving state-of-the-art performance in multi-label classification. Notably, this is accomplished without requiring an explicit taxonomy during the training process. Scientific Contribution: Our proposed approach advances chemical classification by enabling interpretable outputs through a structured and geometrically expressive representation of molecules and their classes.