Research on a Denoising Model for Entity-Relation Extraction Using Hierarchical Contrastive Learning with Distant Supervision

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Distant supervision is a technique that utilizes knowledge base information to automatically generate labels for text samples, enabling the large-scale creation of labeled data. However, this approach often encounters the issue of noisy labels in practice, which arises from inaccuracies in the alignment between the text and the knowledge base, leading to erroneous generated labels that adversely affect the model's performance. In the task of relation extraction, such noise not only diminishes extraction accuracy but may also cause the model to favor the recognition of common relations while neglecting long-tail relations. To address these issues, this paper proposes an innovative hierarchical contrastive learning framework, specifically applied to the Uyghur language using pre-trained models for XML and CINO minority language modeling. This framework effectively integrates both global structural information and local fine-grained interactions to reduce noise within sentences. Specifically, a three-layer learning architecture is designed, which incorporates interactions at different levels and employs a multi-head self-attention mechanism to generate denoised context-aware representations, referred to as multi-granular re-contextualization. Additionally, a dynamic gradient adversarial perturbation data augmentation strategy is introduced to provide pseudo-positive samples for contrastive learning, further enhancing the model's capabilities in recognizing rare relations. Experimental results demonstrate that the proposed framework significantly improves accuracy and robustness in the task of Uyghur relation extraction, validating its effectiveness and innovativeness. This research offers new perspectives and methodologies for the field of distant supervision in relation extraction, advancing further development in this area.

Article activity feed