TMSFE: A Transformer-Based Multi-Label Semantic Feature Extraction Method

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Multi-label text classification is a critical task in natural language processing, in which each document may belong to multiple categories. This setting is challenging, as it involves complex label dependencies and requires extracting fine-grained semantic features for each label. We propose a novel Transformer-based algorithm, TMSFE Transformer-based multi-label semantic feature extraction, which integrates label-specific query embeddings with a multi-head attention mechanism to extract discriminative features for each potential label and leverages a Latent semantic space to enhance the efficiency of feature extraction. Unlike conventional single-label classifiers or flat multi-label methods, the proposed model designs a DeBERTaV3-based Transformer encoder to jointly model the document and label semantics. Additionally, the proposed SimCSE-Based latent semantic space module projects text and label representations into a shared latent semantic space to enhance feature extraction efficiency. And a sigmoid-based multi-label classification head is applied to the extracted features. Results show that the proposed TMSFE consistently outperforms baseline models, achieving lower Hamming loss and higher feature extraction accuracy.

Article activity feed