Exploring Semanticity-Based Clustering of Text Using Transformer Models: Advancing AI Applications in Education and Beyond

Shreya Suresh
Jeganathan L
Janaki Meena M
Srinivasa Rao Ummity
Jayaram Balabaskaran

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The study explores semantic-based clustering using transformer models to overcome the limitations of traditional text clustering approaches. While conventional methods rely on word frequency, this research leverages BERT and SciBERT's contextual understanding capabilities for more nuanced text organization. The methodology combines transformer-based semantic embeddings with various pooling strategies and clustering algorithms, comparing their performance against TF-IDF baselines. Experiments extended across five diverse domains: news, research papers, e-commerce products, movies, and job postings. It was observed that transformer-based embeddings with CLS pooling consistently outperformed traditional methods, producing more coherent clusters across all domains. SciBERT proved to be particularly useful for scientific text. These findings show possible applications in personalized learning systems, content organization, and recommender systems where semantic interpretation is critical. The research provides a framework to develop text clustering solutions better suited to capture contextual linkages and semantic intricacies in complex document collections.

Version published to 10.21203/rs.3.rs-6672315/v1 on Research Square
Jul 24, 2025

HGTBF-OM: A Hybrid Graph-Based Transformer Framework for Enhanced Opinion Mining in Textual Data

This article has 2 authors:
1. Madhurika B
2. Naga Malleswari D
This article has no evaluationsLatest version Jul 24, 2025
Hybrid Semantic Retrieval: Augmenting Weighted TF-IDF with BERT for Enhanced Question Answering

This article has 1 author:
1. Dinesh Kumar Koilada
This article has no evaluationsLatest version Sep 10, 2025
Enhancing Multilingual Text Understanding viaTransformer-Based Meta-Learning

This article has 3 authors:
1. Zhu Xiaoyuan
2. Tao Yun
3. Yu Rui
This article has no evaluationsLatest version Jul 25, 2025

Listed in

Abstract

Article activity feed

Related articles

HGTBF-OM: A Hybrid Graph-Based Transformer Framework for Enhanced Opinion Mining in Textual Data

Hybrid Semantic Retrieval: Augmenting Weighted TF-IDF with BERT for Enhanced Question Answering

Enhancing Multilingual Text Understanding viaTransformer-Based Meta-Learning