意味的類似度を用いたテキストからの未知知見検出法 --ディープラーニングによるアプローチ--

JingHao MA

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

This study explores an efficient method to extract novel insights from free-text data using Semantic Textual Similarity (STS). With the growing use of social media platforms like X (formerly Twitter), large-scale free-text data has become a valuable resource for real-time analysis in various fields. Traditional manual classification methods struggle with the vast and diverse datasets, necessitating computational approaches like Natural Language Processing (NLP).The proposed method employs STS to prioritize text analysis by measuring semantic similarity between existing insights and new data, streamlining the discovery process for previously unobserved insights. Evaluations using two datasets—one on public opinions about a film company and another on microaggressions experienced by Chinese students in Japan—demonstrated the model's effectiveness. Results showed that the STS-based approach significantly outperforms random sampling in detecting novel insights efficiently, even in multilingual and small datasets.

Version published to 10.31234/osf.io/hq8sd_v1 on OSF Preprints
Aug 21, 2025

《从柳林溪遗址“八索纹陶盘”看八卦起源、上古天文与《八索》文献的实证关联》

This article has 1 author:
1. songping zeng
This article has no evaluationsLatest version Jan 6, 2026
Cognitive Discourse Analysis can be up-scaled using Sentiment Analysis

This article has 3 authors:
1. Leena Sarah Farhat
2. Simon Willcock
3. William John Teahan
This article has no evaluationsLatest version Jan 12, 2026
Multi-Scale Computational Analysis of Wikipedia’s Telling of Global History

This article has 7 authors:
1. Steph Buongiorno
2. Jo Guldi
3. Marnie Hughes-Warrington
4. Nan Jiang
5. Rosie Larson
6. Sohan Bellam
7. Gregory J. Palermo
This article has no evaluationsLatest version Jan 19, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

《从柳林溪遗址“八索纹陶盘”看八卦起源、上古天文与《八索》文献的实证关联》

Cognitive Discourse Analysis can be up-scaled using Sentiment Analysis

Multi-Scale Computational Analysis of Wikipedia’s Telling of Global History