PruneBERT: Context-Aware Sentence Classification through Statistical Relevance Pruning

Raghav Kaushik R
Jeganathan L
Janaki Meena M
Ummity Srinivasa Rao
Jayaram Balabaskaran

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Traditional grading mechanisms have been time consuming, prone to error and sometimes even biased. To account for quick grading that is error free to a larger extent and unbiased, AI based Automated Grading Systems(AGS’s) are the go-to technology. Through AGS’s one can expect that the workload management of instructors in the higher education system to be streamlined. The predominant reason for such delays and error in manual grading as well as AGS's is the presence of irrelevant information within an assignment \slash answer sheet. This is a new and a major challenge identified as existing in the AGS's - their inability to identify unimportant sentences when grading student responses. In this paper we leverage the advantage of contextual embeddings from Sentence-BERT (all-mpnet-base-v2) and semantic representations from BERT to introduce a novel dual-embedding framework, called PruneBERT. Our method captures both the inter-sentence coherence and fine-grained semantic distinctions for classifying irrelevant sentences in domain-specific texts, aimed at enhancing the precision of AGS’s. Central to PruneBERT is an adaptive thresholding mechanism that dynamically adjusts similarity cutoffs based on statistical properties of cosine similarity distributions, enabling robust irrelevance filtering across diverse textual inputs. Evaluated on a curated corpus of around 3000 sentences from computer science domains, PruneBERT achieves a relative improvement of 40% in F1 score over conventional threshold-based and single-embedding baselines. The approach offers a lightweight, interpretable, and computationally efficient alternative to large language model inference, making it well-suited for scalable applications in automated grading, summarization, and domain-aware content filtering.

Version published to 10.21203/rs.3.rs-8075703/v1 on Research Square
Feb 6, 2026

Towards Reliable LLM Grading Through Self-Consistency and Selective Human Review: Higher Accuracy, Less Work

This article has 6 authors:
1. Luke Korthals
2. Emma Akrong
3. Gali Geller
4. Hannes Rosenbusch
5. Raoul Grasman
6. Ingmar Visser
This article has no evaluationsLatest version Feb 4, 2026
Relation Extraction (RE) Model for Afaan Oromo Text Using Self-Attention Mechanisms

This article has 1 author:
1. Lingerew Bantie
This article has no evaluationsLatest version Feb 26, 2026
DiLLaB: Discussion Labeling with LLMs for Building Datasets

This article has 6 authors:
1. Ludimila Gonçalves
2. Márcia Lima
3. André Carvalho
4. Walter Nakamura
5. Igor Steinmacher
6. Tayana Conte
This article has no evaluationsLatest version Jan 28, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Towards Reliable LLM Grading Through Self-Consistency and Selective Human Review: Higher Accuracy, Less Work

Relation Extraction (RE) Model for Afaan Oromo Text Using Self-Attention Mechanisms

DiLLaB: Discussion Labeling with LLMs for Building Datasets