A Method for Identifying Predatory Journals Driven by Large Language Models

Fanrui Zhang
Ming Chen

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

This study investigates whether the method of fine-tuning language models can be applied to the task of predatory journal identification and seeks to identify the optimal fine-tuning strategy feasible in similar practical environments. This study employs the Low-Rank Adaptation (LoRA) method to perform instruction-supervised fine-tuning on open-source distilled models (such as DeepSeek-R1-Distill-Qwen-1.5B) based on different strategies. Additionally, three machine learning algorithms and a general large language model API call solution were introduced to compare the performance differences between fine-tuned models, traditional classifiers, and non-fine-tuned large models. The results indicate that a 1.5B model fine-tuned with 398 structured samples surpassed the performance of non-fine-tuned general large models in the specific task, achieving an accuracy of 76%. A 7B model fine-tuned using the same strategy achieved an accuracy of 92%. The comparison revealed that fine-tuning can enhance the performance of distilled models in executing domain-specific tasks, and an increase in the parameter scale of the baseline model can significantly improve the performance of its fine-tuned version in the specific task.

Version published to 10.21203/rs.3.rs-9029371/v1 on Research Square
Mar 20, 2026

Efficient Knowledge Distillation for News Classification Based on ModernBERT

This article has 2 authors:
1. Xuyang Wang
2. Yuxi Zheng
This article has no evaluationsLatest version Apr 8, 2026
Large Language Models for Material Science: A Systematic Review

This article has 2 authors:
1. Cecília Coelho
2. Oliver Niggemann
This article has no evaluationsLatest version Apr 14, 2026
Evaluating the learnability of single-cell large language models on multiple tasks

This article has 3 authors:
1. Yu Yan
2. Xutao Wang
3. Dongyuan Song
This article has no evaluationsLatest version Mar 12, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Efficient Knowledge Distillation for News Classification Based on ModernBERT

Large Language Models for Material Science: A Systematic Review

Evaluating the learnability of single-cell large language models on multiple tasks