Comparative Analysis of Automatic Literature Review Using Mistral Large Language Model and Human Reviewers

Hsiao-Ching Tsai
Yueh-Fen Huang
Chih-Wei Kuo

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

This study evaluates the effectiveness of the Mistral Large Language Model (LLM), enhanced with Retrieval-Augmented Generation (RAG), in automating the process of conducting literature reviews, comparing its performance with traditional human-led review processes. Through a methodical analysis of 50 scientific papers from the OpenReview platform, the study investigates the model's efficiency, scalability, and quality of review, including coherence, relevance, and analytical depth. The findings indicate that while the Mistral LLM significantly surpasses human efforts in terms of efficiency and scalability, it occasionally lacks the analytical depth and attention to detail that characterize human reviews. Despite these limitations, the model demonstrates considerable potential in standardizing preliminary literature reviews, suggesting a hybrid approach where Mistral LLM's capabilities are integrated with human expertise to enhance the literature review process. The study underscores the necessity for further advancements in AI technology to achieve deeper analytical insights and highlights the importance of addressing ethical concerns and biases in AI-assisted research. The integration of LLMs like Mistral presents a promising avenue for redefining academic research methodologies, pointing towards a future where AI and human intelligence collaborate to advance scholarly discourse.

Version published to 10.21203/rs.3.rs-4022248/v1 on Research Square
Mar 8, 2024

How does generative artificial intelligence compare to human analysts in qualitative research?: A systematic review of large language models

This article has 7 authors:
1. Amanda Seyler
2. Roxy O'Rourke
3. Nhu Huynh
4. Christie Burton
5. Paul Arnold
6. Theodore Cheung
7. Jennifer Crosbie
This article has no evaluationsLatest version Dec 13, 2025
Argumentative essay assessment with LLMs: A critical scoping review

This article has 5 authors:
1. Lucile Favero
2. Gabrielle Gaudeau
3. Juan Antonio Pérez-Ortiz
4. Tanja Käser
5. Nuria Oliver
This article has no evaluationsLatest version Feb 2, 2026
LLM Aspect Prediction: Reviewing Academic Papers from Different Aspects with Large Language Model

This article has 3 authors:
1. Zihao Hu
2. Fumiyo Fukumoto
3. Dongjin Yu
This article has no evaluationsLatest version Dec 11, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

How does generative artificial intelligence compare to human analysts in qualitative research?: A systematic review of large language models

Argumentative essay assessment with LLMs: A critical scoping review

LLM Aspect Prediction: Reviewing Academic Papers from Different Aspects with Large Language Model