Can AI Conduct Research: A Pragmatic Experiment

Daniel Lee

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

This study evaluated the capability of three prominent Large Language Model (LLM) AI tools—ChatGPT, Copilot, and Claude—to independently conduct a complete research process. The investigation involved generating hypothetical data (“silicon samples”), performing inductive thematic analysis, and composing research reports. Findings reveal that while AI can produce outputs resembling research products, the quality varies significantly, often lacking depth, accuracy, and synthesis. Notably, AI-generated data and analyses tend to be predictable, with issues such as hallucinated references and misquoted data, underscoring the necessity of human oversight. The study highlights both the potential and current limitations of AI in autonomous research, emphasizing that human researchers remain essential for producing high-quality, impactful scholarly work.

Version published to 10.35542/osf.io/2vuxp_v1 on OSF Preprints
Jan 14, 2026

Trial and Insight: Combining Quantitative Content Analysis and AI for Experimental Stimulus Generation

This article has 4 authors:
1. Yannick Winkler
2. Pablo Jost
3. Nils Schwager
4. Pascal Jürgens
This article has no evaluationsLatest version Mar 4, 2026
Should generative AI be used in reflexive qualitative research?

This article has 3 authors:
1. Elida Izani Ibrahim
2. Laura K. Nelson
3. Andrea Voyer
This article has no evaluationsLatest version Feb 9, 2026
Do People Think ChatGPT Is Conscious? Evidence from a Large Polish Sample

This article has 1 author:
1. Hubert Plisiecki
This article has no evaluationsLatest version Feb 17, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Trial and Insight: Combining Quantitative Content Analysis and AI for Experimental Stimulus Generation

Should generative AI be used in reflexive qualitative research?

Do People Think ChatGPT Is Conscious? Evidence from a Large Polish Sample