How are doctors across specialties using commercial large language models? Insights from the Anthropic Economic Index

Izabella Mancewicz
Yufei Xu
Jeff R. Ma
Khoa N. Cao

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background Commercial large language models (LLMs) have demonstrated potential across a range of medical applications, yet empirical data on their real-world clinical use remains limited. Understanding the extent and nature of LLM adoption by doctors across medical specialties could reveal practical insights into the implementation of AI in healthcare. Methods Using data from the Anthropic Economic Index (March 2025 release), we analyzed approximately one million anonymized conversations with Claude 3.7 from February 2025, categorizing interactions by medical specialty, task intent, and type of AI interaction (augmentation vs. automation). Medical specialties and tasks were aligned with the Occupational Information Network (O*NET) database and adjusted for workforce size. Results Out of one million conversations, 5,998 (0.60%) were attributed to medical doctors, covering 17 specialties and 65 distinct clinical tasks. The majority of conversations involved diagnostics/interpretation (41.1%) and communication/consultation (27.6%), contrary to existing literature emphasizing administrative uses. Specialties such as radiology, allergology/immunology, and pathology showed the highest absolute usage of Claude, while allergists/immunologists, pathologists and nuclear medicine physicians had the highest utilization when adjusted for workforce size. Physicians predominantly used Claude to learn, validate, or perform iterative tasks rather than to automate tasks, with significant variation by specialty. Conclusion Our findings indicate predominant use of a commercial LLM in clinical tasks, particularly diagnostics and patient communication, rather than in administrative tasks only. This study highlights differential adoption rates among specialties and patterns of augmentation compared to automation, marking critical areas for future AI integration and development.

Version published to 10.21203/rs.3.rs-7384730/v1 on Research Square
Mar 24, 2026

Language-dependent variability in large language model performance on pharmaceutical knowledge tasks

This article has 7 authors:
1. Hiroto Asano
2. Yu-Shi Tian
3. Asuka Hatabu
4. Minako Ohishi
5. Kaori Fukuzawa
6. Daisuke Takaya
7. Kenji Ikeda
This article has no evaluationsLatest version Mar 27, 2026
The Inefficacy of Artificial Intelligence Large Language Models in Healthcare: A Clinical and Statistical Perspective

This article has 4 authors:
1. Michael Williams
2. Raeed Kabir
3. Cody Taylor
4. Tariq Nakhooda
This article has no evaluationsLatest version Apr 27, 2026
The Inefficacy of Artificial Intelligence Large Language Models in Healthcare: A Clinical and Statistical Perspective

This article has 4 authors:
1. Michael Williams
2. Raeed Kabir
3. Cody Taylor
4. Tariq Nakhooda
This article has no evaluationsLatest version Apr 27, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Language-dependent variability in large language model performance on pharmaceutical knowledge tasks

The Inefficacy of Artificial Intelligence Large Language Models in Healthcare: A Clinical and Statistical Perspective

The Inefficacy of Artificial Intelligence Large Language Models in Healthcare: A Clinical and Statistical Perspective