How are doctors across specialties using commercial large language models? Insights from the Anthropic Economic Index
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background Commercial large language models (LLMs) have demonstrated potential across a range of medical applications, yet empirical data on their real-world clinical use remains limited. Understanding the extent and nature of LLM adoption by doctors across medical specialties could reveal practical insights into the implementation of AI in healthcare. Methods Using data from the Anthropic Economic Index (March 2025 release), we analyzed approximately one million anonymized conversations with Claude 3.7 from February 2025, categorizing interactions by medical specialty, task intent, and type of AI interaction (augmentation vs. automation). Medical specialties and tasks were aligned with the Occupational Information Network (O*NET) database and adjusted for workforce size. Results Out of one million conversations, 5,998 (0.60%) were attributed to medical doctors, covering 17 specialties and 65 distinct clinical tasks. The majority of conversations involved diagnostics/interpretation (41.1%) and communication/consultation (27.6%), contrary to existing literature emphasizing administrative uses. Specialties such as radiology, allergology/immunology, and pathology showed the highest absolute usage of Claude, while allergists/immunologists, pathologists and nuclear medicine physicians had the highest utilization when adjusted for workforce size. Physicians predominantly used Claude to learn, validate, or perform iterative tasks rather than to automate tasks, with significant variation by specialty. Conclusion Our findings indicate predominant use of a commercial LLM in clinical tasks, particularly diagnostics and patient communication, rather than in administrative tasks only. This study highlights differential adoption rates among specialties and patterns of augmentation compared to automation, marking critical areas for future AI integration and development.