Large Language Models Augment or Substitute Human Experts in Idea Screening

Brendon Rhodes
Pavel Kireyev
Cathy Yang
Abhishek Borah

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Firms that use crowdsourcing to gather advertising and product ideas often rely on internal experts to manually screen thousands of submissions, a costly and time-consuming process. Internal experts rate thousands of ideas to identify a small set of promising ones that are then submitted for additional review. We evaluate how large language models (LLMs), when combined with a machine learning model trained on historical expert ratings and final client selections, can improve the efficiency of this screening. Using data from a platform that engaged experts to evaluate 74,436 ideas across 153 contests for major advertisers, we show that evaluation effort can be reduced by 28.4% compared to the status quo. Of this reduction, 3.8% is directly attributable to the LLM output, while the remainder comes from better weighting expert scores to align with sponsor preferences. Notably, incorporating LLMs could make 5 out of 10 experts redundant, compared to 3 with machine learning alone. Importantly, the experts whose judgments are most replicable by the LLM are not necessarily the poorest performers. These findings offer a practical framework for integrating LLMs into idea screening pipelines and underscore their potential to streamline expert evaluation while maintaining alignment with client goals.

Version published to 10.21203/rs.3.rs-7085870/v1 on Research Square
Aug 28, 2025

Democratizing Deep Expertise: A Framework for Extracting and Codifying Tacit Knowledge Using Large Language Models

This article has 1 author:
1. Irshad Abdulla
This article has no evaluationsLatest version Oct 7, 2025
A Contextual Quality Reward Model For Reliable and Efficient Best of N Sampling

This article has 1 author:
1. Hyung Gyu Rho
This article has no evaluationsLatest version Sep 12, 2025
Leveraging Retrieval-Augmented Prompting for Enhanced Comment Feedback Prediction with Large Language Models

This article has 2 authors:
1. Ruocheng Li
2. Julian Perry
This article has no evaluationsLatest version Oct 6, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Democratizing Deep Expertise: A Framework for Extracting and Codifying Tacit Knowledge Using Large Language Models

A Contextual Quality Reward Model For Reliable and Efficient Best of N Sampling

Leveraging Retrieval-Augmented Prompting for Enhanced Comment Feedback Prediction with Large Language Models