Large Language Models Augment or Substitute Human Experts in Idea Screening
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Firms that use crowdsourcing to gather advertising and product ideas often rely on internal experts to manually screen thousands of submissions, a costly and time-consuming process. Internal experts rate thousands of ideas to identify a small set of promising ones that are then submitted for additional review. We evaluate how large language models (LLMs), when combined with a machine learning model trained on historical expert ratings and final client selections, can improve the efficiency of this screening. Using data from a platform that engaged experts to evaluate 74,436 ideas across 153 contests for major advertisers, we show that evaluation effort can be reduced by 28.4% compared to the status quo. Of this reduction, 3.8% is directly attributable to the LLM output, while the remainder comes from better weighting expert scores to align with sponsor preferences. Notably, incorporating LLMs could make 5 out of 10 experts redundant, compared to 3 with machine learning alone. Importantly, the experts whose judgments are most replicable by the LLM are not necessarily the poorest performers. These findings offer a practical framework for integrating LLMs into idea screening pipelines and underscore their potential to streamline expert evaluation while maintaining alignment with client goals.