Improving the Sample Efficiency of In-Context Learning in Large Language Models Through Meta-Level Optimization
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
In-Context Learning (ICL) has emerged as a compelling paradigm for leveraging Large Language Models (LLMs) on downstream tasks without explicit fine-tuning. However, the performance of ICL is often sensitive to the selection of demonstrations, and standard pre-training objectives do not explicitly optimize for this learning paradigm. To address these limitations, we propose a novel training strategy called Meta-In-Context Learning (Meta-ICL) Training. Our approach involves pre-training LLMs on a diverse set of tasks, where each training instance is formulated as an in-context learning scenario, consisting of a task description, a few demonstrations, and a query. Through extensive experiments on seven Natural Language Understanding benchmarks, we demonstrate that LLMs trained with our Meta-ICL strategy significantly outperform baseline methods, including sophisticated demonstration selection techniques. Our findings highlight the effectiveness of explicitly training LLMs to learn from context, paving the way for more robust and effective in-context learning capabilities.