Task-Conditioned Representation Adaptation for Many-Shot In-Context Learning via Continued Pretraining
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
While continued pretraining has been shown to improve many-shot in-context learning (ICL) in large language models, the underlying representation dynamics that support task generalization remain insufficiently explored. This paper introduces a task-conditioned continued pretraining strategy that enhances ICL by explicitly disentangling task-specific and task-invariant representations during pretraining. The method augments standard language modeling objectives with lightweight task-conditioning signals derived from latent task clusters inferred using contrastive embedding similarity. The model is pretrained on a 150-billion-token mixed-domain corpus containing over 3,200 instruction-defined tasks, with each training sequence incorporating up to 128 demonstrations. Empirical evaluations across arithmetic reasoning, code generation, and information extraction tasks indicate that the proposed approach yields consistent gains in many-shot ICL performance, achieving up to 9.3% accuracy improvement over baseline continued pretraining methods. Representation probing further shows a 17% increase in task-separability scores while preserving general linguistic coherence. These findings suggest that task-conditioned representation adaptation during continued pretraining provides a scalable and data-efficient pathway for improving many-shot in-context learning across heterogeneous task distributions.