Unified Instruction Encoding and Gradient Coordination for Multi-Task Language Models

Wuyang Zhang
Zhaoyang Xu
Yexin Tian
Yan Wu
Mengjie Wang
Xiandong Meng

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

This paper conducts an in-depth study on the theoretical foundations and optimization mechanisms of instruction tuning for multi-task generalization. We propose a unified, parameter-efficient tuning framework that integrates instruction embedding modeling, task similarity regularization, and gradient alignment. The goal is to enhance the generalization and robustness of large language models under complex task combinations. Current instruction tuning methods often suffer from representation shift, objective conflict, and gradient interference when handling heterogeneous tasks. To address these issues, we propose a systematic solution from both structural design and optimization perspectives. Methodologically, we introduce semantically aligned instruction encodings to improve representation consistency across tasks. During optimization, we apply gradient projection to reduce inter-task update conflicts and adopt a dynamic weighting strategy based on gradient variation to enhance training stability and coordination. On the theoretical side, we construct an upper bound on generalization error based on Rademacher complexity and KL divergence of distributional shifts. This provides a formal characterization of the performance boundaries in multi-task instruction tuning. We conduct multiple experiments on the Super-NaturalInstructions dataset. The evaluation covers various aspects, including different instruction formulations, generalization to unseen tasks, and robustness under task combinations. Results show that the proposed method outperforms baselines on key metrics. The findings confirm its effectiveness in improving generalization under high task heterogeneity and reducing the risk of conflicts during cross-task learning.

Version published to 10.20944/preprints202506.0582.v1
Jun 9, 2025

A Comparative Survey of Large Language Models: Foundation, Instruction-Tuned, and Multimodal Variants

This article has 2 authors:
1. Owen Graham
2. Jim Balford
This article has no evaluationsLatest version Jun 13, 2025
Structuring Low-Rank Adaptation with Semantic Guidance for Model Fine-Tuning

This article has 6 authors:
1. Hongye Zheng
2. Yumeng Ma
3. Yichen Wang
4. Guiran Liu
5. Zhen Qi
6. Xu Yan
This article has no evaluationsLatest version Jun 18, 2025
Selective Knowledge Injection via Adapter Modules in Large‐Scale Language Models

This article has 6 authors:
1. Hongye Zheng
2. Lipeng Zhu
3. Wanyu Cui
4. Ray Pan
5. Xu Yan
6. Yue Xing
This article has no evaluationsLatest version Jun 3, 2025

Listed in

Abstract

Article activity feed

Related articles

A Comparative Survey of Large Language Models: Foundation, Instruction-Tuned, and Multimodal Variants

Structuring Low-Rank Adaptation with Semantic Guidance for Model Fine-Tuning

Selective Knowledge Injection via Adapter Modules in Large‐Scale Language Models