Adaptive Gradient-Norm Weighting for Improved Domain Adversarial Training
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Unsupervised domain adaptation (UDA) transfers knowledge from labeled source domains to unlabeled target domains by jointly optimizing classification and domain alignment objectives. The trade-off parameter λ controlling this balance is critical, yet most existing methods rely on fixed values or manually designed schedules that do not adapt to evolving optimization dynamics. Motivated by the theoretical coupling between source error and domain divergence in the Ben-David bound, and inspired by gradient-balancing principles from multi-task learning, we propose Adaptive Gradient-Norm Weighting (AWD), a lightweight scalarization mechanism that dynamically adjusts λ at each training iteration based on the ratio of classification and alignment gradient norms. Unlike architectural modifications, AWD operates purely at the training level and introduces no additional learnable parameters. It can be integrated into existing UDA pipelines as a drop-in replacement for static or scheduled λ settings. We evaluate AWD within two foundational UDA frameworks—Deep Adaptation Network (DAN) and Domain-Adversarial Neural Network (DANN)—to isolate the effect of adaptive scalarization. Across Office-31, Office-Home, DomainNet, and digit adaptation benchmarks, AWD consistently improves performance, yielding gains of up to +4.5% on Office-Home for DANN, while incurring only approximately 3% additional training time. These results suggest that gradient-aware adaptive weighting provides a practical and interpretable approach to balancing classification and alignment objectives, and may serve as a general training-level mechanism applicable across a broad range of UDA architectures.