Improving Multi-task Learning via Seeking Task-based Flat Regions

Phan, Hoang; Tran, Lam; Tran, Quyen; Tran, Ngoc N.; Truong, Tuan; Ho, Nhat; Phung, Dinh; Le, Trung

Computer Science > Machine Learning

arXiv:2211.13723 (cs)

[Submitted on 24 Nov 2022 (v1), last revised 19 Nov 2024 (this version, v3)]

Title:Improving Multi-task Learning via Seeking Task-based Flat Regions

Authors:Hoang Phan, Lam Tran, Quyen Tran, Ngoc N. Tran, Tuan Truong, Nhat Ho, Dinh Phung, Trung Le

View PDF HTML (experimental)

Abstract:Multi-Task Learning (MTL) is a widely-used and powerful learning paradigm for training deep neural networks that allows learning more than one objective by a single backbone. Compared to training tasks separately, MTL significantly reduces computational costs, improves data efficiency, and potentially enhances model performance by leveraging knowledge across tasks. Hence, it has been adopted in a variety of applications, ranging from computer vision to natural language processing and speech recognition. Among them, there is an emerging line of work in MTL that focuses on manipulating the task gradient to derive an ultimate gradient descent direction to benefit all tasks. Despite achieving impressive results on many benchmarks, directly applying these approaches without using appropriate regularization techniques might lead to suboptimal solutions on real-world problems. In particular, standard training that minimizes the empirical loss on the training data can easily suffer from overfitting to low-resource tasks or be spoiled by noisy-labeled ones, which can cause negative transfer between tasks and overall performance drop. To alleviate such problems, we propose to leverage a recently introduced training method, named Sharpness-aware Minimization, which can enhance model generalization ability on single-task learning. Accordingly, we present a novel MTL training methodology, encouraging the model to find task-based flat minima for coherently improving its generalization capability on all tasks. Finally, we conduct comprehensive experiments on a variety of applications to demonstrate the merit of our proposed approach to existing gradient-based MTL methods, as suggested by our developed theory.

Comments:	35 pages, 17 figures, 7 tables
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2211.13723 [cs.LG]
	(or arXiv:2211.13723v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2211.13723

Submission history

From: Hoang Phan [view email]
[v1] Thu, 24 Nov 2022 17:19:30 UTC (4,316 KB)
[v2] Fri, 29 Sep 2023 19:22:28 UTC (4,662 KB)
[v3] Tue, 19 Nov 2024 16:17:58 UTC (4,662 KB)

Computer Science > Machine Learning

Title:Improving Multi-task Learning via Seeking Task-based Flat Regions

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Computer Science > Machine Learning

Title:Improving Multi-task Learning via Seeking Task-based Flat Regions

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.