Communication-Efficient Soft Actor-Critic Policy Collaboration via Regulated Segment Mixture

Yu, Xiaoxue; Li, Rongpeng; Liang, Chengchao; Zhao, Zhifeng

Computer Science > Multiagent Systems

arXiv:2312.10123 (cs)

[Submitted on 15 Dec 2023 (v1), last revised 13 Oct 2024 (this version, v3)]

Title:Communication-Efficient Soft Actor-Critic Policy Collaboration via Regulated Segment Mixture

Authors:Xiaoxue Yu, Rongpeng Li, Chengchao Liang, Zhifeng Zhao

View PDF HTML (experimental)

Abstract:Multi-Agent Reinforcement Learning (MARL) has emerged as a foundational approach for addressing diverse, intelligent control tasks in various scenarios like the Internet of Vehicles, Internet of Things, and Unmanned Aerial Vehicles. However, the widely assumed existence of a central node for centralized, federated learning-assisted MARL might be impractical in highly dynamic environments. This can lead to excessive communication overhead, potentially overwhelming the system. To address these challenges, we design a novel communication-efficient, fully distributed algorithm for collaborative MARL under the frameworks of Soft Actor-Critic (SAC) and Decentralized Federated Learning (DFL), named RSM-MASAC. In particular, RSM-MASAC enhances multi-agent collaboration and prioritizes higher communication efficiency in dynamic systems by incorporating the concept of segmented aggregation in DFL and augmenting multiple model replicas from received neighboring policy segments, which are subsequently employed as reconstructed referential policies for mixing. Distinctively diverging from traditional RL approaches, RSM-MASAC introduces new bounds under the framework of Maximum Entropy Reinforcement Learning (MERL). Correspondingly, it adopts a theory-guided mixture metric to regulate the selection of contributive referential policies, thus guaranteeing soft policy improvement during the communication-assisted mixing phase. Finally, the extensive simulations in mixed-autonomy traffic control scenarios verify the effectiveness and superiority of our algorithm.

Subjects:	Multiagent Systems (cs.MA)
Cite as:	arXiv:2312.10123 [cs.MA]
	(or arXiv:2312.10123v3 [cs.MA] for this version)
	https://doi.org/10.48550/arXiv.2312.10123

Submission history

From: Xiaoxue Yu [view email]
[v1] Fri, 15 Dec 2023 13:39:55 UTC (2,819 KB)
[v2] Wed, 10 Jul 2024 06:55:20 UTC (2,876 KB)
[v3] Sun, 13 Oct 2024 11:05:04 UTC (2,864 KB)

Computer Science > Multiagent Systems

Title:Communication-Efficient Soft Actor-Critic Policy Collaboration via Regulated Segment Mixture

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Computer Science > Multiagent Systems

Title:Communication-Efficient Soft Actor-Critic Policy Collaboration via Regulated Segment Mixture

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.