SMR: State Memory Replay for Long Sequence Modeling

Qi, Biqing; Gao, Junqi; Zhang, Kaiyan; Li, Dong; Liu, Jianxing; Wu, Ligang; Zhou, Bowen

Computer Science > Machine Learning

arXiv:2405.17534 (cs)

[Submitted on 27 May 2024 (v1), last revised 8 Jun 2024 (this version, v2)]

Title:SMR: State Memory Replay for Long Sequence Modeling

Authors:Biqing Qi, Junqi Gao, Kaiyan Zhang, Dong Li, Jianxing Liu, Ligang Wu, Bowen Zhou

View PDF HTML (experimental)

Abstract:Despite the promising performance of state space models (SSMs) in long sequence modeling, limitations still exist. Advanced SSMs like S5 and S6 (Mamba) in addressing non-uniform sampling, their recursive structures impede efficient SSM computation via convolution. To overcome compatibility limitations in parallel convolutional computation, this paper proposes a novel non-recursive non-uniform sample processing strategy. Theoretical analysis of SSMs through the lens of Event-Triggered Control (ETC) theory reveals the Non-Stable State (NSS) problem, where deviations from sampling point requirements lead to error transmission and accumulation, causing the divergence of the SSM's hidden state. Our analysis further reveals that adjustments of input sequences with early memories can mitigate the NSS problem, achieving Sampling Step Adaptation (SSA). Building on this insight, we introduce a simple yet effective plug-and-play mechanism, State Memory Replay (SMR), which utilizes learnable memories to adjust the current state with multi-step information for generalization at sampling points different from those in the training data. This enables SSMs to stably model varying sampling points. Experiments on long-range modeling tasks in autoregressive language modeling and Long Range Arena demonstrate the general effectiveness of the SMR mechanism for a series of SSM models.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2405.17534 [cs.LG]
	(or arXiv:2405.17534v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2405.17534
Journal reference:	Findings of the Association for Computational Linguistics, 2024

Submission history

From: Biqing Qi [view email]
[v1] Mon, 27 May 2024 17:53:32 UTC (4,131 KB)
[v2] Sat, 8 Jun 2024 16:49:00 UTC (4,122 KB)

Computer Science > Machine Learning

Title:SMR: State Memory Replay for Long Sequence Modeling

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Computer Science > Machine Learning

Title:SMR: State Memory Replay for Long Sequence Modeling

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.