PRISM: Exposing and Resolving Spurious Isolation in Federated Multimodal Continual Learning¶

作者: Beining Wu, Zihao Ding, Jun Huang 发表: 2026-05-01

摘要¶

While current federated multimodal continual learning over mixture-of-experts low-rank adaptation (MoE-LoRA) is built on the unverified assumption that routing isolates task-specific knowledge into disjoint experts, we argue that routing operates per-sample, while forgetting accumulates across the task sequence, and gradient conflict persists within each expert even when routing is maximally polarized. Moreover, activation-subspace protection can also fail because, under parameter-efficient fine-tuning, it entangles tasks due to a dimension-counting bound, and federated averaging (FedAvg) disrupts client-side orthogonality. To address this, we propose PRISM (Per-expert Routing-projection Interference-informed Subspace Method), which maintains a per-expert gradient subspace basis whose orthogonality is preserved under FedAvg and reinterprets MoE routing as a capacity allocator.

核心贡献¶

问题暴露: 首次系统揭示了 MoE-LoRA 路由假设中的"伪隔离"问题——路由逐样本操作，而遗忘跨任务累积，梯度冲突在每个 expert 内部持续存在
FedAvg 正交性破坏: 证明联邦平均会破坏客户端正交性，激活子空间保护在参数高效微调下会因维度计数约束而失效
PRISM 方法: 维护 per-expert 梯度子空间基，其正交性在 FedAvg 下得以保持，并将 MoE 路由重新解释为容量分配器
SOTA 性能: 在 LLaVA-1.5-7B/13B 和 Qwen2.5-VL-7B 上，在 CoIN-6 和 CoIN-Long-10 上超越 16 个 state-of-the-art 基线

为什么重要¶

首个深入分析联邦多模态持续学习中路由伪隔离问题的工作。发现现有方法依赖的"路由隔离假设"在实际场景中不成立，导致任务知识并非真正独立存储在各自 expert 中。这一发现对设计更可靠的联邦持续学习系统有重要指导意义。

与端侧/移动端的相关性¶

联邦学习本身就是为了保护用户隐私的分布式学习范式，与端侧部署高度相关。PRISM 对 MoE 路由的重新解释（容量分配而非任务隔离）和梯度子空间正交性保持，对在移动设备上运行联邦多模态持续学习系统有直接价值。