Byzantine robustness has received significant attention recently given its importance for distributed and federated learning. In spite of this, there are severe flaws in existing algorithms even when the data across the participants is identically distributed. To address these issues, we present two surprisingly simple strategies: a new robust iterative clipping procedure, and incorporating worker momentum to overcome time-coupled attacks. This is the first provably robust method for the standard stochastic optimization setting.
This page was last edited on 2024-04-09.
This page was last edited on 2024-04-09.