加速
趋同(经济学)
功能(生物学)
算法
收敛速度
对偶(语法数字)
计算机科学
随机梯度下降算法
分布式算法
数学
数学优化
离散数学
人工智能
并行计算
钥匙(锁)
人工神经网络
分布式计算
文学类
艺术
生物
经济
进化生物学
计算机安全
经济增长
作者
Xinlei Yi,Shengjun Zhang,Tao Yang,Tianyou Chai,Karl Henrik Johansson
标识
DOI:10.1109/jas.2022.105554
摘要
The distributed nonconvex optimization problem of minimizing a global cost function formed by a sum of $n$ local cost functions by using local information exchange is considered. This problem is an important component of many machine learning techniques with data parallelism, such as deep learning and federated learning. We propose a distributed primal-dual stochastic gradient descent (SGD) algorithm, suitable for arbitrarily connected communication networks and any smooth (possibly nonconvex) cost functions. We show that the proposed algorithm achieves the linear speedup convergence rate $\mathcal{O}(1/\sqrt{nT})$ for general nonconvex cost functions and the linear speedup convergence rate $\mathcal{O}(1/(nT))$ when the global cost function satisfies the Polyak-Łojasiewicz (P-Ł) condition, where $T$ is the total number of iterations. We also show that the output of the proposed algorithm with constant parameters linearly converges to a neighborhood of a global optimum. We demonstrate through numerical experiments the efficiency of our algorithm in comparison with the baseline centralized SGD and recently proposed distributed SGD algorithms.
科研通智能强力驱动
Strongly Powered by AbleSci AI