机器学习五要素 optimization

作者：lambdaji

机器学习的终极问题都会转化为目标函数的优化问题，借用文献[8]中的一幅图来看一下优化算法的发展脉络。

$E:\temp\机器学习5要素\机器学习5要素\机器学习五要素之optimization - sgd variants_files\Image.png$

本文仅讨论工业界常用的sgd及其variants之间的区别与联系 GD/SGD/Momentum/Nesterov/SVRG/Adagrad/Adadelta/RMSprop/Adam/Downpour SGD/Hogwild!

$E:\temp\机器学习5要素\机器学习5要素\机器学习五要素之optimization - sgd variants_files\Image [1].png$

SGD的变种主要围绕3个方向展开

#1 优化learn_rate → 自适应学习率

--Annealing 全局共享learn_rate 所有的参数以相同的幅度进行更新

* 随步数衰减

* 指数衰减

* 1/t衰减

--Adagrad 参数独立learn_rate 更新幅度取决于参数本身

#2 优化梯度方向 → 方向感，减小震荡

--动量Momentum，Nesterov

--SAG/SVRG/...

#3 并行化 → Scalable

$E:\temp\机器学习5要素\机器学习5要素\机器学习五要素之optimization - sgd variants_files\Image [2].png$

--Hogwild!

--Downpour SGD

参考文献：

[1] http://sebastianruder.com/optimizing-gradient-descent/?url_type=39&object_type=webpage&pos=1

[2] http://www.cnblogs.com/neopenx/p/4768388.html

[3] https://www.reddit.com/r/MachineLearning/comments/2gopfa/visualizing_gradient_optimization_techniques/

[4] https://www.quora.com/What-are-differences-between-update-rules-like-AdaDelta-RMSProp-AdaGrad-and-AdaM

[5] https://zhuanlan.zhihu.com/p/21798784

[6] https://zhuanlan.zhihu.com/p/21539419

[7] http://www.datakit.cn/blog/2016/07/04/sgd_01.html

[8] 王太峰-浅谈分布式机器学习算法和工具.pdf

[9] A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets

[10] Accelerating Stochastic Gradient Descent using Predictive Variance Reduction

[11] A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives

[12] Less than a Single Pass Stochastically Controlled Stochastic Gradient Method

voters

我爱计算机

Report Story

Tags : 优化

我爱计算机

机器学习五要素 optimization

2 Comments

最近热文

今日头条

分类导航

站内搜索

机器学习五要素 optimization

猜你喜欢

2 Comments

最近热文

Buy fosfom... dosepharmacy.com

保罗的博客 paulgraham.com

Barabási算法... mp.weixin.qq.com

快手发布首个稠密度高... mp.weixin.qq.com

WSDM2022 |... mp.weixin.qq.com

今日头条

分类导航

站内搜索

登录