miniblog.
← Back to all posts
Wilfred Hughes
May 4, 2019 at 12:11
An excellent overview of the different gradient descent algorithms, and a nice example of content that is available as both a responsive website and a PDF on arXiv:
https://ruder.io/optimizing-gradient-descent/
An overview of gradient descent optimization algorithms
Gradient descent is the preferred way to optimize neural networks and many other machine learning algorithms but is often used as a black box. This post explores how many of the most popular gradient-based optimization algorithms such as Momentum, Adagrad, and Adam actually work.