mlx.optimizers.SGD#
- class mlx.optimizers.SGD(learning_rate: float, momentum: float = 0.0)#
Stochastic gradient descent optimizer.
Updates a parameter \(w\) with a gradient \(g\) as follows
\[\begin{split}v_{t+1} &= \mu v_t + (1 - \mu) g_t \\ w_{t+1} &= w_t - \lambda v_{t+1}\end{split}\]- Parameters:
Methods
__init__
(learning_rate[, momentum])apply_single
(gradient, parameter, state)Performs the SGD parameter update and stores \(v\) in the optimizer state.