Common Optimizers#
  | 
The stochastic gradient descent optimizer.  | 
  | 
The RMSprop optimizer [1].  | 
  | 
The Adagrad optimizer [1].  | 
  | 
The Adafactor optimizer.  | 
  | 
The AdaDelta optimizer with a learning rate [1].  | 
  | 
The Adam optimizer [1].  | 
  | 
The AdamW optimizer [1].  | 
  | 
The Adamax optimizer, a variant of Adam based on the infinity norm [1].  | 
  | 
The Lion optimizer [1].  | 
  | 
Wraps a list of optimizers with corresponding weight predicates/filters to make it easy to use different optimizers for different weights.  |