mlx/docs/build/html/python/optimizers.html
Awni Hannun fbd10a48d4 docs
2025-06-04 01:01:47 +00:00

189 lines
13 KiB
HTML
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html class="writer-html5" lang="en" >
<head>
<meta charset="utf-8" /><meta name="generator" content="Docutils 0.18.1: http://docutils.sourceforge.net/" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Optimizers &mdash; MLX 0.0.0 documentation</title>
<link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
<link rel="stylesheet" href="../_static/css/theme.css" type="text/css" />
<!--[if lt IE 9]>
<script src="../_static/js/html5shiv.min.js"></script>
<![endif]-->
<script data-url_root="../" id="documentation_options" src="../_static/documentation_options.js"></script>
<script src="../_static/jquery.js"></script>
<script src="../_static/underscore.js"></script>
<script src="../_static/_sphinx_javascript_frameworks_compat.js"></script>
<script src="../_static/doctools.js"></script>
<script src="../_static/js/theme.js"></script>
<link rel="index" title="Index" href="../genindex.html" />
<link rel="search" title="Search" href="../search.html" />
<link rel="next" title="mlx.optimizers.OptimizerState" href="_autosummary/mlx.optimizers.OptimizerState.html" />
<link rel="prev" title="mlx.nn.silu" href="_autosummary_functions/mlx.nn.silu.html" />
</head>
<body class="wy-body-for-nav">
<div class="wy-grid-for-nav">
<nav data-toggle="wy-nav-shift" class="wy-nav-side">
<div class="wy-side-scroll">
<div class="wy-side-nav-search" >
<a href="../index.html" class="icon icon-home">
MLX
</a>
<div class="version">
0.0.0
</div>
<div role="search">
<form id="rtd-search-form" class="wy-form" action="../search.html" method="get">
<input type="text" name="q" placeholder="Search docs" aria-label="Search docs" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu">
<p class="caption" role="heading"><span class="caption-text">Install</span></p>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../install.html">Build and Install</a></li>
</ul>
<p class="caption" role="heading"><span class="caption-text">Usage</span></p>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../quick_start.html">Quick Start Guide</a></li>
<li class="toctree-l1"><a class="reference internal" href="../using_streams.html">Using Streams</a></li>
</ul>
<p class="caption" role="heading"><span class="caption-text">Examples</span></p>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../examples/linear_regression.html">Linear Regression</a></li>
<li class="toctree-l1"><a class="reference internal" href="../examples/mlp.html">Multi-Layer Perceptron</a></li>
<li class="toctree-l1"><a class="reference internal" href="../examples/llama-inference.html">LLM inference</a></li>
</ul>
<p class="caption" role="heading"><span class="caption-text">Further Reading</span></p>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../dev/extensions.html">Developer Documentation</a></li>
</ul>
<p class="caption" role="heading"><span class="caption-text">Python API Reference</span></p>
<ul class="current">
<li class="toctree-l1"><a class="reference internal" href="array.html">Array</a></li>
<li class="toctree-l1"><a class="reference internal" href="devices_and_streams.html">Devices and Streams</a></li>
<li class="toctree-l1"><a class="reference internal" href="ops.html">Operations</a></li>
<li class="toctree-l1"><a class="reference internal" href="random.html">Random</a></li>
<li class="toctree-l1"><a class="reference internal" href="transforms.html">Transforms</a></li>
<li class="toctree-l1"><a class="reference internal" href="fft.html">FFT</a></li>
<li class="toctree-l1"><a class="reference internal" href="nn.html">Neural Networks</a></li>
<li class="toctree-l1 current"><a class="current reference internal" href="#">Optimizers</a><ul>
<li class="toctree-l2"><a class="reference internal" href="_autosummary/mlx.optimizers.OptimizerState.html">mlx.optimizers.OptimizerState</a></li>
<li class="toctree-l2"><a class="reference internal" href="_autosummary/mlx.optimizers.Optimizer.html">mlx.optimizers.Optimizer</a></li>
<li class="toctree-l2"><a class="reference internal" href="_autosummary/mlx.optimizers.SGD.html">mlx.optimizers.SGD</a></li>
<li class="toctree-l2"><a class="reference internal" href="_autosummary/mlx.optimizers.Adam.html">mlx.optimizers.Adam</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="tree_utils.html">Tree Utils</a></li>
</ul>
<p class="caption" role="heading"><span class="caption-text">C++ API Reference</span></p>
<ul>
<li class="toctree-l1"><a class="reference internal" href="../cpp/ops.html">Operations</a></li>
</ul>
</div>
</div>
</nav>
<section data-toggle="wy-nav-shift" class="wy-nav-content-wrap"><nav class="wy-nav-top" aria-label="Mobile navigation menu" >
<i data-toggle="wy-nav-top" class="fa fa-bars"></i>
<a href="../index.html">MLX</a>
</nav>
<div class="wy-nav-content">
<div class="rst-content">
<div role="navigation" aria-label="Page navigation">
<ul class="wy-breadcrumbs">
<li><a href="../index.html" class="icon icon-home" aria-label="Home"></a></li>
<li class="breadcrumb-item active">Optimizers</li>
<li class="wy-breadcrumbs-aside">
<a href="../_sources/python/optimizers.rst.txt" rel="nofollow"> View page source</a>
</li>
</ul>
<hr/>
</div>
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">
<section id="optimizers">
<span id="id1"></span><h1>Optimizers<a class="headerlink" href="#optimizers" title="Permalink to this heading"></a></h1>
<p>The optimizers in MLX can be used both with <code class="xref py py-mod docutils literal notranslate"><span class="pre">mlx.nn</span></code> but also with pure
<code class="xref py py-mod docutils literal notranslate"><span class="pre">mlx.core</span></code> functions. A typical example involves calling
<code class="xref py py-meth docutils literal notranslate"><span class="pre">Optimizer.update()</span></code> to update a models parameters based on the loss
gradients and subsequently calling <a class="reference internal" href="_autosummary/mlx.core.eval.html#mlx.core.eval" title="mlx.core.eval"><code class="xref py py-func docutils literal notranslate"><span class="pre">mlx.core.eval()</span></code></a> to evaluate both the
models parameters and the <strong>optimizer state</strong>.</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="c1"># Create a model</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">MLP</span><span class="p">(</span><span class="n">num_layers</span><span class="p">,</span> <span class="n">train_images</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">],</span> <span class="n">hidden_dim</span><span class="p">,</span> <span class="n">num_classes</span><span class="p">)</span>
<span class="n">mx</span><span class="o">.</span><span class="n">eval</span><span class="p">(</span><span class="n">model</span><span class="o">.</span><span class="n">parameters</span><span class="p">())</span>
<span class="c1"># Create the gradient function and the optimizer</span>
<span class="n">loss_and_grad_fn</span> <span class="o">=</span> <span class="n">nn</span><span class="o">.</span><span class="n">value_and_grad</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">loss_fn</span><span class="p">)</span>
<span class="n">optimizer</span> <span class="o">=</span> <span class="n">optim</span><span class="o">.</span><span class="n">SGD</span><span class="p">(</span><span class="n">learning_rate</span><span class="o">=</span><span class="n">learning_rate</span><span class="p">)</span>
<span class="k">for</span> <span class="n">e</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">num_epochs</span><span class="p">):</span>
<span class="k">for</span> <span class="n">X</span><span class="p">,</span> <span class="n">y</span> <span class="ow">in</span> <span class="n">batch_iterate</span><span class="p">(</span><span class="n">batch_size</span><span class="p">,</span> <span class="n">train_images</span><span class="p">,</span> <span class="n">train_labels</span><span class="p">):</span>
<span class="n">loss</span><span class="p">,</span> <span class="n">grads</span> <span class="o">=</span> <span class="n">loss_and_grad_fn</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">X</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span>
<span class="c1"># Update the model with the gradients. So far no computation has happened.</span>
<span class="n">optimizer</span><span class="o">.</span><span class="n">update</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">grads</span><span class="p">)</span>
<span class="c1"># Compute the new parameters but also the optimizer state.</span>
<span class="n">mx</span><span class="o">.</span><span class="n">eval</span><span class="p">(</span><span class="n">model</span><span class="o">.</span><span class="n">parameters</span><span class="p">(),</span> <span class="n">optimizer</span><span class="o">.</span><span class="n">state</span><span class="p">)</span>
</pre></div>
</div>
<table class="autosummary longtable docutils align-default">
<tbody>
<tr class="row-odd"><td><p><a class="reference internal" href="_autosummary/mlx.optimizers.OptimizerState.html#mlx.optimizers.OptimizerState" title="mlx.optimizers.OptimizerState"><code class="xref py py-obj docutils literal notranslate"><span class="pre">OptimizerState</span></code></a></p></td>
<td><p>The optimizer state implements a recursively defined <a class="reference external" href="https://docs.python.org/3/library/collections.html#collections.defaultdict" title="(in Python v3.12)"><code class="xref py py-class docutils literal notranslate"><span class="pre">collections.defaultdict</span></code></a>, namely a missing key in an optimizer state is an <a class="reference internal" href="_autosummary/mlx.optimizers.OptimizerState.html#mlx.optimizers.OptimizerState" title="mlx.optimizers.OptimizerState"><code class="xref py py-class docutils literal notranslate"><span class="pre">OptimizerState</span></code></a>.</p></td>
</tr>
<tr class="row-even"><td><p><a class="reference internal" href="_autosummary/mlx.optimizers.Optimizer.html#mlx.optimizers.Optimizer" title="mlx.optimizers.Optimizer"><code class="xref py py-obj docutils literal notranslate"><span class="pre">Optimizer</span></code></a>()</p></td>
<td><p>The base class for all optimizers.</p></td>
</tr>
<tr class="row-odd"><td><p><a class="reference internal" href="_autosummary/mlx.optimizers.SGD.html#mlx.optimizers.SGD" title="mlx.optimizers.SGD"><code class="xref py py-obj docutils literal notranslate"><span class="pre">SGD</span></code></a>(learning_rate[, momentum])</p></td>
<td><p>Stochastic gradient descent optimizer.</p></td>
</tr>
<tr class="row-even"><td><p><a class="reference internal" href="_autosummary/mlx.optimizers.Adam.html#mlx.optimizers.Adam" title="mlx.optimizers.Adam"><code class="xref py py-obj docutils literal notranslate"><span class="pre">Adam</span></code></a>(learning_rate[, betas, eps])</p></td>
<td><p>Implementation of the Adam optimizer [1].</p></td>
</tr>
</tbody>
</table>
</section>
</div>
</div>
<footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer">
<a href="_autosummary_functions/mlx.nn.silu.html" class="btn btn-neutral float-left" title="mlx.nn.silu" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
<a href="_autosummary/mlx.optimizers.OptimizerState.html" class="btn btn-neutral float-right" title="mlx.optimizers.OptimizerState" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
</div>
<hr/>
<div role="contentinfo">
<p>&#169; Copyright 2023, MLX Contributors.</p>
</div>
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
provided by <a href="https://readthedocs.org">Read the Docs</a>.
</footer>
</div>
</div>
</section>
</div>
<script>
jQuery(function () {
SphinxRtdTheme.Navigation.enable(true);
});
</script>
</body>
</html>