Files
mlx/docs/build/doctrees/examples/mlp.doctree

166 lines
16 KiB
Plaintext
Raw Normal View History

2024-01-17 17:15:29 -08:00
<EFBFBD><05><>A<00>sphinx.addnodes<65><73>document<6E><74><EFBFBD>)<29><>}<7D>(<28> rawsource<63><65><00><>children<65>]<5D>(<28>docutils.nodes<65><73>target<65><74><EFBFBD>)<29><>}<7D>(h<05>.. _mlp:<3A>h]<5D><>
attributes<EFBFBD>}<7D>(<28>ids<64>]<5D><>classes<65>]<5D><>names<65>]<5D><>dupnames<65>]<5D><>backrefs<66>]<5D><>refid<69><64>mlp<6C>u<EFBFBD>tagname<6D>h
<EFBFBD>line<6E>K<01>parent<6E>h<03> _document<6E>h<03>source<63><65>5/Users/awnihannun/repos/mlx/docs/src/examples/mlp.rst<73>ubh <09>section<6F><6E><EFBFBD>)<29><>}<7D>(hhh]<5D>(h <09>title<6C><65><EFBFBD>)<29><>}<7D>(h<05>Multi-Layer Perceptron<6F>h]<5D>h <09>Text<78><74><EFBFBD><EFBFBD>Multi-Layer Perceptron<6F><6E><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h+h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh)h h&h!hh"h#hKubh <09> paragraph<70><68><EFBFBD>)<29><>}<7D>(h<05>pIn this example we'll learn to use ``mlx.nn`` by implementing a simple
multi-layer perceptron to classify MNIST.<2E>h]<5D>(h0<68>%In this example well learn to use <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h=h!hh"NhNubh <09>literal<61><6C><EFBFBD>)<29><>}<7D>(h<05>
``mlx.nn``<60>h]<5D>h0<68>mlx.nn<6E><6E><EFBFBD><EFBFBD><EFBFBD>}<7D>(h hGh!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhhEh h=ubh0<68>C by implementing a simple
multi-layer perceptron to classify MNIST.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h=h!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hKh h&h!hubh<)<29><>}<7D>(h<05>0As a first step import the MLX packages we need:<3A>h]<5D>h0<68>0As a first step import the MLX packages we need:<3A><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h_h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK h h&h!hubh <09> literal_block<63><6B><EFBFBD>)<29><>}<7D>(h<05>\import mlx.core as mx
import mlx.nn as nn
import mlx.optimizers as optim
import numpy as np<6E>h]<5D>h0<68>\import mlx.core as mx
import mlx.nn as nn
import mlx.optimizers as optim
import numpy as np<6E><70><EFBFBD><EFBFBD><EFBFBD>}<7D>h hosbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><> xml:space<63><65>preserve<76><65>force<63><65><EFBFBD>language<67><65>python<6F><6E>highlight_args<67>}<7D>uhhmh"h#hK h h&h!hubh<)<29><>}<7D>(h<05><>The model is defined as the ``MLP`` class which inherits from
:class:`mlx.nn.Module`. We follow the standard idiom to make a new module:<3A>h]<5D>(h0<68>The model is defined as the <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubhF)<29><>}<7D>(h<05>``MLP``<60>h]<5D>h0<68>MLP<4C><50><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhhEh h<>ubh0<68> class which inherits from
<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubh<00> pending_xref<65><66><EFBFBD>)<29><>}<7D>(h<05>:class:`mlx.nn.Module`<60>h]<5D>hF)<29><>}<7D>(hh<>h]<5D>h0<68> mlx.nn.Module<6C><65><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>(<28>xref<65><66>py<70><79>py-class<73>eh]<5D>h]<5D>h]<5D>uhhEh h<>ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F><63> examples/mlp<6C><70> refdomain<69>h<EFBFBD><68>reftype<70><65>class<73><73> refexplicit<69><74><EFBFBD>refwarn<72><6E><EFBFBD> py:module<6C>N<EFBFBD>py:class<73>N<EFBFBD> reftarget<65><74> mlx.nn.Module<6C>uhh<>h"h#hKh h<>ubh0<68>4. We follow the standard idiom to make a new module:<3A><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hKh h&h!hubh <09>enumerated_list<73><74><EFBFBD>)<29><>}<7D>(hhh]<5D>(h <09> list_item<65><6D><EFBFBD>)<29><>}<7D>(h<05><>Define an ``__init__`` where the parameters and/or submodules are setup. See
the :ref:`Module class docs<module_class>` for more information on how
:class:`mlx.nn.Module` registers parameters.<2E>h]<5D>h<)<29><>}<7D>(h<05><>Define an ``__init__`` where the parameters and/or submodules are setup. See
the :ref:`Module class docs<module_class>` for more information on how
:class:`mlx.nn.Module` registers parameters.<2E>h]<5D>(h0<68>
Define an <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubhF)<29><>}<7D>(h<05> ``__init__``<60>h]<5D>h0<68>__init__<5F><5F><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhhEh h<>ubh0<68>; where the parameters and/or submodules are setup. See
the <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubh<62>)<29><>}<7D>(h<05>&:ref:`Module class docs<module_class>`<60>h]<5D>h <09>inline<6E><65><EFBFBD>)<29><>}<7D>(hh<>h]<5D>h0<68>Module class docs<63><73><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>(h<><68>std<74><64>std-ref<65>eh]<5D>h]<5D>h]<5D>uhh<>h h<>ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F>h<EFBFBD><68> refdomain<69>j<00>reftype<70><65>ref<65><66> refexplicit<69><74><EFBFBD>refwarn<72><6E> module_class<73>uhh<>h"h#hKh h<>ubh0<68> for more information on how
<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubh<62>)<29><>}<7D>(h<05>:class:`mlx.nn.Module`<60>h]<5D>hF)<29><>}<7D>(hjh]<5D>h0<68> mlx.nn.Module<6C><65><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jh!hh"NhNubah}<7D>(h]<5D>h]<5D>(h<><68>py<70><79>py-class<73>eh]<5D>h]<5D>h]<5D>uhhEh jubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F>h<EFBFBD><68> refdomain<69>j(<00>reftype<70><65>class<73><73> refexplicit<69><74><EFBFBD>refwarn<72><6E>h<EFBFBD>Nh<4E>NhÌ mlx.nn.Module<6C>uhh<>h"h#hKh h<>ubh0<68> registers parameters.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hKh h<>ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh<>h h<>h!hh"h#hNubh<62>)<29><>}<7D>(h<05><Define a ``__call__`` where the computation is implemented.
<EFBFBD>h]<5D>h<)<29><>}<7D>(h<05>;Define a ``__call__`` where the computation is implemented.<2E>h]<5D>(h0<68> Define a <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jNh!hh"NhNubhF)<29><>}<7D>(h<05> ``__call__``<60>h]<5D>h0<68>__call__<5F><5F><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jVh!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhhEh jNubh0<68>& where the computation is implemented.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jNh!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hKh jJubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh<>h h<>h!hh"h#hNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>enumtype<70><65>arabic<69><63>prefix<69>h<06>suffix<69><78>.<2E>uhh<>h h&h!hh"h#hKubhn)<29><>}<7D>(hX<>class MLP(nn.Module):
def __init__(
self, num_layers: int, input_dim: int, hidden_dim: int, output_dim: int
):
super().__init__()
layer_sizes = [input_dim] + [hidden_dim] * num_layers + [output_dim]
self.layers = [
nn.Linear(idim, odim)
for idim, odim in zip(layer_sizes[:-1], layer_sizes[1:])
]
def __call__(self, x):
for l in self.layers[:-1]:
x = mx.maximum(l(x), 0.0)
return self.layers[-1](x)<29>h]<5D>h0X<30>class MLP(nn.Module):
def __init__(
self, num_layers: int, input_dim: int, hidden_dim: int, output_dim: int
):
super().__init__()
layer_sizes = [input_dim] + [hidden_dim] * num_layers + [output_dim]
self.layers = [
nn.Linear(idim, odim)
for idim, odim in zip(layer_sizes[:-1], layer_sizes[1:])
]
def __call__(self, x):
for l in self.layers[:-1]:
x = mx.maximum(l(x), 0.0)
return self.layers[-1](x)<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>h jsbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>h}h~h<68>h<EFBFBD><68>python<6F>h<EFBFBD>}<7D>uhhmh"h#hKh h&h!hubh<)<29><>}<7D>(h<05><>We define the loss function which takes the mean of the per-example cross
entropy loss. The ``mlx.nn.losses`` sub-package has implementations of some
commonly used loss functions.<2E>h]<5D>(h0<68>]We define the loss function which takes the mean of the per-example cross
entropy loss. The <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubhF)<29><>}<7D>(h<05>``mlx.nn.losses``<60>h]<5D>h0<68> mlx.nn.losses<65><73><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhhEh j<>ubh0<68>F sub-package has implementations of some
commonly used loss functions.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK/h h&h!hubhn)<29><>}<7D>(h<05>Rdef loss_fn(model, X, y):
return mx.mean(nn.losses.cross_entropy(model(X), y))<29>h]<5D>h0<68>Rdef loss_fn(model, X, y):
return mx.mean(nn.losses.cross_entropy(model(X), y))<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>h j<>sbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>h}h~h<68>h<EFBFBD><68>python<6F>h<EFBFBD>}<7D>uhhmh"h#hK3h h&h!hubh<)<29><>}<7D>(h<05>SWe also need a function to compute the accuracy of the model on the validation
set:<3A>h]<5D>h0<68>SWe also need a function to compute the accuracy of the model on the validation
set:<3A><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK8h h&h!hubhn)<29><>}<7D>(h<05>Ndef eval_fn(model, X, y):
return mx.mean(mx.argmax(model(X), axis=1) == y)<29>h]<5D>h0<68>Ndef eval_fn(model, X, y):
return mx.mean(mx.argmax(model(X), axis=1) == y)<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>h j<>sbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>h}h~h<68>h<EFBFBD><68>python<6F>h<EFBFBD>}<7D>uhhmh"h#hK;h h&h!hubh<)<29><>}<7D>(h<05><>Next, setup the problem parameters and load the data. To load the data, you need our
`mnist data loader
<https://github.com/ml-explore/mlx-examples/blob/main/mnist/mnist.py>`_, which
we will import as `mnist`.<2E>h]<5D>(h0<68>UNext, setup the problem parameters and load the data. To load the data, you need our
<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubh <09> reference<63><65><EFBFBD>)<29><>}<7D>(h<05>Z`mnist data loader
<https://github.com/ml-explore/mlx-examples/blob/main/mnist/mnist.py>`_<>h]<5D>h0<68>mnist data loader<65><72><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>name<6D><65>mnist data loader<65><72>refuri<72><69>Chttps://github.com/ml-explore/mlx-examples/blob/main/mnist/mnist.py<70>uhj<>h j<>ubh )<29><>}<7D>(h<05>F
<https://github.com/ml-explore/mlx-examples/blob/main/mnist/mnist.py><3E>h]<5D>h}<7D>(h]<5D><>mnist-data-loader<65>ah]<5D>h]<5D><>mnist data loader<65>ah]<5D>h]<5D><>refuri<72>j<EFBFBD>uhh
<EFBFBD>
referenced<EFBFBD>Kh j<>ubh0<68>, which
we will import as <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubh <09>title_reference<63><65><EFBFBD>)<29><>}<7D>(h<05>`mnist`<60>h]<5D>h0<68>mnist<73><74><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhj h j<>ubh0<68>.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK@h h&h!hubhn)<29><>}<7D>(h<05><>num_layers = 2
hidden_dim = 32
num_classes = 10
batch_size = 256
num_epochs = 10
learning_rate = 1e-1
# Load the data
import mnist
train_images, train_labels, test_images, test_labels = map(
mx.array, mnist.mnist()
)<29>h]<5D>h0<68><30>num_layers = 2
hidden_dim = 32
num_classes = 10
batch_size = 256
num_epochs = 10
learning_rate = 1e-1
# Load the data
import mnist
train_images, train_labels, test_images, test_labels = map(
mx.array, mnist.mnist()
)<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>h j%sbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>h}h~h<68>h<EFBFBD><68>python<6F>h<EFBFBD>}<7D>uhhmh"h#hKEh h&h!hubh<)<29><>}<7D>(h<05>uSince we're using SGD, we need an iterator which shuffles and constructs
minibatches of examples in the training set:<3A>h]<5D>h0<68>wSince were using SGD, we need an iterator which shuffles and constructs
minibatches of examples in the training set:<3A><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j5h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hKTh h&h!hubhn)<29><>}<7D>(h<05><>def batch_iterate(batch_size, X, y):
perm = mx.array(np.random.permutation(y.size))
for s in range(0, y.size, batch_size):
ids = perm[s : s + batch_size]
yield X[ids], y[ids]<5D>h]<5D>h0<68><30>def batch_iterate(batch_size, X, y):
perm = mx.array(np.random.permutation(y.size))
for s in range(0, y.size, batch_size):
ids = perm[s : s + batch_size]
yield X[ids], y[ids]<5D><><EFBFBD><EFBFBD><EFBFBD>}<7D>h jCsbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>h}h~h<68>h<EFBFBD><68>python<6F>h<EFBFBD>}<7D>uhhmh"h#hKWh h&h!hubh<)<29><>}<7D>(h<05><>Finally, we put it all together by instantiating the model, the
:class:`mlx.optimizers.SGD` optimizer, and running the training loop:<3A>h]<5D>(h0<68>@Finally, we put it all together by instantiating the model, the
<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jSh!hh"NhNubh<62>)<29><>}<7D>(h<05>:class:`mlx.optimizers.SGD`<60>h]<5D>hF)<29><>}<7D>(hj]h]<5D>h0<68>mlx.optimizers.SGD<47><44><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j_h!hh"NhNubah}<7D>(h]<5D>h]<5D>(h<><68>py<70><79>py-class<73>eh]<5D>h]<5D>h]<5D>uhhEh j[ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F>h<EFBFBD><68> refdomain<69>ji<00>reftype<70><65>class<73><73> refexplicit<69><74><EFBFBD>refwarn<72><6E>h<EFBFBD>Nh<4E>NhÌmlx.optimizers.SGD<47>uhh<>h"h#hK`h jSubh0<68>* optimizer, and running the training loop:<3A><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jSh!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK`h h&h!hubhn)<29><>}<7D>(hXQ# Load the model
model = MLP(num_layers, train_images.shape[-1], hidden_dim, num_classes)
mx.eval(model.parameters())
# Get a function which gives the loss and gradient of the
# loss with respect to the model's trainable parameters
loss_and_grad_fn = nn.value_and_grad(model, loss_fn)
# Instantiate the optimizer
optimizer = optim.SGD(learning_rate=learning_rate)
for e in range(num_epochs):
for X, y in batch_iterate(batch_size, train_images, train_labels):
loss, grads = loss_and_grad_fn(model, X, y)
# Update the optimizer state and model parameters
# in a single call
optimizer.update(model, grads)
# Force a graph evaluation
mx.eval(model.parameters(), optimizer.state)
accuracy = eval_fn(model, test_images, test_labels)
print(f"Epoch {e}: Test accuracy {accuracy.item():.3f}")<29>h]<5D>h0XQ# Load the model
model = MLP(num_layers, train_images.shape[-1], hidden_dim, num_classes)
mx.eval(model.parameters())
# Get a function which gives the loss and gradient of the
# loss with respect to the model's trainable parameters
loss_and_grad_fn = nn.value_and_grad(model, loss_fn)
# Instantiate the optimizer
optimizer = optim.SGD(learning_rate=learning_rate)
for e in range(num_epochs):
for X, y in batch_iterate(batch_size, train_images, train_labels):
loss, grads = loss_and_grad_fn(model, X, y)
# Update the optimizer state and model parameters
# in a single call
optimizer.update(model, grads)
# Force a graph evaluation
mx.eval(model.parameters(), optimizer.state)
accuracy = eval_fn(model, test_images, test_labels)
print(f"Epoch {e}: Test accuracy {accuracy.item():.3f}")<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>h j<>sbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>h}h~h<68>h<EFBFBD><68>python<6F>h<EFBFBD>}<7D>uhhmh"h#hKch h&h!hubh <09>note<74><65><EFBFBD>)<29><>}<7D>(h<05><>The :func:`mlx.nn.value_and_grad` function is a convenience function to get
the gradient of a loss with respect to the trainable parameters of a model.
This should not be confused with :func:`mlx.core.value_and_grad`.<2E>h]<5D>h<)<29><>}<7D>(h<05><>The :func:`mlx.nn.value_and_grad` function is a convenience function to get
the gradient of a loss with respect to the trainable parameters of a model.
This should not be confused with :func:`mlx.core.value_and_grad`.<2E>h]<5D>(h0<68>The <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubh<62>)<29><>}<7D>(h<05>:func:`mlx.nn.value_and_grad`<60>h]<5D>hF)<29><>}<7D>(hj<>h]<5D>h0<68>mlx.nn.value_and_grad()<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>(h<><68>py<70><79>py-func<6E>eh]<5D>h]<5D>h]<5D>uhhEh j<>ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F>h<EFBFBD><68> refdomain<69>j<EFBFBD><00>reftype<70><65>func<6E><63> refexplicit<69><74><EFBFBD>refwarn<72><6E>h<EFBFBD>Nh<4E>NhÌmlx.nn.value_and_grad<61>uhh<>h"h#hK<>h j<>ubh0<68><30> function is a convenience function to get
the gradient of a loss with respect to the trainable parameters of a model.
This should not be confused with <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubh<62>)<29><>}<7D>(h<05>:func:`mlx.core.value_and_grad`<60>h]<5D>hF)<29><>}<7D>(hj<>h]<5D>h0<68>mlx.core.value_and_grad()<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>(h<><68>py<70><79>py-func<6E>eh]<5D>h]<5D>h]<5D>uhhEh j<>ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F>h<EFBFBD><68> refdomain<69>j<EFBFBD><00>reftype<70><65>func<6E><63> refexplicit<69><74><EFBFBD>refwarn<72><6E>h<EFBFBD>Nh<4E>NhÌmlx.core.value_and_grad<61>uhh<>h"h#hK<>h j<>ubh0<68>.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK<>h j<>ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhj<>h h&h!hh"h#hNubh<)<29><>}<7D>(h<05><>The model should train to a decent accuracy (about 95%) after just a few passes
over the training set. The `full example <https://github.com/ml-explore/mlx-examples/tree/main/mnist>`_
is available in the MLX GitHub repo.<2E>h]<5D>(h0<68>kThe model should train to a decent accuracy (about 95%) after just a few passes
over the training set. The <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubj<62>)<29><>}<7D>(h<05>L`full example <https://github.com/ml-explore/mlx-examples/tree/main/mnist>`_<>h]<5D>h0<68> full example<6C><65><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>name<6D><65> full example<6C>j<EFBFBD><00>:https://github.com/ml-explore/mlx-examples/tree/main/mnist<73>uhj<>h j<>ubh )<29><>}<7D>(h<05>= <https://github.com/ml-explore/mlx-examples/tree/main/mnist><3E>h]<5D>h}<7D>(h]<5D><> full-example<6C>ah]<5D>h]<5D><> full example<6C>ah]<5D>h]<5D><>refuri<72>juhh
jKh j<>ubh0<68>%
is available in the MLX GitHub repo.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK<>h h&h!hubeh}<7D>(h]<5D>(<28>multi-layer-perceptron<6F>heh]<5D>h]<5D>(<28>multi-layer perceptron<6F><6E>mlp<6C>eh]<5D>h]<5D>uhh$h hh!hh"h#hK<04>expect_referenced_by_name<6D>}<7D>j-h s<>expect_referenced_by_id<69>}<7D>hh subeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>source<63>h#uhh<01>current_source<63>N<EFBFBD> current_line<6E>N<EFBFBD>settings<67><73>docutils.frontend<6E><64>Values<65><73><EFBFBD>)<29><>}<7D>(h)N<> generator<6F>N<EFBFBD> datestamp<6D>N<EFBFBD> source_link<6E>N<EFBFBD>
source_url<EFBFBD>N<EFBFBD> toc_backlinks<6B><73>entry<72><79>footnote_backlinks<6B>K<01> sectnum_xform<72>K<01>strip_comments<74>N<EFBFBD>strip_elements_with_classes<65>N<EFBFBD> strip_classes<65>N<EFBFBD> report_level<65>K<02>
halt_level<EFBFBD>K<05>exit_status_level<65>K<05>debug<75>N<EFBFBD>warning_stream<61>N<EFBFBD> traceback<63><6B><EFBFBD>input_encoding<6E><67> utf-8-sig<69><67>input_encoding_error_handler<65><72>strict<63><74>output_encoding<6E><67>utf-8<><38>output_encoding_error_handler<65>jW<00>error_encoding<6E><67>utf-8<><38>error_encoding_error_handler<65><72>backslashreplace<63><65> language_code<64><65>en<65><6E>record_dependencies<65>N<EFBFBD>config<69>N<EFBFBD> id_prefix<69>h<06>auto_id_prefix<69><78>id<69><64> dump_settings<67>N<EFBFBD>dump_internals<6C>N<EFBFBD>dump_transforms<6D>N<EFBFBD>dump_pseudo_xml<6D>N<EFBFBD>expose_internals<6C>N<EFBFBD>strict_visitor<6F>N<EFBFBD>_disable_config<69>N<EFBFBD>_source<63>h#<23> _destination<6F>N<EFBFBD> _config_files<65>]<5D><>file_insertion_enabled<65><64><EFBFBD> raw_enabled<65>K<01>line_length_limit<69>M'<27>pep_references<65>N<EFBFBD> pep_base_url<72><6C>https://peps.python.org/<2F><>pep_file_url_template<74><65>pep-%04d<34><64>rfc_references<65>N<EFBFBD> rfc_base_url<72><6C>&https://datatracker.ietf.org/doc/html/<2F><> tab_width<74>K<08>trim_footnote_reference_space<63><65><EFBFBD>syntax_highlight<68><74>long<6E><67> smart_quotes<65><73><EFBFBD>smartquotes_locales<65>]<5D><>character_level_inline_markup<75><70><EFBFBD>doctitle_xform<72><6D><EFBFBD> docinfo_xform<72>K<01>sectsubtitle_xform<72><6D><EFBFBD> image_loading<6E><67>link<6E><6B>embed_stylesheet<65><74><EFBFBD>cloak_email_addresses<65><73><EFBFBD>section_self_link<6E><6B><EFBFBD>env<6E>Nub<75>reporter<65>N<EFBFBD>indirect_targets<74>]<5D><>substitution_defs<66>}<7D><>substitution_names<65>}<7D><>refnames<65>}<7D><>refids<64>}<7D>h]<5D>h as<61>nameids<64>}<7D>(j-hj,j)jj<>jju<> nametypes<65>}<7D>(j-<00>j,<00>j<00>j<00>uh}<7D>(hh&j)h&j<>j<>jju<> footnote_refs<66>}<7D><> citation_refs<66>}<7D><> autofootnotes<65>]<5D><>autofootnote_refs<66>]<5D><>symbol_footnotes<65>]<5D><>symbol_footnote_refs<66>]<5D><> footnotes<65>]<5D><> citations<6E>]<5D><>autofootnote_start<72>K<01>symbol_footnote_start<72>K<00>
id_counter<EFBFBD><EFBFBD> collections<6E><73>Counter<65><72><EFBFBD>}<7D><><EFBFBD>R<EFBFBD><52>parse_messages<65>]<5D><>transform_messages<65>]<5D>h <09>system_message<67><65><EFBFBD>)<29><>}<7D>(hhh]<5D>h<)<29><>}<7D>(hhh]<5D>h0<68>)Hyperlink target "mlp" is not referenced.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>h j<>sbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h j<>ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>level<65>K<01>type<70><65>INFO<46><4F>source<63>h#<23>line<6E>Kuhj<>uba<62> transformer<65>N<EFBFBD> include_log<6F>]<5D><>
decoration<EFBFBD>Nh!hub.