docs/build/doctrees/examples/mlp.doctree

<EFBFBD><05><>A<00>sphinx.addnodes<65><73>document<6E><74><EFBFBD>)<29><>}<7D>(<28>	rawsource<63><65><00><>children<65>]<5D>(<28>docutils.nodes<65><73>target<65><74><EFBFBD>)<29><>}<7D>(h<05>.. _mlp:<3A>h]<5D><>
attributes<EFBFBD>}<7D>(<28>ids<64>]<5D><>classes<65>]<5D><>names<65>]<5D><>dupnames<65>]<5D><>backrefs<66>]<5D><>refid<69><64>mlp<6C>u<EFBFBD>tagname<6D>h
<EFBFBD>line<6E>K<01>parent<6E>h<03>	_document<6E>h<03>source<63><65>5/Users/awnihannun/repos/mlx/docs/src/examples/mlp.rst<73>ubh	<09>section<6F><6E><EFBFBD>)<29><>}<7D>(hhh]<5D>(h	<09>title<6C><65><EFBFBD>)<29><>}<7D>(h<05>Multi-Layer Perceptron<6F>h]<5D>h	<09>Text<78><74><EFBFBD><EFBFBD>Multi-Layer Perceptron<6F><6E><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h+h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh)h h&h!hh"h#hKubh	<09>	paragraph<70><68><EFBFBD>)<29><>}<7D>(h<05>pIn this example we'll learn to use ``mlx.nn`` by implementing a simple
multi-layer perceptron to classify MNIST.<2E>h]<5D>(h0<68>%In this example we’ll learn to use <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h=h!hh"NhNubh	<09>literal<61><6C><EFBFBD>)<29><>}<7D>(h<05>
``mlx.nn``<60>h]<5D>h0<68>mlx.nn<6E><6E><EFBFBD><EFBFBD><EFBFBD>}<7D>(h hGh!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhhEh h=ubh0<68>C by implementing a simple
multi-layer perceptron to classify MNIST.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h=h!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hKh h&h!hubh<)<29><>}<7D>(h<05>0As a first step import the MLX packages we need:<3A>h]<5D>h0<68>0As a first step import the MLX packages we need:<3A><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h_h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK	h h&h!hubh	<09>
literal_block<63><6B><EFBFBD>)<29><>}<7D>(h<05>\import mlx.core as mx
import mlx.nn as nn
import mlx.optimizers as optim

import numpy as np<6E>h]<5D>h0<68>\import mlx.core as mx
import mlx.nn as nn
import mlx.optimizers as optim

import numpy as np<6E><70><EFBFBD><EFBFBD><EFBFBD>}<7D>h hosbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>	xml:space<63><65>preserve<76><65>force<63><65><EFBFBD>language<67><65>python<6F><6E>highlight_args<67>}<7D>uhhmh"h#hKh h&h!hubh<)<29><>}<7D>(h<05><>The model is defined as the ``MLP`` class which inherits from
:class:`mlx.nn.Module`. We follow the standard idiom to make a new module:<3A>h]<5D>(h0<68>The model is defined as the <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubhF)<29><>}<7D>(h<05>``MLP``<60>h]<5D>h0<68>MLP<4C><50><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhhEh h<>ubh0<68> class which inherits from
<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubh<00>pending_xref<65><66><EFBFBD>)<29><>}<7D>(h<05>:class:`mlx.nn.Module`<60>h]<5D>hF)<29><>}<7D>(hh<>h]<5D>h0<68>
mlx.nn.Module<6C><65><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>(<28>xref<65><66>py<70><79>py-class<73>eh]<5D>h]<5D>h]<5D>uhhEh h<>ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F><63>examples/mlp<6C><70>	refdomain<69>h<EFBFBD><68>reftype<70><65>class<73><73>refexplicit<69><74><EFBFBD>refwarn<72><6E><EFBFBD>	py:module<6C>N<EFBFBD>py:class<73>N<EFBFBD>	reftarget<65><74>
mlx.nn.Module<6C>uhh<>h"h#hKh h<>ubh0<68>4. We follow the standard idiom to make a new module:<3A><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hKh h&h!hubh	<09>enumerated_list<73><74><EFBFBD>)<29><>}<7D>(hhh]<5D>(h	<09>	list_item<65><6D><EFBFBD>)<29><>}<7D>(h<05><>Define an ``__init__`` where the parameters and/or submodules are setup. See
the :ref:`Module class docs<module_class>` for more information on how
:class:`mlx.nn.Module` registers parameters.<2E>h]<5D>h<)<29><>}<7D>(h<05><>Define an ``__init__`` where the parameters and/or submodules are setup. See
the :ref:`Module class docs<module_class>` for more information on how
:class:`mlx.nn.Module` registers parameters.<2E>h]<5D>(h0<68>
Define an <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubhF)<29><>}<7D>(h<05>``__init__``<60>h]<5D>h0<68>__init__<5F><5F><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhhEh h<>ubh0<68>; where the parameters and/or submodules are setup. See
the <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubh<62>)<29><>}<7D>(h<05>&:ref:`Module class docs<module_class>`<60>h]<5D>h	<09>inline<6E><65><EFBFBD>)<29><>}<7D>(hh<>h]<5D>h0<68>Module class docs<63><73><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>(h<><68>std<74><64>std-ref<65>eh]<5D>h]<5D>h]<5D>uhh<>h h<>ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F>h<EFBFBD><68>	refdomain<69>j<00>reftype<70><65>ref<65><66>refexplicit<69><74><EFBFBD>refwarn<72><6E>hÌmodule_class<73>uhh<>h"h#hKh h<>ubh0<68> for more information on how
<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubh<62>)<29><>}<7D>(h<05>:class:`mlx.nn.Module`<60>h]<5D>hF)<29><>}<7D>(hjh]<5D>h0<68>
mlx.nn.Module<6C><65><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jh!hh"NhNubah}<7D>(h]<5D>h]<5D>(h<><68>py<70><79>py-class<73>eh]<5D>h]<5D>h]<5D>uhhEh jubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F>h<EFBFBD><68>	refdomain<69>j(<00>reftype<70><65>class<73><73>refexplicit<69><74><EFBFBD>refwarn<72><6E>h<EFBFBD>Nh<4E>NhÌ
mlx.nn.Module<6C>uhh<>h"h#hKh h<>ubh0<68> registers parameters.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hKh h<>ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh<>h h<>h!hh"h#hNubh<62>)<29><>}<7D>(h<05><Define a ``__call__`` where the computation is implemented.
<EFBFBD>h]<5D>h<)<29><>}<7D>(h<05>;Define a ``__call__`` where the computation is implemented.<2E>h]<5D>(h0<68>	Define a <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jNh!hh"NhNubhF)<29><>}<7D>(h<05>``__call__``<60>h]<5D>h0<68>__call__<5F><5F><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jVh!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhhEh jNubh0<68>& where the computation is implemented.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jNh!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hKh jJubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh<>h h<>h!hh"h#hNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>enumtype<70><65>arabic<69><63>prefix<69>h<06>suffix<69><78>.<2E>uhh<>h h&h!hh"h#hKubhn)<29><>}<7D>(hX<>class MLP(nn.Module):
    def __init__(
        self, num_layers: int, input_dim: int, hidden_dim: int, output_dim: int
    ):
        super().__init__()
        layer_sizes = [input_dim] + [hidden_dim] * num_layers + [output_dim]
        self.layers = [
            nn.Linear(idim, odim)
            for idim, odim in zip(layer_sizes[:-1], layer_sizes[1:])
        ]

    def __call__(self, x):
        for l in self.layers[:-1]:
            x = mx.maximum(l(x), 0.0)
        return self.layers[-1](x)<29>h]<5D>h0X<30>class MLP(nn.Module):
    def __init__(
        self, num_layers: int, input_dim: int, hidden_dim: int, output_dim: int
    ):
        super().__init__()
        layer_sizes = [input_dim] + [hidden_dim] * num_layers + [output_dim]
        self.layers = [
            nn.Linear(idim, odim)
            for idim, odim in zip(layer_sizes[:-1], layer_sizes[1:])
        ]

    def __call__(self, x):
        for l in self.layers[:-1]:
            x = mx.maximum(l(x), 0.0)
        return self.layers[-1](x)<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>h jsbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>h}h~h<68>h<EFBFBD><68>python<6F>h<EFBFBD>}<7D>uhhmh"h#hKh h&h!hubh<)<29><>}<7D>(h<05><>We define the loss function which takes the mean of the per-example cross
entropy loss.  The ``mlx.nn.losses`` sub-package has implementations of some
commonly used loss functions.<2E>h]<5D>(h0<68>]We define the loss function which takes the mean of the per-example cross
entropy loss.  The <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubhF)<29><>}<7D>(h<05>``mlx.nn.losses``<60>h]<5D>h0<68>
mlx.nn.losses<65><73><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhhEh j<>ubh0<68>F sub-package has implementations of some
commonly used loss functions.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK/h h&h!hubhn)<29><>}<7D>(h<05>Rdef loss_fn(model, X, y):
    return mx.mean(nn.losses.cross_entropy(model(X), y))<29>h]<5D>h0<68>Rdef loss_fn(model, X, y):
    return mx.mean(nn.losses.cross_entropy(model(X), y))<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>h j<>sbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>h}h~h<68>h<EFBFBD><68>python<6F>h<EFBFBD>}<7D>uhhmh"h#hK3h h&h!hubh<)<29><>}<7D>(h<05>SWe also need a function to compute the accuracy of the model on the validation
set:<3A>h]<5D>h0<68>SWe also need a function to compute the accuracy of the model on the validation
set:<3A><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK8h h&h!hubhn)<29><>}<7D>(h<05>Ndef eval_fn(model, X, y):
    return mx.mean(mx.argmax(model(X), axis=1) == y)<29>h]<5D>h0<68>Ndef eval_fn(model, X, y):
    return mx.mean(mx.argmax(model(X), axis=1) == y)<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>h j<>sbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>h}h~h<68>h<EFBFBD><68>python<6F>h<EFBFBD>}<7D>uhhmh"h#hK;h h&h!hubh<)<29><>}<7D>(h<05><>Next, setup the problem parameters and load the data. To load the data, you need our
`mnist data loader
<https://github.com/ml-explore/mlx-examples/blob/main/mnist/mnist.py>`_, which
we will import as `mnist`.<2E>h]<5D>(h0<68>UNext, setup the problem parameters and load the data. To load the data, you need our
<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubh	<09>	reference<63><65><EFBFBD>)<29><>}<7D>(h<05>Z`mnist data loader
<https://github.com/ml-explore/mlx-examples/blob/main/mnist/mnist.py>`_<>h]<5D>h0<68>mnist data loader<65><72><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>name<6D><65>mnist data loader<65><72>refuri<72><69>Chttps://github.com/ml-explore/mlx-examples/blob/main/mnist/mnist.py<70>uhj<>h j<>ubh)<29><>}<7D>(h<05>F
<https://github.com/ml-explore/mlx-examples/blob/main/mnist/mnist.py><3E>h]<5D>h}<7D>(h]<5D><>mnist-data-loader<65>ah]<5D>h]<5D><>mnist data loader<65>ah]<5D>h]<5D><>refuri<72>j<EFBFBD>uhh
<EFBFBD>
referenced<EFBFBD>Kh j<>ubh0<68>, which
we will import as <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubh	<09>title_reference<63><65><EFBFBD>)<29><>}<7D>(h<05>`mnist`<60>h]<5D>h0<68>mnist<73><74><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j
h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhjh j<>ubh0<68>.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK@h h&h!hubhn)<29><>}<7D>(h<05><>num_layers = 2
hidden_dim = 32
num_classes = 10
batch_size = 256
num_epochs = 10
learning_rate = 1e-1

# Load the data
import mnist
train_images, train_labels, test_images, test_labels = map(
    mx.array, mnist.mnist()
)<29>h]<5D>h0<68><30>num_layers = 2
hidden_dim = 32
num_classes = 10
batch_size = 256
num_epochs = 10
learning_rate = 1e-1

# Load the data
import mnist
train_images, train_labels, test_images, test_labels = map(
    mx.array, mnist.mnist()
)<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>h j%sbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>h}h~h<68>h<EFBFBD><68>python<6F>h<EFBFBD>}<7D>uhhmh"h#hKEh h&h!hubh<)<29><>}<7D>(h<05>uSince we're using SGD, we need an iterator which shuffles and constructs
minibatches of examples in the training set:<3A>h]<5D>h0<68>wSince we’re using SGD, we need an iterator which shuffles and constructs
minibatches of examples in the training set:<3A><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j5h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hKTh h&h!hubhn)<29><>}<7D>(h<05><>def batch_iterate(batch_size, X, y):
    perm = mx.array(np.random.permutation(y.size))
    for s in range(0, y.size, batch_size):
        ids = perm[s : s + batch_size]
        yield X[ids], y[ids]<5D>h]<5D>h0<68><30>def batch_iterate(batch_size, X, y):
    perm = mx.array(np.random.permutation(y.size))
    for s in range(0, y.size, batch_size):
        ids = perm[s : s + batch_size]
        yield X[ids], y[ids]<5D><><EFBFBD><EFBFBD><EFBFBD>}<7D>h jCsbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>h}h~h<68>h<EFBFBD><68>python<6F>h<EFBFBD>}<7D>uhhmh"h#hKWh h&h!hubh<)<29><>}<7D>(h<05><>Finally, we put it all together by instantiating the model, the
:class:`mlx.optimizers.SGD` optimizer, and running the training loop:<3A>h]<5D>(h0<68>@Finally, we put it all together by instantiating the model, the
<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jSh!hh"NhNubh<62>)<29><>}<7D>(h<05>:class:`mlx.optimizers.SGD`<60>h]<5D>hF)<29><>}<7D>(hj]h]<5D>h0<68>mlx.optimizers.SGD<47><44><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j_h!hh"NhNubah}<7D>(h]<5D>h]<5D>(h<><68>py<70><79>py-class<73>eh]<5D>h]<5D>h]<5D>uhhEh j[ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F>h<EFBFBD><68>	refdomain<69>ji<00>reftype<70><65>class<73><73>refexplicit<69><74><EFBFBD>refwarn<72><6E>h<EFBFBD>Nh<4E>NhÌmlx.optimizers.SGD<47>uhh<>h"h#hK`h jSubh0<68>* optimizer, and running the training loop:<3A><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jSh!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK`h h&h!hubhn)<29><>}<7D>(hXQ# Load the model
model = MLP(num_layers, train_images.shape[-1], hidden_dim, num_classes)
mx.eval(model.parameters())

# Get a function which gives the loss and gradient of the
# loss with respect to the model's trainable parameters
loss_and_grad_fn = nn.value_and_grad(model, loss_fn)

# Instantiate the optimizer
optimizer = optim.SGD(learning_rate=learning_rate)

for e in range(num_epochs):
    for X, y in batch_iterate(batch_size, train_images, train_labels):
        loss, grads = loss_and_grad_fn(model, X, y)

        # Update the optimizer state and model parameters
        # in a single call
        optimizer.update(model, grads)

        # Force a graph evaluation
        mx.eval(model.parameters(), optimizer.state)

    accuracy = eval_fn(model, test_images, test_labels)
    print(f"Epoch {e}: Test accuracy {accuracy.item():.3f}")<29>h]<5D>h0XQ# Load the model
model = MLP(num_layers, train_images.shape[-1], hidden_dim, num_classes)
mx.eval(model.parameters())

# Get a function which gives the loss and gradient of the
# loss with respect to the model's trainable parameters
loss_and_grad_fn = nn.value_and_grad(model, loss_fn)

# Instantiate the optimizer
optimizer = optim.SGD(learning_rate=learning_rate)

for e in range(num_epochs):
    for X, y in batch_iterate(batch_size, train_images, train_labels):
        loss, grads = loss_and_grad_fn(model, X, y)

        # Update the optimizer state and model parameters
        # in a single call
        optimizer.update(model, grads)

        # Force a graph evaluation
        mx.eval(model.parameters(), optimizer.state)

    accuracy = eval_fn(model, test_images, test_labels)
    print(f"Epoch {e}: Test accuracy {accuracy.item():.3f}")<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>h j<>sbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>h}h~h<68>h<EFBFBD><68>python<6F>h<EFBFBD>}<7D>uhhmh"h#hKch h&h!hubh	<09>note<74><65><EFBFBD>)<29><>}<7D>(h<05><>The :func:`mlx.nn.value_and_grad` function is a convenience function to get
the gradient of a loss with respect to the trainable parameters of a model.
This should not be confused with :func:`mlx.core.value_and_grad`.<2E>h]<5D>h<)<29><>}<7D>(h<05><>The :func:`mlx.nn.value_and_grad` function is a convenience function to get
the gradient of a loss with respect to the trainable parameters of a model.
This should not be confused with :func:`mlx.core.value_and_grad`.<2E>h]<5D>(h0<68>The <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubh<62>)<29><>}<7D>(h<05>:func:`mlx.nn.value_and_grad`<60>h]<5D>hF)<29><>}<7D>(hj<>h]<5D>h0<68>mlx.nn.value_and_grad()<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>(h<><68>py<70><79>py-func<6E>eh]<5D>h]<5D>h]<5D>uhhEh j<>ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F>h<EFBFBD><68>	refdomain<69>j<EFBFBD><00>reftype<70><65>func<6E><63>refexplicit<69><74><EFBFBD>refwarn<72><6E>h<EFBFBD>Nh<4E>NhÌmlx.nn.value_and_grad<61>uhh<>h"h#hK<>h j<>ubh0<68><30> function is a convenience function to get
the gradient of a loss with respect to the trainable parameters of a model.
This should not be confused with <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubh<62>)<29><>}<7D>(h<05>:func:`mlx.core.value_and_grad`<60>h]<5D>hF)<29><>}<7D>(hj<>h]<5D>h0<68>mlx.core.value_and_grad()<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>(h<><68>py<70><79>py-func<6E>eh]<5D>h]<5D>h]<5D>uhhEh j<>ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F>h<EFBFBD><68>	refdomain<69>j<EFBFBD><00>reftype<70><65>func<6E><63>refexplicit<69><74><EFBFBD>refwarn<72><6E>h<EFBFBD>Nh<4E>NhÌmlx.core.value_and_grad<61>uhh<>h"h#hK<>h j<>ubh0<68>.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK<>h j<>ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhj<>h h&h!hh"h#hNubh<)<29><>}<7D>(h<05><>The model should train to a decent accuracy (about 95%) after just a few passes
over the training set. The `full example <https://github.com/ml-explore/mlx-examples/tree/main/mnist>`_
is available in the MLX GitHub repo.<2E>h]<5D>(h0<68>kThe model should train to a decent accuracy (about 95%) after just a few passes
over the training set. The <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubj<62>)<29><>}<7D>(h<05>L`full example <https://github.com/ml-explore/mlx-examples/tree/main/mnist>`_<>h]<5D>h0<68>full example<6C><65><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>name<6D><65>full example<6C>j<EFBFBD><00>:https://github.com/ml-explore/mlx-examples/tree/main/mnist<73>uhj<>h j<>ubh)<29><>}<7D>(h<05>= <https://github.com/ml-explore/mlx-examples/tree/main/mnist><3E>h]<5D>h}<7D>(h]<5D><>full-example<6C>ah]<5D>h]<5D><>full example<6C>ah]<5D>h]<5D><>refuri<72>juhh
jKh j<>ubh0<68>%
is available in the MLX GitHub repo.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK<>h h&h!hubeh}<7D>(h]<5D>(<28>multi-layer-perceptron<6F>heh]<5D>h]<5D>(<28>multi-layer perceptron<6F><6E>mlp<6C>eh]<5D>h]<5D>uhh$h hh!hh"h#hK<04>expect_referenced_by_name<6D>}<7D>j-hs<>expect_referenced_by_id<69>}<7D>hhsubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>source<63>h#uhh<01>current_source<63>N<EFBFBD>current_line<6E>N<EFBFBD>settings<67><73>docutils.frontend<6E><64>Values<65><73><EFBFBD>)<29><>}<7D>(h)N<>	generator<6F>N<EFBFBD>	datestamp<6D>N<EFBFBD>source_link<6E>N<EFBFBD>
source_url<EFBFBD>N<EFBFBD>
toc_backlinks<6B><73>entry<72><79>footnote_backlinks<6B>K<01>
sectnum_xform<72>K<01>strip_comments<74>N<EFBFBD>strip_elements_with_classes<65>N<EFBFBD>
strip_classes<65>N<EFBFBD>report_level<65>K<02>
halt_level<EFBFBD>K<05>exit_status_level<65>K<05>debug<75>N<EFBFBD>warning_stream<61>N<EFBFBD>	traceback<63><6B><EFBFBD>input_encoding<6E><67>	utf-8-sig<69><67>input_encoding_error_handler<65><72>strict<63><74>output_encoding<6E><67>utf-8<><38>output_encoding_error_handler<65>jW<00>error_encoding<6E><67>utf-8<><38>error_encoding_error_handler<65><72>backslashreplace<63><65>
language_code<64><65>en<65><6E>record_dependencies<65>N<EFBFBD>config<69>N<EFBFBD>	id_prefix<69>h<06>auto_id_prefix<69><78>id<69><64>
dump_settings<67>N<EFBFBD>dump_internals<6C>N<EFBFBD>dump_transforms<6D>N<EFBFBD>dump_pseudo_xml<6D>N<EFBFBD>expose_internals<6C>N<EFBFBD>strict_visitor<6F>N<EFBFBD>_disable_config<69>N<EFBFBD>_source<63>h#<23>_destination<6F>N<EFBFBD>
_config_files<65>]<5D><>file_insertion_enabled<65><64><EFBFBD>raw_enabled<65>K<01>line_length_limit<69>M'<27>pep_references<65>N<EFBFBD>pep_base_url<72><6C>https://peps.python.org/<2F><>pep_file_url_template<74><65>pep-%04d<34><64>rfc_references<65>N<EFBFBD>rfc_base_url<72><6C>&https://datatracker.ietf.org/doc/html/<2F><>	tab_width<74>K<08>trim_footnote_reference_space<63><65><EFBFBD>syntax_highlight<68><74>long<6E><67>smart_quotes<65><73><EFBFBD>smartquotes_locales<65>]<5D><>character_level_inline_markup<75><70><EFBFBD>doctitle_xform<72><6D><EFBFBD>
docinfo_xform<72>K<01>sectsubtitle_xform<72><6D><EFBFBD>
image_loading<6E><67>link<6E><6B>embed_stylesheet<65><74><EFBFBD>cloak_email_addresses<65><73><EFBFBD>section_self_link<6E><6B><EFBFBD>env<6E>Nub<75>reporter<65>N<EFBFBD>indirect_targets<74>]<5D><>substitution_defs<66>}<7D><>substitution_names<65>}<7D><>refnames<65>}<7D><>refids<64>}<7D>h]<5D>has<61>nameids<64>}<7D>(j-hj,j)jj<>jju<>	nametypes<65>}<7D>(j-<00>j,<00>j<00>j<00>uh}<7D>(hh&j)h&j<>j<>jju<>
footnote_refs<66>}<7D><>
citation_refs<66>}<7D><>
autofootnotes<65>]<5D><>autofootnote_refs<66>]<5D><>symbol_footnotes<65>]<5D><>symbol_footnote_refs<66>]<5D><>	footnotes<65>]<5D><>	citations<6E>]<5D><>autofootnote_start<72>K<01>symbol_footnote_start<72>K<00>
id_counter<EFBFBD><EFBFBD>collections<6E><73>Counter<65><72><EFBFBD>}<7D><><EFBFBD>R<EFBFBD><52>parse_messages<65>]<5D><>transform_messages<65>]<5D>h	<09>system_message<67><65><EFBFBD>)<29><>}<7D>(hhh]<5D>h<)<29><>}<7D>(hhh]<5D>h0<68>)Hyperlink target "mlp" is not referenced.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>h j<>sbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h j<>ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>level<65>K<01>type<70><65>INFO<46><4F>source<63>h#<23>line<6E>Kuhj<>uba<62>transformer<65>N<EFBFBD>include_log<6F>]<5D><>
decoration<EFBFBD>Nh!hub.