Files
mlx/docs/build/doctrees/usage/function_transforms.doctree

205 lines
26 KiB
Plaintext
Raw Normal View History

2024-01-17 17:15:29 -08:00
<EFBFBD><05><>e<00>sphinx.addnodes<65><73>document<6E><74><EFBFBD>)<29><>}<7D>(<28> rawsource<63><65><00><>children<65>]<5D>(<28>docutils.nodes<65><73>target<65><74><EFBFBD>)<29><>}<7D>(h<05>.. _function_transforms:<3A>h]<5D><>
attributes<EFBFBD>}<7D>(<28>ids<64>]<5D><>classes<65>]<5D><>names<65>]<5D><>dupnames<65>]<5D><>backrefs<66>]<5D><>refid<69><64>function-transforms<6D>u<EFBFBD>tagname<6D>h
<EFBFBD>line<6E>K<01>parent<6E>h<03> _document<6E>h<03>source<63><65>B/Users/awnihannun/repos/mlx/docs/src/usage/function_transforms.rst<73>ubh <09>section<6F><6E><EFBFBD>)<29><>}<7D>(hhh]<5D>(h <09>title<6C><65><EFBFBD>)<29><>}<7D>(h<05>Function Transforms<6D>h]<5D>h <09>Text<78><74><EFBFBD><EFBFBD>Function Transforms<6D><73><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h+h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh)h h&h!hh"h#hKubh <09> paragraph<70><68><EFBFBD>)<29><>}<7D>(h<05><>MLX uses composable function transformations for automatic differentiation and
vectorization. The key idea behind composable function transformations is that
every transformation returns a function which can be further transformed.<2E>h]<5D>h0<68><30>MLX uses composable function transformations for automatic differentiation and
vectorization. The key idea behind composable function transformations is that
every transformation returns a function which can be further transformed.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h=h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hKh h&h!hubh<)<29><>}<7D>(h<05>Here is a simple example:<3A>h]<5D>h0<68>Here is a simple example:<3A><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h hKh!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK h h&h!hubh <09> literal_block<63><6B><EFBFBD>)<29><>}<7D>(h<05><>>>> dfdx = mx.grad(mx.sin)
>>> dfdx(mx.array(mx.pi))
array(-1, dtype=float32)
>>> mx.cos(mx.array(mx.pi))
array(-1, dtype=float32)<29>h]<5D>h0<68><30>>>> dfdx = mx.grad(mx.sin)
>>> dfdx(mx.array(mx.pi))
array(-1, dtype=float32)
>>> mx.cos(mx.array(mx.pi))
array(-1, dtype=float32)<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>h h[sbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><> xml:space<63><65>preserve<76><65>force<63><65><EFBFBD>language<67><65>shell<6C><6C>highlight_args<67>}<7D>uhhYh"h#hKh h&h!hubh<)<29><>}<7D>(h<05><>The output of :func:`grad` on :func:`sin` is simply another function. In this
case it is the gradient of the sine function which is exactly the cosine
function. To get the second derivative you can do:<3A>h]<5D>(h0<68>The output of <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h hph!hh"NhNubh<00> pending_xref<65><66><EFBFBD>)<29><>}<7D>(h<05> :func:`grad`<60>h]<5D>h <09>literal<61><6C><EFBFBD>)<29><>}<7D>(hh|h]<5D>h0<68>grad()<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>(<28>xref<65><66>py<70><79>py-func<6E>eh]<5D>h]<5D>h]<5D>uhh~h hzubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F><63>usage/function_transforms<6D><73> refdomain<69>h<EFBFBD><68>reftype<70><65>func<6E><63> refexplicit<69><74><EFBFBD>refwarn<72><6E><EFBFBD> py:module<6C><65>mlx.core<72><65>py:class<73>N<EFBFBD> reftarget<65><74>grad<61>uhhxh"h#hKh hpubh0<68> on <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h hph!hh"NhNubhy)<29><>}<7D>(h<05> :func:`sin`<60>h]<5D>h)<29><>}<7D>(hh<>h]<5D>h0<68>sin()<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>(h<><68>py<70><79>py-func<6E>eh]<5D>h]<5D>h]<5D>uhh~h h<>ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F>h<EFBFBD><68> refdomain<69>h<EFBFBD><68>reftype<70><65>func<6E><63> refexplicit<69><74><EFBFBD>refwarn<72><6E>h<EFBFBD>h<EFBFBD>h<EFBFBD>Nh<4E><68>sin<69>uhhxh"h#hKh hpubh0<68><30> is simply another function. In this
case it is the gradient of the sine function which is exactly the cosine
function. To get the second derivative you can do:<3A><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h hph!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hKh h&h!hubhZ)<29><>}<7D>(h<05><>>>> d2fdx2 = mx.grad(mx.grad(mx.sin))
>>> d2fdx2(mx.array(mx.pi / 2))
array(-1, dtype=float32)
>>> mx.sin(mx.array(mx.pi / 2))
array(1, dtype=float32)<29>h]<5D>h0<68><30>>>> d2fdx2 = mx.grad(mx.grad(mx.sin))
>>> d2fdx2(mx.array(mx.pi / 2))
array(-1, dtype=float32)
>>> mx.sin(mx.array(mx.pi / 2))
array(1, dtype=float32)<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>h h<>sbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>hihjhk<68>hl<68>shell<6C>hn}<7D>uhhYh"h#hKh h&h!hubh<)<29><>}<7D>(h<05>iUsing :func:`grad` on the output of :func:`grad` is always ok. You keep
getting higher order derivatives.<2E>h]<5D>(h0<68>Using <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubhy)<29><>}<7D>(h<05> :func:`grad`<60>h]<5D>h)<29><>}<7D>(hh<>h]<5D>h0<68>grad()<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>(h<><68>py<70><79>py-func<6E>eh]<5D>h]<5D>h]<5D>uhh~h h<>ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F>h<EFBFBD><68> refdomain<69>h<EFBFBD><68>reftype<70><65>func<6E><63> refexplicit<69><74><EFBFBD>refwarn<72><6E>h<EFBFBD>h<EFBFBD>h<EFBFBD>Nh<4E><68>grad<61>uhhxh"h#hK#h h<>ubh0<68> on the output of <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubhy)<29><>}<7D>(h<05> :func:`grad`<60>h]<5D>h)<29><>}<7D>(hjh]<5D>h0<68>grad()<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jh!hh"NhNubah}<7D>(h]<5D>h]<5D>(h<><68>py<70><79>py-func<6E>eh]<5D>h]<5D>h]<5D>uhh~h j ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F>h<EFBFBD><68> refdomain<69>j<00>reftype<70><65>func<6E><63> refexplicit<69><74><EFBFBD>refwarn<72><6E>h<EFBFBD>h<EFBFBD>h<EFBFBD>Nh<4E><68>grad<61>uhhxh"h#hK#h h<>ubh0<68>9 is always ok. You keep
getting higher order derivatives.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h h<>h!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK#h h&h!hubh<)<29><>}<7D>(hXFAny of the MLX function transformations can be composed in any order to any
depth. To see the complete list of function transformations check-out the
:ref:`API documentation <transforms>`. See the following sections for more
information on :ref:`automatic differentiaion <auto diff>` and
:ref:`automatic vectorization <vmap>`.<2E>h]<5D>(h0<68><30>Any of the MLX function transformations can be composed in any order to any
depth. To see the complete list of function transformations check-out the
<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j6h!hh"NhNubhy)<29><>}<7D>(h<05>%:ref:`API documentation <transforms>`<60>h]<5D>h <09>inline<6E><65><EFBFBD>)<29><>}<7D>(hj@h]<5D>h0<68>API documentation<6F><6E><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jDh!hh"NhNubah}<7D>(h]<5D>h]<5D>(h<><68>std<74><64>std-ref<65>eh]<5D>h]<5D>h]<5D>uhjBh j>ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F>h<EFBFBD><68> refdomain<69>jN<00>reftype<70><65>ref<65><66> refexplicit<69><74><EFBFBD>refwarn<72><6E>h<EFBFBD><68>
transforms<EFBFBD>uhhxh"h#hK&h j6ubh0<68>5. See the following sections for more
information on <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j6h!hh"NhNubhy)<29><>}<7D>(h<05>+:ref:`automatic differentiaion <auto diff>`<60>h]<5D>jC)<29><>}<7D>(hjfh]<5D>h0<68>automatic differentiaion<6F><6E><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jhh!hh"NhNubah}<7D>(h]<5D>h]<5D>(h<><68>std<74><64>std-ref<65>eh]<5D>h]<5D>h]<5D>uhjBh jdubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F>h<EFBFBD><68> refdomain<69>jr<00>reftype<70><65>ref<65><66> refexplicit<69><74><EFBFBD>refwarn<72><6E>h<EFBFBD><68> auto diff<66>uhhxh"h#hK&h j6ubh0<68> and
<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j6h!hh"NhNubhy)<29><>}<7D>(h<05>%:ref:`automatic vectorization <vmap>`<60>h]<5D>jC)<29><>}<7D>(hj<>h]<5D>h0<68>automatic vectorization<6F><6E><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>(h<><68>std<74><64>std-ref<65>eh]<5D>h]<5D>h]<5D>uhjBh j<>ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F>h<EFBFBD><68> refdomain<69>j<EFBFBD><00>reftype<70><65>ref<65><66> refexplicit<69><74><EFBFBD>refwarn<72><6E>h<EFBFBD><68>vmap<61>uhhxh"h#hK&h j6ubh0<68>.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j6h!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK&h h&h!hubh%)<29><>}<7D>(hhh]<5D>(h*)<29><>}<7D>(h<05>Automatic Differentiation<6F>h]<5D>h0<68>Automatic Differentiation<6F><6E><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh)h j<>h!hh"h#hK-ubh )<29><>}<7D>(h<05>.. _auto diff:<3A>h]<5D>h}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>h<1C> auto-diff<66>uhh
hK/h j<>h!hh"h#ubh<)<29><>}<7D>(h<05>SAutomatic differentiation in MLX works on functions rather than on implicit
graphs.<2E>h]<5D>h0<68>SAutomatic differentiation in MLX works on functions rather than on implicit
graphs.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>j<EFBFBD>ah]<5D>h]<5D><> auto diff<66>ah]<5D>h]<5D>uhh;h"h#hK1h j<>h!h<03>expect_referenced_by_name<6D>}<7D>j<EFBFBD>j<>s<>expect_referenced_by_id<69>}<7D>j<EFBFBD>j<>subh <09>note<74><65><EFBFBD>)<29><>}<7D>(h<05><>If you are coming to MLX from PyTorch, you no longer need functions like
``backward``, ``zero_grad``, and ``detach``, or properties like
``requires_grad``.<2E>h]<5D>h<)<29><>}<7D>(h<05><>If you are coming to MLX from PyTorch, you no longer need functions like
``backward``, ``zero_grad``, and ``detach``, or properties like
``requires_grad``.<2E>h]<5D>(h0<68>IIf you are coming to MLX from PyTorch, you no longer need functions like
<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubh)<29><>}<7D>(h<05> ``backward``<60>h]<5D>h0<68>backward<72><64><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh~h j<>ubh0<68>, <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubh)<29><>}<7D>(h<05> ``zero_grad``<60>h]<5D>h0<68> zero_grad<61><64><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jh!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh~h j<>ubh0<68>, and <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubh)<29><>}<7D>(h<05>
``detach``<60>h]<5D>h0<68>detach<63><68><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jh!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh~h j<>ubh0<68>, or properties like
<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubh)<29><>}<7D>(h<05>``requires_grad``<60>h]<5D>h0<68> requires_grad<61><64><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j%h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh~h j<>ubh0<68>.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK6h j<>ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhj<>h j<>h!hh"h#hNubh<)<29><>}<7D>(hXThe most basic example is taking the gradient of a scalar-valued function as we
saw above. You can use the :func:`grad` and :func:`value_and_grad` function to
compute gradients of more complex functions. By default these functions compute
the gradient with respect to the first argument:<3A>h]<5D>(h0<68>kThe most basic example is taking the gradient of a scalar-valued function as we
saw above. You can use the <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jCh!hh"NhNubhy)<29><>}<7D>(h<05> :func:`grad`<60>h]<5D>h)<29><>}<7D>(hjMh]<5D>h0<68>grad()<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jOh!hh"NhNubah}<7D>(h]<5D>h]<5D>(h<><68>py<70><79>py-func<6E>eh]<5D>h]<5D>h]<5D>uhh~h jKubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F>h<EFBFBD><68> refdomain<69>jY<00>reftype<70><65>func<6E><63> refexplicit<69><74><EFBFBD>refwarn<72><6E>h<EFBFBD>h<EFBFBD>h<EFBFBD>Nh<4E><68>grad<61>uhhxh"h#hK:h jCubh0<68> and <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jCh!hh"NhNubhy)<29><>}<7D>(h<05>:func:`value_and_grad`<60>h]<5D>h)<29><>}<7D>(hjqh]<5D>h0<68>value_and_grad()<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jsh!hh"NhNubah}<7D>(h]<5D>h]<5D>(h<><68>py<70><79>py-func<6E>eh]<5D>h]<5D>h]<5D>uhh~h joubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F>h<EFBFBD><68> refdomain<69>j}<00>reftype<70><65>func<6E><63> refexplicit<69><74><EFBFBD>refwarn<72><6E>h<EFBFBD>h<EFBFBD>h<EFBFBD>Nh<4E><68>value_and_grad<61>uhhxh"h#hK:h jCubh0<68><30> function to
compute gradients of more complex functions. By default these functions compute
the gradient with respect to the first argument:<3A><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jCh!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK:h j<>h!hubhZ)<29><>}<7D>(hX<>def loss_fn(w, x, y):
return mx.mean(mx.square(w * x - y))
w = mx.array(1.0)
x = mx.array([0.5, -0.5])
y = mx.array([1.5, -1.5])
# Computes the gradient of loss_fn with respect to w:
grad_fn = mx.grad(loss_fn)
dloss_dw = grad_fn(w, x, y)
# Prints array(-1, dtype=float32)
print(dloss_dw)
# To get the gradient with respect to x we can do:
grad_fn = mx.grad(loss_fn, argnums=1)
dloss_dx = grad_fn(w, x, y)
# Prints array([-1, 1], dtype=float32)
print(dloss_dx)<29>h]<5D>h0X<30>def loss_fn(w, x, y):
return mx.mean(mx.square(w * x - y))
w = mx.array(1.0)
x = mx.array([0.5, -0.5])
y = mx.array([1.5, -1.5])
# Computes the gradient of loss_fn with respect to w:
grad_fn = mx.grad(loss_fn)
dloss_dw = grad_fn(w, x, y)
# Prints array(-1, dtype=float32)
print(dloss_dw)
# To get the gradient with respect to x we can do:
grad_fn = mx.grad(loss_fn, argnums=1)
dloss_dx = grad_fn(w, x, y)
# Prints array([-1, 1], dtype=float32)
print(dloss_dx)<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>h j<>sbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>hihjhk<68>hl<68>python<6F>hn}<7D>uhhYh"h#hK?h j<>h!hubh<)<29><>}<7D>(h<05><>One way to get the loss and gradient is to call ``loss_fn`` followed by
``grad_fn``, but this can result in a lot of redundant work. Instead, you
should use :func:`value_and_grad`. Continuing the above example:<3A>h]<5D>(h0<68>0One way to get the loss and gradient is to call <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubh)<29><>}<7D>(h<05> ``loss_fn``<60>h]<5D>h0<68>loss_fn<66><6E><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh~h j<>ubh0<68> followed by
<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubh)<29><>}<7D>(h<05> ``grad_fn``<60>h]<5D>h0<68>grad_fn<66><6E><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh~h j<>ubh0<68>J, but this can result in a lot of redundant work. Instead, you
should use <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubhy)<29><>}<7D>(h<05>:func:`value_and_grad`<60>h]<5D>h)<29><>}<7D>(hj<>h]<5D>h0<68>value_and_grad()<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>(h<><68>py<70><79>py-func<6E>eh]<5D>h]<5D>h]<5D>uhh~h j<>ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F>h<EFBFBD><68> refdomain<69>j<EFBFBD><00>reftype<70><65>func<6E><63> refexplicit<69><74><EFBFBD>refwarn<72><6E>h<EFBFBD>h<EFBFBD>h<EFBFBD>Nh<4E><68>value_and_grad<61>uhhxh"h#hKUh j<>ubh0<68>. Continuing the above example:<3A><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hKUh j<>h!hubhZ)<29><>}<7D>(h<05><># Computes the gradient of loss_fn with respect to w:
loss_and_grad_fn = mx.value_and_grad(loss_fn)
loss, dloss_dw = loss_and_grad_fn(w, x, y)
# Prints array(1, dtype=float32)
print(loss)
# Prints array(-1, dtype=float32)
print(dloss_dw)<29>h]<5D>h0<68><30># Computes the gradient of loss_fn with respect to w:
loss_and_grad_fn = mx.value_and_grad(loss_fn)
loss, dloss_dw = loss_and_grad_fn(w, x, y)
# Prints array(1, dtype=float32)
print(loss)
# Prints array(-1, dtype=float32)
print(dloss_dw)<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>h j<>sbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>hihjhk<68>hl<68>python<6F>hn}<7D>uhhYh"h#hKZh j<>h!hubh<)<29><>}<7D>(h<05><>You can also take the gradient with respect to arbitrarily nested Python
containers of arrays (specifically any of :obj:`list`, :obj:`tuple`, or
:obj:`dict`).<2E>h]<5D>(h0<68>sYou can also take the gradient with respect to arbitrarily nested Python
containers of arrays (specifically any of <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jh!hh"NhNubhy)<29><>}<7D>(h<05> :obj:`list`<60>h]<5D>h)<29><>}<7D>(hjh]<5D>h0<68>list<73><74><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jh!hh"NhNubah}<7D>(h]<5D>h]<5D>(h<><68>py<70><79>py-obj<62>eh]<5D>h]<5D>h]<5D>uhh~h jubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F>h<EFBFBD><68> refdomain<69>j%<00>reftype<70><65>obj<62><6A> refexplicit<69><74><EFBFBD>refwarn<72><6E>h<EFBFBD>h<EFBFBD>h<EFBFBD>Nh<4E><68>list<73>uhhxh"h#hKgh jubh0<68>, <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jh!hh"NhNubhy)<29><>}<7D>(h<05> :obj:`tuple`<60>h]<5D>h)<29><>}<7D>(hj=h]<5D>h0<68>tuple<6C><65><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j?h!hh"NhNubah}<7D>(h]<5D>h]<5D>(h<><68>py<70><79>py-obj<62>eh]<5D>h]<5D>h]<5D>uhh~h j;ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F>h<EFBFBD><68> refdomain<69>jI<00>reftype<70><65>obj<62><6A> refexplicit<69><74><EFBFBD>refwarn<72><6E>h<EFBFBD>h<EFBFBD>h<EFBFBD>Nh<4E><68>tuple<6C>uhhxh"h#hKgh jubh0<68>, or
<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jh!hh"NhNubhy)<29><>}<7D>(h<05> :obj:`dict`<60>h]<5D>h)<29><>}<7D>(hjah]<5D>h0<68>dict<63><74><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jch!hh"NhNubah}<7D>(h]<5D>h]<5D>(h<><68>py<70><79>py-obj<62>eh]<5D>h]<5D>h]<5D>uhh~h j_ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F>h<EFBFBD><68> refdomain<69>jm<00>reftype<70><65>obj<62><6A> refexplicit<69><74><EFBFBD>refwarn<72><6E>h<EFBFBD>h<EFBFBD>h<EFBFBD>Nh<4E><68>dict<63>uhhxh"h#hKgh jubh0<68>).<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jh!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hKgh j<>h!hubh<)<29><>}<7D>(h<05>mSuppose we wanted a weight and a bias parameter in the above example. A nice
way to do that is the following:<3A>h]<5D>h0<68>mSuppose we wanted a weight and a bias parameter in the above example. A nice
way to do that is the following:<3A><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hKkh j<>h!hubhZ)<29><>}<7D>(hX<>def loss_fn(params, x, y):
w, b = params["weight"], params["bias"]
h = w * x + b
return mx.mean(mx.square(h - y))
params = {"weight": mx.array(1.0), "bias": mx.array(0.0)}
x = mx.array([0.5, -0.5])
y = mx.array([1.5, -1.5])
# Computes the gradient of loss_fn with respect to both the
# weight and bias:
grad_fn = mx.grad(loss_fn)
grads = grad_fn(params, x, y)
# Prints
# {'weight': array(-1, dtype=float32), 'bias': array(0, dtype=float32)}
print(grads)<29>h]<5D>h0X<30>def loss_fn(params, x, y):
w, b = params["weight"], params["bias"]
h = w * x + b
return mx.mean(mx.square(h - y))
params = {"weight": mx.array(1.0), "bias": mx.array(0.0)}
x = mx.array([0.5, -0.5])
y = mx.array([1.5, -1.5])
# Computes the gradient of loss_fn with respect to both the
# weight and bias:
grad_fn = mx.grad(loss_fn)
grads = grad_fn(params, x, y)
# Prints
# {'weight': array(-1, dtype=float32), 'bias': array(0, dtype=float32)}
print(grads)<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>h j<>sbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>hihjhk<68>hl<68>python<6F>hn}<7D>uhhYh"h#hKnh j<>h!hubh<)<29><>}<7D>(h<05>JNotice the tree structure of the parameters is preserved in the gradients.<2E>h]<5D>h0<68>JNotice the tree structure of the parameters is preserved in the gradients.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK<>h j<>h!hubh<)<29><>}<7D>(h<05><>In some cases you may want to stop gradients from propagating through a
part of the function. You can use the :func:`stop_gradient` for that.<2E>h]<5D>(h0<68>nIn some cases you may want to stop gradients from propagating through a
part of the function. You can use the <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubhy)<29><>}<7D>(h<05>:func:`stop_gradient`<60>h]<5D>h)<29><>}<7D>(hj<>h]<5D>h0<68>stop_gradient()<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>(h<><68>py<70><79>py-func<6E>eh]<5D>h]<5D>h]<5D>uhh~h j<>ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F>h<EFBFBD><68> refdomain<69>j<EFBFBD><00>reftype<70><65>func<6E><63> refexplicit<69><74><EFBFBD>refwarn<72><6E>h<EFBFBD>h<EFBFBD>h<EFBFBD>Nh<4E><68> stop_gradient<6E>uhhxh"h#hK<>h j<>ubh0<68>
for that.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK<>h j<>h!hubeh}<7D>(h]<5D><>automatic-differentiation<6F>ah]<5D>h]<5D><>automatic differentiation<6F>ah]<5D>h]<5D>uhh$h h&h!hh"h#hK-ubh%)<29><>}<7D>(hhh]<5D>(h*)<29><>}<7D>(h<05>Automatic Vectorization<6F>h]<5D>h0<68>Automatic Vectorization<6F><6E><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh)h j<>h!hh"h#hK<>ubh )<29><>}<7D>(h<05> .. _vmap:<3A>h]<5D>h}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>h<1C>vmap<61>uhh
hK<>h j<>h!hh"h#ubh<)<29><>}<7D>(h<05><>Use :func:`vmap` to automate vectorizing complex functions. Here we'll go
through a basic and contrived example for the sake of clarity, but :func:`vmap`
can be quite powerful for more complex functions which are difficult to optimize
by hand.<2E>h]<5D>(h0<68>Use <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j h!hh"NhNubhy)<29><>}<7D>(h<05> :func:`vmap`<60>h]<5D>h)<29><>}<7D>(hjh]<5D>h0<68>vmap()<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jh!hh"NhNubah}<7D>(h]<5D>h]<5D>(h<><68>py<70><79>py-func<6E>eh]<5D>h]<5D>h]<5D>uhh~h jubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F>h<EFBFBD><68> refdomain<69>j!<00>reftype<70><65>func<6E><63> refexplicit<69><74><EFBFBD>refwarn<72><6E>h<EFBFBD>h<EFBFBD>h<EFBFBD>Nh<4E><68>vmap<61>uhhxh"h#hK<>h j ubh0<68> to automate vectorizing complex functions. Here well go
through a basic and contrived example for the sake of clarity, but <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j h!hh"NhNubhy)<29><>}<7D>(h<05> :func:`vmap`<60>h]<5D>h)<29><>}<7D>(hj9h]<5D>h0<68>vmap()<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j;h!hh"NhNubah}<7D>(h]<5D>h]<5D>(h<><68>py<70><79>py-func<6E>eh]<5D>h]<5D>h]<5D>uhh~h j7ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F>h<EFBFBD><68> refdomain<69>jE<00>reftype<70><65>func<6E><63> refexplicit<69><74><EFBFBD>refwarn<72><6E>h<EFBFBD>h<EFBFBD>h<EFBFBD>Nh<4E><68>vmap<61>uhhxh"h#hK<>h j ubh0<68>Z
can be quite powerful for more complex functions which are difficult to optimize
by hand.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j h!hh"NhNubeh}<7D>(h]<5D>j
ah]<5D>h]<5D><>vmap<61>ah]<5D>h]<5D>uhh;h"h#hK<>h j<>h!hj<>}<7D>j_jsj<73>}<7D>j
jsubh <09>warning<6E><67><EFBFBD>)<29><>}<7D>(hXSome operations are not yet supported with :func:`vmap`. If you encounter an error
like: ``ValueError: Primitive's vmap not implemented.`` file an `issue
<https://github.com/ml-explore/mlx/issues>`_ and include your function.
We will prioritize including it.<2E>h]<5D>h<)<29><>}<7D>(hXSome operations are not yet supported with :func:`vmap`. If you encounter an error
like: ``ValueError: Primitive's vmap not implemented.`` file an `issue
<https://github.com/ml-explore/mlx/issues>`_ and include your function.
We will prioritize including it.<2E>h]<5D>(h0<68>+Some operations are not yet supported with <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jjh!hh"NhNubhy)<29><>}<7D>(h<05> :func:`vmap`<60>h]<5D>h)<29><>}<7D>(hjth]<5D>h0<68>vmap()<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jvh!hh"NhNubah}<7D>(h]<5D>h]<5D>(h<><68>py<70><79>py-func<6E>eh]<5D>h]<5D>h]<5D>uhh~h jrubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F>h<EFBFBD><68> refdomain<69>j<EFBFBD><00>reftype<70><65>func<6E><63> refexplicit<69><74><EFBFBD>refwarn<72><6E>h<EFBFBD>h<EFBFBD>h<EFBFBD>Nh<4E><68>vmap<61>uhhxh"h#hK<>h jjubh0<68>". If you encounter an error
like: <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jjh!hh"NhNubh)<29><>}<7D>(h<05>1``ValueError: Primitive's vmap not implemented.``<60>h]<5D>h0<68>-ValueError: Primitive's vmap not implemented.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh~h jjubh0<68> file an <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jjh!hh"NhNubh <09> reference<63><65><EFBFBD>)<29><>}<7D>(h<05>3`issue
<https://github.com/ml-explore/mlx/issues>`_<>h]<5D>h0<68>issue<75><65><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>name<6D><65>issue<75><65>refuri<72><69>(https://github.com/ml-explore/mlx/issues<65>uhj<>h jjubh )<29><>}<7D>(h<05>+
<https://github.com/ml-explore/mlx/issues><3E>h]<5D>h}<7D>(h]<5D><>issue<75>ah]<5D>h]<5D><>issue<75>ah]<5D>h]<5D><>refuri<72>j<EFBFBD>uhh
<EFBFBD>
referenced<EFBFBD>Kh jjubh0<68>< and include your function.
We will prioritize including it.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jjh!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK<>h jfubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhjdh j<>h!hh"h#hNubh<)<29><>}<7D>(h<05>HA naive way to add the elements from two sets of vectors is with a loop:<3A>h]<5D>h0<68>HA naive way to add the elements from two sets of vectors is with a loop:<3A><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK<>h j<>h!hubhZ)<29><>}<7D>(h<05><>xs = mx.random.uniform(shape=(4096, 100))
ys = mx.random.uniform(shape=(100, 4096))
def naive_add(xs, ys):
return [xs[i] + ys[:, i] for i in range(xs.shape[1])]<5D>h]<5D>h0<68><30>xs = mx.random.uniform(shape=(4096, 100))
ys = mx.random.uniform(shape=(100, 4096))
def naive_add(xs, ys):
return [xs[i] + ys[:, i] for i in range(xs.shape[1])]<5D><><EFBFBD><EFBFBD><EFBFBD>}<7D>h j<>sbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>hihjhk<68>hl<68>python<6F>hn}<7D>uhhYh"h#hK<>h j<>h!hubh<)<29><>}<7D>(h<05>IInstead you can use :func:`vmap` to automatically vectorize the addition:<3A>h]<5D>(h0<68>Instead you can use <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubhy)<29><>}<7D>(h<05> :func:`vmap`<60>h]<5D>h)<29><>}<7D>(hjh]<5D>h0<68>vmap()<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jh!hh"NhNubah}<7D>(h]<5D>h]<5D>(h<><68>py<70><79>py-func<6E>eh]<5D>h]<5D>h]<5D>uhh~h jubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F>h<EFBFBD><68> refdomain<69>j<00>reftype<70><65>func<6E><63> refexplicit<69><74><EFBFBD>refwarn<72><6E>h<EFBFBD>h<EFBFBD>h<EFBFBD>Nh<4E><68>vmap<61>uhhxh"h#hK<>h j<>ubh0<68>) to automatically vectorize the addition:<3A><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK<>h j<>h!hubhZ)<29><>}<7D>(h<05><># Vectorize over the second dimension of x and the
# first dimension of y
vmap_add = mx.vmap(lambda x, y: x + y, in_axes=(1, 0))<29>h]<5D>h0<68><30># Vectorize over the second dimension of x and the
# first dimension of y
vmap_add = mx.vmap(lambda x, y: x + y, in_axes=(1, 0))<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>h j*sbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>hihjhk<68>hl<68>python<6F>hn}<7D>uhhYh"h#hK<>h j<>h!hubh<)<29><>}<7D>(h<05><>The ``in_axes`` parameter can be used to specify which dimensions of the
corresponding input to vectorize over. Similarly, use ``out_axes`` to specify
where the vectorized axes should be in the outputs.<2E>h]<5D>(h0<68>The <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j:h!hh"NhNubh)<29><>}<7D>(h<05> ``in_axes``<60>h]<5D>h0<68>in_axes<65><73><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jBh!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh~h j:ubh0<68>p parameter can be used to specify which dimensions of the
corresponding input to vectorize over. Similarly, use <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j:h!hh"NhNubh)<29><>}<7D>(h<05> ``out_axes``<60>h]<5D>h0<68>out_axes<65><73><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jTh!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh~h j:ubh0<68>? to specify
where the vectorized axes should be in the outputs.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j:h!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK<>h j<>h!hubh<)<29><>}<7D>(h<05>(Let's time these two different versions:<3A>h]<5D>h0<68>*Lets time these two different versions:<3A><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h jlh!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK<>h j<>h!hubhZ)<29><>}<7D>(h<05><>import timeit
print(timeit.timeit(lambda: mx.eval(naive_add(xs, ys)), number=100))
print(timeit.timeit(lambda: mx.eval(vmap_add(xs, ys)), number=100))<29>h]<5D>h0<68><30>import timeit
print(timeit.timeit(lambda: mx.eval(naive_add(xs, ys)), number=100))
print(timeit.timeit(lambda: mx.eval(vmap_add(xs, ys)), number=100))<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>h jzsbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>hihjhk<68>hl<68>python<6F>hn}<7D>uhhYh"h#hK<>h j<>h!hubh<)<29><>}<7D>(h<05><>On an M1 Max the naive version takes in total ``0.390`` seconds whereas the
vectorized version takes only ``0.025`` seconds, more than ten times faster.<2E>h]<5D>(h0<68>.On an M1 Max the naive version takes in total <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubh)<29><>}<7D>(h<05> ``0.390``<60>h]<5D>h0<68>0.390<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh~h j<>ubh0<68>3 seconds whereas the
vectorized version takes only <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubh)<29><>}<7D>(h<05> ``0.025``<60>h]<5D>h0<68>0.025<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh~h j<>ubh0<68>% seconds, more than ten times faster.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK<>h j<>h!hubh<)<29><>}<7D>(h<05><>Of course, this operation is quite contrived. A better approach is to simply do
``xs + ys.T``, but for more complex functions :func:`vmap` can be quite handy.<2E>h]<5D>(h0<68>POf course, this operation is quite contrived. A better approach is to simply do
<EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubh)<29><>}<7D>(h<05> ``xs + ys.T``<60>h]<5D>h0<68> xs + ys.T<><54><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh~h j<>ubh0<68>!, but for more complex functions <20><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubhy)<29><>}<7D>(h<05> :func:`vmap`<60>h]<5D>h)<29><>}<7D>(hj<>h]<5D>h0<68>vmap()<29><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubah}<7D>(h]<5D>h]<5D>(h<><68>py<70><79>py-func<6E>eh]<5D>h]<5D>h]<5D>uhh~h j<>ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>refdoc<6F>h<EFBFBD><68> refdomain<69>j<EFBFBD><00>reftype<70><65>func<6E><63> refexplicit<69><74><EFBFBD>refwarn<72><6E>h<EFBFBD>h<EFBFBD>h<EFBFBD>Nh<4E><68>vmap<61>uhhxh"h#hK<>h j<>ubh0<68> can be quite handy.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>(h j<>h!hh"NhNubeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h"h#hK<>h j<>h!hubeh}<7D>(h]<5D><>automatic-vectorization<6F>ah]<5D>h]<5D><>automatic vectorization<6F>ah]<5D>h]<5D>uhh$h h&h!hh"h#hK<>ubeh}<7D>(h]<5D>(h<1D>id1<64>eh]<5D>h]<5D>(<28>function transforms<6D><73>function_transforms<6D>eh]<5D>h]<5D>uhh$h hh!hh"h#hKj<>}<7D>jh sj<73>}<7D>hh subeh}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>source<63>h#uhh<01>current_source<63>N<EFBFBD> current_line<6E>N<EFBFBD>settings<67><73>docutils.frontend<6E><64>Values<65><73><EFBFBD>)<29><>}<7D>(h)N<> generator<6F>N<EFBFBD> datestamp<6D>N<EFBFBD> source_link<6E>N<EFBFBD>
source_url<EFBFBD>N<EFBFBD> toc_backlinks<6B><73>entry<72><79>footnote_backlinks<6B>K<01> sectnum_xform<72>K<01>strip_comments<74>N<EFBFBD>strip_elements_with_classes<65>N<EFBFBD> strip_classes<65>N<EFBFBD> report_level<65>K<02>
halt_level<EFBFBD>K<05>exit_status_level<65>K<05>debug<75>N<EFBFBD>warning_stream<61>N<EFBFBD> traceback<63><6B><EFBFBD>input_encoding<6E><67> utf-8-sig<69><67>input_encoding_error_handler<65><72>strict<63><74>output_encoding<6E><67>utf-8<><38>output_encoding_error_handler<65>j6<00>error_encoding<6E><67>utf-8<><38>error_encoding_error_handler<65><72>backslashreplace<63><65> language_code<64><65>en<65><6E>record_dependencies<65>N<EFBFBD>config<69>N<EFBFBD> id_prefix<69>h<06>auto_id_prefix<69><78>id<69><64> dump_settings<67>N<EFBFBD>dump_internals<6C>N<EFBFBD>dump_transforms<6D>N<EFBFBD>dump_pseudo_xml<6D>N<EFBFBD>expose_internals<6C>N<EFBFBD>strict_visitor<6F>N<EFBFBD>_disable_config<69>N<EFBFBD>_source<63>h#<23> _destination<6F>N<EFBFBD> _config_files<65>]<5D><>file_insertion_enabled<65><64><EFBFBD> raw_enabled<65>K<01>line_length_limit<69>M'<27>pep_references<65>N<EFBFBD> pep_base_url<72><6C>https://peps.python.org/<2F><>pep_file_url_template<74><65>pep-%04d<34><64>rfc_references<65>N<EFBFBD> rfc_base_url<72><6C>&https://datatracker.ietf.org/doc/html/<2F><> tab_width<74>K<08>trim_footnote_reference_space<63><65><EFBFBD>syntax_highlight<68><74>long<6E><67> smart_quotes<65><73><EFBFBD>smartquotes_locales<65>]<5D><>character_level_inline_markup<75><70><EFBFBD>doctitle_xform<72><6D><EFBFBD> docinfo_xform<72>K<01>sectsubtitle_xform<72><6D><EFBFBD> image_loading<6E><67>link<6E><6B>embed_stylesheet<65><74><EFBFBD>cloak_email_addresses<65><73><EFBFBD>section_self_link<6E><6B><EFBFBD>env<6E>Nub<75>reporter<65>N<EFBFBD>indirect_targets<74>]<5D><>substitution_defs<66>}<7D><>substitution_names<65>}<7D><>refnames<65>}<7D><>refids<64>}<7D>(h]<5D>h aj<61>]<5D>j<EFBFBD>aj
]<5D>jau<61>nameids<64>}<7D>(jhj j
j<>j<>j<>j<>jjj_j
j<>j<>u<> nametypes<65>}<7D>(j<00>j <00>j<EFBFBD><00>j<EFBFBD><00>j<00>j_<00>j<EFBFBD><00>uh}<7D>(hh&j
h&j<>j<>j<>j<>jj<>j
j j<>j<>u<> footnote_refs<66>}<7D><> citation_refs<66>}<7D><> autofootnotes<65>]<5D><>autofootnote_refs<66>]<5D><>symbol_footnotes<65>]<5D><>symbol_footnote_refs<66>]<5D><> footnotes<65>]<5D><> citations<6E>]<5D><>autofootnote_start<72>K<01>symbol_footnote_start<72>K<00>
id_counter<EFBFBD><EFBFBD> collections<6E><73>Counter<65><72><EFBFBD>}<7D>jDKs<><73>R<EFBFBD><52>parse_messages<65>]<5D><>transform_messages<65>]<5D>(h <09>system_message<67><65><EFBFBD>)<29><>}<7D>(hhh]<5D>h<)<29><>}<7D>(hhh]<5D>h0<68>9Hyperlink target "function-transforms" is not referenced.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>h j<>sbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h j<>ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>level<65>K<01>type<70><65>INFO<46><4F>source<63>h#<23>line<6E>Kuhj<>ubj<62>)<29><>}<7D>(hhh]<5D>h<)<29><>}<7D>(hhh]<5D>h0<68>/Hyperlink target "auto-diff" is not referenced.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>h j<>sbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h j<>ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>level<65>K<01>type<70>j<EFBFBD><00>source<63>h#<23>line<6E>K/uhj<>ubj<62>)<29><>}<7D>(hhh]<5D>h<)<29><>}<7D>(hhh]<5D>h0<68>*Hyperlink target "vmap" is not referenced.<2E><><EFBFBD><EFBFBD><EFBFBD>}<7D>h j<>sbah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D>uhh;h j<>ubah}<7D>(h]<5D>h]<5D>h]<5D>h]<5D>h]<5D><>level<65>K<01>type<70>j<EFBFBD><00>source<63>h#<23>line<6E>K<EFBFBD>uhj<>ube<62> transformer<65>N<EFBFBD> include_log<6F>]<5D><>
decoration<EFBFBD>Nh!hub.