docs

2025-12-16 01:49:05 +08:00 · 2023-12-17 13:23:03 -08:00
parent f75712551d
commit 0250e203f6
319 changed files with 39952 additions and 21579 deletions
--- a/docs/build/html/_sources/python/nn.rst
+++ b/docs/build/html/_sources/python/nn.rst
@@ -64,7 +64,6 @@ Quick Start with Neural Networks
    # gradient with respect to `mlp.trainable_parameters()`
    loss_and_grad = nn.value_and_grad(mlp, l2_loss)

-
 .. _module_class:

 The Module Class
@@ -86,20 +85,58 @@ name should not start with ``_``). It can be arbitrarily nested in other
 :meth:`Module.parameters` can be used to extract a nested dictionary with all
 the parameters of a module and its submodules.

-A :class:`Module` can also keep track of "frozen" parameters.
-:meth:`Module.trainable_parameters` returns only the subset of
-:meth:`Module.parameters` that is not frozen. When using
-:meth:`mlx.nn.value_and_grad` the gradients returned will be with respect to these
-trainable parameters.
+A :class:`Module` can also keep track of "frozen" parameters. See the
+:meth:`Module.freeze` method for more details. :meth:`mlx.nn.value_and_grad`
+the gradients returned will be with respect to these trainable parameters.

-Updating the parameters
+
+Updating the Parameters
 ^^^^^^^^^^^^^^^^^^^^^^^

 MLX modules allow accessing and updating individual parameters. However, most
 times we need to update large subsets of a module's parameters. This action is
 performed by :meth:`Module.update`.

-Value and grad
+
+Inspecting Modules
+^^^^^^^^^^^^^^^^^^
+
+The simplest way to see the model architecture is to print it. Following along with
+the above example, you can print the ``MLP`` with:
+
+.. code-block:: python
+
+  print(mlp)
+
+This will display:
+
+.. code-block:: shell
+
+  MLP(
+    (layers.0): Linear(input_dims=2, output_dims=128, bias=True)
+    (layers.1): Linear(input_dims=128, output_dims=128, bias=True)
+    (layers.2): Linear(input_dims=128, output_dims=10, bias=True)
+  )
+
+To get more detailed information on the arrays in a :class:`Module` you can use
+:func:`mlx.utils.tree_map` on the parameters. For example, to see the shapes of
+all the parameters in a :class:`Module` do:
+
+.. code-block:: python
+    
+   from mlx.utils import tree_map
+   shapes = tree_map(lambda p: p.shape, mlp.parameters())
+
+As another example, you can count the number of parameters in a :class:`Module`
+with:
+
+.. code-block:: python
+    
+   from mlx.utils import tree_flatten
+   num_params = sum(v.size for _, v in tree_flatten(mlp.parameters()))
+
+
+Value and Grad
 --------------

 Using a :class:`Module` does not preclude using MLX's high order function
@@ -133,62 +170,14 @@ In detail:
  :meth:`mlx.core.value_and_grad`

 .. autosummary::
+   :recursive:
   :toctree: _autosummary

   value_and_grad
+   Module

-Neural Network Layers
---------------------
+.. toctree::

-.. autosummary::
-   :toctree: _autosummary
-   :template: nn-module-template.rst
-
-   Embedding
-   ReLU
-   PReLU
-   GELU
-   SiLU
-   Step
-   SELU
-   Mish
-   Linear
-   Conv1d
-   Conv2d
-   LayerNorm
-   RMSNorm
-   GroupNorm
-   RoPE
-   MultiHeadAttention
-   Sequential
-
-Layers without parameters (e.g. activation functions) are also provided as
-simple functions.
-
-.. autosummary::
-   :toctree: _autosummary_functions
-   :template: nn-module-template.rst
-
-   gelu
-   gelu_approx
-   gelu_fast_approx
-   relu
-   prelu
-   silu
-   step
-   selu
-   mish
-
-Loss Functions
--------------
-
-.. autosummary::
-   :toctree: _autosummary_functions
-   :template: nn-module-template.rst
-
-   losses.cross_entropy
-   losses.binary_cross_entropy
-   losses.l1_loss
-   losses.mse_loss
-   losses.nll_loss
-   losses.kl_div_loss
+   nn/layers
+   nn/functions
+   nn/losses