Indexing Arrays#
-For the most part, indexing an MLX array
works the same as indexing a
-NumPy numpy.ndarray
. See the NumPy documentation for more details on
-how that works
diff --git a/docs/build/html/_sources/index.rst b/docs/build/html/_sources/index.rst
index 9f0445a18..f1fe468ca 100644
--- a/docs/build/html/_sources/index.rst
+++ b/docs/build/html/_sources/index.rst
@@ -35,9 +35,10 @@ are the CPU and GPU.
:caption: Usage
:maxdepth: 1
- quick_start
- unified_memory
- using_streams
+ usage/quick_start
+ usage/unified_memory
+ usage/using_streams
+ usage/numpy
.. toctree::
:caption: Examples
diff --git a/docs/build/html/_sources/indexing.rst b/docs/build/html/_sources/indexing.rst
deleted file mode 100644
index 093fb1f58..000000000
--- a/docs/build/html/_sources/indexing.rst
+++ /dev/null
@@ -1,12 +0,0 @@
-.. _indexing:
-
-Indexing Arrays
-===============
-
-.. currentmodule:: mlx.core
-
-For the most part, indexing an MLX :obj:`array` works the same as indexing a
-NumPy :obj:`numpy.ndarray`. See the `NumPy documentation
- Usage Examples Usage Examples Usage Examples
-
diff --git a/docs/build/html/dev/extensions.html b/docs/build/html/dev/extensions.html
index e7ac596e4..bde17609a 100644
--- a/docs/build/html/dev/extensions.html
+++ b/docs/build/html/dev/extensions.html
@@ -146,9 +146,10 @@
-
diff --git a/docs/build/html/examples/linear_regression.html b/docs/build/html/examples/linear_regression.html
index ced953d14..5d2417057 100644
--- a/docs/build/html/examples/linear_regression.html
+++ b/docs/build/html/examples/linear_regression.html
@@ -47,7 +47,7 @@
-
+
@@ -147,9 +147,10 @@
-
@@ -705,12 +706,12 @@ examples are available in the MLX GitHub repo.
previous
-Using Streams
+Conversion to NumPy and Other Frameworks
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
For the most part, indexing an MLX array
works the same as indexing a
-NumPy numpy.ndarray
. See the NumPy documentation for more details on
-how that works
Usage
Examples
next
diff --git a/docs/build/html/objects.inv b/docs/build/html/objects.inv index 33e8aa013..e2a06a3c7 100644 Binary files a/docs/build/html/objects.inv and b/docs/build/html/objects.inv differ diff --git a/docs/build/html/python/_autosummary/mlx.core.Device.html b/docs/build/html/python/_autosummary/mlx.core.Device.html index 961b9b13e..a92ee7909 100644 --- a/docs/build/html/python/_autosummary/mlx.core.Device.html +++ b/docs/build/html/python/_autosummary/mlx.core.Device.html @@ -147,9 +147,10 @@Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Create an identity matrix or a general diagonal matrix.
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Create a square identity matrix.
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Construct an array of ones.
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Sample from the standard Gumbel distribution.
The values are sampled from a standard Gumbel distribution
which CDF exp(-exp(-x))
.
Usage
Examples
Usage
Examples
Generate normally distributed random numbers.
Usage
Examples
Generate random integers from the given interval.
The values are sampled with equal probability from the integers in
half-open interval [low, high)
. The lower and upper bound can be
diff --git a/docs/build/html/python/_autosummary/mlx.core.random.seed.html b/docs/build/html/python/_autosummary/mlx.core.random.seed.html
index 69c16c3c4..07c3df57c 100644
--- a/docs/build/html/python/_autosummary/mlx.core.random.seed.html
+++ b/docs/build/html/python/_autosummary/mlx.core.random.seed.html
@@ -147,9 +147,10 @@
Usage
Examples
Usage
Examples
Usage
Examples
Generate values from a truncated normal distribution.
The values are sampled from the truncated normal distribution
on the domain (lower, upper)
. The bounds lower
and upper
diff --git a/docs/build/html/python/_autosummary/mlx.core.random.uniform.html b/docs/build/html/python/_autosummary/mlx.core.random.uniform.html
index b8d3641a4..37172908d 100644
--- a/docs/build/html/python/_autosummary/mlx.core.random.uniform.html
+++ b/docs/build/html/python/_autosummary/mlx.core.random.uniform.html
@@ -147,9 +147,10 @@
Usage
Examples
Generate uniformly distributed random numbers.
The values are sampled uniformly in the half-open interval [low, high)
.
The lower and upper bound can be scalars or arrays and must be
diff --git a/docs/build/html/python/_autosummary/mlx.core.reciprocal.html b/docs/build/html/python/_autosummary/mlx.core.reciprocal.html
index e5ecbaca2..fc7af35a0 100644
--- a/docs/build/html/python/_autosummary/mlx.core.reciprocal.html
+++ b/docs/build/html/python/_autosummary/mlx.core.reciprocal.html
@@ -147,9 +147,10 @@
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Construct an array of zeros.
Usage
Examples
Base class for building neural networks with MLX.
-All the layers provided in mlx.nn.layers
subclass this class and
-your models should do the same.
A Module
can contain other Module
instances or mlx.core.array
-instances in arbitrary nesting of python lists or dicts. The Module
-then allows recursively extracting all the mlx.core.array
instances
-using mlx.nn.Module.parameters()
.
In addition, the Module
has the concept of trainable and non trainable
-parameters (called “frozen”). When using mlx.nn.value_and_grad()
-the gradients are returned only with respect to the trainable parameters.
-All arrays in a module are trainable unless they are added in the “frozen”
-set by calling freeze()
.
import mlx.core as mx
-import mlx.nn as nn
-
-class MyMLP(nn.Module):
- def __init__(self, in_dims: int, out_dims: int, hidden_dims: int = 16):
- super().__init__()
-
- self.in_proj = nn.Linear(in_dims, hidden_dims)
- self.out_proj = nn.Linear(hidden_dims, out_dims)
-
- def __call__(self, x):
- x = self.in_proj(x)
- x = mx.maximum(x, 0)
- return self.out_proj(x)
-
-model = MyMLP(2, 1)
-
-# All the model parameters are created but since MLX is lazy by
-# default, they are not evaluated yet. Calling `mx.eval` actually
-# allocates memory and initializes the parameters.
-mx.eval(model.parameters())
-
-# Setting a parameter to a new value is as simply as accessing that
-# parameter and assigning a new array to it.
-model.in_proj.weight = model.in_proj.weight * 2
-mx.eval(model.parameters())
-
Should be called by the subclasses of Module
.
Methods
-
|
-Should be called by the subclasses of |
-
|
-Map all the parameters using the provided |
-
|
-Apply a function to all the modules in this instance (including this instance). |
-
|
-Return the direct descendants of this Module instance. |
-
|
-- |
|
-- |
|
-- |
|
-Recursively filter the contents of the module using |
-
|
-Freeze the Module's parameters or some of them. |
-
|
-Create a new dictionary with keys from iterable and values set to value. |
-
|
-Return the value for key if key is in the dictionary, else default. |
-
|
-- |
|
-- |
|
-- |
|
-Return the submodules that do not contain other modules. |
-
|
-Load and update the model's weights from a .npz file. |
-
|
-Return a list with all the modules in this instance. |
-
|
-Return a list with all the modules in this instance and their name with dot notation. |
-
|
-Recursively return all the |
-
|
-If key is not found, default is returned if given, otherwise KeyError is raised |
-
|
-Remove and return a (key, value) pair as a 2-tuple. |
-
|
-Save the model's weights to a .npz file. |
-
|
-Insert key with a value of default if key is not in the dictionary. |
-
|
-- |
|
-- |
|
-Recursively return all the non frozen |
-
|
-Unfreeze the Module's parameters or some of them. |
-
|
-Replace the parameters of this Module with the provided ones in the dict of dicts and lists. |
-
|
-Replace the child modules of this |
-
|
-- |
|
-- |
|
-- |
Attributes
-
|
-- |
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Computes the smooth L1 loss.
-The smooth L1 loss is a variant of the L1 loss which replaces the absolute
-difference with a squared difference when the absolute difference is less
-than beta
.
The formula for the smooth L1 Loss is:
-predictions (array) – Predicted values.
targets (array) – Ground truth values.
beta (float, optional) – The threshold after which the loss changes
-from the squared to the absolute difference. Default: 1.0
.
reduction (str, optional) – Specifies the reduction to apply to the output:
-'none'
| 'mean'
| 'sum'
. Default: 'mean'
.
The computed smooth L1 loss.
-Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Usage
Examples
Import mlx.core
and make an array
:
>> import mlx.core as mx
->> a = mx.array([1, 2, 3, 4])
->> a.shape
-[4]
->> a.dtype
-int32
->> b = mx.array([1.0, 2.0, 3.0, 4.0])
->> b.dtype
-float32
-
Operations in MLX are lazy. The outputs of MLX operations are not computed
-until they are needed. To force an array to be evaluated use
-eval()
. Arrays will automatically be evaluated in a few cases. For
-example, inspecting a scalar with array.item()
, printing an array,
-or converting an array from array
to numpy.ndarray
all
-automatically evaluate the array.
>> c = a + b # c not yet evaluated
->> mx.eval(c) # evaluates c
->> c = a + b
->> print(c) # Also evaluates c
-array([2, 4, 6, 8], dtype=float32)
->> c = a + b
->> import numpy as np
->> np.array(c) # Also evaluates c
-array([2., 4., 6., 8.], dtype=float32)
-
MLX has standard function transformations like grad()
and vmap()
.
-Transformations can be composed arbitrarily. For example
-grad(vmap(grad(fn)))
(or any other composition) is allowed.
>> x = mx.array(0.0)
->> mx.sin(x)
-array(0, dtype=float32)
->> mx.grad(mx.sin)(x)
-array(1, dtype=float32)
->> mx.grad(mx.grad(mx.sin))(x)
-array(-0, dtype=float32)
-
Other gradient transformations include vjp()
for vector-Jacobian products
-and jvp()
for Jacobian-vector products.
Use value_and_grad()
to efficiently compute both a function’s output and
-gradient with respect to the function’s input.
Usage
Examples
Apple silicon has a unified memory architecture. The CPU and GPU have direct -access to the same memory pool. MLX is designed to take advantage of that.
-Concretely, when you make an array in MLX you don’t have to specify its location:
-a = mx.random.normal((100,))
-b = mx.random.normal((100,))
-
Both a
and b
live in unified memory.
In MLX, rather than moving arrays to devices, you specify the device when you
-run the operation. Any device can perform any operation on a
and b
-without needing to move them from one memory location to another. For example:
mx.add(a, b, stream=mx.cpu)
-mx.add(a, b, stream=mx.gpu)
-
In the above, both the CPU and the GPU will perform the same add -operation. The operations can (and likely will) be run in parallel since -there are no dependencies between them. See Using Streams for more -information the semantics of streams in MLX.
-In the above add
example, there are no dependencies between operations, so
-there is no possibility for race conditions. If there are dependencies, the
-MLX scheduler will automatically manage them. For example:
c = mx.add(a, b, stream=mx.cpu)
-d = mx.add(a, c, stream=mx.gpu)
-
In the above case, the second add
runs on the GPU but it depends on the
-output of the first add
which is running on the CPU. MLX will
-automatically insert a dependency between the two streams so that the second
-add
only starts executing after the first is complete and c
is
-available.
Here is a more interesting (albeit slightly contrived example) of how unified -memory can be helpful. Suppose we have the following computation:
-def fun(a, b, d1, d2):
- x = mx.matmul(a, b, stream=d1)
- for _ in range(500):
- b = mx.exp(b, stream=d2)
- return x, b
-
which we want to run with the following arguments:
-a = mx.random.uniform(shape=(4096, 512))
-b = mx.random.uniform(shape=(512, 4))
-
The first matmul
operation is a good fit for the GPU since it’s more
-compute dense. The second sequence of operations are a better fit for the CPU,
-since they are very small and would probably be overhead bound on the GPU.
If we time the computation fully on the GPU, we get 2.8 milliseconds. But if we
-run the computation with d1=mx.gpu
and d2=mx.cpu
, then the time is only
-about 1.4 milliseconds, about twice as fast. These times were measured on an M1
-Max.
MLX array implements the Python Buffer Protocol. +Let’s convert an array to NumPy and back.
+import mlx.core as mx
+import numpy as np
+
+a = mx.arange(3)
+b = np.array(a) # copy of a
+c = mx.array(b) # copy of b
+
Note
+Since NumPy does not support bfloat16
arrays, you will need to convert to float16
or float32
first:
+np.array(a.astype(mx.float32))
.
+Otherwise, you will receive an error like: Item size 2 for PEP 3118 buffer format string does not match the dtype V item size 0.
By default, NumPy copies data to a new array. This can be prevented by creating an array view:
+a = mx.arange(3)
+a_view = np.array(a, copy=False)
+print(a_view.flags.owndata) # False
+a_view[0] = 1
+print(a[0].item()) # 1
+
A NumPy array view is a normal NumPy array, except that it does not own its memory. +This means writing to the view is reflected in the original array.
+While this is quite powerful to prevent copying arrays, it should be noted that external changes to the memory of arrays cannot be reflected in gradients.
+Let’s demonstrate this in an example:
+def f(x):
+ x_view = np.array(x, copy=False)
+ x_view[:] *= x_view # modify memory without telling mx
+ return x.sum()
+
+x = mx.array([3.0])
+y, df = mx.value_and_grad(f)(x)
+print("f(x) = x² =", y.item()) # 9.0
+print("f'(x) = 2x !=", df.item()) # 1.0
+
The function f
indirectly modifies the array x
through a memory view.
+However, this modification is not reflected in the gradient, as seen in the last line outputting 1.0
,
+representing the gradient of the sum operation alone.
+The squaring of x
occurs externally to MLX, meaning that no gradient is incorporated.
+It’s important to note that a similar issue arises during array conversion and copying.
+For instance, a function defined as mx.array(np.array(x)**2).sum()
would also result in an incorrect gradient,
+even though no in-place operations on MLX memory are executed.
PyTorch supports the buffer protocol, but it requires an explicit memoryview
.
import mlx.core as mx
+import torch
+
+a = mx.arange(3)
+b = torch.tensor(memoryview(a))
+c = mx.array(b.numpy())
+
Conversion from PyTorch tensors back to arrays must be done via intermediate NumPy arrays with numpy()
.
JAX fully supports the buffer protocol.
+import mlx.core as mx
+import jax.numpy as jnp
+
+a = mx.arange(3)
+b = jnp.array(a)
+c = mx.array(b)
+
TensorFlow supports the buffer protocol, but it requires an explicit memoryview
.
import mlx.core as mx
+import tensorflow as tf
+
+a = mx.arange(3)
+b = tf.constant(memoryview(a))
+c = mx.array(b)
+
Import mlx.core
and make an array
:
>> import mlx.core as mx
+>> a = mx.array([1, 2, 3, 4])
+>> a.shape
+[4]
+>> a.dtype
+int32
+>> b = mx.array([1.0, 2.0, 3.0, 4.0])
+>> b.dtype
+float32
+
Operations in MLX are lazy. The outputs of MLX operations are not computed
+until they are needed. To force an array to be evaluated use
+eval()
. Arrays will automatically be evaluated in a few cases. For
+example, inspecting a scalar with array.item()
, printing an array,
+or converting an array from array
to numpy.ndarray
all
+automatically evaluate the array.
>> c = a + b # c not yet evaluated
+>> mx.eval(c) # evaluates c
+>> c = a + b
+>> print(c) # Also evaluates c
+array([2, 4, 6, 8], dtype=float32)
+>> c = a + b
+>> import numpy as np
+>> np.array(c) # Also evaluates c
+array([2., 4., 6., 8.], dtype=float32)
+
MLX has standard function transformations like grad()
and vmap()
.
+Transformations can be composed arbitrarily. For example
+grad(vmap(grad(fn)))
(or any other composition) is allowed.
>> x = mx.array(0.0)
+>> mx.sin(x)
+array(0, dtype=float32)
+>> mx.grad(mx.sin)(x)
+array(1, dtype=float32)
+>> mx.grad(mx.grad(mx.sin))(x)
+array(-0, dtype=float32)
+
Other gradient transformations include vjp()
for vector-Jacobian products
+and jvp()
for Jacobian-vector products.
Use value_and_grad()
to efficiently compute both a function’s output and
+gradient with respect to the function’s input.
Apple silicon has a unified memory architecture. The CPU and GPU have direct +access to the same memory pool. MLX is designed to take advantage of that.
+Concretely, when you make an array in MLX you don’t have to specify its location:
+a = mx.random.normal((100,))
+b = mx.random.normal((100,))
+
Both a
and b
live in unified memory.
In MLX, rather than moving arrays to devices, you specify the device when you
+run the operation. Any device can perform any operation on a
and b
+without needing to move them from one memory location to another. For example:
mx.add(a, b, stream=mx.cpu)
+mx.add(a, b, stream=mx.gpu)
+
In the above, both the CPU and the GPU will perform the same add +operation. The operations can (and likely will) be run in parallel since +there are no dependencies between them. See Using Streams for more +information the semantics of streams in MLX.
+In the above add
example, there are no dependencies between operations, so
+there is no possibility for race conditions. If there are dependencies, the
+MLX scheduler will automatically manage them. For example:
c = mx.add(a, b, stream=mx.cpu)
+d = mx.add(a, c, stream=mx.gpu)
+
In the above case, the second add
runs on the GPU but it depends on the
+output of the first add
which is running on the CPU. MLX will
+automatically insert a dependency between the two streams so that the second
+add
only starts executing after the first is complete and c
is
+available.
Here is a more interesting (albeit slightly contrived example) of how unified +memory can be helpful. Suppose we have the following computation:
+def fun(a, b, d1, d2):
+ x = mx.matmul(a, b, stream=d1)
+ for _ in range(500):
+ b = mx.exp(b, stream=d2)
+ return x, b
+
which we want to run with the following arguments:
+a = mx.random.uniform(shape=(4096, 512))
+b = mx.random.uniform(shape=(512, 4))
+
The first matmul
operation is a good fit for the GPU since it’s more
+compute dense. The second sequence of operations are a better fit for the CPU,
+since they are very small and would probably be overhead bound on the GPU.
If we time the computation fully on the GPU, we get 2.8 milliseconds. But if we
+run the computation with d1=mx.gpu
and d2=mx.cpu
, then the time is only
+about 1.4 milliseconds, about twice as fast. These times were measured on an M1
+Max.
Stream
#All operations (including random number generation) take an optional
+keyword argument stream
. The stream
kwarg specifies which
+Stream
the operation should run on. If the stream is unspecified then
+the operation is run on the default stream of the default device:
+mx.default_stream(mx.default_device())
. The stream
kwarg can also
+be a Device
(e.g. stream=my_device
) in which case the operation is
+run on the default stream of the provided device
+mx.default_stream(my_device)
.
Stream
#All operations (including random number generation) take an optional
-keyword argument stream
. The stream
kwarg specifies which
-Stream
the operation should run on. If the stream is unspecified then
-the operation is run on the default stream of the default device:
-mx.default_stream(mx.default_device())
. The stream
kwarg can also
-be a Device
(e.g. stream=my_device
) in which case the operation is
-run on the default stream of the provided device
-mx.default_stream(my_device)
.