Address more comments

2025-12-16 01:49:05 +08:00 · 2025-08-20 17:19:36 -07:00
parent d6b204b528
commit 6f608857db
3 changed files with 11 additions and 1 deletions
--- a/docs/src/python/cuda.rst
+++ b/docs/src/python/cuda.rst
@@ -0,0 +1,9 @@
+CUDA
+=====
+
+.. currentmodule:: mlx.core.cuda
+
+.. autosummary::
+  :toctree: _autosummary
+
+  is_available
--- a/docs/src/python/fast.rst
+++ b/docs/src/python/fast.rst
@@ -13,3 +13,4 @@ Fast
  rope
  scaled_dot_product_attention
  metal_kernel
+  cuda_kernel
--- a/python/src/fast.cpp
+++ b/python/src/fast.cpp
@@ -438,7 +438,7 @@ void init_fast(nb::module_& parent_module) {
              output_shapes (List[Sequence[int]]): The list of shapes for each output in ``output_names``.
              output_dtypes (List[Dtype]): The list of data types for each output in ``output_names``.
              grid (tuple[int, int, int]): 3-tuple specifying the grid to launch the kernel with.
-                For compatibility with :func:`metal_kernel` the grid is in threads and not in threadblocks.
+                For compatibility with :func:`metal_kernel` the grid is in threads and not in threadgroups.
              threadgroup (tuple[int, int, int]): 3-tuple specifying the threadgroup size to use.
              template (List[Tuple[str, Union[bool, int, Dtype]]], optional): Template arguments.
                  These will be added as template arguments to the kernel definition. Default: ``None``.