mlx/mlx/backend/metal/kernels
2025-03-28 14:31:53 -07:00
..
fft Feature complete Metal FFT (#1102) 2024-06-06 12:57:25 -07:00
jit Dispatch bf16 at run time when using the JIT (#1584) 2024-11-15 16:54:36 -08:00
metal_3_0 Dispatch bf16 at run time when using the JIT (#1584) 2024-11-15 16:54:36 -08:00
metal_3_1 Dispatch bf16 at run time when using the JIT (#1584) 2024-11-15 16:54:36 -08:00
reduction Use int64 stride everywhere (#1671) 2024-12-09 11:09:02 -08:00
steel Fix looping limit in causal attention (#1999) 2025-03-24 12:28:00 -07:00
arange.h More jitting (#1132) 2024-05-23 16:23:44 -07:00
arange.metal Dispatch bf16 at run time when using the JIT (#1584) 2024-11-15 16:54:36 -08:00
arg_reduce.metal Use int64 stride everywhere (#1671) 2024-12-09 11:09:02 -08:00
atomic.h Refactor reductions and fix scatter atomics for large sizes (#1300) 2024-08-22 16:03:31 -07:00
bf16_math.h Dispatch bf16 at run time when using the JIT (#1584) 2024-11-15 16:54:36 -08:00
binary_ops.h Fix complex power on Metal (#1460) 2024-10-06 19:58:30 -07:00
binary_two.h Use int64 stride everywhere (#1671) 2024-12-09 11:09:02 -08:00
binary_two.metal Allow no copy negative strides in as_strided and slice (#1688) 2024-12-12 08:59:45 -08:00
binary.h Use int64 stride everywhere (#1671) 2024-12-09 11:09:02 -08:00
binary.metal Allow no copy negative strides in as_strided and slice (#1688) 2024-12-12 08:59:45 -08:00
CMakeLists.txt use minimum deployment target (#2016) 2025-03-28 14:31:53 -07:00
complex.h Refactor reductions and fix scatter atomics for large sizes (#1300) 2024-08-22 16:03:31 -07:00
conv.metal Fix convs by reverting #1803 (#1882) 2025-02-18 14:36:34 -08:00
copy.h Dynamic slicing (#1741) 2025-01-07 14:02:16 -08:00
copy.metal Dynamic slicing (#1741) 2025-01-07 14:02:16 -08:00
defines.h Refactor reductions and fix scatter atomics for large sizes (#1300) 2024-08-22 16:03:31 -07:00
erf.h JIT compile option for binary minimization (#1091) 2024-05-22 12:57:13 -07:00
expm1f.h Fix overflow / underflow handling for expm1f (#1278) 2024-07-23 07:29:06 -07:00
fence.metal Faster synchronization Fence primitive (#1773) 2025-01-17 18:42:19 -08:00
fft.h Feature complete Metal FFT (#1102) 2024-06-06 12:57:25 -07:00
fft.metal Add Quantized Ops to the JIT (#1204) 2024-06-12 09:47:12 -07:00
gather_axis.h scatter axis + gather axis primitives (#1813) 2025-01-31 20:48:08 -08:00
gather.h Use int64 stride everywhere (#1671) 2024-12-09 11:09:02 -08:00
gemv_masked.h Use same accumulation precision in gemv as gemm (#1962) 2025-03-16 07:13:24 -07:00
gemv_masked.metal Use int64 stride everywhere (#1671) 2024-12-09 11:09:02 -08:00
gemv.metal enable complex gemm (#2017) 2025-03-28 10:45:13 -07:00
hadamard.h Fix bfloat16 Hadamard (#1283) 2024-07-23 14:54:43 -07:00
indexing.h Use int64 stride everywhere (#1671) 2024-12-09 11:09:02 -08:00
layer_norm.metal RMS norm without scaling (#1915) 2025-02-28 20:26:57 -08:00
quantized.h Affine quant always in fp32 (#1925) 2025-03-04 17:50:19 -08:00
quantized.metal 3 and 6 bit quantization (#1613) 2024-11-22 10:22:13 -08:00
random.metal Use int64 stride everywhere (#1671) 2024-12-09 11:09:02 -08:00
reduce_utils.h More jitting (#1132) 2024-05-23 16:23:44 -07:00
reduce.h Fix JIT reductions (#1373) 2024-08-28 16:39:11 -07:00
reduce.metal Allow no copy negative strides in as_strided and slice (#1688) 2024-12-12 08:59:45 -08:00
rms_norm.metal RMS norm without scaling (#1915) 2025-02-28 20:26:57 -08:00
rope.metal Dispatch bf16 at run time when using the JIT (#1584) 2024-11-15 16:54:36 -08:00
scaled_dot_product_attention.metal sdpa specialization for head dim 256 (#2007) 2025-03-27 19:31:25 -07:00
scan.h Working 64-bit scans (#1506) 2024-10-24 11:05:46 -07:00
scan.metal Working 64-bit scans (#1506) 2024-10-24 11:05:46 -07:00
scatter_axis.h scatter axis + gather axis primitives (#1813) 2025-01-31 20:48:08 -08:00
scatter.h Use int64 stride everywhere (#1671) 2024-12-09 11:09:02 -08:00
sdpa_vector.h causal vector sdpa (#2018) 2025-03-28 12:36:13 -07:00
softmax.h consistently handle all -inf in softmax (#1470) 2024-10-08 09:54:02 -07:00
softmax.metal Dispatch bf16 at run time when using the JIT (#1584) 2024-11-15 16:54:36 -08:00
sort.h faster sort (#1831) 2025-02-05 06:10:22 -08:00
sort.metal faster sort (#1831) 2025-02-05 06:10:22 -08:00
ternary_ops.h JIT compile option for binary minimization (#1091) 2024-05-22 12:57:13 -07:00
ternary.h Use int64 stride everywhere (#1671) 2024-12-09 11:09:02 -08:00
ternary.metal Fix nd ternary on GPU (#1746) 2025-01-03 11:52:17 -08:00
unary_ops.h Bitwise Inverse (#1862) 2025-02-13 08:44:14 -08:00
unary.h Use int64 stride everywhere (#1671) 2024-12-09 11:09:02 -08:00
unary.metal Bitwise Inverse (#1862) 2025-02-13 08:44:14 -08:00
utils.h Allow no copy negative strides in as_strided and slice (#1688) 2024-12-12 08:59:45 -08:00