mlx/mlx/backend/metal
2025-03-28 14:31:53 -07:00
..
jit scatter axis + gather axis primitives (#1813) 2025-01-31 20:48:08 -08:00
kernels use minimum deployment target (#2016) 2025-03-28 14:31:53 -07:00
allocator.cpp wire cache (#2006) 2025-03-25 18:54:01 -07:00
allocator.h wire cache (#2006) 2025-03-25 18:54:01 -07:00
binary.cpp redesign for faster cpu/gpu synch (#1869) 2025-03-06 19:23:38 -08:00
binary.h Fixes for large arrays with a few ops (#1299) 2024-07-30 17:18:39 -07:00
CMakeLists.txt MinGW support (#1806) 2025-02-01 12:40:06 -08:00
compiled.cpp redesign for faster cpu/gpu synch (#1869) 2025-03-06 19:23:38 -08:00
conv.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
copy.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
copy.h Dynamic slicing (#1741) 2025-01-07 14:02:16 -08:00
custom_kernel.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
device.cpp Fix multistream GPU deadlock (#1969) 2025-03-20 07:19:47 -07:00
device.h redesign for faster cpu/gpu synch (#1869) 2025-03-06 19:23:38 -08:00
distributed.cpp redesign for faster cpu/gpu synch (#1869) 2025-03-06 19:23:38 -08:00
event.cpp redesign for faster cpu/gpu synch (#1869) 2025-03-06 19:23:38 -08:00
fence.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
fft.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
hadamard.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
indexing.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
jit_kernels.cpp fix copy for large arrays (#1953) 2025-03-10 15:04:25 -07:00
kernels.h Add missing #pragma once (#1838) 2025-02-06 11:11:22 -08:00
make_compiled_preamble.sh Dispatch bf16 at run time when using the JIT (#1584) 2024-11-15 16:54:36 -08:00
matmul.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
matmul.h Use int64 stride everywhere (#1671) 2024-12-09 11:09:02 -08:00
metal_impl.h redesign for faster cpu/gpu synch (#1869) 2025-03-06 19:23:38 -08:00
metal.cpp Fix multistream GPU deadlock (#1969) 2025-03-20 07:19:47 -07:00
metal.h move memory APIs into top level mlx.core (#1982) 2025-03-21 07:25:12 -07:00
nojit_kernels.cpp Dynamic slicing (#1741) 2025-01-07 14:02:16 -08:00
normalization.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
primitives.cpp Distributed layers (#1270) 2025-03-21 13:52:17 -07:00
quantized.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
reduce.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
reduce.h Reductions update (#1351) 2024-11-04 22:25:16 -08:00
resident.cpp Fix some leaks and races (#1629) 2024-11-27 20:01:20 -08:00
resident.h Wired (#1510) 2024-10-25 09:35:33 -07:00
rope.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
scaled_dot_product_attention.cpp causal vector sdpa (#2018) 2025-03-28 12:36:13 -07:00
scan.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
slicing.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
slicing.h More shape type (#1705) 2024-12-19 08:08:20 -08:00
softmax.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
sort.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
ternary.cpp redesign for faster cpu/gpu synch (#1869) 2025-03-06 19:23:38 -08:00
ternary.h Add some internal GPU apis (#1177) 2024-06-04 09:24:26 -07:00
unary.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
unary.h Add some internal GPU apis (#1177) 2024-06-04 09:24:26 -07:00
utils.cpp Fp64 on the CPU (#1843) 2025-02-07 15:52:22 -08:00
utils.h redesign for faster cpu/gpu synch (#1869) 2025-03-06 19:23:38 -08:00