mlx/mlx/backend/metal
2025-05-13 20:19:54 -07:00
..
jit Gather mm new kernel and small refactoring (#2040) 2025-04-14 16:37:36 -07:00
kernels Fix typo in row_reduce_small (#2179) 2025-05-13 20:19:54 -07:00
allocator.cpp Generalize gpu backend (#2138) 2025-04-30 09:08:17 -07:00
allocator.h wire cache (#2006) 2025-03-25 18:54:01 -07:00
binary.cpp fix bw for elementwise ops (#2151) 2025-05-05 06:15:04 -07:00
binary.h Fixes for large arrays with a few ops (#1299) 2024-07-30 17:18:39 -07:00
CMakeLists.txt Generalize gpu backend (#2138) 2025-04-30 09:08:17 -07:00
compiled.cpp fix bw for elementwise ops (#2151) 2025-05-05 06:15:04 -07:00
conv.cpp fix: conv_general differences between gpu, cpu (#2070) 2025-05-09 10:26:52 -07:00
copy.cpp Move common gpu primitives to backend/gpu (#2145) 2025-05-05 13:45:29 -07:00
custom_kernel.cpp Move common gpu primitives to backend/gpu (#2145) 2025-05-05 13:45:29 -07:00
device.cpp Generalize gpu backend (#2138) 2025-04-30 09:08:17 -07:00
device.h Generalize gpu backend (#2138) 2025-04-30 09:08:17 -07:00
distributed.cpp Move common gpu primitives to backend/gpu (#2145) 2025-05-05 13:45:29 -07:00
eval.cpp Generalize gpu backend (#2138) 2025-04-30 09:08:17 -07:00
event.cpp Generalize gpu backend (#2138) 2025-04-30 09:08:17 -07:00
fence.cpp fix input coherent kernel launch (#2153) 2025-05-05 17:30:50 -07:00
fft.cpp Fix fft for integer overflow (#2161) 2025-05-09 14:25:12 -07:00
hadamard.cpp Move common gpu primitives to backend/gpu (#2145) 2025-05-05 13:45:29 -07:00
indexing.cpp Add remove_index utility (#2173) 2025-05-13 17:09:56 -07:00
jit_kernels.cpp Gather qmm batched kernel and refactoring of quantized (#2078) 2025-04-17 13:53:11 -07:00
kernels.h Gather qmm batched kernel and refactoring of quantized (#2078) 2025-04-17 13:53:11 -07:00
logsumexp.cpp Move common gpu primitives to backend/gpu (#2145) 2025-05-05 13:45:29 -07:00
make_compiled_preamble.sh Dispatch bf16 at run time when using the JIT (#1584) 2024-11-15 16:54:36 -08:00
matmul.cpp Close a couple edge case bugs: hadamard and addmm on empty inputs (#2177) 2025-05-12 10:48:57 -07:00
matmul.h Use int64 stride everywhere (#1671) 2024-12-09 11:09:02 -08:00
metal.cpp Generalize gpu backend (#2138) 2025-04-30 09:08:17 -07:00
metal.h Generalize gpu backend (#2138) 2025-04-30 09:08:17 -07:00
no_metal.cpp Generalize gpu backend (#2138) 2025-04-30 09:08:17 -07:00
nojit_kernels.cpp Gather qmm batched kernel and refactoring of quantized (#2078) 2025-04-17 13:53:11 -07:00
normalization.cpp Move common gpu primitives to backend/gpu (#2145) 2025-05-05 13:45:29 -07:00
primitives.cpp Move common gpu primitives to backend/gpu (#2145) 2025-05-05 13:45:29 -07:00
quantized.cpp Move common gpu primitives to backend/gpu (#2145) 2025-05-05 13:45:29 -07:00
reduce.cpp Move common gpu primitives to backend/gpu (#2145) 2025-05-05 13:45:29 -07:00
reduce.h Reductions update (#1351) 2024-11-04 22:25:16 -08:00
resident.cpp Generalize gpu backend (#2138) 2025-04-30 09:08:17 -07:00
resident.h Wired (#1510) 2024-10-25 09:35:33 -07:00
rope.cpp Move common gpu primitives to backend/gpu (#2145) 2025-05-05 13:45:29 -07:00
scaled_dot_product_attention.cpp Move common gpu primitives to backend/gpu (#2145) 2025-05-05 13:45:29 -07:00
scan.cpp Move common gpu primitives to backend/gpu (#2145) 2025-05-05 13:45:29 -07:00
slicing.cpp Move common gpu primitives to backend/gpu (#2145) 2025-05-05 13:45:29 -07:00
softmax.cpp Move common gpu primitives to backend/gpu (#2145) 2025-05-05 13:45:29 -07:00
sort.cpp Move common gpu primitives to backend/gpu (#2145) 2025-05-05 13:45:29 -07:00
ternary.cpp fix bw for elementwise ops (#2151) 2025-05-05 06:15:04 -07:00
ternary.h Add some internal GPU apis (#1177) 2024-06-04 09:24:26 -07:00
unary.cpp fix bw for elementwise ops (#2151) 2025-05-05 06:15:04 -07:00
unary.h Add some internal GPU apis (#1177) 2024-06-04 09:24:26 -07:00
utils.cpp Fp64 on the CPU (#1843) 2025-02-07 15:52:22 -08:00
utils.h fix bw for elementwise ops (#2151) 2025-05-05 06:15:04 -07:00