mlx/mlx/backend/metal
2025-04-10 19:41:27 -07:00
..
jit Custom logsumexp (#2028) 2025-03-31 07:36:55 -07:00
kernels fix fft bug (#2062) 2025-04-10 19:41:27 -07:00
allocator.cpp only add to residency set once (#2049) 2025-04-06 17:38:25 -07:00
allocator.h wire cache (#2006) 2025-03-25 18:54:01 -07:00
binary.cpp redesign for faster cpu/gpu synch (#1869) 2025-03-06 19:23:38 -08:00
binary.h Fixes for large arrays with a few ops (#1299) 2024-07-30 17:18:39 -07:00
CMakeLists.txt Custom logsumexp (#2028) 2025-03-31 07:36:55 -07:00
compiled.cpp redesign for faster cpu/gpu synch (#1869) 2025-03-06 19:23:38 -08:00
conv.cpp Depthwise Conv2D optimization (#2036) 2025-04-03 09:42:04 -07:00
copy.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
copy.h Dynamic slicing (#1741) 2025-01-07 14:02:16 -08:00
custom_kernel.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
device.cpp Do not load the default lib if another is requested (#2055) 2025-04-09 13:31:38 -07:00
device.h Do not load the default lib if another is requested (#2055) 2025-04-09 13:31:38 -07:00
distributed.cpp redesign for faster cpu/gpu synch (#1869) 2025-03-06 19:23:38 -08:00
event.cpp Remove Event::Signal() (#2052) 2025-04-08 06:20:27 -07:00
fence.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
fft.cpp fix fft bug (#2062) 2025-04-10 19:41:27 -07:00
hadamard.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
indexing.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
jit_kernels.cpp Custom logsumexp (#2028) 2025-03-31 07:36:55 -07:00
kernels.h Custom logsumexp (#2028) 2025-03-31 07:36:55 -07:00
logsumexp.cpp Custom logsumexp (#2028) 2025-03-31 07:36:55 -07:00
make_compiled_preamble.sh Dispatch bf16 at run time when using the JIT (#1584) 2024-11-15 16:54:36 -08:00
matmul.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
matmul.h Use int64 stride everywhere (#1671) 2024-12-09 11:09:02 -08:00
metal_impl.h redesign for faster cpu/gpu synch (#1869) 2025-03-06 19:23:38 -08:00
metal.cpp Fix multistream GPU deadlock (#1969) 2025-03-20 07:19:47 -07:00
metal.h move memory APIs into top level mlx.core (#1982) 2025-03-21 07:25:12 -07:00
nojit_kernels.cpp Custom logsumexp (#2028) 2025-03-31 07:36:55 -07:00
normalization.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
primitives.cpp Distributed layers (#1270) 2025-03-21 13:52:17 -07:00
quantized.cpp tune quant dispatch (#2031) 2025-04-02 20:05:54 -07:00
reduce.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
reduce.h Reductions update (#1351) 2024-11-04 22:25:16 -08:00
resident.cpp Only request residency once (#2051) 2025-04-07 10:47:51 -07:00
resident.h Wired (#1510) 2024-10-25 09:35:33 -07:00
rope.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
scaled_dot_product_attention.cpp Fix causal mask sdpa vec (#2053) 2025-04-08 09:11:23 -07:00
scan.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
slicing.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
slicing.h More shape type (#1705) 2024-12-19 08:08:20 -08:00
softmax.cpp Custom logsumexp (#2028) 2025-03-31 07:36:55 -07:00
sort.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
ternary.cpp redesign for faster cpu/gpu synch (#1869) 2025-03-06 19:23:38 -08:00
ternary.h Add some internal GPU apis (#1177) 2024-06-04 09:24:26 -07:00
unary.cpp fix malloc or wait deadlock (#1976) 2025-03-20 16:48:43 -07:00
unary.h Add some internal GPU apis (#1177) 2024-06-04 09:24:26 -07:00
utils.cpp Fp64 on the CPU (#1843) 2025-02-07 15:52:22 -08:00
utils.h redesign for faster cpu/gpu synch (#1869) 2025-03-06 19:23:38 -08:00