mlx/mlx/backend/metal
Jagrit Digani 358e1fd6ab
Fused GEMM (#1123)
* Basic gemm working

* Update addmm

* Clear out steel_gemm and steel_addmm kernels

* Fuse and clear out gather gemm

* Update objc releases
2024-05-15 10:30:41 -07:00
..
kernels Fused GEMM (#1123) 2024-05-15 10:30:41 -07:00
mps copyright + ack 2023-11-30 11:12:53 -08:00
allocator.cpp Reset peak memory (#1074) 2024-05-03 17:12:51 -07:00
allocator.h Reset peak memory (#1074) 2024-05-03 17:12:51 -07:00
CMakeLists.txt Shared events for synchronization + async eval (#998) 2024-04-17 06:16:02 -07:00
compiled_preamble.h Kernel generation (#614) 2024-02-07 13:15:59 -08:00
compiled.cpp Split encoders in non-concurrent context with a max ops per encoder (#1085) 2024-05-09 16:21:02 -07:00
conv.cpp Conv3d (#993) 2024-05-11 06:15:02 -07:00
copy.cpp Split encoders in non-concurrent context with a max ops per encoder (#1085) 2024-05-09 16:21:02 -07:00
copy.h Add a SliceUpdate op and primitive (#850) 2024-03-20 10:39:25 -07:00
device.cpp Fused GEMM (#1123) 2024-05-15 10:30:41 -07:00
device.h Split encoders in non-concurrent context with a max ops per encoder (#1085) 2024-05-09 16:21:02 -07:00
event.cpp Shared events for synchronization + async eval (#998) 2024-04-17 06:16:02 -07:00
fft.cpp Split encoders in non-concurrent context with a max ops per encoder (#1085) 2024-05-09 16:21:02 -07:00
indexing.cpp Split encoders in non-concurrent context with a max ops per encoder (#1085) 2024-05-09 16:21:02 -07:00
make_compiled_preamble.sh quote file name (#670) 2024-02-11 10:33:30 -08:00
matmul.cpp Fused GEMM (#1123) 2024-05-15 10:30:41 -07:00
matmul.h Add groups to Conv1d (#948) 2024-04-27 06:24:57 -07:00
metal_impl.h Add synchronize function (#1006) 2024-04-22 08:25:46 -07:00
metal.cpp Split encoders in non-concurrent context with a max ops per encoder (#1085) 2024-05-09 16:21:02 -07:00
metal.h Reset peak memory (#1074) 2024-05-03 17:12:51 -07:00
normalization.cpp Split encoders in non-concurrent context with a max ops per encoder (#1085) 2024-05-09 16:21:02 -07:00
primitives.cpp Add conjugate operator (#1100) 2024-05-10 07:22:20 -07:00
quantized.cpp Split encoders in non-concurrent context with a max ops per encoder (#1085) 2024-05-09 16:21:02 -07:00
reduce.cpp Split encoders in non-concurrent context with a max ops per encoder (#1085) 2024-05-09 16:21:02 -07:00
reduce.h Explicit barriers with concurrent dispatch (#977) 2024-04-10 21:45:31 -07:00
rope.cpp Split encoders in non-concurrent context with a max ops per encoder (#1085) 2024-05-09 16:21:02 -07:00
scaled_dot_product_attention.cpp Split encoders in non-concurrent context with a max ops per encoder (#1085) 2024-05-09 16:21:02 -07:00
scan.cpp Split encoders in non-concurrent context with a max ops per encoder (#1085) 2024-05-09 16:21:02 -07:00
softmax.cpp Split encoders in non-concurrent context with a max ops per encoder (#1085) 2024-05-09 16:21:02 -07:00
sort.cpp Split encoders in non-concurrent context with a max ops per encoder (#1085) 2024-05-09 16:21:02 -07:00
utils.h Metal FFT for powers of 2 up to 2048 (#915) 2024-04-11 21:40:06 -07:00