mlx/metal at 2b8ace6a039be663cf4aede1c0b7a713e2dd18cc - mlx

mirror of https://github.com/ml-explore/mlx.git synced 2025-07-18 23:21:16 +08:00

History

Awni Hannun 881615b072 Faster metal compiled kernels + some fixes (#1486 ) * bump mac tests to use py39 * work per thread for compiled kernels * fixe for large arrays * fix		2024-10-14 12:45:38 -07:00
..
jit	More fixes for arrays with large sizes (#1405 )	2024-09-17 12:46:31 -07:00
kernels	Faster metal compiled kernels + some fixes (#1486 )	2024-10-14 12:45:38 -07:00
allocator.cpp	Allow querying the allocator for the buffer size (#1404 )	2024-09-11 21:02:16 -07:00
allocator.h	Allow querying the allocator for the buffer size (#1404 )	2024-09-11 21:02:16 -07:00
binary.cpp	Faster metal compiled kernels + some fixes (#1486 )	2024-10-14 12:45:38 -07:00
binary.h	Fixes for large arrays with a few ops (#1299 )	2024-07-30 17:18:39 -07:00
CMakeLists.txt	Chore: add pre-commit hook for cmake (#1362 )	2024-09-16 12:53:01 -07:00
compiled.cpp	Faster metal compiled kernels + some fixes (#1486 )	2024-10-14 12:45:38 -07:00
conv.cpp	Conv grad with groups + bugfix (#1449 )	2024-10-06 07:08:53 -07:00
copy.cpp	Faster metal compiled kernels + some fixes (#1486 )	2024-10-14 12:45:38 -07:00
copy.h	Fix copying scalars by adding fill_gpu (#1402 )	2024-09-09 15:54:08 -07:00
custom_kernel.cpp	Make the GPU device more thread safe (#1478 )	2024-10-12 17:49:15 -07:00
device.cpp	Make the GPU device more thread safe (#1478 )	2024-10-12 17:49:15 -07:00
device.h	Make the GPU device more thread safe (#1478 )	2024-10-12 17:49:15 -07:00
distributed.cpp	Adds send/recv ops in distributed (#1366 )	2024-08-26 23:01:37 -07:00
event.cpp	Fix array is_available race cases (#1468 )	2024-10-07 19:13:50 -07:00
fft.cpp	Fix normalization check_input (#1452 )	2024-10-03 13:26:56 -07:00
hadamard.cpp	Make the GPU device more thread safe (#1478 )	2024-10-12 17:49:15 -07:00
indexing.cpp	Make the GPU device more thread safe (#1478 )	2024-10-12 17:49:15 -07:00
jit_kernels.cpp	Faster metal compiled kernels + some fixes (#1486 )	2024-10-14 12:45:38 -07:00
kernels.h	fix jit reduce (#1395 )	2024-09-04 14:03:10 -07:00
make_compiled_preamble.sh	fix compiling with space in paths (#1332 )	2024-08-15 16:39:24 -07:00
matmul.cpp	Conv grad with groups + bugfix (#1449 )	2024-10-06 07:08:53 -07:00
matmul.h	Conv grad with groups + bugfix (#1449 )	2024-10-06 07:08:53 -07:00
metal_impl.h	Add synchronize function (#1006 )	2024-04-22 08:25:46 -07:00
metal.cpp	Fix array is_available race cases (#1468 )	2024-10-07 19:13:50 -07:00
metal.h	Reset peak memory (#1074 )	2024-05-03 17:12:51 -07:00
nojit_kernels.cpp	fix jit reduce (#1395 )	2024-09-04 14:03:10 -07:00
normalization.cpp	Fix normalization check_input (#1452 )	2024-10-03 13:26:56 -07:00
primitives.cpp	Avoid io timeout for large arrays (#1442 )	2024-09-27 13:32:14 -07:00
quantized.cpp	Fix normalization check_input (#1452 )	2024-10-03 13:26:56 -07:00
reduce.cpp	Fix normalization check_input (#1452 )	2024-10-03 13:26:56 -07:00
reduce.h	Further reduction tuning (#1349 )	2024-08-23 10:35:25 -07:00
rope.cpp	Xcode 160 (#1384 )	2024-09-10 15:15:17 -07:00
scaled_dot_product_attention.cpp	Metal shaders for memory efficient self attention on large sequences (#964 )	2024-06-03 09:16:19 -07:00
scan.cpp	Fix normalization check_input (#1452 )	2024-10-03 13:26:56 -07:00
slicing.cpp	Fix copying scalars by adding fill_gpu (#1402 )	2024-09-09 15:54:08 -07:00
slicing.h	Fix slice data size (#1394 )	2024-09-04 19:10:43 -07:00
softmax.cpp	Fix normalization check_input (#1452 )	2024-10-03 13:26:56 -07:00
sort.cpp	Fix normalization check_input (#1452 )	2024-10-03 13:26:56 -07:00
ternary.cpp	Faster metal compiled kernels + some fixes (#1486 )	2024-10-14 12:45:38 -07:00
ternary.h	Add some internal GPU apis (#1177 )	2024-06-04 09:24:26 -07:00
unary.cpp	Faster metal compiled kernels + some fixes (#1486 )	2024-10-14 12:45:38 -07:00
unary.h	Add some internal GPU apis (#1177 )	2024-06-04 09:24:26 -07:00
utils.cpp	Add gemv masked to JIT plus some fixes (#1310 )	2024-08-07 13:38:07 -07:00
utils.h	Add gemv masked to JIT plus some fixes (#1310 )	2024-08-07 13:38:07 -07:00