mlx/mlx/backend/metal
Awni Hannun 3d405fb3b1
Add synchronize function (#1006)
* add synchronize function

* fix linux

* fix linux

* fix and fix docs

* fix test

* try synchronize in stream destroy

* synchronize works for both cpu and gpu
2024-04-22 08:25:46 -07:00
..
kernels Fix mask broadcasting bug and add relevant test (#1003) 2024-04-17 17:33:48 -07:00
mps copyright + ack 2023-11-30 11:12:53 -08:00
allocator.cpp Improve profiling with gpu tracing (#969) 2024-04-07 21:47:43 -07:00
allocator.h Some fixes in cache / thread safety (#777) 2024-03-05 13:30:50 -08:00
CMakeLists.txt Shared events for synchronization + async eval (#998) 2024-04-17 06:16:02 -07:00
compiled_preamble.h Kernel generation (#614) 2024-02-07 13:15:59 -08:00
compiled.cpp Explicit barriers with concurrent dispatch (#977) 2024-04-10 21:45:31 -07:00
conv.cpp Explicit barriers with concurrent dispatch (#977) 2024-04-10 21:45:31 -07:00
copy.cpp Explicit barriers with concurrent dispatch (#977) 2024-04-10 21:45:31 -07:00
copy.h Add a SliceUpdate op and primitive (#850) 2024-03-20 10:39:25 -07:00
device.cpp Shared events for synchronization + async eval (#998) 2024-04-17 06:16:02 -07:00
device.h No copy command encoder (#986) 2024-04-11 21:15:36 -07:00
event.cpp Shared events for synchronization + async eval (#998) 2024-04-17 06:16:02 -07:00
fft.cpp Metal FFT for powers of 2 up to 2048 (#915) 2024-04-11 21:40:06 -07:00
indexing.cpp Explicit barriers with concurrent dispatch (#977) 2024-04-10 21:45:31 -07:00
make_compiled_preamble.sh quote file name (#670) 2024-02-11 10:33:30 -08:00
matmul.cpp Fix mask broadcasting bug and add relevant test (#1003) 2024-04-17 17:33:48 -07:00
matmul.h No copy gems (#801) 2024-03-12 13:13:41 -07:00
metal_impl.h Add synchronize function (#1006) 2024-04-22 08:25:46 -07:00
metal.cpp Add synchronize function (#1006) 2024-04-22 08:25:46 -07:00
metal.h Add synchronize function (#1006) 2024-04-22 08:25:46 -07:00
normalization.cpp Explicit barriers with concurrent dispatch (#977) 2024-04-10 21:45:31 -07:00
primitives.cpp Explicit barriers with concurrent dispatch (#977) 2024-04-10 21:45:31 -07:00
quantized.cpp Explicit barriers with concurrent dispatch (#977) 2024-04-10 21:45:31 -07:00
reduce.cpp Explicit barriers with concurrent dispatch (#977) 2024-04-10 21:45:31 -07:00
reduce.h Explicit barriers with concurrent dispatch (#977) 2024-04-10 21:45:31 -07:00
rope.cpp Explicit barriers with concurrent dispatch (#977) 2024-04-10 21:45:31 -07:00
scaled_dot_product_attention.cpp Explicit barriers with concurrent dispatch (#977) 2024-04-10 21:45:31 -07:00
scan.cpp Explicit barriers with concurrent dispatch (#977) 2024-04-10 21:45:31 -07:00
softmax.cpp Explicit barriers with concurrent dispatch (#977) 2024-04-10 21:45:31 -07:00
sort.cpp Explicit barriers with concurrent dispatch (#977) 2024-04-10 21:45:31 -07:00
utils.h Metal FFT for powers of 2 up to 2048 (#915) 2024-04-11 21:40:06 -07:00