mlx/mlx/backend/metal
Awni Hannun 61d787726a
Fix view scalar bug segfault (#1603)
* fix view scalar bug

* fix view scalar bug

* one more fix
2024-11-19 10:54:05 -08:00
..
jit Faster indexing math in a few kernels (#1589) 2024-11-18 19:52:00 -08:00
kernels Faster indexing math in a few kernels (#1589) 2024-11-18 19:52:00 -08:00
allocator.cpp Fully wrap the command encoder (#1572) 2024-11-08 11:50:21 -08:00
allocator.h Wired (#1510) 2024-10-25 09:35:33 -07:00
binary.cpp Faster indexing math in a few kernels (#1589) 2024-11-18 19:52:00 -08:00
binary.h Fixes for large arrays with a few ops (#1299) 2024-07-30 17:18:39 -07:00
CMakeLists.txt Use osx deployment target to pick Metal version (#1595) 2024-11-18 19:16:49 -08:00
compiled.cpp Faster indexing math in a few kernels (#1589) 2024-11-18 19:52:00 -08:00
conv.cpp fix dispatch threads for a few kernels (#1594) 2024-11-18 08:35:25 -08:00
copy.cpp Faster indexing math in a few kernels (#1589) 2024-11-18 19:52:00 -08:00
copy.h Fix copying scalars by adding fill_gpu (#1402) 2024-09-09 15:54:08 -07:00
custom_kernel.cpp fix dispatch threads for a few kernels (#1594) 2024-11-18 08:35:25 -08:00
device.cpp Dispatch bf16 at run time when using the JIT (#1584) 2024-11-15 16:54:36 -08:00
device.h Fully wrap the command encoder (#1572) 2024-11-08 11:50:21 -08:00
distributed.cpp Adds send/recv ops in distributed (#1366) 2024-08-26 23:01:37 -07:00
event.cpp Fix array is_available race cases (#1468) 2024-10-07 19:13:50 -07:00
fft.cpp Fully wrap the command encoder (#1572) 2024-11-08 11:50:21 -08:00
hadamard.cpp Fully wrap the command encoder (#1572) 2024-11-08 11:50:21 -08:00
indexing.cpp Faster indexing math in a few kernels (#1589) 2024-11-18 19:52:00 -08:00
jit_kernels.cpp Faster indexing math in a few kernels (#1589) 2024-11-18 19:52:00 -08:00
kernels.h Reductions update (#1351) 2024-11-04 22:25:16 -08:00
make_compiled_preamble.sh Dispatch bf16 at run time when using the JIT (#1584) 2024-11-15 16:54:36 -08:00
matmul.cpp fix dispatch threads for a few kernels (#1594) 2024-11-18 08:35:25 -08:00
matmul.h Wired (#1510) 2024-10-25 09:35:33 -07:00
metal_impl.h Add synchronize function (#1006) 2024-04-22 08:25:46 -07:00
metal.cpp Bfs width limit (#1568) 2024-11-08 15:00:46 -08:00
metal.h Wired (#1510) 2024-10-25 09:35:33 -07:00
nojit_kernels.cpp Reductions update (#1351) 2024-11-04 22:25:16 -08:00
normalization.cpp Fully wrap the command encoder (#1572) 2024-11-08 11:50:21 -08:00
primitives.cpp Fix view scalar bug segfault (#1603) 2024-11-19 10:54:05 -08:00
quantized.cpp Fully wrap the command encoder (#1572) 2024-11-08 11:50:21 -08:00
reduce.cpp Fully wrap the command encoder (#1572) 2024-11-08 11:50:21 -08:00
reduce.h Reductions update (#1351) 2024-11-04 22:25:16 -08:00
resident.cpp Skip using Residency sets in VMs (#1537) 2024-10-29 19:37:23 -07:00
resident.h Wired (#1510) 2024-10-25 09:35:33 -07:00
rope.cpp Fully wrap the command encoder (#1572) 2024-11-08 11:50:21 -08:00
scaled_dot_product_attention.cpp 2-Pass Sdpa Inference Kernel (#1597) 2024-11-18 17:31:53 -08:00
scan.cpp Fully wrap the command encoder (#1572) 2024-11-08 11:50:21 -08:00
slicing.cpp Fix copying scalars by adding fill_gpu (#1402) 2024-09-09 15:54:08 -07:00
slicing.h Fix slice data size (#1394) 2024-09-04 19:10:43 -07:00
softmax.cpp Fully wrap the command encoder (#1572) 2024-11-08 11:50:21 -08:00
sort.cpp Fully wrap the command encoder (#1572) 2024-11-08 11:50:21 -08:00
ternary.cpp Faster indexing math in a few kernels (#1589) 2024-11-18 19:52:00 -08:00
ternary.h Add some internal GPU apis (#1177) 2024-06-04 09:24:26 -07:00
unary.cpp Faster indexing math in a few kernels (#1589) 2024-11-18 19:52:00 -08:00
unary.h Add some internal GPU apis (#1177) 2024-06-04 09:24:26 -07:00
utils.cpp Fix thread group for large arrays (#1543) 2024-10-30 16:25:12 -07:00
utils.h Faster indexing math in a few kernels (#1589) 2024-11-18 19:52:00 -08:00