mlx/metal at 61d787726af2125cd22a92acf7ee84efdc1f7714 - mlx

mirror of https://github.com/ml-explore/mlx.git synced 2025-10-22 02:58:16 +08:00

Files

Awni Hannun 61d787726a Fix view scalar bug segfault (#1603 )

* fix view scalar bug

* fix view scalar bug

* one more fix

2024-11-19 10:54:05 -08:00

jit

Faster indexing math in a few kernels (#1589 )

2024-11-18 19:52:00 -08:00

kernels

Faster indexing math in a few kernels (#1589 )

2024-11-18 19:52:00 -08:00

allocator.cpp

Fully wrap the command encoder (#1572 )

2024-11-08 11:50:21 -08:00

allocator.h

Wired (#1510 )

2024-10-25 09:35:33 -07:00

binary.cpp

Faster indexing math in a few kernels (#1589 )

2024-11-18 19:52:00 -08:00

binary.h

Fixes for large arrays with a few ops (#1299 )

2024-07-30 17:18:39 -07:00

CMakeLists.txt

Use osx deployment target to pick Metal version (#1595 )

2024-11-18 19:16:49 -08:00

compiled.cpp

Faster indexing math in a few kernels (#1589 )

2024-11-18 19:52:00 -08:00

conv.cpp

fix dispatch threads for a few kernels (#1594 )

2024-11-18 08:35:25 -08:00

copy.cpp

Faster indexing math in a few kernels (#1589 )

2024-11-18 19:52:00 -08:00

copy.h

Fix copying scalars by adding fill_gpu (#1402 )

2024-09-09 15:54:08 -07:00

custom_kernel.cpp

fix dispatch threads for a few kernels (#1594 )

2024-11-18 08:35:25 -08:00

device.cpp

Dispatch bf16 at run time when using the JIT (#1584 )

2024-11-15 16:54:36 -08:00

device.h

Fully wrap the command encoder (#1572 )

2024-11-08 11:50:21 -08:00

distributed.cpp

Adds send/recv ops in distributed (#1366 )

2024-08-26 23:01:37 -07:00

event.cpp

Fix array is_available race cases (#1468 )

2024-10-07 19:13:50 -07:00

fft.cpp

Fully wrap the command encoder (#1572 )

2024-11-08 11:50:21 -08:00

hadamard.cpp

Fully wrap the command encoder (#1572 )

2024-11-08 11:50:21 -08:00

indexing.cpp

Faster indexing math in a few kernels (#1589 )

2024-11-18 19:52:00 -08:00

jit_kernels.cpp

Faster indexing math in a few kernels (#1589 )

2024-11-18 19:52:00 -08:00

kernels.h

Reductions update (#1351 )

2024-11-04 22:25:16 -08:00

make_compiled_preamble.sh

Dispatch bf16 at run time when using the JIT (#1584 )

2024-11-15 16:54:36 -08:00

matmul.cpp

fix dispatch threads for a few kernels (#1594 )

2024-11-18 08:35:25 -08:00

matmul.h

Wired (#1510 )

2024-10-25 09:35:33 -07:00

metal_impl.h

Add synchronize function (#1006 )

2024-04-22 08:25:46 -07:00

metal.cpp

Bfs width limit (#1568 )

2024-11-08 15:00:46 -08:00

metal.h

Wired (#1510 )

2024-10-25 09:35:33 -07:00

nojit_kernels.cpp

Reductions update (#1351 )

2024-11-04 22:25:16 -08:00

normalization.cpp

Fully wrap the command encoder (#1572 )

2024-11-08 11:50:21 -08:00

primitives.cpp

Fix view scalar bug segfault (#1603 )

2024-11-19 10:54:05 -08:00

quantized.cpp

Fully wrap the command encoder (#1572 )

2024-11-08 11:50:21 -08:00

reduce.cpp

Fully wrap the command encoder (#1572 )

2024-11-08 11:50:21 -08:00

reduce.h

Reductions update (#1351 )

2024-11-04 22:25:16 -08:00

resident.cpp

Skip using Residency sets in VMs (#1537 )

2024-10-29 19:37:23 -07:00

resident.h

Wired (#1510 )

2024-10-25 09:35:33 -07:00

rope.cpp

Fully wrap the command encoder (#1572 )

2024-11-08 11:50:21 -08:00

scaled_dot_product_attention.cpp

2-Pass Sdpa Inference Kernel (#1597 )

2024-11-18 17:31:53 -08:00

scan.cpp

Fully wrap the command encoder (#1572 )

2024-11-08 11:50:21 -08:00

slicing.cpp

Fix copying scalars by adding fill_gpu (#1402 )

2024-09-09 15:54:08 -07:00

slicing.h

Fix slice data size (#1394 )

2024-09-04 19:10:43 -07:00

softmax.cpp

Fully wrap the command encoder (#1572 )

2024-11-08 11:50:21 -08:00

sort.cpp

Fully wrap the command encoder (#1572 )

2024-11-08 11:50:21 -08:00

ternary.cpp

Faster indexing math in a few kernels (#1589 )

2024-11-18 19:52:00 -08:00

ternary.h

Add some internal GPU apis (#1177 )

2024-06-04 09:24:26 -07:00

unary.cpp

Faster indexing math in a few kernels (#1589 )

2024-11-18 19:52:00 -08:00

unary.h

Add some internal GPU apis (#1177 )

2024-06-04 09:24:26 -07:00

utils.cpp

Fix thread group for large arrays (#1543 )

2024-10-30 16:25:12 -07:00

utils.h

Faster indexing math in a few kernels (#1589 )

2024-11-18 19:52:00 -08:00