mlx/metal at c7b0300af5c3c203aa3f6578086e3a9f7879f694 - mlx

jit

Use int64 stride everywhere (#1671 )

2024-12-09 11:09:02 -08:00

kernels

Fix batched qmv bug (#1758 )

2025-01-09 11:45:57 -08:00

allocator.cpp

track resource limit and throw if exceeded (#1718 )

2024-12-18 18:45:58 -08:00

allocator.h

track resource limit and throw if exceeded (#1718 )

2024-12-18 18:45:58 -08:00

binary.cpp

Allow no copy negative strides in as_strided and slice (#1688 )

2024-12-12 08:59:45 -08:00

binary.h

Fixes for large arrays with a few ops (#1299 )

2024-07-30 17:18:39 -07:00

CMakeLists.txt

Use osx deployment target to pick Metal version (#1595 )

2024-11-18 19:16:49 -08:00

compiled.cpp

Dynamic broadcasting for shapeless compile/export (#1722 )

2025-01-09 11:04:24 -08:00

conv.cpp

More shape type (#1705 )

2024-12-19 08:08:20 -08:00

copy.cpp

Dynamic slicing (#1741 )

2025-01-07 14:02:16 -08:00

copy.h

Dynamic slicing (#1741 )

2025-01-07 14:02:16 -08:00

custom_kernel.cpp

fix dispatch threads for a few kernels (#1594 )

2024-11-18 08:35:25 -08:00

device.cpp

track resource limit and throw if exceeded (#1718 )

2024-12-18 18:45:58 -08:00

device.h

Fully wrap the command encoder (#1572 )

2024-11-08 11:50:21 -08:00

distributed.cpp

mpi send use input as output (#1750 )

2025-01-06 06:08:43 -08:00

event.cpp

Fix array is_available race cases (#1468 )

2024-10-07 19:13:50 -07:00

fft.cpp

Use int64 stride everywhere (#1671 )

2024-12-09 11:09:02 -08:00

hadamard.cpp

Fully wrap the command encoder (#1572 )

2024-11-08 11:50:21 -08:00

indexing.cpp

Allow no copy negative strides in as_strided and slice (#1688 )

2024-12-12 08:59:45 -08:00

jit_kernels.cpp

Dynamic slicing (#1741 )

2025-01-07 14:02:16 -08:00

kernels.h

Dynamic slicing (#1741 )

2025-01-07 14:02:16 -08:00

make_compiled_preamble.sh

Dispatch bf16 at run time when using the JIT (#1584 )

2024-11-15 16:54:36 -08:00

matmul.cpp

Use int64 stride everywhere (#1671 )

2024-12-09 11:09:02 -08:00

matmul.h

Use int64 stride everywhere (#1671 )

2024-12-09 11:09:02 -08:00

metal_impl.h

Add synchronize function (#1006 )

2024-04-22 08:25:46 -07:00

metal.cpp

Print exceptions in eval_cpu/eval_gpu and abort (#1754 )

2025-01-08 06:31:09 -08:00

metal.h

Added missing unordered_map includes (#1635 )

2024-12-02 07:03:03 -08:00

nojit_kernels.cpp

Dynamic slicing (#1741 )

2025-01-07 14:02:16 -08:00

normalization.cpp

Fully wrap the command encoder (#1572 )

2024-11-08 11:50:21 -08:00

primitives.cpp

Dynamic broadcasting for shapeless compile/export (#1722 )

2025-01-09 11:04:24 -08:00

quantized.cpp

3 and 6 bit quantization (#1613 )

2024-11-22 10:22:13 -08:00

reduce.cpp

More shape type (#1705 )

2024-12-19 08:08:20 -08:00

reduce.h

Reductions update (#1351 )

2024-11-04 22:25:16 -08:00

resident.cpp

Fix some leaks and races (#1629 )

2024-11-27 20:01:20 -08:00

resident.h

Wired (#1510 )

2024-10-25 09:35:33 -07:00

rope.cpp

Allow offset to be an mx.array for mx.fast.rope (#1724 )

2024-12-19 15:51:44 -08:00

scaled_dot_product_attention.cpp

Add boolean mask support in vector SDPA (#1757 )

2025-01-07 20:24:53 -08:00

scan.cpp

Fully wrap the command encoder (#1572 )

2024-11-08 11:50:21 -08:00

slicing.cpp

shapeless slice update and broadcast when possible (#1727 )

2024-12-23 11:25:15 -08:00

slicing.h

More shape type (#1705 )

2024-12-19 08:08:20 -08:00

softmax.cpp

Fully wrap the command encoder (#1572 )

2024-11-08 11:50:21 -08:00

sort.cpp

Fix small sort with metal validation (#1695 )

2024-12-12 09:21:45 -08:00

ternary.cpp

Allow no copy negative strides in as_strided and slice (#1688 )

2024-12-12 08:59:45 -08:00

ternary.h

Add some internal GPU apis (#1177 )

2024-06-04 09:24:26 -07:00

unary.cpp

Allow no copy negative strides in as_strided and slice (#1688 )

2024-12-12 08:59:45 -08:00

unary.h

Add some internal GPU apis (#1177 )

2024-06-04 09:24:26 -07:00

utils.cpp

Use int64 stride everywhere (#1671 )

2024-12-09 11:09:02 -08:00

utils.h

Use int64 stride everywhere (#1671 )

2024-12-09 11:09:02 -08:00