.. |
jit
|
Gather mm new kernel and small refactoring (#2040)
|
2025-04-14 16:37:36 -07:00 |
kernels
|
Gather qmm batched kernel and refactoring of quantized (#2078)
|
2025-04-17 13:53:11 -07:00 |
allocator.cpp
|
only add to residency set once (#2049)
|
2025-04-06 17:38:25 -07:00 |
allocator.h
|
wire cache (#2006)
|
2025-03-25 18:54:01 -07:00 |
binary.cpp
|
redesign for faster cpu/gpu synch (#1869)
|
2025-03-06 19:23:38 -08:00 |
binary.h
|
Fixes for large arrays with a few ops (#1299)
|
2024-07-30 17:18:39 -07:00 |
CMakeLists.txt
|
Gather mm new kernel and small refactoring (#2040)
|
2025-04-14 16:37:36 -07:00 |
compiled.cpp
|
redesign for faster cpu/gpu synch (#1869)
|
2025-03-06 19:23:38 -08:00 |
conv.cpp
|
Depthwise Conv2D optimization (#2036)
|
2025-04-03 09:42:04 -07:00 |
copy.cpp
|
fix malloc or wait deadlock (#1976)
|
2025-03-20 16:48:43 -07:00 |
copy.h
|
Dynamic slicing (#1741)
|
2025-01-07 14:02:16 -08:00 |
custom_kernel.cpp
|
fix malloc or wait deadlock (#1976)
|
2025-03-20 16:48:43 -07:00 |
device.cpp
|
Do not load the default lib if another is requested (#2055)
|
2025-04-09 13:31:38 -07:00 |
device.h
|
Do not load the default lib if another is requested (#2055)
|
2025-04-09 13:31:38 -07:00 |
distributed.cpp
|
redesign for faster cpu/gpu synch (#1869)
|
2025-03-06 19:23:38 -08:00 |
event.cpp
|
Remove Event::Signal() (#2052)
|
2025-04-08 06:20:27 -07:00 |
fence.cpp
|
fix malloc or wait deadlock (#1976)
|
2025-03-20 16:48:43 -07:00 |
fft.cpp
|
fix fft bug (#2062)
|
2025-04-10 19:41:27 -07:00 |
hadamard.cpp
|
fix malloc or wait deadlock (#1976)
|
2025-03-20 16:48:43 -07:00 |
indexing.cpp
|
fix malloc or wait deadlock (#1976)
|
2025-03-20 16:48:43 -07:00 |
jit_kernels.cpp
|
Gather qmm batched kernel and refactoring of quantized (#2078)
|
2025-04-17 13:53:11 -07:00 |
kernels.h
|
Gather qmm batched kernel and refactoring of quantized (#2078)
|
2025-04-17 13:53:11 -07:00 |
logsumexp.cpp
|
Custom logsumexp (#2028)
|
2025-03-31 07:36:55 -07:00 |
make_compiled_preamble.sh
|
Dispatch bf16 at run time when using the JIT (#1584)
|
2024-11-15 16:54:36 -08:00 |
matmul.cpp
|
Gather qmm batched kernel and refactoring of quantized (#2078)
|
2025-04-17 13:53:11 -07:00 |
matmul.h
|
Use int64 stride everywhere (#1671)
|
2024-12-09 11:09:02 -08:00 |
metal_impl.h
|
redesign for faster cpu/gpu synch (#1869)
|
2025-03-06 19:23:38 -08:00 |
metal.cpp
|
Fix multistream GPU deadlock (#1969)
|
2025-03-20 07:19:47 -07:00 |
metal.h
|
move memory APIs into top level mlx.core (#1982)
|
2025-03-21 07:25:12 -07:00 |
nojit_kernels.cpp
|
Gather qmm batched kernel and refactoring of quantized (#2078)
|
2025-04-17 13:53:11 -07:00 |
normalization.cpp
|
fix malloc or wait deadlock (#1976)
|
2025-03-20 16:48:43 -07:00 |
primitives.cpp
|
Distributed layers (#1270)
|
2025-03-21 13:52:17 -07:00 |
quantized.cpp
|
Route to gather qmm only for many tokens per expert (#2082)
|
2025-04-17 14:53:08 -07:00 |
reduce.cpp
|
fix malloc or wait deadlock (#1976)
|
2025-03-20 16:48:43 -07:00 |
reduce.h
|
Reductions update (#1351)
|
2024-11-04 22:25:16 -08:00 |
resident.cpp
|
Only request residency once (#2051)
|
2025-04-07 10:47:51 -07:00 |
resident.h
|
Wired (#1510)
|
2024-10-25 09:35:33 -07:00 |
rope.cpp
|
fix malloc or wait deadlock (#1976)
|
2025-03-20 16:48:43 -07:00 |
scaled_dot_product_attention.cpp
|
Add float mask to sdpa vector (#2068)
|
2025-04-11 17:29:40 -07:00 |
scan.cpp
|
LogCumSumExp (#2069)
|
2025-04-13 01:27:29 -07:00 |
slicing.cpp
|
fix malloc or wait deadlock (#1976)
|
2025-03-20 16:48:43 -07:00 |
slicing.h
|
More shape type (#1705)
|
2024-12-19 08:08:20 -08:00 |
softmax.cpp
|
Custom logsumexp (#2028)
|
2025-03-31 07:36:55 -07:00 |
sort.cpp
|
fix malloc or wait deadlock (#1976)
|
2025-03-20 16:48:43 -07:00 |
ternary.cpp
|
redesign for faster cpu/gpu synch (#1869)
|
2025-03-06 19:23:38 -08:00 |
ternary.h
|
Add some internal GPU apis (#1177)
|
2024-06-04 09:24:26 -07:00 |
unary.cpp
|
fix malloc or wait deadlock (#1976)
|
2025-03-20 16:48:43 -07:00 |
unary.h
|
Add some internal GPU apis (#1177)
|
2024-06-04 09:24:26 -07:00 |
utils.cpp
|
Fp64 on the CPU (#1843)
|
2025-02-07 15:52:22 -08:00 |
utils.h
|
Gather mm new kernel and small refactoring (#2040)
|
2025-04-14 16:37:36 -07:00 |