.. |
fft
|
Feature complete Metal FFT (#1102)
|
2024-06-06 12:57:25 -07:00 |
jit
|
Dispatch bf16 at run time when using the JIT (#1584)
|
2024-11-15 16:54:36 -08:00 |
metal_3_0
|
Dispatch bf16 at run time when using the JIT (#1584)
|
2024-11-15 16:54:36 -08:00 |
metal_3_1
|
Dispatch bf16 at run time when using the JIT (#1584)
|
2024-11-15 16:54:36 -08:00 |
reduction
|
Reductions update (#1351)
|
2024-11-04 22:25:16 -08:00 |
steel
|
Dispatch bf16 at run time when using the JIT (#1584)
|
2024-11-15 16:54:36 -08:00 |
arange.h
|
More jitting (#1132)
|
2024-05-23 16:23:44 -07:00 |
arange.metal
|
Dispatch bf16 at run time when using the JIT (#1584)
|
2024-11-15 16:54:36 -08:00 |
arg_reduce.metal
|
More fixes for arrays with large sizes (#1405)
|
2024-09-17 12:46:31 -07:00 |
atomic.h
|
Refactor reductions and fix scatter atomics for large sizes (#1300)
|
2024-08-22 16:03:31 -07:00 |
bf16_math.h
|
Dispatch bf16 at run time when using the JIT (#1584)
|
2024-11-15 16:54:36 -08:00 |
binary_ops.h
|
Fix complex power on Metal (#1460)
|
2024-10-06 19:58:30 -07:00 |
binary_two.h
|
Faster indexing math in a few kernels (#1589)
|
2024-11-18 19:52:00 -08:00 |
binary_two.metal
|
Faster indexing math in a few kernels (#1589)
|
2024-11-18 19:52:00 -08:00 |
binary.h
|
Faster indexing math in a few kernels (#1589)
|
2024-11-18 19:52:00 -08:00 |
binary.metal
|
Faster indexing math in a few kernels (#1589)
|
2024-11-18 19:52:00 -08:00 |
CMakeLists.txt
|
Dispatch bf16 at run time when using the JIT (#1584)
|
2024-11-15 16:54:36 -08:00 |
complex.h
|
Refactor reductions and fix scatter atomics for large sizes (#1300)
|
2024-08-22 16:03:31 -07:00 |
conv.metal
|
Dispatch bf16 at run time when using the JIT (#1584)
|
2024-11-15 16:54:36 -08:00 |
copy.h
|
Faster indexing math in a few kernels (#1589)
|
2024-11-18 19:52:00 -08:00 |
copy.metal
|
Faster indexing math in a few kernels (#1589)
|
2024-11-18 19:52:00 -08:00 |
defines.h
|
Refactor reductions and fix scatter atomics for large sizes (#1300)
|
2024-08-22 16:03:31 -07:00 |
erf.h
|
JIT compile option for binary minimization (#1091)
|
2024-05-22 12:57:13 -07:00 |
expm1f.h
|
Fix overflow / underflow handling for expm1f (#1278)
|
2024-07-23 07:29:06 -07:00 |
fft.h
|
Feature complete Metal FFT (#1102)
|
2024-06-06 12:57:25 -07:00 |
fft.metal
|
Add Quantized Ops to the JIT (#1204)
|
2024-06-12 09:47:12 -07:00 |
gather.h
|
Faster indexing math in a few kernels (#1589)
|
2024-11-18 19:52:00 -08:00 |
gemv_masked.h
|
Add gemv masked to JIT plus some fixes (#1310)
|
2024-08-07 13:38:07 -07:00 |
gemv_masked.metal
|
Dispatch bf16 at run time when using the JIT (#1584)
|
2024-11-15 16:54:36 -08:00 |
gemv.metal
|
Dispatch bf16 at run time when using the JIT (#1584)
|
2024-11-15 16:54:36 -08:00 |
hadamard.h
|
Fix bfloat16 Hadamard (#1283)
|
2024-07-23 14:54:43 -07:00 |
indexing.h
|
Faster indexing math in a few kernels (#1589)
|
2024-11-18 19:52:00 -08:00 |
layer_norm.metal
|
Dispatch bf16 at run time when using the JIT (#1584)
|
2024-11-15 16:54:36 -08:00 |
quantized.h
|
OOB QMV fix (#1579)
|
2024-11-08 17:59:45 -08:00 |
quantized.metal
|
Add split_k qvm for long context (#1564)
|
2024-11-05 11:25:19 -08:00 |
random.metal
|
Faster bits and bernoulli (#1535)
|
2024-10-28 11:11:00 -07:00 |
reduce_utils.h
|
More jitting (#1132)
|
2024-05-23 16:23:44 -07:00 |
reduce.h
|
Fix JIT reductions (#1373)
|
2024-08-28 16:39:11 -07:00 |
reduce.metal
|
Reductions update (#1351)
|
2024-11-04 22:25:16 -08:00 |
rms_norm.metal
|
Dispatch bf16 at run time when using the JIT (#1584)
|
2024-11-15 16:54:36 -08:00 |
rope.metal
|
Dispatch bf16 at run time when using the JIT (#1584)
|
2024-11-15 16:54:36 -08:00 |
scaled_dot_product_attention_params.h
|
Metal shaders for memory efficient self attention on large sequences (#964)
|
2024-06-03 09:16:19 -07:00 |
scaled_dot_product_attention.metal
|
2-Pass Sdpa Inference Kernel (#1597)
|
2024-11-18 17:31:53 -08:00 |
scan.h
|
Working 64-bit scans (#1506)
|
2024-10-24 11:05:46 -07:00 |
scan.metal
|
Working 64-bit scans (#1506)
|
2024-10-24 11:05:46 -07:00 |
scatter.h
|
Faster indexing math in a few kernels (#1589)
|
2024-11-18 19:52:00 -08:00 |
sdpa_vector.h
|
2-Pass Sdpa Inference Kernel (#1597)
|
2024-11-18 17:31:53 -08:00 |
softmax.h
|
consistently handle all -inf in softmax (#1470)
|
2024-10-08 09:54:02 -07:00 |
softmax.metal
|
Dispatch bf16 at run time when using the JIT (#1584)
|
2024-11-15 16:54:36 -08:00 |
sort.h
|
More fixes for arrays with large sizes (#1405)
|
2024-09-17 12:46:31 -07:00 |
sort.metal
|
Dispatch bf16 at run time when using the JIT (#1584)
|
2024-11-15 16:54:36 -08:00 |
ternary_ops.h
|
JIT compile option for binary minimization (#1091)
|
2024-05-22 12:57:13 -07:00 |
ternary.h
|
Faster indexing math in a few kernels (#1589)
|
2024-11-18 19:52:00 -08:00 |
ternary.metal
|
Faster indexing math in a few kernels (#1589)
|
2024-11-18 19:52:00 -08:00 |
unary_ops.h
|
Real and Imag (#1490)
|
2024-10-15 16:23:15 -07:00 |
unary.h
|
Faster indexing math in a few kernels (#1589)
|
2024-11-18 19:52:00 -08:00 |
unary.metal
|
Faster indexing math in a few kernels (#1589)
|
2024-11-18 19:52:00 -08:00 |
utils.h
|
Faster indexing math in a few kernels (#1589)
|
2024-11-18 19:52:00 -08:00 |