.. |
jit
|
2d gather specialization (#1339)
|
2024-08-22 10:48:24 -07:00 |
kernels
|
Chore: update pre-commit hooks (#1353)
|
2024-08-24 06:46:36 -07:00 |
allocator.cpp
|
Do not release buffers on exit (#1142)
|
2024-07-15 15:12:24 -07:00 |
allocator.h
|
Reset peak memory (#1074)
|
2024-05-03 17:12:51 -07:00 |
binary.cpp
|
Fixes for large arrays with a few ops (#1299)
|
2024-07-30 17:18:39 -07:00 |
binary.h
|
Fixes for large arrays with a few ops (#1299)
|
2024-07-30 17:18:39 -07:00 |
CMakeLists.txt
|
Custom Metal Kernels from Python (#1325)
|
2024-08-22 13:46:29 -07:00 |
compiled.cpp
|
Fix a couple bugs (#1161)
|
2024-05-28 15:18:18 -07:00 |
conv.cpp
|
Option to JIT steel gemm / conv (#1139)
|
2024-05-23 18:07:34 -07:00 |
copy.cpp
|
Fixes for large arrays with a few ops (#1299)
|
2024-07-30 17:18:39 -07:00 |
copy.h
|
Add a SliceUpdate op and primitive (#850)
|
2024-03-20 10:39:25 -07:00 |
custom_kernel.cpp
|
Add grid_sample example to metal_kernel docs (#1352)
|
2024-08-23 18:24:16 -07:00 |
device.cpp
|
fix extension metal library finding (#1361)
|
2024-08-26 09:18:50 -07:00 |
device.h
|
fix extension metal library finding (#1361)
|
2024-08-26 09:18:50 -07:00 |
event.cpp
|
Shared events for synchronization + async eval (#998)
|
2024-04-17 06:16:02 -07:00 |
fft.cpp
|
Fix contiguity check (#1336)
|
2024-08-19 16:05:06 -07:00 |
hadamard.cpp
|
Fast Hadamard Transform (#1249)
|
2024-07-09 20:39:01 -07:00 |
indexing.cpp
|
2d gather specialization (#1339)
|
2024-08-22 10:48:24 -07:00 |
jit_kernels.cpp
|
Add gemv masked to JIT plus some fixes (#1310)
|
2024-08-07 13:38:07 -07:00 |
kernels.h
|
Add gemv masked to JIT plus some fixes (#1310)
|
2024-08-07 13:38:07 -07:00 |
make_compiled_preamble.sh
|
fix compiling with space in paths (#1332)
|
2024-08-15 16:39:24 -07:00 |
matmul.cpp
|
Add gemv masked to JIT plus some fixes (#1310)
|
2024-08-07 13:38:07 -07:00 |
matmul.h
|
Add gemv masked to JIT plus some fixes (#1310)
|
2024-08-07 13:38:07 -07:00 |
metal_impl.h
|
Add synchronize function (#1006)
|
2024-04-22 08:25:46 -07:00 |
metal.cpp
|
Fix leak with multi-output primitives (#1274)
|
2024-07-23 06:34:18 -07:00 |
metal.h
|
Reset peak memory (#1074)
|
2024-05-03 17:12:51 -07:00 |
nojit_kernels.cpp
|
Add gemv masked to JIT plus some fixes (#1310)
|
2024-08-07 13:38:07 -07:00 |
normalization.cpp
|
Refactor reductions and fix scatter atomics for large sizes (#1300)
|
2024-08-22 16:03:31 -07:00 |
primitives.cpp
|
Custom transforms (#1246)
|
2024-07-10 18:00:01 -07:00 |
quantized.cpp
|
Fused Affine Quantize/Dequantize ops (#1282)
|
2024-07-29 15:11:38 -07:00 |
reduce.cpp
|
Fix boolean all reduce bug (#1355)
|
2024-08-24 10:09:32 -07:00 |
reduce.h
|
Further reduction tuning (#1349)
|
2024-08-23 10:35:25 -07:00 |
rope.cpp
|
Fix rope (#1340)
|
2024-08-20 17:37:52 -07:00 |
scaled_dot_product_attention.cpp
|
Metal shaders for memory efficient self attention on large sequences (#964)
|
2024-06-03 09:16:19 -07:00 |
scan.cpp
|
fix jit scan when output doesn't have primitive (#1190)
|
2024-06-06 07:24:58 -07:00 |
slicing.cpp
|
Add some internal GPU apis (#1177)
|
2024-06-04 09:24:26 -07:00 |
slicing.h
|
Add some internal GPU apis (#1177)
|
2024-06-04 09:24:26 -07:00 |
softmax.cpp
|
Fixes for large arrays with a few ops (#1299)
|
2024-07-30 17:18:39 -07:00 |
sort.cpp
|
Fix GPU sort for large arrays (#1285)
|
2024-07-24 14:37:10 -07:00 |
ternary.cpp
|
Fix ternary for large arrays (#1359)
|
2024-08-26 11:22:27 -07:00 |
ternary.h
|
Add some internal GPU apis (#1177)
|
2024-06-04 09:24:26 -07:00 |
unary.cpp
|
Fixes for large arrays with a few ops (#1299)
|
2024-07-30 17:18:39 -07:00 |
unary.h
|
Add some internal GPU apis (#1177)
|
2024-06-04 09:24:26 -07:00 |
utils.cpp
|
Add gemv masked to JIT plus some fixes (#1310)
|
2024-08-07 13:38:07 -07:00 |
utils.h
|
Add gemv masked to JIT plus some fixes (#1310)
|
2024-08-07 13:38:07 -07:00 |