.. |
blas
|
Update GEMM (#424)
|
2024-01-17 12:42:39 -08:00 |
comparative
|
Reductions update (#1351)
|
2024-11-04 22:25:16 -08:00 |
batch_matmul_bench.py
|
Add isort pre-commit and run (#68)
|
2023-12-08 11:31:47 -08:00 |
compile_bench.py
|
Add softmin, hardshrink, hardtanh (#1180)
|
2024-06-04 15:48:18 -07:00 |
conv1d_bench.py
|
Add groups to Conv1d (#948)
|
2024-04-27 06:24:57 -07:00 |
conv2d_bench_cpu.py
|
Conv cpu improvements (#1410)
|
2024-09-15 18:45:10 -07:00 |
conv2d_train_bench_cpu.py
|
Conv cpu improvements (#1410)
|
2024-09-15 18:45:10 -07:00 |
conv2d_transpose_bench_cpu.py
|
Conv cpu improvements (#1410)
|
2024-09-15 18:45:10 -07:00 |
conv3d_bench_cpu.py
|
Conv cpu improvements (#1410)
|
2024-09-15 18:45:10 -07:00 |
conv3d_train_bench_cpu.py
|
Conv cpu improvements (#1410)
|
2024-09-15 18:45:10 -07:00 |
conv3d_transpose_bench_cpu.py
|
Conv cpu improvements (#1410)
|
2024-09-15 18:45:10 -07:00 |
conv_bench.py
|
Add softmin, hardshrink, hardtanh (#1180)
|
2024-06-04 15:48:18 -07:00 |
conv_transpose_bench.py
|
Transposed Convolution (#1245)
|
2024-09-06 19:52:38 -07:00 |
distributed_bench.py
|
MPI ops in GPU stream for faster comms (#1356)
|
2024-08-26 15:12:50 -07:00 |
einsum_bench.py
|
Einsum (#1269)
|
2024-07-25 09:36:44 -07:00 |
fft_bench.py
|
Feature complete Metal FFT (#1102)
|
2024-06-06 12:57:25 -07:00 |
gather_bench.py
|
Scatter optimization : Eliminate 64b integer divide. (#662)
|
2024-02-10 08:49:51 -08:00 |
hadamard_bench.py
|
Fast Hadamard Transform (#1249)
|
2024-07-09 20:39:01 -07:00 |
layer_norm_bench.py
|
Implement vjps for some primitives in the fast namespace (#883)
|
2024-03-26 16:35:34 -07:00 |
rms_norm_bench.py
|
Implement vjps for some primitives in the fast namespace (#883)
|
2024-03-26 16:35:34 -07:00 |
rope_bench.py
|
Fix copy donation and add partial rope (#881)
|
2024-03-22 17:28:26 -07:00 |
scatter_bench.py
|
improvements to scatter / gather (#1541)
|
2024-10-30 19:30:54 -07:00 |
sdpa_bench.py
|
Add softmin, hardshrink, hardtanh (#1180)
|
2024-06-04 15:48:18 -07:00 |
sdpa_vector_bench.py
|
2-Pass Sdpa Inference Kernel (#1597)
|
2024-11-18 17:31:53 -08:00 |
single_ops.py
|
Propagate nans in binary ops (#579)
|
2024-01-29 11:19:38 -08:00 |
time_utils.py
|
Shapeless compilation for some graphs (#687)
|
2024-02-19 21:43:54 -08:00 |