mlx/benchmarks
Jagrit Digani 02bec0bb6d
Matrix Attention kernel (#1610)
* Rough INIT

* [WIP]: Loading and Matmuls added

* [WIP]: Reductions and min working aligned kernel at headdim = 64

* [WIP] Added headdim 80 for testing

* [WIP] Update dispatch params for testing

* [WIP] Add support for unaligned seq lengths - still looks messy

* Update sdpa_benchmarks

* Update sdpa_benchmarks

* Update sdpa_benchmarks

* Enable gqa support

* Update benchmark and switch off 128 headdim

* Update headdim 128 tuning

* Remove older fast attention code. Write out O strided

* Disable hd=128 until further optimizations

* Enable bf16

* Fix data size bug

* Enable attn build outside of jit
2024-11-22 10:34:05 -08:00
..
cpp Update pre-commit hooks (#984) 2024-04-11 07:27:53 -07:00
numpy Add isort pre-commit and run (#68) 2023-12-08 11:31:47 -08:00
python Matrix Attention kernel (#1610) 2024-11-22 10:34:05 -08:00