mlx/benchmarks at f76a49e555e4fe76d46a3584bb72dce8287f24b2 - mlx

mirror of https://github.com/ml-explore/mlx.git synced 2025-12-15 17:39:05 +08:00

Files

Jagrit Digani 02bec0bb6d Matrix Attention kernel (#1610 )

* Rough INIT

* [WIP]: Loading and Matmuls added

* [WIP]: Reductions and min working aligned kernel at headdim = 64

* [WIP] Added headdim 80 for testing

* [WIP] Update dispatch params for testing

* [WIP] Add support for unaligned seq lengths - still looks messy

* Update sdpa_benchmarks

* Update sdpa_benchmarks

* Update sdpa_benchmarks

* Enable gqa support

* Update benchmark and switch off 128 headdim

* Update headdim 128 tuning

* Remove older fast attention code. Write out O strided

* Disable hd=128 until further optimizations

* Enable bf16

* Fix data size bug

* Enable attn build outside of jit

2024-11-22 10:34:05 -08:00

cpp

Update pre-commit hooks (#984 )

2024-04-11 07:27:53 -07:00

numpy

Add isort pre-commit and run (#68 )

2023-12-08 11:31:47 -08:00

python

Matrix Attention kernel (#1610 )

2024-11-22 10:34:05 -08:00