mlx/benchmarks/python at 1865299a30d2c47fb4497e3dd01800965b21b088 - mlx

mirror of https://github.com/ml-explore/mlx.git synced 2025-12-16 01:49:05 +08:00

Files

Brian Keene 1865299a30 Metal shaders for memory efficient self attention on large sequences (#964 )

* Metal shaders for efficient self attention on large sequences

Updated fast attention: GEMM-ified with Steel primitives
Uses flash attention 1 for scale correction

* more compiler silencing

* Address rebase issues

* Templatize kernel instantiation, revise cpu bindings

* Safer writes to output

* Permit batch size > 1

* Numerical fixes for sdpa self attention

* Re-enable test, remove unused variable

* add benchmarking script

* Disable sdpa prior to perf tuning, and simplify tests for per-patch CI

2024-06-03 09:16:19 -07:00

blas

Update GEMM (#424 )

2024-01-17 12:42:39 -08:00

comparative

Reduce update (#783 )

2024-03-04 19:09:51 -08:00

batch_matmul_bench.py

Add isort pre-commit and run (#68 )

2023-12-08 11:31:47 -08:00

compile_bench.py

Shapeless compilation for some graphs (#687 )