mlx/python
Brian Keene 19fb69e2ed
Add memory_efficient_threshold kwarg to sdpa kernel (#1319)
Allows opt-in to memory efficient GPU shader at proscribed sequence
length.  Otherwise, utilizes aggregate MLX primitives for best latency.
2024-08-12 12:57:09 -07:00
..
mlx Add "edge" mode to mx.pad (#1309) 2024-08-06 11:23:10 -07:00
src Add memory_efficient_threshold kwarg to sdpa kernel (#1319) 2024-08-12 12:57:09 -07:00
tests Add memory_efficient_threshold kwarg to sdpa kernel (#1319) 2024-08-12 12:57:09 -07:00