Files
mlx/python/src
Brian Keene 19fb69e2ed Add memory_efficient_threshold kwarg to sdpa kernel (#1319)
Allows opt-in to memory efficient GPU shader at proscribed sequence
length.  Otherwise, utilizes aggregate MLX primitives for best latency.
2024-08-12 12:57:09 -07:00
..
2024-08-08 17:17:46 -07:00
2024-05-23 17:04:02 -07:00
2024-07-26 10:40:49 -07:00
2024-05-22 15:52:05 -07:00
2024-03-18 20:12:25 -07:00
2024-03-18 20:12:25 -07:00
2024-03-18 20:12:25 -07:00
2024-03-18 20:12:25 -07:00
2024-05-03 17:12:51 -07:00
2024-05-23 17:04:02 -07:00
2024-08-06 11:23:10 -07:00
2024-07-25 09:36:44 -07:00
2024-04-26 12:56:05 -07:00
2024-07-11 15:59:07 -07:00
2024-05-06 16:02:49 -07:00
2024-05-06 16:02:49 -07:00