Files
mlx/python/src
Awni Hannun ec72b44417 Add quantize/dequantize for mxfp8 and nvfp4 (#2688)
* Add quantize/dequantize slow path for mxfp8 and nvfp4

* fast cuda kernel for mx/nv quantization

* fallback for cuda < 12.8 (#2697)

* format (#2700)

* fix (#2701)

* metal kernels

* docs

* fix jit

* add default bits and group sizes

* improve quant docs

* fix output type of mxfp4 matmuls
2025-10-28 16:23:12 -07:00
..
2025-02-25 11:39:36 -08:00
2024-07-26 10:40:49 -07:00
2025-08-20 17:20:22 -07:00
2025-06-10 21:19:47 -07:00
2025-08-21 11:56:15 -07:00
2025-10-08 19:24:33 -07:00
2025-09-10 14:53:00 -07:00
2025-01-07 14:02:16 -08:00
2025-08-04 16:14:18 -07:00
2025-08-04 16:14:18 -07:00
2025-09-09 07:41:05 +09:00
2024-12-17 10:57:54 -08:00