mlx/python at fc00f16e306f865dd749f12f6bbeda21c95e2acd - mlx

mirror of https://github.com/ml-explore/mlx.git synced 2025-12-16 01:49:05 +08:00

Files

Awni Hannun ec72b44417 Add quantize/dequantize for mxfp8 and nvfp4 (#2688 )

* Add quantize/dequantize slow path for mxfp8 and nvfp4

* fast cuda kernel for mx/nv quantization

* fallback for cuda < 12.8 (#2697)

* format (#2700)

* fix (#2701)

* metal kernels

* docs

* fix jit

* add default bits and group sizes

* improve quant docs

* fix output type of mxfp4 matmuls

2025-10-28 16:23:12 -07:00

mlx

fix cross entropy axis param (#2641 )

2025-10-01 16:49:55 -07:00

scripts

nccl dep + default for cuda (#2526 )

2025-08-21 17:57:49 -07:00

src

Add quantize/dequantize for mxfp8 and nvfp4 (#2688 )

2025-10-28 16:23:12 -07:00

tests

Add quantize/dequantize for mxfp8 and nvfp4 (#2688 )

2025-10-28 16:23:12 -07:00