mirror of
https://github.com/ml-explore/mlx-examples.git
synced 2025-09-01 12:49:50 +08:00
Quantize embedding / Update quantize API (#680)
* more async eval * quantize embedding / update quantize api * more updates for quantize * update for quantize embeddings * update sd quant API * update sdxl quants * error for datasets < batch_size * async * fix config loading * fix quant * fix tests * fix req * remove lm head if tie weights is true * fix test
This commit is contained in:
@@ -183,7 +183,7 @@ def load_model(folder: str):
|
||||
weights = tree_unflatten(list(weights.items()))
|
||||
model = Mistral(model_args)
|
||||
if quantization is not None:
|
||||
nn.QuantizedLinear.quantize_module(model, **quantization)
|
||||
nn.quantize(model, **quantization)
|
||||
model.update(weights)
|
||||
mx.eval(model.parameters())
|
||||
return model, tokenizer
|
||||
|
Reference in New Issue
Block a user