mirror of
https://github.com/ml-explore/mlx-examples.git
synced 2025-09-01 04:14:38 +08:00
Quantize embedding / Update quantize API (#680)
* more async eval * quantize embedding / update quantize api * more updates for quantize * update for quantize embeddings * update sd quant API * update sdxl quants * error for datasets < batch_size * async * fix config loading * fix quant * fix tests * fix req * remove lm head if tie weights is true * fix test
This commit is contained in:
@@ -169,7 +169,7 @@ class Model(nn.Module):
|
||||
cache=None,
|
||||
):
|
||||
out, cache = self.model(inputs, cache)
|
||||
out = out @ self.model.embed_tokens.weight.T
|
||||
out = self.model.embed_tokens.as_linear(out)
|
||||
return out, cache
|
||||
|
||||
@property
|
||||
|
Reference in New Issue
Block a user