mirror of
https://github.com/ml-explore/mlx-examples.git
synced 2025-09-01 12:49:50 +08:00
Quantize embedding / Update quantize API (#680)
* more async eval * quantize embedding / update quantize api * more updates for quantize * update for quantize embeddings * update sd quant API * update sdxl quants * error for datasets < batch_size * async * fix config loading * fix quant * fix tests * fix req * remove lm head if tie weights is true * fix test
This commit is contained in:
@@ -142,7 +142,7 @@ class Transformer(nn.Module):
|
||||
h = self.norm(h)
|
||||
|
||||
if self.weight_tying:
|
||||
return h @ self.wte.weight.T, cache
|
||||
return self.wte.as_linear(h), cache
|
||||
|
||||
return self.ff_out(h), cache
|
||||
|
||||
|
Reference in New Issue
Block a user