Neil Mehta
956da0ddc7
Create sampler chain
2025-03-08 14:10:33 -05:00
Neil Mehta
932b7c0510
top_k and min_p refactor
2025-03-08 10:12:28 -05:00
Neil Mehta
58e912966a
top_p refactor
2025-03-08 08:55:49 -05:00
Awni Hannun
f44a52e2dc
batched min p and fix spec gen sampling ( #1222 )
2025-01-27 15:40:31 -08:00
Awni Hannun
db109184b7
Fix no template prompt + top_k sampling ( #1166 )
...
* fix no template prompt
* add top_k sampling
* fix chinese
2024-12-18 18:46:50 -08:00
Awni Hannun
0f135396ae
Generation refactor: part 2 ( #1099 )
...
* unify with stream_generate
* fixes
* nit
* some cleanup, warnings, tests
* fix test + faster min p + test
* version
2024-11-23 11:47:06 -08:00
Awni Hannun
9b83004631
Faster sampling with mx.compile
( #937 )
...
* faster sampling with compile
* fix test
2024-08-15 11:29:09 -07:00
Anchen
297a908e3d
fix(mlx-lm): type hints in gguf.py ( #621 )
2024-03-26 07:56:01 -07:00
Anchen
0ab01b4626
fix(mlx-lm): sorted probs in top_p implementation. ( #610 )
...
* fix(mlx-lm): the top p imp
* chore: address comment
2024-03-25 15:07:55 -07:00
Anchen
fbed720d6f
chore(mlx-lm): fix the top_p implementation. ( #602 )
...
* chore(mlx-lm): clean up the top p imp
* chore: clean up
* chore: add test
* chore: address comments
* chore: clean up docs string
* chore: clean up test
2024-03-21 12:18:23 -07:00