mlx-examples/llms
Benjamin Anderson 09566c7257
add speculative decoding example for llama (#149)
* speculative decoding

* add sample 0

* spec decode gives same results as regular decode

* rebase

* use accept reject criteria

* switch to t5

* update readme

* readme nit

* nits

* nits

* nits

---------

Co-authored-by: Benjamin Anderson <benjamin@Benjamins-MBP.lan>
Co-authored-by: Awni Hannun <awni@apple.com>
2023-12-28 15:20:43 -08:00
..
llama Fixed the return type for the __call__ method in Attention (#190) 2023-12-26 09:32:43 -08:00
mistral Fix conversion + inference errors. - Mistral (#176) 2023-12-22 14:10:25 -08:00
mixtral Fix generate example in README (#197) 2023-12-27 13:11:10 -08:00
phi2 Quantize example (#162) 2023-12-21 12:59:37 -08:00
qwen QWEN: Fix unsupported ScalarType BFloat16 (#187) 2023-12-25 06:10:01 -08:00
speculative_decoding add speculative decoding example for llama (#149) 2023-12-28 15:20:43 -08:00