mlx-examples/llms/gguf_llm/generate.py at 09ed837896dc2ca2393adfc1e6b68beaac049288

mirror of https://github.com/ml-explore/mlx-examples.git synced 2025-12-16 02:08:55 +08:00

Files

Juarez Bochi f5b80c95fb Example reading directly from gguf file (#222 )

* Draft of tiny llama from gguf

* Transpose all

* No transposition with new layout

* Read config from gguf

* Create tokenizer from gguf

* move gguf and update to be similar to hf_llm

* change model to HF style + updates to REAMDE

* nits in REAMDE

* nit readme

* only use mlx for metadata

* fix eos/bos tokenizer

* fix tokenization

* quantization runs

* 8-bit works

* tokenizer fix

* bump mlx version

---------

Co-authored-by: Juarez Bochi <juarez.bochi@grammarly.com>
Co-authored-by: Awni Hannun <awni@apple.com>

2024-01-23 15:41:54 -08:00

2.2 KiB

Raw Blame History

View Raw

2.2 KiB Raw Blame History

2.2 KiB

Raw Blame History