mlx-examples/llms/mlx_lm
vishal-14069 21e19b5b5a
Add Repetitive penalty to LLM inference - mlx-lm (#399)
* feat: add repetition penalty

* fix: generate function argument fix

* typo fixes

* update repetitive penalty

* update generate_step and generate

* resolve conflicts in generate

* merge latest oull origin master

* update generate

* update generate and generate_step

* update repetition list - rename variable

* refactor token count

* update generate step and generate

* move repetition_context in generate_step

* update generate step

* update generate_step
2024-02-16 21:58:17 -08:00
..
models Change gqa to use repeat instead of concatenate (#443) 2024-02-14 17:40:11 -08:00
tuner LoRA: add training callbacks (#414) 2024-02-16 06:04:57 -08:00
__init__.py Mlx llm package (#301) 2024-01-12 10:25:56 -08:00
convert.py feat: move lora into mlx-lm (#337) 2024-01-23 08:44:37 -08:00
fuse.py feat(mlx-lm): add de-quant for fuse.py (#365) 2024-01-25 18:59:32 -08:00
generate.py fix the chinese character generation as same as PR #321 (#342) 2024-01-23 12:44:23 -08:00
LORA.md feat: move lora into mlx-lm (#337) 2024-01-23 08:44:37 -08:00
lora.py LoRA: add training callbacks (#414) 2024-02-16 06:04:57 -08:00
py.typed Add py.typed to support PEP-561 (type-hinting) (#389) 2024-01-30 21:17:38 -08:00
README.md feat: move lora into mlx-lm (#337) 2024-01-23 08:44:37 -08:00
requirements.txt Update a few examples to use compile (#420) 2024-02-08 13:00:41 -08:00
UPLOAD.md Mlx llm package (#301) 2024-01-12 10:25:56 -08:00
utils.py Add Repetitive penalty to LLM inference - mlx-lm (#399) 2024-02-16 21:58:17 -08:00

Generate Text with MLX and 🤗 Hugging Face

This an example of large language model text generation that can pull models from the Hugging Face Hub.

For more information on this example, see the README in the parent directory.

This package also supports fine tuning with LoRA or QLoRA. For more information see the LoRA documentation.