mlx-examples/llms/mlx_lm/tuner
L fc93c55723
feat(mlx_lm): Nemotron (#949)
* feat: Nemotron

https://huggingface.co/nvidia/Minitron-4B-Base

This is basically Llama with partial RoPE and LayerNorm instead of
BatchNorm. Also they add 1 to the LayerNorm weight for some reason.

* fixup! feat: Nemotron

* nits

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-08-29 21:08:57 -07:00
..
__init__.py LoRA: Extract small function (#614) 2024-06-02 06:38:42 -07:00
datasets.py Configuration-based use of HF hub-hosted datasets for training (#701) 2024-06-26 10:20:50 -07:00
dora.py Allow the entire model to be targed for LoRA and DoRA fine tuning: LoRA and DoRA embeddings with small DoRALinear bug fix (#914) 2024-08-16 07:38:36 -07:00
lora.py Allow the entire model to be targed for LoRA and DoRA fine tuning: LoRA and DoRA embeddings with small DoRALinear bug fix (#914) 2024-08-16 07:38:36 -07:00
trainer.py Add eos token to lora fine-tunes (#818) 2024-06-12 07:44:21 -07:00
utils.py feat(mlx_lm): Nemotron (#949) 2024-08-29 21:08:57 -07:00