Commit Graph

546 Commits

Author SHA1 Message Date
Angelos Katharopoulos
fc88e3b0d0 Fix gradient accumulation averaging 2024-10-10 03:28:44 -07:00
Angelos Katharopoulos
3c587ed618 Fixes and default args adjustment 2024-10-10 03:28:44 -07:00
Angelos Katharopoulos
807bd66b80 Enable generation with a trained adapter 2024-10-10 03:28:44 -07:00
Angelos Katharopoulos
ecd8828e33 Further refactoring 2024-10-10 03:28:44 -07:00
Angelos Katharopoulos
f538394eec Cleanup the dreambooth 2024-10-10 03:28:44 -07:00
Angelos Katharopoulos
446d8b6439 Revert SD dreambooth 2024-10-10 03:28:44 -07:00
Angelos Katharopoulos
b54218ea08 General updates 2024-10-10 03:28:44 -07:00
Angelos Katharopoulos
19dc28f08a Fix time schedule 2024-10-10 03:28:44 -07:00
Angelos Katharopoulos
27aaff8f31 Finetune all layers 2024-10-10 03:28:44 -07:00
Angelos Katharopoulos
d9c5fd5ba4 Add the raw option to txt2image 2024-10-10 03:28:44 -07:00
Angelos Katharopoulos
bb8436a441 Update dataset 2024-10-10 03:28:44 -07:00
Angelos Katharopoulos
f2ccad52f4 Add lr schedule 2024-10-10 03:28:44 -07:00
Angelos Katharopoulos
e7751e4c29 Add gradient accumulation and data parallelism 2024-10-10 03:28:44 -07:00
Angelos Katharopoulos
7cffcdcaff Flux lora training 2024-10-10 03:28:44 -07:00
Angelos Katharopoulos
9eef46e645 Refactor the pipeline 2024-10-10 03:28:44 -07:00
Angelos Katharopoulos
aefe60e79d Avoid upcasting and fix batch size > 1 2024-10-10 03:28:44 -07:00
Angelos Katharopoulos
070c58ed92 Bugfix in t5 rpos and initial generation example 2024-10-10 03:28:44 -07:00
Angelos Katharopoulos
88603f0330 Add the tokenizers 2024-10-10 03:28:44 -07:00
Angelos Katharopoulos
63932d777c Working clip, t5 and flux model 2024-10-10 03:28:44 -07:00
Angelos Katharopoulos
ed17f815f5 Flux implementation in examples 2024-10-10 03:28:44 -07:00
Angelos Katharopoulos
f61f4b5cf1 Start a stable diffusion dreambooth example 2024-10-10 03:28:44 -07:00
Awni Hannun
4360e7ccec clear cache during prompt processing (#1027) 2024-10-09 16:48:32 -07:00
Awni Hannun
b7373cb44f fix long prompt generations (#1023) 2024-10-09 11:09:36 -07:00
Awni Hannun
fca087be49 More cache improvements (#1015)
* fix rotating kv cache for chat use case

* reorg + fixes to caching, unify prompt caching across types and use cases for e.g. caching during a chat

* nit in chat

* fix tests

* fix tests

* fix tests

* docs

* chat command

* comments + docs

* Define meta_state on all Cache implementations

* fixes + trim_prompt_cache api

* fix default model

---------

Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com>
2024-10-07 20:45:51 -07:00
Awni Hannun
9bc53fc210 convert (#1006) 2024-10-02 13:13:33 -07:00
madroid
36c1d8e8dc Server: support function calling (#1003) 2024-10-02 12:36:07 -07:00
nathan
0866e23a67 repetiton_penalty and logits_bias just using logits_processors (#1004)
* refactor of repetition_penalty and logits_bias to use logits_processor

* nits

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-09-30 08:49:03 -07:00
Zai Thottakath
418d9a5511 Feature: QDoRA (#891)
* feat: QDoRA with tests and a small bug fix for recalculation of self.m

* some simplifications and fixes

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-09-30 08:01:11 -07:00
madroid
aa1c8abdc6 LoRA: Support HuggingFace dataset via data parameter (#996)
* LoRA: support huggingface dataset via `data` argument

* LoRA: Extract the load_custom_hf_dataset function

* LoRA: split small functions

* fix spelling errors

* handle load hf dataset error

* fix pre-commit lint

* update data argument help

* nits and doc

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-09-30 07:36:21 -07:00
Gökdeniz Gülmez
50e5ca81a8 Adding full finetuning (#903)
* Adding full model weights finetuning

* Updating the LORA.md and ACKNOWLEDGMENTS.md files.

* removing --use-dora and --fulll-training and adding --fine-tune-type

* some clean up

* reformating and fixing dora training

* updated CONFIG_DEFAULTS

* update config example

* update in the config example fie

* Update LORA.md

* merge and commit

* adding argument for dora linear layer

* clean up

* clean up in the example yaml file

* fix

* final fix before sending

* small addition to re md file

* fix for loading the fully trained model by saving all the files and configs correctly

* clean up

* removing the unnesesairy files

* changing lora layers back to 16

* removed max file size

* nits

* resolve merge

* some consistency changes

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-09-29 17:12:47 -07:00
madroid
7ec2021bb9 LoRA: support tools(function calling) format datasets (#995)
* LoRA: support fine-tuning tools datasets

* LoRA: Split small function

* LoRA: add tools format to lora docs

* LoRA: pre-commit fix

* Revert "LoRA: pre-commit fix"

This reverts commit b94b7e0fe7.

* Revert "LoRA: Split small function"

This reverts commit 3f6a5f19fd.

* LoRA: remove ToolsDataset

In a JSONL file, not all data is required to include the tools value.

* nit in readme

* nit in readme

* nit in readme

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-09-28 10:41:36 -07:00
nathan
ace2bb5890 Add logits_processor option to generate_step function (#983)
* Add logits_processor option for the generation as in huggingface transformers library

* concatenation correction

* Rename the tokens variable for clarity

* remove the logit_bias argument from generate_step method

* fix the variable name

* nits + test

* test

* add back logit bias + test

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-09-28 10:08:49 -07:00
jamesm131
d812516d3d Add /v1/models endpoint to mlx_lm.server (#984)
* Add 'models' endpoint to server

* Add test for new 'models' server endpoint

* Check hf_cache for mlx models

* update tests to check hf_cache for models

* simplify test

* doc

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-09-28 07:21:11 -07:00
Gökdeniz Gülmez
76710f61af Adding support for mamba (#940)
* initial commit

* initial commit

* Adding first lines

* adding x, and dt projection layers

* adding the clamping mechanism

* First succesful inference

* last commit for today - added custom geenrate function and it works as expected, will try training and then with loading a model from the hub

* clean up

* save up

* almost

* update

* update

* fixed cache handeling

* fixed loading

* added seperate generat_step method in the model and also in the utils to automaticaly use the generate step mthod in the model class

* quick update

* still not working

* save

* still not working

* initial commit

* utils.py logits = logits[:, -1, :] TypeError: tuple indices must be integers or slices, not tuple

* update

* update

* Fixing the Batching Depfwise Comnvolution and multi token input

* fixing generate and logits outputs

* Done!

* Fixing the cache handling, generating works now trying training

* update ACKNOWLEDGEMENTS

* removing the model_type if stuff in the _step loop in generate_step and adding MambaCache in base.py for training easier generations and removing mamba in tuner/utils.

* quick clean up

* update trainer/utils for right initialisation of the layers for LoRA, but not working.

* clean up

* Forther update to trainer/utils for correct layer selection. Successfull training

* removing extra mamba-infer.py file

* clean up, reformating will come later

* reformat and big clean up, final commit

* some speedups and cleanups

* fix test

* nits

* nits

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-09-28 07:02:53 -07:00
Cheng
e776c970f7 Fix llava model when using text-only prompt (#998) 2024-09-25 07:19:41 -07:00
Awni Hannun
9bb2dd62f3 Encodec (#991)
* initial encodec

* works

* nits

* use fast group norm

* fix for rnn layer

* fix mlx version

* use custom LSTM kernel

* audio encodec

* fix example, support batched inference

* nits
2024-09-23 11:39:25 -07:00
Angelos Katharopoulos
796d5e40e4 Fix export to gguf (#993) 2024-09-20 13:33:45 -07:00
Awni Hannun
f530f56df2 don't use internal exception (#990) 2024-09-17 16:22:48 -07:00
Awni Hannun
6c2369e4b9 Fix bug in upload + docs nit (#981)
* fix bug in upload + docs nit

* nit
2024-09-07 14:46:57 -07:00
Awni Hannun
c3e3411756 Update LLM generation docs to use chat template (#973)
* fix docs

* add template to model cards as well

* revert

* version
2024-09-07 06:06:15 -07:00
Angelos Katharopoulos
324184d670 Fix the cache_prompt (#979) 2024-09-06 20:19:27 -07:00
madroid
bd29aec299 Support HuggingFace model tree (#957)
* Hub: Update quantization configuration fields

* Hub: add base_model metadata

* Hub: add quantization_config for model tree Quantized type

* Hub: update quantization_config value

* Hub: remove config print
2024-09-04 06:19:32 -07:00
Chime Ogbuji
83a209e200 Add prompt piping (#962)
* Initial commit of --prompt-only and prompt from STDIN feature

* Switch to using --verbose instead of --prompt-only

* Fix capitalization typo

* Fix reference to changed option name

* Update exception text
2024-09-03 13:29:10 -07:00
James Zhao
bf921afcbe Make sure to import the correct "version" module when installing mlx_whisper and mlx_lm from local source code. (#969)
* Make sure to import the correct "version" module when installing the
mlx_whisper package from local source code.

* Make sure to import the correct "version" module when installing the mlx_lm package from local source code

* fix

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-09-03 13:16:21 -07:00
Awni Hannun
3c6e8b11af fix (#965) 2024-08-30 05:56:27 -07:00
L
fc93c55723 feat(mlx_lm): Nemotron (#949)
* feat: Nemotron

https://huggingface.co/nvidia/Minitron-4B-Base

This is basically Llama with partial RoPE and LayerNorm instead of
BatchNorm. Also they add 1 to the LayerNorm weight for some reason.

* fixup! feat: Nemotron

* nits

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-08-29 21:08:57 -07:00
Awni Hannun
b1186e2a81 Docs on prompt scaling (#963)
* docs on prompt scaling

* remove unused var

* nits
2024-08-29 15:05:17 -07:00
Angelos Katharopoulos
1003a8b2dd Add the ability to load the KV cache from a file (#956) 2024-08-28 22:11:45 -07:00
Angelos Katharopoulos
7f8c961287 Fix setattr for the TokenizerWrapper (#961) 2024-08-28 14:47:33 -07:00
Nripesh Niketan
bf21789b17 chore: update black pre-commit hooks to latest versions (#955) 2024-08-26 07:54:23 -07:00