Commit Graph

26 Commits

Author SHA1 Message Date
Pedro Cuenca
e2e5478da5
READMEs: fix typo in link, minor update. (#1246) 2025-02-04 11:52:32 -08:00
Awni Hannun
c4833a2f55
fix encoding with special tokens + chat template (#1189) 2025-01-03 10:50:59 -08:00
vb
1727959a27
Add mentions of MLX-my-repo. (#1129)
* Add mentions of MLX-my-repo.

* simplify

* move

* move

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-12-03 19:21:39 -08:00
Awni Hannun
0f135396ae
Generation refactor: part 2 (#1099)
* unify with stream_generate

* fixes

* nit

* some cleanup, warnings, tests

* fix test + faster min p + test

* version
2024-11-23 11:47:06 -08:00
Awni Hannun
657b4cc0aa
[MLX LM] Sampler refactor + a few improvements (#1094)
* starting

* refactor sampler/processor and a few improvements

* fix stream

* fix stream generate

* fix eos handling in stream generate
2024-11-07 16:15:24 -08:00
ilyasch2
3b526f0aa1
Add support for falcon-mamba (#1074)
* Add support for falcon-mamba

* nits

* nit

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-11-04 12:23:30 -08:00
Awni Hannun
9f34fdbda4
Wire models in MLX LM (#1069)
* wired in MLX LM

* fix synch

* comment + nit

* version

* mlx lm version

* bump to 0.19.2
2024-10-31 08:17:14 -07:00
Awni Hannun
fca087be49
More cache improvements (#1015)
* fix rotating kv cache for chat use case

* reorg + fixes to caching, unify prompt caching across types and use cases for e.g. caching during a chat

* nit in chat

* fix tests

* fix tests

* fix tests

* docs

* chat command

* comments + docs

* Define meta_state on all Cache implementations

* fixes + trim_prompt_cache api

* fix default model

---------

Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com>
2024-10-07 20:45:51 -07:00
Gökdeniz Gülmez
50e5ca81a8
Adding full finetuning (#903)
* Adding full model weights finetuning

* Updating the LORA.md and ACKNOWLEDGMENTS.md files.

* removing --use-dora and --fulll-training and adding --fine-tune-type

* some clean up

* reformating and fixing dora training

* updated CONFIG_DEFAULTS

* update config example

* update in the config example fie

* Update LORA.md

* merge and commit

* adding argument for dora linear layer

* clean up

* clean up in the example yaml file

* fix

* final fix before sending

* small addition to re md file

* fix for loading the fully trained model by saving all the files and configs correctly

* clean up

* removing the unnesesairy files

* changing lora layers back to 16

* removed max file size

* nits

* resolve merge

* some consistency changes

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-09-29 17:12:47 -07:00
Awni Hannun
c3e3411756
Update LLM generation docs to use chat template (#973)
* fix docs

* add template to model cards as well

* revert

* version
2024-09-07 06:06:15 -07:00
Awni Hannun
b1186e2a81
Docs on prompt scaling (#963)
* docs on prompt scaling

* remove unused var

* nits
2024-08-29 15:05:17 -07:00
Alex Wozniakowski
63800c8feb
Example of response generation with optional arguments (#853)
* Generate response with optional arguments

* Reference response generation example

* Include transformers and sentencepiece

* Update example to run Mistral-7B-Instruct-v0.3

* Link to generation example

* Style changes from pre-commit
2024-07-09 06:49:59 -07:00
Michał Kurc
43d6deb3c1
mlx_lm: Add Streaming Capability to Generate Function (#807)
* Add streaming feature to text generation function

* separate stream and regular functions

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-06-03 09:04:39 -07:00
Chen Xin
aac98ca6f4
support internlm2 (#797)
* support internlm2

* only attention projections

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-05-27 06:22:21 -07:00
Phúc H. Lê Khắc
35206806ac
Create executables for generate, lora, server, merge, convert (#682)
* feat: create executables mlx_lm.<cmd>

* nits in docs

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-04-16 16:08:49 -07:00
Muhtasham Oblokulov
81e2a80026
Add Starcoder 2 (#502)
* Add Starcoder2 model and update utils.py

* Refactor model arguments and modules in starcoder2.py

* Refactor FeedForward class to MLP in starcoder2.py

* Fix typo

* pre-commit

* Refactor starcoder2.py: Update model arguments and modules

* Fix LM head and MLP layers

* Rename  input layer norm

* Update bias in linear layers

* Refactor token embeddings in Starcoder2Model

* Rename to standard HF attention layer name

* Add LayerNorm

* Add transposed token embeddings (like in Gemma)

* Refactor MLP and TransformerBlock classes

* Add tie_word_embeddings option to ModelArgs and update Model implementation

* Add conditional check for tying word embeddings in Starcoder2Model

* Fix bias in lm_head linear layer

* Remove unused LayerNorm in stablelm

* Update transformers dependency to use GitHub repository

* fix lm head bug, revert transformer req

* Update RoPE initialization in Attention class

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-03-02 19:39:23 -08:00
Awni Hannun
8fd953ee2b
Support for slerp merging models (#455)
* support for slerp merging models

* docs

* update docs

* format'
2024-02-19 20:37:15 -08:00
Ashish
0b57f0eae6
Add StableLM-2 1.6B (#378)
* init

* stablelm

* add to readme

* bump version

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-01-26 10:28:00 -08:00
Anchen
362e88a744
feat: move lora into mlx-lm (#337)
* feat: Add lora and qlora training to mlx-lm


---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-01-23 08:44:37 -08:00
Shunta Saito
85c1ff8fd6
Add PLaMo-13B model as an LLM example (#303)
* Convert HF weights of PLaMo and load it to a plamo model in mlx

* Fix model inference part

* Add bos at the beginning of the prompt

* Fix convert.py to copy tokenizer.model into the converted dir

* Use the required insturction format in generate.py when "--instruct" option is specified

* Change filenames and update existing scripts

* Add README

* Add requirements.txt

* Fix plamo.py to stop generation when EOS appears

* Add quantization to convert.py

* Use mlx>=0.0.9 for mx.core.outer() in PLaMo model

* Update acknowledgements.md

* Fix card text in upload_to_hub()

* Not use prompt template when --instruct is not specified

* Ask if you trust_remote_code for loading tokenizer of PLaMo

* Check the user trusts the remote code when converting

* Remove plamo directory

* Update README

* Add PLaMo model file

* Fix the handling of cache in PLaMo and update README

* Ask if trust_remote_code only when the model is PLaMo

* Remove resolve_trust_remote_code from convert.py and use the latest transformers

* Remove code not to add EOS

* Update README to fix an example not to use noncommercial version of the model

* Remove unused imports

* Remove unnecessary description about the instruct model of PLaMo from README

* format, nits in README

* typo

---------

Co-authored-by: Shunta Saito <shunta@mitmul-mbp.local>
Co-authored-by: Awni Hannun <awni@apple.com>
2024-01-23 07:17:24 -08:00
Sugato Ray
a445ac2895
Update docs with conda install option (#354) 2024-01-22 21:14:48 -08:00
Anchen
30be4c4734
refactor(qwen): moving qwen into mlx-lm (#312)
* refactor(qwen): moving qwen into mlx-lm

* chore: update doc

* chore: fix type hint

* add qwen model support in convert

* chore: fix doc

* chore: only load model in quantize_model

* chore: make the convert script only copy tokenizer files instead of load it and save

* chore: update docstring

* chore: remove unnecessary try catch

* chore: clean up for tokenizer and update  transformers 4.37

* nits in README

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-01-22 15:00:07 -08:00
Anchen
195bec2fa3
feat(mlx_lm): add mixtral support in mlx_lm (#318)
* feat: add mixtral support in mlx_lm

* chore: update doc
2024-01-15 07:18:14 -08:00
Marcel Bischoff
cd3cff0858
Phixtral (#290)
* initial

* file

* remove debug

* Adding README

* typo

* simplify readme

* nits in readmes

---------

Co-authored-by: Marcel Bischoff <marcel.bischoff@awarehq.com>
Co-authored-by: Awni Hannun <awni@apple.com>
2024-01-13 08:35:03 -08:00
Anchen
a39b735c3b
chore(mlx-lm): update phi2 model args to sync with hf config format. (#311)
* chore(mlx-lm): update phi2 model args to sync with hf config format

* chore: fix type hint
2024-01-13 07:51:45 -08:00
Awni Hannun
c6440416a2
Mlx llm package (#301)
* fix converter

* add recursive files

* remove gitignore

* remove gitignore

* add packages properly

* read me update

* remove dup readme

* relative

* fix convert

* fix community name

* fix url

* version
2024-01-12 10:25:56 -08:00