Commit Graph

398 Commits

Author SHA1 Message Date
Angelos Katharopoulos
e55a9e8cb4
Add an SPM detokenizer that doesn't trim initial space (#681) 2024-04-15 14:15:25 -07:00
Awni Hannun
d3f8e4aee9
Fix argpartition call in Mixtral and other MOES (#676)
* Update mixtral.py

* fix all moes

---------

Co-authored-by: yuhai-china <yuhai.china@gmail.com>
2024-04-12 11:00:56 -07:00
Awni Hannun
9c5554d8ee
Use async eval (#670)
* Use async eval

* bump

* bump

* remove workaround for bfloat cumsum
2024-04-11 13:18:23 -07:00
Nripesh Niketan
0250f6f38e
feat: Update black-pre-commit-mirror to version 24.3.0 (#675) 2024-04-11 07:28:26 -07:00
devonthomas35
9f472dc985
Update transformers for ⌘-R+ (#668) 2024-04-11 07:28:12 -07:00
da-z
5a4cad34ef
Always resume downloads (#674)
* Always resume downloads

* format

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-04-11 06:52:32 -07:00
Angelos Katharopoulos
eff6690952
Fix CFG for SDXL (#667) 2024-04-09 06:06:41 -07:00
Angelos Katharopoulos
1278994b56
Add streaming detokenizers (#651) 2024-04-08 22:36:01 -07:00
Awni Hannun
c68aa3c7c3
Stable lm 2 (#666)
* stable lm 2

* test and lora

* version bump

* merge stable models
2024-04-08 14:18:55 -07:00
Awni Hannun
1e2f7f50b6
fix for empty initial string (#665) 2024-04-08 10:40:05 -07:00
Awni Hannun
c386dd5f5a
Fix for cohere plus (#650)
* fix for cohere plus

* version bump
2024-04-05 14:11:24 -07:00
Awni Hannun
2bd64b78cf
Save lora config (#636)
* lora config

* comments

* version bump
2024-04-02 13:52:53 -07:00
Prince Canuma
d661440dbb
Add support for qwen2moe (#640)
* add sparsemoe block and update decoder logic

* update file name to match HF

* update name

* Code formatting

* update gates calculation

* add support for Qwen2MoE.

* fix pytest

* code formatting and fix missing comma in utils

* Remove decoder sparse step.

Co-authored-by: bozheng-hit <dsoul0621@gmail.com>

* remove gate layer anti-quantisation

* remove unused argument

---------

Co-authored-by: bozheng-hit <dsoul0621@gmail.com>
2024-04-02 11:33:29 -07:00
Awni Hannun
78c431dc25
cleanup whisper a little (#639) 2024-03-30 13:13:58 -07:00
Chime Ogbuji
f6283ef7ce
Configurable LR schedulers (#604)
* Initial config handler and test

* Added means to run from CLI

* Update lora config loading and tests

* Constrain scheduler config (warmup and minimum LR) for each kind

* Update reference to moved schedule_config module

* Minor fix

* Fix typos

* Moved build_schedule and tests

* nits in schedule config

* flake

* fix path

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-03-29 13:41:10 -07:00
Awni Hannun
b80adbcc3e
DBRX (#628)
* dbrx

* format

* format

* comments

* change scores slightly

* remove inadvertant import
2024-03-28 21:03:53 -07:00
Anchen
297a908e3d
fix(mlx-lm): type hints in gguf.py (#621) 2024-03-26 07:56:01 -07:00
Anchen
0ab01b4626
fix(mlx-lm): sorted probs in top_p implementation. (#610)
* fix(mlx-lm): the top p imp

* chore: address comment
2024-03-25 15:07:55 -07:00
Awni Hannun
bbfcc103d7
cast around lora adapters (#613) 2024-03-24 19:34:51 -07:00
Awni Hannun
5a52899405
Partially stream de-tokenization (#609)
* partially stream de-tokenization

* don't break full response
2024-03-23 15:32:33 -07:00
Anchen
494cdf8e96
chore: fix loar for moe model (#608) 2024-03-23 07:22:11 -07:00
Awni Hannun
b8a348c1b8
Switch to fast RMS/LN Norm (#603)
* use nn.RMSNorm, use sdpa, cleanup

* bump mlx versions

* minor update

* use fast layer norm

* version bump

* update requirement for whisper

* update requirement for gguf
2024-03-23 07:13:51 -07:00
Anchen
fbed720d6f
chore(mlx-lm): fix the top_p implementation. (#602)
* chore(mlx-lm): clean up the top p imp

* chore: clean up

* chore: add test

* chore: address comments

* chore: clean up docs string

* chore: clean up test
2024-03-21 12:18:23 -07:00
Anchen
fe96ef342f
feat(mlx-lm): export the GGUF (fp16) format model weights from fuse.py (#555)
* wip

* wip

* feat: convert mlx model to gguf f16

* chore: conver norm layer to float32 to avoid overflow issue

* chore: add support for mixtral

* chore: clean up

* chore: remove unused import statement

* chore: clean up weight name mapping

* version and readme

* actual version bump

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-03-21 10:34:11 -07:00
Anchen
8f906c859a
chore(mlx-lm): enable to apply default chat template (#577)
* chore(mlx-lm): enable to apply default chat template

* Add option to use default chat template

* chore: rename the flag to use default chat template
2024-03-20 21:39:39 -07:00
Ivan Fioravanti
d2a99172a6
Add dropout parameter to lora configuration (#599)
* Add dropout parameter to lora configuration

A dropout parameter has been added to the lora configuration settings in lora_config.yaml. The LoRALinear class in utils.py has been updated to take this new parameter. Additionally, a AttributeError: 'types.SimpleNamespace' object has no attribute 'prompt' related to `args.prompt` has been removed from lora.py.

* Update lora_config.yaml

Set dropout to 0.0 in the sample config file

* format

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-03-20 08:44:40 -07:00
Anchen
949f63f309
chore(mlx-lm): fix print_trainable_parameters for quant models (#581)
* chore(mlx-lm): fix print_trainable_parameters for quant models

* chore: clean up

* refactor: use layer type to check quant bits

* chore: address comment
2024-03-20 08:41:03 -07:00
Matt Wronkiewicz
373dd6f2a2
Set finish_reason in response (#592) 2024-03-19 20:21:26 -07:00
Alwin Arrasyid
6c3d4c8ba2
add dequantize option to mlx_lm/convert.py (#547) 2024-03-19 19:50:08 -07:00
Chime Ogbuji
6f2fd5daea
Add mlx-lm version information to HF model card (#596)
* Add mlx-lm version informatiohn to HF model card

* Update llms/mlx_lm/utils.py

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>

* Reverted indentation

* Pre-commit formatting

---------

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
2024-03-19 19:42:03 -07:00
madroid
39d5ca6427
LoRA: report last train info (#595) 2024-03-19 17:29:50 -07:00
yzimmermann
4680ef4413
Enable more BERT models (#580)
* Update convert.py

* Update model.py

* Update test.py

* Update model.py

* Update convert.py

* Add files via upload

* Update convert.py

* format

* nit

* nit

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-03-19 17:21:33 -07:00
madroid
b0bcd86a40
Support for OpenAI’s fine-tuning dataset format (#548)
* LoRA: move load_dataset to tuner/datasets.py file

* LoRA: support OpenAI chat format datasets

see https://platform.openai.com/docs/guides/fine-tuning/example-format

* LoRA: support OpenAI completion format datasets

* LoRA: formatting dataset timing to reduce memory footprint

* Refactor dataset item access in PromptCompletionDataset

* Update mlx_lm/LORA.md

* Update mlx_lm/LORA.md

* check Unsupported data format

* add tests, fine-tune doc

* add tests, fine-tune doc

* add jinja2 for chat template

* nits in readme

* nits in readme

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-03-19 16:45:46 -07:00
Abdul Fatir
e05e502c34
Fix scaling when embeddings are tied (#591) 2024-03-18 13:41:07 -07:00
Awni Hannun
e4b19bb9e1
Make attention faster for a some models (#574)
* make attention faster for a couple models

* remove unused generation flags

* add comment on lora

* include text files as well
2024-03-14 21:35:54 -07:00
Angelos Katharopoulos
3f3741d229
Fix requirements and image2image strength/steps mismatch (#585) 2024-03-14 12:22:54 -07:00
sweetcard
e2205beb66
Update server.py to add --trust-remote-code to server (#578)
* Update server.py

Add --trust-remote-code to server

* format code by running pre-commit

---------

Co-authored-by: flymonk <zhou.feng@gsafer.com>
2024-03-14 07:05:19 -07:00
Sugato Ray
2cd793dd69
feat: add update_config functionality (#531)
* feat: add `update_config` finctionality

- sorts the config for better readability
- updates "_name_or_path" key in config with upload_repo
- sets indentation of 4 spaces
- allows adding other key-value pairs via kwargs
- reduces code duplication
- standardizes config-update across mlx-lm

* feat: standardize updating config

Impactes:
- fuse.py
- merge.py

* update formatting

* remove commented out code

* update func: update_config to save_config

- drop kwards
- rename func as save_config
- incorporate review suggestions

* update func: save_config

- ensure only config-saving functionality
- function oes not return config as a dict anymore
- added review suggestions

* fixed formatting

* update formatting instruction in contribution guide

* nits

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-03-14 06:36:05 -07:00
madroid
485180ae91
LoRA: some minor optimizations (#573)
* init training_args in training scope

* Add trainable parameters percentage
2024-03-13 20:26:30 -07:00
madroid
d4e1de1d5b
add peak_memory info to training callback (#572) 2024-03-13 20:17:10 -07:00
Race
376bb9cc44
bert encoder inherits from nn.Module now (#571) 2024-03-13 10:24:21 -07:00
Awni Hannun
14fe868825
version (#570) 2024-03-13 10:09:36 -07:00
Prince Canuma
76c3244cc5
Add support for Cohere's Command-R (#565)
* initial commit for command-R

* update mlp, layernorm, lm_head and model args

* add custom layernorm

* add default to tie_word_embeddings

* add layernorm weight type and refactor

* update layernorm (bias conditional) in model/layers

* fix layer norm use traditional rope

* add test

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-03-13 07:03:36 -07:00
Anchen
3535408c99
chore(mlx-lm): fix tie_word_embeddings for qwen2 (#566)
* chore: fix tie_word_embeddings for qwen2

* chore: default tie_word_embeddings to True
2024-03-12 21:34:32 -07:00
Awni Hannun
39084e81c2
Some improvements to LoRA (#528)
* set cache_limit

* remove set cache_limit

* cleanup

* add gradient checkpointing

* fix sort

* mokey patch call for checkpoint

* fix example config
2024-03-12 20:02:03 -07:00
Chime Ogbuji
e56d9015ef
LoRA on all linear transformer block layers (#546)
* Add --lora-all-linear option to apply LoRa to all linear transfer block layers

* Moved to YAML config and added specification of rank & alpha

* nits in conifg, more tests

* nit

* run tests for prs

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-03-12 07:37:40 -07:00
devonthomas35
fe5edee360
Fix image2image for SDXL (#563)
---------

Co-authored-by: Angelos Katharopoulos <katharas@gmail.com>
2024-03-11 12:18:47 -07:00
zweifisch
d0fa6cfcae
feat: stable-diffusion t2i add --seed (#558) 2024-03-10 06:12:54 -07:00
Awni Hannun
ad3cf5ed98
dropout 0 as default (#549) 2024-03-08 13:07:10 -08:00
Angelos Katharopoulos
3a9e6c3f70
Stable diffusion XL (#516) 2024-03-08 10:24:19 -08:00