Angelos Katharopoulos
e55a9e8cb4
Add an SPM detokenizer that doesn't trim initial space ( #681 )
2024-04-15 14:15:25 -07:00
Awni Hannun
d3f8e4aee9
Fix argpartition call in Mixtral and other MOES ( #676 )
...
* Update mixtral.py
* fix all moes
---------
Co-authored-by: yuhai-china <yuhai.china@gmail.com>
2024-04-12 11:00:56 -07:00
Awni Hannun
9c5554d8ee
Use async eval ( #670 )
...
* Use async eval
* bump
* bump
* remove workaround for bfloat cumsum
2024-04-11 13:18:23 -07:00
Nripesh Niketan
0250f6f38e
feat: Update black-pre-commit-mirror to version 24.3.0 ( #675 )
2024-04-11 07:28:26 -07:00
devonthomas35
9f472dc985
Update transformers for ⌘-R+ ( #668 )
2024-04-11 07:28:12 -07:00
da-z
5a4cad34ef
Always resume downloads ( #674 )
...
* Always resume downloads
* format
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2024-04-11 06:52:32 -07:00
Angelos Katharopoulos
eff6690952
Fix CFG for SDXL ( #667 )
2024-04-09 06:06:41 -07:00
Angelos Katharopoulos
1278994b56
Add streaming detokenizers ( #651 )
2024-04-08 22:36:01 -07:00
Awni Hannun
c68aa3c7c3
Stable lm 2 ( #666 )
...
* stable lm 2
* test and lora
* version bump
* merge stable models
2024-04-08 14:18:55 -07:00
Awni Hannun
1e2f7f50b6
fix for empty initial string ( #665 )
2024-04-08 10:40:05 -07:00
Awni Hannun
c386dd5f5a
Fix for cohere plus ( #650 )
...
* fix for cohere plus
* version bump
2024-04-05 14:11:24 -07:00
Awni Hannun
2bd64b78cf
Save lora config ( #636 )
...
* lora config
* comments
* version bump
2024-04-02 13:52:53 -07:00
Prince Canuma
d661440dbb
Add support for qwen2moe ( #640 )
...
* add sparsemoe block and update decoder logic
* update file name to match HF
* update name
* Code formatting
* update gates calculation
* add support for Qwen2MoE.
* fix pytest
* code formatting and fix missing comma in utils
* Remove decoder sparse step.
Co-authored-by: bozheng-hit <dsoul0621@gmail.com>
* remove gate layer anti-quantisation
* remove unused argument
---------
Co-authored-by: bozheng-hit <dsoul0621@gmail.com>
2024-04-02 11:33:29 -07:00
Awni Hannun
78c431dc25
cleanup whisper a little ( #639 )
2024-03-30 13:13:58 -07:00
Chime Ogbuji
f6283ef7ce
Configurable LR schedulers ( #604 )
...
* Initial config handler and test
* Added means to run from CLI
* Update lora config loading and tests
* Constrain scheduler config (warmup and minimum LR) for each kind
* Update reference to moved schedule_config module
* Minor fix
* Fix typos
* Moved build_schedule and tests
* nits in schedule config
* flake
* fix path
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2024-03-29 13:41:10 -07:00
Awni Hannun
b80adbcc3e
DBRX ( #628 )
...
* dbrx
* format
* format
* comments
* change scores slightly
* remove inadvertant import
2024-03-28 21:03:53 -07:00
Anchen
297a908e3d
fix(mlx-lm): type hints in gguf.py ( #621 )
2024-03-26 07:56:01 -07:00
Anchen
0ab01b4626
fix(mlx-lm): sorted probs in top_p implementation. ( #610 )
...
* fix(mlx-lm): the top p imp
* chore: address comment
2024-03-25 15:07:55 -07:00
Awni Hannun
bbfcc103d7
cast around lora adapters ( #613 )
2024-03-24 19:34:51 -07:00
Awni Hannun
5a52899405
Partially stream de-tokenization ( #609 )
...
* partially stream de-tokenization
* don't break full response
2024-03-23 15:32:33 -07:00
Anchen
494cdf8e96
chore: fix loar for moe model ( #608 )
2024-03-23 07:22:11 -07:00
Awni Hannun
b8a348c1b8
Switch to fast RMS/LN Norm ( #603 )
...
* use nn.RMSNorm, use sdpa, cleanup
* bump mlx versions
* minor update
* use fast layer norm
* version bump
* update requirement for whisper
* update requirement for gguf
2024-03-23 07:13:51 -07:00
Anchen
fbed720d6f
chore(mlx-lm): fix the top_p implementation. ( #602 )
...
* chore(mlx-lm): clean up the top p imp
* chore: clean up
* chore: add test
* chore: address comments
* chore: clean up docs string
* chore: clean up test
2024-03-21 12:18:23 -07:00
Anchen
fe96ef342f
feat(mlx-lm): export the GGUF (fp16) format model weights from fuse.py ( #555 )
...
* wip
* wip
* feat: convert mlx model to gguf f16
* chore: conver norm layer to float32 to avoid overflow issue
* chore: add support for mixtral
* chore: clean up
* chore: remove unused import statement
* chore: clean up weight name mapping
* version and readme
* actual version bump
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2024-03-21 10:34:11 -07:00
Anchen
8f906c859a
chore(mlx-lm): enable to apply default chat template ( #577 )
...
* chore(mlx-lm): enable to apply default chat template
* Add option to use default chat template
* chore: rename the flag to use default chat template
2024-03-20 21:39:39 -07:00
Ivan Fioravanti
d2a99172a6
Add dropout parameter to lora configuration ( #599 )
...
* Add dropout parameter to lora configuration
A dropout parameter has been added to the lora configuration settings in lora_config.yaml. The LoRALinear class in utils.py has been updated to take this new parameter. Additionally, a AttributeError: 'types.SimpleNamespace' object has no attribute 'prompt' related to `args.prompt` has been removed from lora.py.
* Update lora_config.yaml
Set dropout to 0.0 in the sample config file
* format
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2024-03-20 08:44:40 -07:00
Anchen
949f63f309
chore(mlx-lm): fix print_trainable_parameters for quant models ( #581 )
...
* chore(mlx-lm): fix print_trainable_parameters for quant models
* chore: clean up
* refactor: use layer type to check quant bits
* chore: address comment
2024-03-20 08:41:03 -07:00
Matt Wronkiewicz
373dd6f2a2
Set finish_reason in response ( #592 )
2024-03-19 20:21:26 -07:00
Alwin Arrasyid
6c3d4c8ba2
add dequantize option to mlx_lm/convert.py ( #547 )
2024-03-19 19:50:08 -07:00
Chime Ogbuji
6f2fd5daea
Add mlx-lm version information to HF model card ( #596 )
...
* Add mlx-lm version informatiohn to HF model card
* Update llms/mlx_lm/utils.py
Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
* Reverted indentation
* Pre-commit formatting
---------
Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
2024-03-19 19:42:03 -07:00
madroid
39d5ca6427
LoRA: report last train info ( #595 )
2024-03-19 17:29:50 -07:00
yzimmermann
4680ef4413
Enable more BERT models ( #580 )
...
* Update convert.py
* Update model.py
* Update test.py
* Update model.py
* Update convert.py
* Add files via upload
* Update convert.py
* format
* nit
* nit
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2024-03-19 17:21:33 -07:00
madroid
b0bcd86a40
Support for OpenAI’s fine-tuning dataset format ( #548 )
...
* LoRA: move load_dataset to tuner/datasets.py file
* LoRA: support OpenAI chat format datasets
see https://platform.openai.com/docs/guides/fine-tuning/example-format
* LoRA: support OpenAI completion format datasets
* LoRA: formatting dataset timing to reduce memory footprint
* Refactor dataset item access in PromptCompletionDataset
* Update mlx_lm/LORA.md
* Update mlx_lm/LORA.md
* check Unsupported data format
* add tests, fine-tune doc
* add tests, fine-tune doc
* add jinja2 for chat template
* nits in readme
* nits in readme
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2024-03-19 16:45:46 -07:00
Abdul Fatir
e05e502c34
Fix scaling when embeddings are tied ( #591 )
2024-03-18 13:41:07 -07:00
Awni Hannun
e4b19bb9e1
Make attention faster for a some models ( #574 )
...
* make attention faster for a couple models
* remove unused generation flags
* add comment on lora
* include text files as well
2024-03-14 21:35:54 -07:00
Angelos Katharopoulos
3f3741d229
Fix requirements and image2image strength/steps mismatch ( #585 )
2024-03-14 12:22:54 -07:00
sweetcard
e2205beb66
Update server.py to add --trust-remote-code to server ( #578 )
...
* Update server.py
Add --trust-remote-code to server
* format code by running pre-commit
---------
Co-authored-by: flymonk <zhou.feng@gsafer.com>
2024-03-14 07:05:19 -07:00
Sugato Ray
2cd793dd69
feat: add update_config functionality ( #531 )
...
* feat: add `update_config` finctionality
- sorts the config for better readability
- updates "_name_or_path" key in config with upload_repo
- sets indentation of 4 spaces
- allows adding other key-value pairs via kwargs
- reduces code duplication
- standardizes config-update across mlx-lm
* feat: standardize updating config
Impactes:
- fuse.py
- merge.py
* update formatting
* remove commented out code
* update func: update_config to save_config
- drop kwards
- rename func as save_config
- incorporate review suggestions
* update func: save_config
- ensure only config-saving functionality
- function oes not return config as a dict anymore
- added review suggestions
* fixed formatting
* update formatting instruction in contribution guide
* nits
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2024-03-14 06:36:05 -07:00
madroid
485180ae91
LoRA: some minor optimizations ( #573 )
...
* init training_args in training scope
* Add trainable parameters percentage
2024-03-13 20:26:30 -07:00
madroid
d4e1de1d5b
add peak_memory info to training callback ( #572 )
2024-03-13 20:17:10 -07:00
Race
376bb9cc44
bert encoder inherits from nn.Module now ( #571 )
2024-03-13 10:24:21 -07:00
Awni Hannun
14fe868825
version ( #570 )
2024-03-13 10:09:36 -07:00
Prince Canuma
76c3244cc5
Add support for Cohere's Command-R ( #565 )
...
* initial commit for command-R
* update mlp, layernorm, lm_head and model args
* add custom layernorm
* add default to tie_word_embeddings
* add layernorm weight type and refactor
* update layernorm (bias conditional) in model/layers
* fix layer norm use traditional rope
* add test
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2024-03-13 07:03:36 -07:00
Anchen
3535408c99
chore(mlx-lm): fix tie_word_embeddings for qwen2 ( #566 )
...
* chore: fix tie_word_embeddings for qwen2
* chore: default tie_word_embeddings to True
2024-03-12 21:34:32 -07:00
Awni Hannun
39084e81c2
Some improvements to LoRA ( #528 )
...
* set cache_limit
* remove set cache_limit
* cleanup
* add gradient checkpointing
* fix sort
* mokey patch call for checkpoint
* fix example config
2024-03-12 20:02:03 -07:00
Chime Ogbuji
e56d9015ef
LoRA on all linear transformer block layers ( #546 )
...
* Add --lora-all-linear option to apply LoRa to all linear transfer block layers
* Moved to YAML config and added specification of rank & alpha
* nits in conifg, more tests
* nit
* run tests for prs
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2024-03-12 07:37:40 -07:00
devonthomas35
fe5edee360
Fix image2image for SDXL ( #563 )
...
---------
Co-authored-by: Angelos Katharopoulos <katharas@gmail.com>
2024-03-11 12:18:47 -07:00
zweifisch
d0fa6cfcae
feat: stable-diffusion t2i add --seed ( #558 )
2024-03-10 06:12:54 -07:00
Awni Hannun
ad3cf5ed98
dropout 0 as default ( #549 )
2024-03-08 13:07:10 -08:00
Angelos Katharopoulos
3a9e6c3f70
Stable diffusion XL ( #516 )
2024-03-08 10:24:19 -08:00