Gökdeniz Gülmez
dd29e74b89
Merge branch 'ml-explore:main' into adding-support-for-mamba2
2025-01-22 14:19:06 +01:00
Awni Hannun
9a3ddc3e65
some fixes for pipeline parallel deep seek r1 ( #1216 )
2025-01-21 19:40:29 -08:00
Goekdeniz-Guelmez
a4b716e65d
small optimization
2025-01-22 00:15:02 +01:00
Victor Nogueira
df1406735b
Fix dataset variable name, in datasets.py
( #1212 )
2025-01-21 14:12:43 -08:00
Goekdeniz-Guelmez
12e9f34524
removing unnessesairy lines and cleaning up
2025-01-21 23:06:40 +01:00
Goekdeniz-Guelmez
c13de475f6
removing custom RMSNorm class
2025-01-21 22:52:45 +01:00
Goekdeniz-Guelmez
a6a92cb91f
codestral inference exxtually works now
2025-01-21 21:01:39 +01:00
Goekdeniz-Guelmez
5a6ada2df0
getting reall closer:
...
python -m mlx_lm.generate --model /Users/gokdenizgulmez/Desktop/Mamba-Codestral-7B-v0.1-4bit --prompt "# A function that computes fibonacci
def fibonacci(" -m 64
==========
n):
print(f"{os.path.abspath(".")/data/data/data/com.android.launcher.png)
## 🙌🏼 🙌 🙌 🙌 🙌 🙌 🙌
class _State(Enum):
def __init__ (self
==========
Prompt: 16 tokens, 84.547 tokens-per-sec
Generation: 64 tokens, 13.774 tokens-per-sec
Peak memory: 4.139 GB
2025-01-21 20:44:51 +01:00
Goekdeniz-Guelmez
eb432f4b7d
inference with the origional mamba2 model woirks but still not with codestral. working:
...
rokyang/mamba2-130m-hf
rokyang/mamba2-370m-hf
rokyang/mamba2-780m-hf
rokyang/mamba2-1.3b-hf
rokyang/mamba2-2.7b-hf
2025-01-21 19:38:07 +01:00
Gökdeniz Gülmez
be4bc7a090
Merge branch 'ml-explore:main' into adding-support-for-mamba2
2025-01-21 10:57:21 +01:00
Goekdeniz-Guelmez
e96c17d061
inference works
2025-01-20 19:50:08 +01:00
Goekdeniz-Guelmez
db514f24c8
update
2025-01-20 19:44:05 +01:00
Goekdeniz-Guelmez
531ac96481
fixing cache
2025-01-20 18:26:21 +01:00
Jarrett
07f88f8057
fix(lora): add back store_true default args ( #1205 )
2025-01-16 11:15:42 -08:00
Awni Hannun
50f0a7f6d9
add internlm3 ( #1206 )
2025-01-15 14:55:41 -08:00
Ivan Fioravanti
6ae6c72c2e
reduction moved to CPU in case of distributed training ( #1200 )
2025-01-14 17:20:42 -08:00
Goekdeniz-Guelmez
dd4957f3da
adding correct initialisation of dt, A and D
2025-01-13 21:28:43 +01:00
Gökdeniz Gülmez
5509ef8e52
Merge branch 'ml-explore:main' into adding-support-for-mamba2
2025-01-13 20:16:04 +01:00
Awni Hannun
c117af83b8
fix gpt bigcode ( #1204 )
2025-01-13 10:22:32 -08:00
Chime Ogbuji
0228c46434
Custom local dataset features ( #1085 )
...
* Generalize prompt_feature and completion_feature for use in local datasets to facilitate compatibility with many other training dataset formats.
* Persist configured prompt/completion key
* rebase + nits
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2025-01-13 10:01:18 -08:00
Prince Canuma
bf2da36fc6
Fix Cohere2: mask shape error (long context) ( #1202 )
...
* fix mask shape error (long context)
* Update llms/mlx_lm/models/cohere2.py
Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
* revert layer_idx
* black formatting
* Update cohere2.py
* format
---------
Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
Co-authored-by: Awni Hannun <awni@apple.com>
2025-01-12 12:58:08 -08:00
Xingjun.Wang
514502da22
Support snapshot_download for ModelScope ( #1194 )
...
* add MLX_USE_MODELSCOPE env
* update
* update snapshot_download
* update
* remove modelscope dependency and add import check
* update
* nits
* fix
---------
Co-authored-by: wangxingjun778 <jason@U-C7X6TX5G-2239.local>
Co-authored-by: Awni Hannun <awni@apple.com>
2025-01-10 15:29:34 -08:00
Awni Hannun
93c5cfd781
Add a speculative decoding generator ( #1155 )
...
* add a speculative decoding generator
* fix
* fixes
* optional kwarg pop
2025-01-10 15:27:08 -08:00
Awni Hannun
5cae0a60e6
deepseek v3 model with pipeline parallelism ( #1191 )
...
* deepseekv3
* use upload_large_file instead of deprecated multi comit
* add pipeline generation and example
* comment
* get fp16 working
* use mlx==0.22
2025-01-09 15:55:53 -08:00
Jarrett
40b88eff48
fix(lora): config yaml & arg default merge bug ( #1196 )
2025-01-09 11:33:54 -08:00
Pedro Cuenca
b8f0cacfa8
Use upload_large_folder ( #1193 )
2025-01-07 09:18:31 -08:00
Awni Hannun
9183fe8b6d
fix ( #1192 )
2025-01-06 10:12:07 -08:00
Chime Ogbuji
f2619f507c
Add support for fewshot and apply chat template lm_eval functionality ( #1180 )
...
* Add support for multiturn fewshot examples and chat templates
Added two new arguments to the evaluation script: `--fewshot-as-multiturn` and `--apply-chat-template` which correspond to lm_eval options of similar names and are very often used to ensure apples-to-apples comparisons of lm_evaluation results
* Add HF overrides for methods needed by added options
* don't add duplicate bos
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2025-01-06 07:58:43 -08:00
Angelos Katharopoulos
25ec2d8c44
Change the eos-token argument for mlx_lm.generate ( #1176 )
2025-01-05 22:26:05 -08:00
Awni Hannun
c4833a2f55
fix encoding with special tokens + chat template ( #1189 )
2025-01-03 10:50:59 -08:00
Ivan Fioravanti
3a58c36109
Improvements to mlx_lm.manage ( #1178 )
...
* improvements to manage. Default value is N and size added to deletion confirmation.
* Fixing case for no case
* nits
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2025-01-01 07:25:57 -08:00
Goekdeniz-Guelmez
8deada9d11
optimizations
2024-12-27 17:52:14 +01:00
Goekdeniz-Guelmez
4e94e87f57
nits
2024-12-27 15:41:54 +01:00
Goekdeniz-Guelmez
3384d38a83
nits
2024-12-27 15:37:41 +01:00
Goekdeniz-Guelmez
2ed51946ab
still gibberish
2024-12-27 15:36:37 +01:00
Goekdeniz-Guelmez
f4cbe27b0f
new set but still gibberish
2024-12-27 15:27:09 +01:00
Goekdeniz-Guelmez
d044db959d
update
2024-12-27 15:17:45 +01:00
Alex Barron
d4ef909d4a
Length masking for batch inputs ( #1173 )
...
* length masking
* add mask to mlx_lm model interface
* remove lengths
* fix test:
* comment + fix
2024-12-18 19:43:52 -08:00
Awni Hannun
db109184b7
Fix no template prompt + top_k sampling ( #1166 )
...
* fix no template prompt
* add top_k sampling
* fix chinese
2024-12-18 18:46:50 -08:00
Goekdeniz-Guelmez
0ae536c423
update: using einsum on som elines making it faster, but still generates Gibberish on Codestral
2024-12-18 19:32:22 +01:00
Gökdeniz Gülmez
7996a6f4fd
Merge branch 'ml-explore:main' into adding-support-for-mamba2
2024-12-18 18:35:43 +01:00
Billel Mokeddem
845efddc8c
Fix decoding manually added tokens ( #1164 )
...
* Fix decoding manually added tokens
* fix + test
* nit
* nit
* no lag bpe
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2024-12-17 09:54:29 -08:00
Gökdeniz Gülmez
68533e2a8f
Merge branch 'ml-explore:main' into adding-support-for-mamba2
2024-12-17 11:14:40 +01:00
Prince Canuma
dfa4dd6c93
Add support for cohere2 ( #1157 )
...
* add support for cohere2
* revert to act_fn to silu
* fix tests and sliding window attention
* add tests
* add to tuner
* fix sliding window
* add coauthor :)
Co-authored-by: n8programs <43304488+N8python@users.noreply.github.com>
* Add rotating kvcache to save space
* some nits
* style
* nits
---------
Co-authored-by: n8programs <43304488+N8python@users.noreply.github.com>
Co-authored-by: N8 <n8@n8programs.com>
Co-authored-by: Awni Hannun <awni@apple.com>
2024-12-16 08:01:03 -08:00
Ikko Eltociear Ashimine
fc0674d2d8
chore: update evaluate.py ( #1159 )
...
occurence -> occurrence
2024-12-15 06:06:29 -08:00
Goekdeniz-Guelmez
dff4e52910
adding the modelnames in the LORA.md file and removing unused functions from mamba2.py
2024-12-12 22:52:00 +01:00
Awni Hannun
9f2ea5892e
Bpe stream without space ( #1154 )
...
* bpe streaming detokenization without space
* version bump
2024-12-12 13:13:50 -08:00
Goekdeniz-Guelmez
a883e39f41
optimizing the code for faster inference but still generates giberish
2024-12-12 21:08:33 +01:00
Awni Hannun
2ba0e36683
[mlx-lm] Use top p in server ( #1144 )
...
* use top p in server
* couple other fixes
2024-12-12 11:12:21 -08:00
Angelos Katharopoulos
19abf3dcaa
Replace unicode errors instead of raising exception ( #1146 )
2024-12-12 11:10:41 -08:00