Commit Graph

390 Commits

Author SHA1 Message Date
Goekdeniz-Guelmez
a527cdb39b fix: prevent gradients from flowing through the reference model's logits 2025-02-09 17:02:58 +01:00
Goekdeniz-Guelmez
54179901b5 fix 2025-02-09 15:41:47 +01:00
Goekdeniz-Guelmez
39e9469059 freeze ref model 2025-02-09 15:30:51 +01:00
Goekdeniz-Guelmez
9ba6146a76 fix 2025-02-09 14:32:50 +01:00
Gökdeniz Gülmez
94dcd0f63e
Merge branch 'ml-explore:main' into adding-GRPO-training 2025-02-06 08:15:58 +01:00
Goekdeniz-Guelmez
bcfa55d882 updates 2025-02-05 15:02:12 +01:00
Goekdeniz-Guelmez
0a19522ec4 updates 2025-02-05 14:38:09 +01:00
Goekdeniz-Guelmez
35a2d99cf9 smoll fix 2025-02-05 11:30:21 +01:00
Goekdeniz-Guelmez
a33cad84b4 udpates 2025-02-05 09:48:00 +01:00
Goekdeniz-Guelmez
d84ad0cf86 fix testing 2025-02-05 08:53:30 +01:00
Goekdeniz-Guelmez
2a8e6f6e44 udpate 2025-02-05 08:47:03 +01:00
Goekdeniz-Guelmez
0a09a93454 fix cache handling 2025-02-05 08:44:06 +01:00
Pedro Cuenca
e2e5478da5
READMEs: fix typo in link, minor update. (#1246) 2025-02-04 11:52:32 -08:00
Goekdeniz-Guelmez
7b0141455e better create_dataset 2025-02-04 10:43:00 +01:00
Goekdeniz-Guelmez
bd1a42ec2f adding args into dataset handling 2025-02-04 10:22:34 +01:00
Goekdeniz-Guelmez
7173840283 first succesfull training run 2025-02-04 09:18:45 +01:00
Awni Hannun
21d0ab6e8a
fix deepseek sharding (#1242) 2025-02-03 16:59:50 -08:00
Gökdeniz Gülmez
0989c073b0
Optimizations for mamba1 (#1213)
* added mx.einsum() operations: before: 41.293 tokens-per-sec, after: 57.822 tokens-per-sec

* Fused Operations in delta, B, C = ... :. Before: 57.822 tokens-per-sec, after: 83.890 tokens-per-sec

* Pre-computing A_log. After: 83.890 tokens-per-sec, before: 85.848 tokens-per-sec

* Update MambaBlock, Batched Input Processing, Improved Cache Handling, Pre-computed Constants, Cleaner State Management, Explicit Return Values:. Before: 82.442 tokens-per-sec, after: 129.130 tokens-per-sec.

* cleaning up and adding apple copyright to helium modelfile

* update Copyright to this year

* nits + even faster

---------

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
2025-02-03 13:36:08 -08:00
Goekdeniz-Guelmez
ca32424043 updates 2025-02-03 21:57:26 +01:00
Goekdeniz-Guelmez
54e295ea80 fix name funcs 2025-02-03 19:56:11 +01:00
Goekdeniz-Guelmez
06f9c29c94 print func name 2025-02-03 19:47:40 +01:00
Goekdeniz-Guelmez
40bca770ae fixes 2025-02-03 19:43:49 +01:00
Goekdeniz-Guelmez
05d921b788 optims 2025-02-03 19:37:05 +01:00
Awni Hannun
d9924d08d1
Fix no validation in lora (#1241) 2025-02-03 09:55:24 -08:00
Goekdeniz-Guelmez
1d9e4802f0 first working prototype, will try training out at home 2025-02-03 12:05:29 +01:00
Goekdeniz-Guelmez
23d75cd7ad starting fist training test run 2025-02-03 10:08:28 +01:00
Goekdeniz-Guelmez
41ff5364d7 Merge branch 'adding-GRPO-training' of https://github.com/Goekdeniz-Guelmez/mlx-examples into adding-GRPO-training 2025-02-03 09:19:00 +01:00
Goekdeniz-Guelmez
a3ed632422 dataset wrapper done 2025-02-03 09:13:17 +01:00
Gökdeniz Gülmez
734d6f4a69
Merge branch 'ml-explore:main' into adding-GRPO-training 2025-02-03 09:07:20 +01:00
Goekdeniz-Guelmez
d034ca369e adding function for R1 2025-02-03 08:26:42 +01:00
Awni Hannun
9c2ef38d4d
only download local shard (#1240) 2025-02-02 13:58:44 -08:00
Goekdeniz-Guelmez
243c9621d9 update lora.py 2025-01-31 21:10:44 +01:00
Goekdeniz-Guelmez
a57d553fc1 update 2025-01-31 16:57:43 +01:00
Goekdeniz-Guelmez
80bcf68956 grpo_trainer shoudl be done 2025-01-31 16:54:18 +01:00
Goekdeniz-Guelmez
6c58aa995c updates 2025-01-31 16:27:31 +01:00
Goekdeniz-Guelmez
93370ff1c3 updates ans fixing the KL div lines 2025-01-30 23:55:40 +01:00
Gökdeniz Gülmez
b1e573d6e8
Merge branch 'ml-explore:main' into adding-GRPO-training 2025-01-29 15:07:52 +01:00
Goekdeniz-Guelmez
5e0ae83487 initial commit, gn 2025-01-29 00:19:07 +01:00
Awni Hannun
e8afb59de4
better overflow correction (#1229) 2025-01-28 14:37:30 -08:00
Anchen
7a83077cd7
chore(mlx-lm): support text type content in messages (#1225)
* chore(mlx-lm): support text type content

* chore: optimize the messagef content processing

* nits + format

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2025-01-27 17:13:50 -08:00
Awni Hannun
f44a52e2dc
batched min p and fix spec gen sampling (#1222) 2025-01-27 15:40:31 -08:00
Gökdeniz Gülmez
77faa14ba4
adding support for kyutai's helium (#1208)
* initial commit

* adding helium into training

* Update ACKNOWLEDGMENTS.md

* nits

* nits

* fixes / nits

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2025-01-26 07:19:07 -08:00
Awni Hannun
9a3ddc3e65
some fixes for pipeline parallel deep seek r1 (#1216) 2025-01-21 19:40:29 -08:00
Victor Nogueira
df1406735b
Fix dataset variable name, in datasets.py (#1212) 2025-01-21 14:12:43 -08:00
Jarrett
07f88f8057
fix(lora): add back store_true default args (#1205) 2025-01-16 11:15:42 -08:00
Awni Hannun
50f0a7f6d9
add internlm3 (#1206) 2025-01-15 14:55:41 -08:00
Ivan Fioravanti
6ae6c72c2e
reduction moved to CPU in case of distributed training (#1200) 2025-01-14 17:20:42 -08:00
Awni Hannun
c117af83b8
fix gpt bigcode (#1204) 2025-01-13 10:22:32 -08:00
Chime Ogbuji
0228c46434
Custom local dataset features (#1085)
* Generalize prompt_feature and completion_feature for use in local datasets to facilitate compatibility with many other training dataset formats.

* Persist configured prompt/completion key

* rebase + nits

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2025-01-13 10:01:18 -08:00
Prince Canuma
bf2da36fc6
Fix Cohere2: mask shape error (long context) (#1202)
* fix mask shape error (long context)

* Update llms/mlx_lm/models/cohere2.py

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>

* revert layer_idx

* black formatting

* Update cohere2.py

* format

---------

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
Co-authored-by: Awni Hannun <awni@apple.com>
2025-01-12 12:58:08 -08:00