Commit Graph

162 Commits

Author SHA1 Message Date
Goekdeniz-Guelmez
06ff47012f match pytoch imeplentation for loss calculation 2025-03-11 09:00:21 +01:00
Goekdeniz-Guelmez
f1961f1b79 fix batch size 2025-03-09 00:26:41 +01:00
Goekdeniz-Guelmez
e88f0fad4b clean up 2025-03-09 00:18:33 +01:00
Goekdeniz-Guelmez
0bc2a881ad generation should be fixed now 2025-03-09 00:16:40 +01:00
Gökdeniz Gülmez
46d6146102
Merge branch 'ml-explore:main' into adding-GRPO-training 2025-03-08 22:41:10 +01:00
Gökdeniz Gülmez
56d2db23e1
adding OLMoE architecture (#1321)
* initial commit

* udpate ACKNOWLEDGMENTS.md

* adding olmoe to training

* clean up

* faster generation

* remove sanitize method

* more clean ups

* adding SwitchGLU

* clean up

* a little faster and adding norm_topk_prob

* formated
2025-03-05 13:46:06 -08:00
Goekdeniz-Guelmez
f13a0d04ca seperate functions 2025-03-05 15:28:12 +01:00
Goekdeniz-Guelmez
d723ddfeda updates 2025-03-05 14:49:56 +01:00
Goekdeniz-Guelmez
9a36452519 updates 2025-03-05 14:42:34 +01:00
Goekdeniz-Guelmez
326935be49 updates 2025-03-05 14:40:23 +01:00
Goekdeniz-Guelmez
2d2f39f96e updates 2025-03-05 14:25:55 +01:00
Goekdeniz-Guelmez
1f89453295 eos token return fix 2025-03-05 14:00:51 +01:00
Goekdeniz-Guelmez
2bde97fe13 minor speed improvement 2025-03-05 13:55:24 +01:00
Goekdeniz-Guelmez
3dfb21267b updates 2025-03-05 12:59:41 +01:00
Goekdeniz-Guelmez
132225a018 updates 2025-03-01 22:23:33 +01:00
Goekdeniz-Guelmez
925e11439b updates 2025-02-28 22:07:24 +01:00
Goekdeniz-Guelmez
15d53279ae batching fix 2025-02-28 16:02:40 +01:00
Goekdeniz-Guelmez
fab2dc2688 smoll fix 2025-02-26 15:21:57 +01:00
Goekdeniz-Guelmez
53185c7f3d last update, gn 2025-02-24 22:20:07 +01:00
Goekdeniz-Guelmez
e4eac9c97b adding custom system message integration in dataset, more opimizations (generates now faster, while same RAM usage), fix for the identical generatrions, seperated the reward functions into a seperate file. 2025-02-24 20:49:22 +01:00
Gökdeniz Gülmez
bd5f081ca5
Merge branch 'ml-explore:main' into adding-GRPO-training 2025-02-22 19:11:26 +01:00
Goekdeniz-Guelmez
9705ed908e fix wrong generation in train 2025-02-22 17:21:08 +01:00
Goekdeniz-Guelmez
d9c4c6e60c clean up and readding temperature argument 2025-02-22 02:34:56 +01:00
Goekdeniz-Guelmez
d653371e3d nits 2025-02-22 02:12:02 +01:00
Goekdeniz-Guelmez
235348c211 generation speed improvement in training too from 3 t/s to 15 t/s 2025-02-22 02:03:01 +01:00
Goekdeniz-Guelmez
79de353530 nits 2025-02-22 01:05:58 +01:00
Goekdeniz-Guelmez
c51b0a2715 fix 2025-02-22 00:21:47 +01:00
Goekdeniz-Guelmez
710bc1490e training mode working too got from 2 toks/sec to 30 toks/sec with raw 1.5B model 2025-02-21 22:42:15 +01:00
Goekdeniz-Guelmez
6086137131 Huge speed improvement in validation mode. 2025-02-21 22:08:49 +01:00
Goekdeniz-Guelmez
2f20107d9b little faster generation + prints ot a examplke generatino in validation mode, more optimization in trianing function 2025-02-21 16:02:27 +01:00
Awni Hannun
85669451d0
Fix num layers in fine tune (#1294) 2025-02-20 13:32:01 -08:00
Goekdeniz-Guelmez
541f0be937 fix generation cutoff in evaluation 2025-02-17 14:39:38 +01:00
Goekdeniz-Guelmez
6a6bd53e43 removing print and switching some variables in the math 2025-02-15 15:38:51 +01:00
Goekdeniz-Guelmez
5ec4790656 removing comments + adding temperature + reward weighting 2025-02-15 15:29:22 +01:00
Goekdeniz-Guelmez
baeb9f117f reduncancy fix + nits 2025-02-14 09:09:59 +01:00
Goekdeniz-Guelmez
65a49dda0e nits 2025-02-13 21:46:30 +01:00
Goekdeniz-Guelmez
8179b99436 quick prompting fix 2025-02-12 19:24:35 +01:00
Goekdeniz-Guelmez
a7273f6a56 small fix 2025-02-12 18:30:12 +01:00
Gökdeniz Gülmez
3823154014
Merge branch 'ml-explore:main' into adding-GRPO-training 2025-02-12 11:10:10 +01:00
Goekdeniz-Guelmez
e33d9d509b updates 2025-02-12 11:07:53 +01:00
Goekdeniz-Guelmez
c42e858d7e Merge branch 'adding-GRPO-training' of https://github.com/Goekdeniz-Guelmez/mlx-examples into adding-GRPO-training 2025-02-12 08:57:33 +01:00
Goekdeniz-Guelmez
5aeefc8c47 update new iterade batches function + nits 2025-02-12 08:57:26 +01:00
Awni Hannun
ec30dc3538
hunyuan finetune (#1270) 2025-02-11 16:49:35 -08:00
Awni Hannun
42413c5d85
fix lora timings after validation (#1278) 2025-02-11 16:48:55 -08:00
Goekdeniz-Guelmez
978deab589 small fix 2025-02-11 17:48:42 +01:00
Goekdeniz-Guelmez
35ecc17042 fix 2025-02-11 17:07:08 +01:00
Goekdeniz-Guelmez
e80bf95182 fix 2025-02-11 09:26:43 +01:00
Goekdeniz-Guelmez
e96afe9e9f updates 2025-02-11 09:09:28 +01:00
Goekdeniz-Guelmez
88ca747e9e nits 2025-02-10 19:46:19 +01:00
Goekdeniz-Guelmez
b7bc811507 nits 2025-02-10 19:45:19 +01:00