Commit Graph

435 Commits

Author SHA1 Message Date
Goekdeniz-Guelmez
9705ed908e fix wrong generation in train 2025-02-22 17:21:08 +01:00
Goekdeniz-Guelmez
d9c4c6e60c clean up and readding temperature argument 2025-02-22 02:34:56 +01:00
Goekdeniz-Guelmez
d653371e3d nits 2025-02-22 02:12:02 +01:00
Goekdeniz-Guelmez
235348c211 generation speed improvement in training too from 3 t/s to 15 t/s 2025-02-22 02:03:01 +01:00
Goekdeniz-Guelmez
79de353530 nits 2025-02-22 01:05:58 +01:00
Goekdeniz-Guelmez
c51b0a2715 fix 2025-02-22 00:21:47 +01:00
Goekdeniz-Guelmez
710bc1490e training mode working too got from 2 toks/sec to 30 toks/sec with raw 1.5B model 2025-02-21 22:42:15 +01:00
Goekdeniz-Guelmez
6086137131 Huge speed improvement in validation mode. 2025-02-21 22:08:49 +01:00
Goekdeniz-Guelmez
2f20107d9b little faster generation + prints ot a examplke generatino in validation mode, more optimization in trianing function 2025-02-21 16:02:27 +01:00
Goekdeniz-Guelmez
541f0be937 fix generation cutoff in evaluation 2025-02-17 14:39:38 +01:00
Gökdeniz Gülmez
1eea135a20
Merge branch 'ml-explore:main' into adding-GRPO-training 2025-02-17 14:25:03 +01:00
Goekdeniz-Guelmez
6a6bd53e43 removing print and switching some variables in the math 2025-02-15 15:38:51 +01:00
Goekdeniz-Guelmez
5ec4790656 removing comments + adding temperature + reward weighting 2025-02-15 15:29:22 +01:00
Goekdeniz-Guelmez
baeb9f117f reduncancy fix + nits 2025-02-14 09:09:59 +01:00
Matthias Neumayer
96bf37008e
Update README.md to include how to set temperature (#1280)
* Update README.md to include how to set temperature

* nits

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2025-02-13 19:32:56 -08:00
Awni Hannun
7b07b14e67
add logits processor to spec gen (#1260) 2025-02-13 19:19:53 -08:00
Goekdeniz-Guelmez
65a49dda0e nits 2025-02-13 21:46:30 +01:00
Goekdeniz-Guelmez
8179b99436 quick prompting fix 2025-02-12 19:24:35 +01:00
Goekdeniz-Guelmez
a7273f6a56 small fix 2025-02-12 18:30:12 +01:00
Gökdeniz Gülmez
3823154014
Merge branch 'ml-explore:main' into adding-GRPO-training 2025-02-12 11:10:10 +01:00
Goekdeniz-Guelmez
e33d9d509b updates 2025-02-12 11:07:53 +01:00
Goekdeniz-Guelmez
c42e858d7e Merge branch 'adding-GRPO-training' of https://github.com/Goekdeniz-Guelmez/mlx-examples into adding-GRPO-training 2025-02-12 08:57:33 +01:00
Goekdeniz-Guelmez
5aeefc8c47 update new iterade batches function + nits 2025-02-12 08:57:26 +01:00
Awni Hannun
ec30dc3538
hunyuan finetune (#1270) 2025-02-11 16:49:35 -08:00
Awni Hannun
42413c5d85
fix lora timings after validation (#1278) 2025-02-11 16:48:55 -08:00
Awni Hannun
f8cbf159e0
fix sharding for more even number of layers (#1276) 2025-02-11 16:26:59 -08:00
Awni Hannun
e879ea70e1
fix generation evaluations (#1277) 2025-02-11 16:10:30 -08:00
Matt Clayton
3d677f0870
Add "from_draft" to GenerationResponse (#1272)
* Add from_draft field in GenerationResponse

* Cleanup

* Re-work for minimal changes, add test

* Fix comment
2025-02-11 15:41:02 -08:00
Goekdeniz-Guelmez
978deab589 small fix 2025-02-11 17:48:42 +01:00
Goekdeniz-Guelmez
35ecc17042 fix 2025-02-11 17:07:08 +01:00
Goekdeniz-Guelmez
e80bf95182 fix 2025-02-11 09:26:43 +01:00
Goekdeniz-Guelmez
e96afe9e9f updates 2025-02-11 09:09:28 +01:00
Goekdeniz-Guelmez
88ca747e9e nits 2025-02-10 19:46:19 +01:00
Goekdeniz-Guelmez
b7bc811507 nits 2025-02-10 19:45:19 +01:00
Goekdeniz-Guelmez
e5aa2c3b5d nits 2025-02-10 17:51:14 +01:00
Goekdeniz-Guelmez
f88e897019 removing helper functions 2025-02-10 16:07:28 +01:00
Goekdeniz-Guelmez
d9da35f458 nits 2025-02-10 10:52:32 +01:00
Gökdeniz Gülmez
0dac286539
Merge branch 'main' into adding-GRPO-training 2025-02-10 10:43:22 +01:00
Chime Ogbuji
5865899c81
Completion only fine-tuning of instruction models with collections of HF datasets (#1103)
- Optional completion only fine-tuning with `--mask-prompt`
- Collections of Hugging Face datasets

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2025-02-09 20:12:34 -08:00
Sri Harsha Pamu
1ced1b00ca
rm temp argument (#1267) 2025-02-09 11:39:11 -08:00
Goekdeniz-Guelmez
00712522ba rebase loss calculation 2025-02-09 17:13:05 +01:00
Goekdeniz-Guelmez
a527cdb39b fix: prevent gradients from flowing through the reference model's logits 2025-02-09 17:02:58 +01:00
Goekdeniz-Guelmez
54179901b5 fix 2025-02-09 15:41:47 +01:00
Goekdeniz-Guelmez
39e9469059 freeze ref model 2025-02-09 15:30:51 +01:00
Goekdeniz-Guelmez
9ba6146a76 fix 2025-02-09 14:32:50 +01:00
Awni Hannun
1503bd4f55
support hunyuan 7b (#1263) 2025-02-08 15:46:47 -08:00
Awni Hannun
31611b62d7
Add IBM granite model (#1265)
* add granite

* add thinking option
2025-02-08 15:46:15 -08:00
Awni Hannun
6120a5f376
Faster DSv2/3 expert score computation (#1257)
* fix deepseek sharding (#1242)

* compile and use put along axis in deep seek routing function
2025-02-07 10:24:57 -08:00
Awni Hannun
52c41b5b5a
Fix prompt cache for models without chat template (#1250)
* fix deepseek sharding (#1242)

* fix prompt cache with no chat template
2025-02-06 11:10:58 -08:00
Gökdeniz Gülmez
94dcd0f63e
Merge branch 'ml-explore:main' into adding-GRPO-training 2025-02-06 08:15:58 +01:00