Goekdeniz-Guelmez
6a6bd53e43
removing print and switching some variables in the math
2025-02-15 15:38:51 +01:00
Goekdeniz-Guelmez
5ec4790656
removing comments + adding temperature + reward weighting
2025-02-15 15:29:22 +01:00
Goekdeniz-Guelmez
baeb9f117f
reduncancy fix + nits
2025-02-14 09:09:59 +01:00
Goekdeniz-Guelmez
65a49dda0e
nits
2025-02-13 21:46:30 +01:00
Goekdeniz-Guelmez
8179b99436
quick prompting fix
2025-02-12 19:24:35 +01:00
Goekdeniz-Guelmez
a7273f6a56
small fix
2025-02-12 18:30:12 +01:00
Gökdeniz Gülmez
3823154014
Merge branch 'ml-explore:main' into adding-GRPO-training
2025-02-12 11:10:10 +01:00
Goekdeniz-Guelmez
e33d9d509b
updates
2025-02-12 11:07:53 +01:00
Goekdeniz-Guelmez
c42e858d7e
Merge branch 'adding-GRPO-training' of https://github.com/Goekdeniz-Guelmez/mlx-examples into adding-GRPO-training
2025-02-12 08:57:33 +01:00
Goekdeniz-Guelmez
5aeefc8c47
update new iterade batches function + nits
2025-02-12 08:57:26 +01:00
Awni Hannun
ec30dc3538
hunyuan finetune ( #1270 )
2025-02-11 16:49:35 -08:00
Awni Hannun
42413c5d85
fix lora timings after validation ( #1278 )
2025-02-11 16:48:55 -08:00
Awni Hannun
f8cbf159e0
fix sharding for more even number of layers ( #1276 )
2025-02-11 16:26:59 -08:00
Awni Hannun
e879ea70e1
fix generation evaluations ( #1277 )
2025-02-11 16:10:30 -08:00
Matt Clayton
3d677f0870
Add "from_draft" to GenerationResponse ( #1272 )
...
* Add from_draft field in GenerationResponse
* Cleanup
* Re-work for minimal changes, add test
* Fix comment
2025-02-11 15:41:02 -08:00
Goekdeniz-Guelmez
978deab589
small fix
2025-02-11 17:48:42 +01:00
Goekdeniz-Guelmez
35ecc17042
fix
2025-02-11 17:07:08 +01:00
Goekdeniz-Guelmez
e80bf95182
fix
2025-02-11 09:26:43 +01:00
Goekdeniz-Guelmez
e96afe9e9f
updates
2025-02-11 09:09:28 +01:00
Awni Hannun
bded1a8fcd
fix looping in whisper ( #1273 )
2025-02-10 13:04:35 -08:00
Goekdeniz-Guelmez
88ca747e9e
nits
2025-02-10 19:46:19 +01:00
Goekdeniz-Guelmez
b7bc811507
nits
2025-02-10 19:45:19 +01:00
Goekdeniz-Guelmez
e5aa2c3b5d
nits
2025-02-10 17:51:14 +01:00
Goekdeniz-Guelmez
f88e897019
removing helper functions
2025-02-10 16:07:28 +01:00
Goekdeniz-Guelmez
d9da35f458
nits
2025-02-10 10:52:32 +01:00
Gökdeniz Gülmez
0dac286539
Merge branch 'main' into adding-GRPO-training
2025-02-10 10:43:22 +01:00
Chime Ogbuji
5865899c81
Completion only fine-tuning of instruction models with collections of HF datasets ( #1103 )
...
- Optional completion only fine-tuning with `--mask-prompt`
- Collections of Hugging Face datasets
---------
Co-authored-by: Awni Hannun <awni@apple.com>
2025-02-09 20:12:34 -08:00
Sri Harsha Pamu
1ced1b00ca
rm temp argument ( #1267 )
2025-02-09 11:39:11 -08:00
Goekdeniz-Guelmez
00712522ba
rebase loss calculation
2025-02-09 17:13:05 +01:00
Goekdeniz-Guelmez
a527cdb39b
fix: prevent gradients from flowing through the reference model's logits
2025-02-09 17:02:58 +01:00
Goekdeniz-Guelmez
54179901b5
fix
2025-02-09 15:41:47 +01:00
Goekdeniz-Guelmez
39e9469059
freeze ref model
2025-02-09 15:30:51 +01:00
Goekdeniz-Guelmez
9ba6146a76
fix
2025-02-09 14:32:50 +01:00
Awni Hannun
f58c7de901
Some improvements to speedup alignment computation in MLX Whisper ( #1259 )
...
* some improvements to speedup alignment computation in MLX Whisper
* fix alignment
2025-02-08 15:47:00 -08:00
Awni Hannun
1503bd4f55
support hunyuan 7b ( #1263 )
2025-02-08 15:46:47 -08:00
Awni Hannun
31611b62d7
Add IBM granite model ( #1265 )
...
* add granite
* add thinking option
2025-02-08 15:46:15 -08:00
Awni Hannun
6120a5f376
Faster DSv2/3 expert score computation ( #1257 )
...
* fix deepseek sharding (#1242 )
* compile and use put along axis in deep seek routing function
2025-02-07 10:24:57 -08:00
Awni Hannun
52c41b5b5a
Fix prompt cache for models without chat template ( #1250 )
...
* fix deepseek sharding (#1242 )
* fix prompt cache with no chat template
2025-02-06 11:10:58 -08:00
Nripesh Niketan
747c08e202
Chore: pre-commit bump ( #1253 )
2025-02-06 09:06:31 -08:00
Gökdeniz Gülmez
94dcd0f63e
Merge branch 'ml-explore:main' into adding-GRPO-training
2025-02-06 08:15:58 +01:00
Goekdeniz-Guelmez
bcfa55d882
updates
2025-02-05 15:02:12 +01:00
Goekdeniz-Guelmez
0a19522ec4
updates
2025-02-05 14:38:09 +01:00
Goekdeniz-Guelmez
35a2d99cf9
smoll fix
2025-02-05 11:30:21 +01:00
Goekdeniz-Guelmez
a33cad84b4
udpates
2025-02-05 09:48:00 +01:00
Goekdeniz-Guelmez
d84ad0cf86
fix testing
2025-02-05 08:53:30 +01:00
Goekdeniz-Guelmez
2a8e6f6e44
udpate
2025-02-05 08:47:03 +01:00
Goekdeniz-Guelmez
0a09a93454
fix cache handling
2025-02-05 08:44:06 +01:00
Pedro Cuenca
e2e5478da5
READMEs: fix typo in link, minor update. ( #1246 )
2025-02-04 11:52:32 -08:00
Goekdeniz-Guelmez
7b0141455e
better create_dataset
2025-02-04 10:43:00 +01:00
Goekdeniz-Guelmez
bd1a42ec2f
adding args into dataset handling
2025-02-04 10:22:34 +01:00