Goekdeniz-Guelmez
|
a33cad84b4
|
udpates
|
2025-02-05 09:48:00 +01:00 |
|
Goekdeniz-Guelmez
|
2a8e6f6e44
|
udpate
|
2025-02-05 08:47:03 +01:00 |
|
Goekdeniz-Guelmez
|
0a09a93454
|
fix cache handling
|
2025-02-05 08:44:06 +01:00 |
|
Goekdeniz-Guelmez
|
7173840283
|
first succesfull training run
|
2025-02-04 09:18:45 +01:00 |
|
Goekdeniz-Guelmez
|
ca32424043
|
updates
|
2025-02-03 21:57:26 +01:00 |
|
Goekdeniz-Guelmez
|
54e295ea80
|
fix name funcs
|
2025-02-03 19:56:11 +01:00 |
|
Goekdeniz-Guelmez
|
06f9c29c94
|
print func name
|
2025-02-03 19:47:40 +01:00 |
|
Goekdeniz-Guelmez
|
40bca770ae
|
fixes
|
2025-02-03 19:43:49 +01:00 |
|
Goekdeniz-Guelmez
|
05d921b788
|
optims
|
2025-02-03 19:37:05 +01:00 |
|
Goekdeniz-Guelmez
|
1d9e4802f0
|
first working prototype, will try training out at home
|
2025-02-03 12:05:29 +01:00 |
|
Goekdeniz-Guelmez
|
23d75cd7ad
|
starting fist training test run
|
2025-02-03 10:08:28 +01:00 |
|
Goekdeniz-Guelmez
|
a3ed632422
|
dataset wrapper done
|
2025-02-03 09:13:17 +01:00 |
|
Goekdeniz-Guelmez
|
d034ca369e
|
adding function for R1
|
2025-02-03 08:26:42 +01:00 |
|
Goekdeniz-Guelmez
|
243c9621d9
|
update lora.py
|
2025-01-31 21:10:44 +01:00 |
|
Goekdeniz-Guelmez
|
a57d553fc1
|
update
|
2025-01-31 16:57:43 +01:00 |
|
Goekdeniz-Guelmez
|
80bcf68956
|
grpo_trainer shoudl be done
|
2025-01-31 16:54:18 +01:00 |
|
Goekdeniz-Guelmez
|
6c58aa995c
|
updates
|
2025-01-31 16:27:31 +01:00 |
|
Goekdeniz-Guelmez
|
93370ff1c3
|
updates ans fixing the KL div lines
|
2025-01-30 23:55:40 +01:00 |
|
Goekdeniz-Guelmez
|
5e0ae83487
|
initial commit, gn
|
2025-01-29 00:19:07 +01:00 |
|