Goekdeniz-Guelmez
|
1f89453295
|
eos token return fix
|
2025-03-05 14:00:51 +01:00 |
|
Goekdeniz-Guelmez
|
2bde97fe13
|
minor speed improvement
|
2025-03-05 13:55:24 +01:00 |
|
Goekdeniz-Guelmez
|
3dfb21267b
|
updates
|
2025-03-05 12:59:41 +01:00 |
|
Goekdeniz-Guelmez
|
132225a018
|
updates
|
2025-03-01 22:23:33 +01:00 |
|
Goekdeniz-Guelmez
|
925e11439b
|
updates
|
2025-02-28 22:07:24 +01:00 |
|
Goekdeniz-Guelmez
|
15d53279ae
|
batching fix
|
2025-02-28 16:02:40 +01:00 |
|
Goekdeniz-Guelmez
|
fab2dc2688
|
smoll fix
|
2025-02-26 15:21:57 +01:00 |
|
Goekdeniz-Guelmez
|
53185c7f3d
|
last update, gn
|
2025-02-24 22:20:07 +01:00 |
|
Goekdeniz-Guelmez
|
e4eac9c97b
|
adding custom system message integration in dataset, more opimizations (generates now faster, while same RAM usage), fix for the identical generatrions, seperated the reward functions into a seperate file.
|
2025-02-24 20:49:22 +01:00 |
|
Goekdeniz-Guelmez
|
9705ed908e
|
fix wrong generation in train
|
2025-02-22 17:21:08 +01:00 |
|
Goekdeniz-Guelmez
|
d9c4c6e60c
|
clean up and readding temperature argument
|
2025-02-22 02:34:56 +01:00 |
|
Goekdeniz-Guelmez
|
d653371e3d
|
nits
|
2025-02-22 02:12:02 +01:00 |
|
Goekdeniz-Guelmez
|
235348c211
|
generation speed improvement in training too from 3 t/s to 15 t/s
|
2025-02-22 02:03:01 +01:00 |
|
Goekdeniz-Guelmez
|
79de353530
|
nits
|
2025-02-22 01:05:58 +01:00 |
|
Goekdeniz-Guelmez
|
c51b0a2715
|
fix
|
2025-02-22 00:21:47 +01:00 |
|
Goekdeniz-Guelmez
|
710bc1490e
|
training mode working too got from 2 toks/sec to 30 toks/sec with raw 1.5B model
|
2025-02-21 22:42:15 +01:00 |
|
Goekdeniz-Guelmez
|
6086137131
|
Huge speed improvement in validation mode.
|
2025-02-21 22:08:49 +01:00 |
|
Goekdeniz-Guelmez
|
2f20107d9b
|
little faster generation + prints ot a examplke generatino in validation mode, more optimization in trianing function
|
2025-02-21 16:02:27 +01:00 |
|
Goekdeniz-Guelmez
|
541f0be937
|
fix generation cutoff in evaluation
|
2025-02-17 14:39:38 +01:00 |
|
Goekdeniz-Guelmez
|
6a6bd53e43
|
removing print and switching some variables in the math
|
2025-02-15 15:38:51 +01:00 |
|
Goekdeniz-Guelmez
|
5ec4790656
|
removing comments + adding temperature + reward weighting
|
2025-02-15 15:29:22 +01:00 |
|
Goekdeniz-Guelmez
|
e33d9d509b
|
updates
|
2025-02-12 11:07:53 +01:00 |
|
Goekdeniz-Guelmez
|
5aeefc8c47
|
update new iterade batches function + nits
|
2025-02-12 08:57:26 +01:00 |
|
Goekdeniz-Guelmez
|
e80bf95182
|
fix
|
2025-02-11 09:26:43 +01:00 |
|
Goekdeniz-Guelmez
|
e96afe9e9f
|
updates
|
2025-02-11 09:09:28 +01:00 |
|
Goekdeniz-Guelmez
|
88ca747e9e
|
nits
|
2025-02-10 19:46:19 +01:00 |
|
Goekdeniz-Guelmez
|
b7bc811507
|
nits
|
2025-02-10 19:45:19 +01:00 |
|
Goekdeniz-Guelmez
|
e5aa2c3b5d
|
nits
|
2025-02-10 17:51:14 +01:00 |
|
Goekdeniz-Guelmez
|
f88e897019
|
removing helper functions
|
2025-02-10 16:07:28 +01:00 |
|
Goekdeniz-Guelmez
|
d9da35f458
|
nits
|
2025-02-10 10:52:32 +01:00 |
|
Goekdeniz-Guelmez
|
00712522ba
|
rebase loss calculation
|
2025-02-09 17:13:05 +01:00 |
|
Goekdeniz-Guelmez
|
a527cdb39b
|
fix: prevent gradients from flowing through the reference model's logits
|
2025-02-09 17:02:58 +01:00 |
|
Goekdeniz-Guelmez
|
54179901b5
|
fix
|
2025-02-09 15:41:47 +01:00 |
|
Goekdeniz-Guelmez
|
9ba6146a76
|
fix
|
2025-02-09 14:32:50 +01:00 |
|
Goekdeniz-Guelmez
|
bcfa55d882
|
updates
|
2025-02-05 15:02:12 +01:00 |
|
Goekdeniz-Guelmez
|
0a19522ec4
|
updates
|
2025-02-05 14:38:09 +01:00 |
|
Goekdeniz-Guelmez
|
35a2d99cf9
|
smoll fix
|
2025-02-05 11:30:21 +01:00 |
|
Goekdeniz-Guelmez
|
a33cad84b4
|
udpates
|
2025-02-05 09:48:00 +01:00 |
|
Goekdeniz-Guelmez
|
2a8e6f6e44
|
udpate
|
2025-02-05 08:47:03 +01:00 |
|
Goekdeniz-Guelmez
|
0a09a93454
|
fix cache handling
|
2025-02-05 08:44:06 +01:00 |
|
Goekdeniz-Guelmez
|
7173840283
|
first succesfull training run
|
2025-02-04 09:18:45 +01:00 |
|
Goekdeniz-Guelmez
|
ca32424043
|
updates
|
2025-02-03 21:57:26 +01:00 |
|
Goekdeniz-Guelmez
|
54e295ea80
|
fix name funcs
|
2025-02-03 19:56:11 +01:00 |
|
Goekdeniz-Guelmez
|
06f9c29c94
|
print func name
|
2025-02-03 19:47:40 +01:00 |
|
Goekdeniz-Guelmez
|
40bca770ae
|
fixes
|
2025-02-03 19:43:49 +01:00 |
|
Goekdeniz-Guelmez
|
05d921b788
|
optims
|
2025-02-03 19:37:05 +01:00 |
|
Goekdeniz-Guelmez
|
1d9e4802f0
|
first working prototype, will try training out at home
|
2025-02-03 12:05:29 +01:00 |
|
Goekdeniz-Guelmez
|
23d75cd7ad
|
starting fist training test run
|
2025-02-03 10:08:28 +01:00 |
|
Goekdeniz-Guelmez
|
a3ed632422
|
dataset wrapper done
|
2025-02-03 09:13:17 +01:00 |
|
Goekdeniz-Guelmez
|
d034ca369e
|
adding function for R1
|
2025-02-03 08:26:42 +01:00 |
|