Commit Graph

15 Commits

Author SHA1 Message Date
Juarez Bochi
f26e81ccc9 Fix layer norm 2023-12-17 07:47:52 -05:00
Juarez Bochi
4ec2b6eec3 Utils to compare encoder output 2023-12-17 07:20:24 -05:00
Juarez Bochi
7e42349f4c Use position bias in all layers 2023-12-17 07:19:32 -05:00
Juarez Bochi
203f550ef9 Decode (broken after 1st token) 2023-12-16 14:53:50 -05:00
Juarez Bochi
31da1b0dab LM head 2023-12-16 14:44:15 -05:00
Juarez Bochi
d12db65eeb No scaling, no encoder mask 2023-12-16 14:24:13 -05:00
Juarez Bochi
64e7eaccb8 Fix relative_attention_max_distance config 2023-12-16 11:18:17 -05:00
Juarez Bochi
2a8ee32b02 Fix default prompt 2023-12-16 08:17:08 -05:00
Juarez Bochi
392b7a2f98 translate pytorch to mx 2023-12-15 16:51:01 -05:00
Juarez Bochi
330f024d1c Move position biases to attention module 2023-12-15 11:30:17 -05:00
Juarez Bochi
d0497ddc0b Load decoder weights 2023-12-15 10:50:04 -05:00
Juarez Bochi
009ed0179c Load position bias embeddings 2023-12-15 10:16:11 -05:00
Juarez Bochi
62924d8135 Pass config to all modules, fix ln 2023-12-14 15:51:03 -05:00
Juarez Bochi
c0001a94f2 Load all encoder weights 2023-12-14 15:38:41 -05:00
Juarez Bochi
bca5ca4f98 Add skeleton 2023-12-14 15:21:36 -05:00