Juarez Bochi
|
64e53e8415
|
Pass ln2 to cross attention
|
2023-12-18 15:05:18 -05:00 |
|
Awni Hannun
|
e899271275
|
nits
|
2023-12-18 11:01:16 -08:00 |
|
Awni Hannun
|
29e642a482
|
readme updates
|
2023-12-18 10:58:43 -08:00 |
|
Juarez Bochi
|
36fd88509e
|
Rescale output before projecting on vocab
|
2023-12-18 13:43:03 -05:00 |
|
Juarez Bochi
|
511f572b6c
|
Increase hf max_length
|
2023-12-18 13:35:44 -05:00 |
|
Juarez Bochi
|
66e1c0f050
|
Fix type for attention mask
|
2023-12-18 11:39:17 -05:00 |
|
Juarez Bochi
|
5ae339f6d2
|
Add hf generation for comparison
|
2023-12-18 11:35:16 -05:00 |
|
Juarez Bochi
|
305a52dde8
|
Run hf_t5 with any model
|
2023-12-18 11:25:14 -05:00 |
|
Juarez Bochi
|
0779417903
|
Fix --encode-only
|
2023-12-18 11:19:44 -05:00 |
|
Juarez Bochi
|
83b68a5bdb
|
Fix relative position scale
|
2023-12-18 11:13:44 -05:00 |
|
Juarez Bochi
|
9d3ee016c9
|
Add readme.md for t5
|
2023-12-18 08:50:36 -05:00 |
|
Juarez Bochi
|
4bc8f49043
|
Add gitignore
|
2023-12-18 08:42:45 -05:00 |
|
Juarez Bochi
|
54b82198d0
|
Uncomment bidirectional param
|
2023-12-18 08:42:27 -05:00 |
|
Juarez Bochi
|
55f204dd3a
|
Load config from HF to support any model
|
2023-12-18 08:42:06 -05:00 |
|
Juarez Bochi
|
b2a3782a96
|
Add argument to generate float16 npz
|
2023-12-18 08:21:20 -05:00 |
|
Juarez Bochi
|
09e851499a
|
Stream output
|
2023-12-18 08:09:56 -05:00 |
|
Juarez Bochi
|
689eda9937
|
Fix T5.__call__
|
2023-12-18 08:00:01 -05:00 |
|
Awni Hannun
|
34843ddeb2
|
format
|
2023-12-17 21:30:28 -08:00 |
|
Awni Hannun
|
c468edc4e3
|
bug fix with bidirectional only for encoder, add offset to position bias
|
2023-12-17 21:22:00 -08:00 |
|
Awni Hannun
|
688a6e1e78
|
with cache
|
2023-12-17 17:35:53 -08:00 |
|
Juarez Bochi
|
29bfb93455
|
Measure tokens/s
|
2023-12-17 10:53:49 -05:00 |
|
Juarez Bochi
|
90d3a15ba2
|
Stop on eos
|
2023-12-17 08:59:03 -05:00 |
|
Juarez Bochi
|
61fda57eba
|
Remove prints
|
2023-12-17 08:52:54 -05:00 |
|
Juarez Bochi
|
152e85fade
|
Concatenate tokens
|
2023-12-17 08:51:16 -05:00 |
|
Juarez Bochi
|
daea1dcddf
|
Use position bias in decoder
|
2023-12-17 08:40:10 -05:00 |
|
Juarez Bochi
|
7dcf2b688d
|
Fix decoder mask
|
2023-12-17 08:34:21 -05:00 |
|
Juarez Bochi
|
f26e81ccc9
|
Fix layer norm
|
2023-12-17 07:47:52 -05:00 |
|
Juarez Bochi
|
4ec2b6eec3
|
Utils to compare encoder output
|
2023-12-17 07:20:24 -05:00 |
|
Juarez Bochi
|
7e42349f4c
|
Use position bias in all layers
|
2023-12-17 07:19:32 -05:00 |
|
Juarez Bochi
|
203f550ef9
|
Decode (broken after 1st token)
|
2023-12-16 14:53:50 -05:00 |
|
Juarez Bochi
|
31da1b0dab
|
LM head
|
2023-12-16 14:44:15 -05:00 |
|
Juarez Bochi
|
d12db65eeb
|
No scaling, no encoder mask
|
2023-12-16 14:24:13 -05:00 |
|
Juarez Bochi
|
64e7eaccb8
|
Fix relative_attention_max_distance config
|
2023-12-16 11:18:17 -05:00 |
|
Juarez Bochi
|
2a8ee32b02
|
Fix default prompt
|
2023-12-16 08:17:08 -05:00 |
|
Juarez Bochi
|
392b7a2f98
|
translate pytorch to mx
|
2023-12-15 16:51:01 -05:00 |
|
Juarez Bochi
|
330f024d1c
|
Move position biases to attention module
|
2023-12-15 11:30:17 -05:00 |
|
Juarez Bochi
|
d0497ddc0b
|
Load decoder weights
|
2023-12-15 10:50:04 -05:00 |
|
Juarez Bochi
|
009ed0179c
|
Load position bias embeddings
|
2023-12-15 10:16:11 -05:00 |
|
Juarez Bochi
|
62924d8135
|
Pass config to all modules, fix ln
|
2023-12-14 15:51:03 -05:00 |
|
Juarez Bochi
|
c0001a94f2
|
Load all encoder weights
|
2023-12-14 15:38:41 -05:00 |
|
Juarez Bochi
|
bca5ca4f98
|
Add skeleton
|
2023-12-14 15:21:36 -05:00 |
|
Awni Hannun
|
0e88a6afa1
|
Merge pull request #103 from arpitingle/patch-1
added phi in readme
|
2023-12-14 10:19:40 -08:00 |
|
arpit
|
5b08da2395
|
Update README.md
|
2023-12-14 23:40:50 +05:30 |
|
Awni Hannun
|
92efa32060
|
Merge pull request #97 from jbarrow/main
Phi-2
|
2023-12-14 09:21:26 -08:00 |
|
Awni Hannun
|
8f60d60814
|
cleanup conversion to use single qkv matrix
|
2023-12-14 09:19:44 -08:00 |
|
Awni Hannun
|
0c1c500714
|
update readme
|
2023-12-14 08:37:34 -08:00 |
|
Awni Hannun
|
3d2a23184a
|
change file name for consistency, update readme.
|
2023-12-14 08:34:24 -08:00 |
|
Awni Hannun
|
840c0c36c2
|
don't drop last tokens
|
2023-12-14 08:27:44 -08:00 |
|
Awni Hannun
|
1613e608a9
|
fix args, update README, remove extra files
|
2023-12-14 08:18:01 -08:00 |
|
Awni Hannun
|
a8d4149147
|
fix fp16 + nits
|
2023-12-14 08:08:28 -08:00 |
|