Commit Graph

136 Commits

Author SHA1 Message Date
Juarez Bochi
f26e81ccc9
Fix layer norm 2023-12-17 07:47:52 -05:00
Juarez Bochi
4ec2b6eec3
Utils to compare encoder output 2023-12-17 07:20:24 -05:00
Juarez Bochi
7e42349f4c
Use position bias in all layers 2023-12-17 07:19:32 -05:00
Juarez Bochi
203f550ef9
Decode (broken after 1st token) 2023-12-16 14:53:50 -05:00
Juarez Bochi
31da1b0dab
LM head 2023-12-16 14:44:15 -05:00
Juarez Bochi
d12db65eeb
No scaling, no encoder mask 2023-12-16 14:24:13 -05:00
Juarez Bochi
64e7eaccb8
Fix relative_attention_max_distance config 2023-12-16 11:18:17 -05:00
Juarez Bochi
2a8ee32b02
Fix default prompt 2023-12-16 08:17:08 -05:00
Juarez Bochi
392b7a2f98
translate pytorch to mx 2023-12-15 16:51:01 -05:00
Juarez Bochi
330f024d1c
Move position biases to attention module 2023-12-15 11:30:17 -05:00
Juarez Bochi
d0497ddc0b
Load decoder weights 2023-12-15 10:50:04 -05:00
Juarez Bochi
009ed0179c
Load position bias embeddings 2023-12-15 10:16:11 -05:00
Juarez Bochi
62924d8135
Pass config to all modules, fix ln 2023-12-14 15:51:03 -05:00
Juarez Bochi
c0001a94f2
Load all encoder weights 2023-12-14 15:38:41 -05:00
Juarez Bochi
bca5ca4f98
Add skeleton 2023-12-14 15:21:36 -05:00
Awni Hannun
0e88a6afa1
Merge pull request #103 from arpitingle/patch-1
added phi in readme
2023-12-14 10:19:40 -08:00
arpit
5b08da2395
Update README.md 2023-12-14 23:40:50 +05:30
Awni Hannun
92efa32060
Merge pull request #97 from jbarrow/main
Phi-2
2023-12-14 09:21:26 -08:00
Awni Hannun
8f60d60814 cleanup conversion to use single qkv matrix 2023-12-14 09:19:44 -08:00
Awni Hannun
0c1c500714 update readme 2023-12-14 08:37:34 -08:00
Awni Hannun
3d2a23184a change file name for consistency, update readme. 2023-12-14 08:34:24 -08:00
Awni Hannun
840c0c36c2 don't drop last tokens 2023-12-14 08:27:44 -08:00
Awni Hannun
1613e608a9 fix args, update README, remove extra files 2023-12-14 08:18:01 -08:00
Awni Hannun
a8d4149147 fix fp16 + nits 2023-12-14 08:08:28 -08:00
Awni Hannun
b11997122d
Merge pull request #98 from finnless/patch-1
Fix typo in stable_diffusion README
2023-12-14 07:13:19 -08:00
Awni Hannun
363108d7b3
Merge pull request #102 from burakbudanur/main
Corrected the typo in 'ffn_dim_multiplier' in and added 'rope_theta' …
2023-12-14 07:12:20 -08:00
Burak Budanur
f691e00e5a Corrected the typo in 'ffn_dim_multiplier' in and added 'rope_theta' to the list unused. Without these, llama examples did not run. 2023-12-14 14:02:11 +01:00
Awni Hannun
88d7b67e6e add cache + generation, clean up some stuff 2023-12-13 22:26:33 -08:00
Nolan
0ce7618bc9
Fix typo in stable_diffusion README 2023-12-13 20:51:39 -08:00
Joe Barrow
a466cc5191 phi-2 draft 2023-12-13 22:23:38 -05:00
Awni Hannun
af2e2b40f9
Merge pull request #96 from Stv-X/typo-fix
Typo fix in whisper/README
2023-12-13 16:28:03 -08:00
Stv.X
cbae83e011 Corrected spelling of terms in whisper/README.md 2023-12-14 08:15:26 +08:00
Awni Hannun
9c7e996ff0
Merge pull request #51 from jbarrow/main
Update BERT to take advantage of bias param in MultiHeadAttention
2023-12-13 15:20:29 -08:00
Joe Barrow
9f4e63acbf Update to mlx>=0.0.5 2023-12-13 17:48:07 -05:00
Awni Hannun
c88468755b
Merge pull request #94 from jbax3/patch-1
Update README.md to fix git-lfs command
2023-12-13 14:19:14 -08:00
jbax3
1505e49a62
Update README.md to fix git-lfs command 2023-12-13 15:51:27 -06:00
Awni Hannun
8d83960a55
Merge pull request #93 from jbochi/patch-1
Fix convert.py instructions for Bert model
2023-12-13 08:47:52 -08:00
Juarez Bochi
03fe6896de
Fix convert.py instructions for Bert model
It just adds the missing backslash.
2023-12-13 11:37:02 -05:00
Awni Hannun
700b67fa3a
Merge pull request #90 from bofenghuang/fix-fp16
Fix whisper fp16 inference
2023-12-13 07:29:10 -08:00
Awni Hannun
3b7cfeb8ed
Merge pull request #88 from dastrobu/meta-form-url
fix "request access" form url for Llama models
2023-12-13 07:20:51 -08:00
bofenghuang
4b1a06c0cb Fix fp16 2023-12-13 11:07:47 +01:00
Daniel Strobusch
5515c2a75b
fix "request access" form url for Llama models 2023-12-13 10:19:29 +01:00
Awni Hannun
74c4ed40d2
Merge pull request #76 from bofenghuang/add-whisper-large-v3
Add whisper-large-v3
2023-12-12 20:22:31 -08:00
Awni Hannun
a614e951c4
Merge pull request #82 from ml-explore/llamav2
llama v2 with sharded weights
2023-12-12 17:08:24 -08:00
Awni Hannun
a99e9d551e hf correction 2023-12-12 17:08:04 -08:00
Awni Hannun
d3bd2e5d68
Merge pull request #79 from ml-explore/whisper_fp16
Enable FP16 for Whisper
2023-12-12 17:05:21 -08:00
Awni Hannun
66253a324c
Merge pull request #84 from iammerrick/patch-1
Update convert.py
2023-12-12 17:02:21 -08:00
Awni Hannun
b7081feb62
Merge pull request #86 from 1-ashraful-islam/patch-2
Update README.md with recently added examples
2023-12-12 17:01:02 -08:00
Ashraful Islam
2e6a6c32ae
Update README.md
updates readme with recently added examples
2023-12-12 18:26:13 -06:00
Merrick Christensen
2206e8f7d9
Update convert.py
Docs are right, however, the code has a typo.
2023-12-12 14:33:33 -07:00