Enable more BERT models (#580)

* Update convert.py * Update model.py * Update test.py * Update model.py * Update convert.py * Add files via upload * Update convert.py * format * nit * nit --------- Co-authored-by: Awni Hannun <awni@apple.com>
2025-12-16 02:08:55 +08:00 · 2024-03-20 01:21:33 +01:00
parent b0bcd86a40
commit 4680ef4413
4 changed files with 76 additions and 68 deletions
--- a/bert/README.md
+++ b/bert/README.md
@@ -1,6 +1,6 @@
 # BERT

-An implementation of BERT [(Devlin, et al., 2019)](https://aclanthology.org/N19-1423/) within MLX.
+An implementation of BERT [(Devlin, et al., 2019)](https://aclanthology.org/N19-1423/) in MLX.

 ## Setup 

@@ -38,12 +38,12 @@ output, pooled = model(**tokens)
 ```

 The `output` contains a `Batch x Tokens x Dims` tensor, representing a vector
-for every input token. If you want to train anything at a **token-level**,
-you'll want to use this.
+for every input token. If you want to train anything at the **token-level**,
+use this.

 The `pooled` contains a `Batch x Dims` tensor, which is the pooled
 representation for each input. If you want to train a **classification**
-model, you'll want to use this.
+model, use this.


 ## Test