CLIP (ViT) (#315)

mirror of https://github.com/ml-explore/mlx-examples.git synced 2025-10-24 06:28:07 +08:00

* probably approximatelly correct CLIPTextEncoder

* implemented CLIPEncoderLayer as built-in nn.TransformerEncoderLayer

* replaced embedding layer with simple matrix

* implemented ViT

* added ViT tests

* fixed tests

* added pooler_output for text

* implemented complete CLIPModel

* implemented init

* implemented convert.py and from_pretrained

* fixed some minor bugs and added the README.md

* removed tokenizer unused comments

* removed unused deps

* updated ACKNOWLEDGEMENTS.md

* Feat: Image Processor for CLIP (#1)

@nkasmanoff:
* clip image processor
* added example usage

* refactored image preprocessing

* deleted unused image_config.py

* removed preprocessing port

* added dependency to mlx-data

* fixed attribution and moved photos to assets

* implemented a simple port of CLIPImageProcessor

* review changes

* PR review changes

* renamed too verbose arg

* updated README.md

* nits in readme / conversion

* simplify some stuff, remove unneeded inits

* remove more init stuff

* more simplify

* make test a unit test

* update main readme

* readme nits

---------

Co-authored-by: Noah Kasmanoff <nkasmanoff@gmail.com>
Co-authored-by: Awni Hannun <awni@apple.com>

This commit is contained in:

Gabrijel Boduljak

2024-01-31 23:19:53 +01:00

committed by

GitHub

parent ba3a9355d1

commit 94358219cf

14 changed files with 890 additions and 0 deletions

BIN
clip/assets/cat.jpeg Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 193 KiB

CLIP (ViT) (#315)

BIN clip/assets/cat.jpeg Normal file View File

BIN
clip/assets/cat.jpeg Normal file

View File