Goekdeniz-Guelmez
|
1d851069ea
|
nits
|
2024-11-10 17:21:18 +01:00 |
|
Goekdeniz-Guelmez
|
1a6688384d
|
imopemented multi Token inputs, but still generating Gibberish
|
2024-11-10 17:19:00 +01:00 |
|
Goekdeniz-Guelmez
|
2f95b361a8
|
removed the custom Mamba2Cache adn updated the existing MambaCache but still only one input Token and outputs gibberish
|
2024-11-10 16:57:03 +01:00 |
|
Gökdeniz Gülmez
|
49d3f188f8
|
Merge branch 'ml-explore:main' into adding-support-for-mamba2
|
2024-11-10 16:36:02 +01:00 |
|
Goekdeniz-Guelmez
|
3a499f9735
|
fixed inference slowness but it cant handle multible Token inputs and is generateing gibberish
|
2024-11-10 16:35:07 +01:00 |
|
Goekdeniz-Guelmez
|
800b60239c
|
save checkpoint
|
2024-11-10 14:36:26 +01:00 |
|
Awni Hannun
|
657b4cc0aa
|
[MLX LM] Sampler refactor + a few improvements (#1094)
* starting
* refactor sampler/processor and a few improvements
* fix stream
* fix stream generate
* fix eos handling in stream generate
|
2024-11-07 16:15:24 -08:00 |
|
Goekdeniz-Guelmez
|
906f972d36
|
save push
|
2024-11-06 16:35:46 +01:00 |
|
Angelos Katharopoulos
|
ed9e81dd58
|
Fix rotating kv cache size (#1093)
|
2024-11-05 10:24:24 -08:00 |
|
Awni Hannun
|
6fd1f70f73
|
fix spm decoder multi-byte (#1092)
|
2024-11-05 06:06:26 -08:00 |
|
ilyasch2
|
3b526f0aa1
|
Add support for falcon-mamba (#1074)
* Add support for falcon-mamba
* nits
* nit
---------
Co-authored-by: Awni Hannun <awni@apple.com>
|
2024-11-04 12:23:30 -08:00 |
|
Anchen
|
82e3338987
|
chore(mlx-lm): add max token arg for mlx_lm.chat (#1089)
* chore(mlx-lm): add max token arg for mlx_lm.chat
* chore: update the default max token value
|
2024-11-04 06:06:34 -08:00 |
|
Angelos Katharopoulos
|
331148d8ec
|
Enable distributed LoRA training (#821)
|
2024-11-02 18:02:31 -07:00 |
|
Awni Hannun
|
0f799947d0
|
fix (#1079)
|
2024-11-01 16:30:32 -07:00 |
|
Awni Hannun
|
e510987870
|
Clear cache every now and then (#1081)
* clear cache every now and then
* don't need user arg anymore
|
2024-11-01 14:15:32 -07:00 |
|
Alex Barron
|
85ffd2c96a
|
Quantized KV Cache (#1075)
* add QuantizedKVCache
* simplify
* add tests
* single sdpa function
* fix sed
* in place
* fix tests
* support different k and v head dims
|
2024-10-31 16:59:52 -07:00 |
|
Awni Hannun
|
9f34fdbda4
|
Wire models in MLX LM (#1069)
* wired in MLX LM
* fix synch
* comment + nit
* version
* mlx lm version
* bump to 0.19.2
|
2024-10-31 08:17:14 -07:00 |
|
Goekdeniz-Guelmez
|
58b448dc0b
|
updates
|
2024-10-30 21:23:13 +01:00 |
|
Gökdeniz Gülmez
|
ffc7ab06a0
|
Merge branch 'ml-explore:main' into adding-support-for-mamba2
|
2024-10-30 17:04:38 +01:00 |
|
Awni Hannun
|
8fe9539af7
|
Fix detokenizer space match for quote (#1072)
* fix + test
* remove transformer flax/torch warning
* format
|
2024-10-27 15:06:07 -07:00 |
|
hschaeufler
|
ab4bf05c6e
|
Update lora_config.yaml with new param: num_layers (#1068)
|
2024-10-26 09:34:46 -07:00 |
|
Gökdeniz Gülmez
|
3b70708201
|
Merge branch 'ml-explore:main' into adding-support-for-mamba2
|
2024-10-25 08:57:37 +02:00 |
|
Goekdeniz-Guelmez
|
7c8849e795
|
update
|
2024-10-24 16:16:42 +02:00 |
|
Awni Hannun
|
9000e280ae
|
fix mamba models conversion (#1065)
|
2024-10-22 15:44:08 -07:00 |
|
Goekdeniz-Guelmez
|
a677638c4b
|
inference works but is hella slow
|
2024-10-22 23:06:06 +02:00 |
|
Goekdeniz-Guelmez
|
9ab581d678
|
notes
|
2024-10-22 22:10:53 +02:00 |
|
Goekdeniz-Guelmez
|
e43a2ab229
|
not working, incorrect handling with cache probably
|
2024-10-22 22:04:25 +02:00 |
|
Goekdeniz-Guelmez
|
55485b98e8
|
update
|
2024-10-22 21:23:47 +02:00 |
|
madroid
|
d1d480867b
|
LoRA: update tools datasets docs (#1063)
* LoRA: update tools datasets docs
* nits
* nits
---------
Co-authored-by: Awni Hannun <awni@apple.com>
|
2024-10-22 12:19:11 -07:00 |
|
Goekdeniz-Guelmez
|
758597eaa8
|
adding multi token input and correct cache handling in ssm step
|
2024-10-22 20:44:23 +02:00 |
|
Awni Hannun
|
66e7bcb886
|
override dtype with quant (#1062)
|
2024-10-22 09:56:45 -07:00 |
|
Goekdeniz-Guelmez
|
5326d9373a
|
Merge branch 'adding-support-for-mamba2' of https://github.com/Goekdeniz-Guelmez/mlx-examples into adding-support-for-mamba2
|
2024-10-22 18:26:05 +02:00 |
|
Goekdeniz-Guelmez
|
b9c57cd429
|
generation works! trying training now
|
2024-10-22 18:25:59 +02:00 |
|
Gökdeniz Gülmez
|
0ef73f3a2d
|
Merge branch 'ml-explore:main' into adding-support-for-mamba2
|
2024-10-21 15:14:19 +02:00 |
|
aronson
|
743763bc2e
|
Handle empty string case in maybe_trim_space (#1055)
* Handle empty string case in maybe_trim_space
* nit
---------
Co-authored-by: Awni Hannun <awni@apple.com>
|
2024-10-20 20:46:43 -07:00 |
|
Goekdeniz-Guelmez
|
c1634ce81b
|
still generating gibberish
|
2024-10-20 18:41:28 +02:00 |
|
Goekdeniz-Guelmez
|
ab4cf1d1cf
|
generation works but outputs gibberish
|
2024-10-20 18:04:34 +02:00 |
|
Goekdeniz-Guelmez
|
4ab5139c05
|
quick save
|
2024-10-20 16:11:39 +02:00 |
|
Goekdeniz-Guelmez
|
cd036ccfb5
|
fix generation works too (almost)
|
2024-10-16 21:13:36 +02:00 |
|
Goekdeniz-Guelmez
|
181d6abedc
|
Merge branch 'adding-support-for-mamba2' of https://github.com/Goekdeniz-Guelmez/mlx-examples into adding-support-for-mamba2
|
2024-10-16 21:09:42 +02:00 |
|
Goekdeniz-Guelmez
|
8073cb486c
|
adding debug statements (somehiw generating only goes through the fist MambaMixer block pass)
|
2024-10-16 21:09:30 +02:00 |
|
Gökdeniz Gülmez
|
855fcc4327
|
Merge branch 'ml-explore:main' into adding-support-for-mamba2
|
2024-10-16 18:57:55 +02:00 |
|
Awni Hannun
|
605c4854f1
|
Prompt caching in mlx_lm.server (#1026)
* caching in server
* nits
* fix tests
* don't throw if no metal
* comments
|
2024-10-14 10:57:22 -07:00 |
|
Awni Hannun
|
8dca1a2f60
|
Tokenizer updates + tests (#1024)
* tokenizer updates + tests
* nit
* add can_trim_prompt_cache
* nits
|
2024-10-14 10:48:46 -07:00 |
|
Awni Hannun
|
c799133998
|
Make llm async eval less brittle (#1040)
* Make llm async eval less brittle
* nit
|
2024-10-14 10:25:24 -07:00 |
|
Gökdeniz Gülmez
|
3f1c1dde6a
|
Merge branch 'ml-explore:main' into adding-support-for-mamba2
|
2024-10-14 16:32:00 +02:00 |
|
Shunta Saito
|
7612c646f3
|
Fix PLaMo model to support Grouped Query Attention (#1037)
|
2024-10-12 15:26:50 -07:00 |
|
Goekdeniz-Guelmez
|
00ba27fe6c
|
adding debug statements
|
2024-10-11 21:36:41 +02:00 |
|
Goekdeniz-Guelmez
|
6f88dd59d7
|
quick clean up and fix
|
2024-10-11 21:08:13 +02:00 |
|
Goekdeniz-Guelmez
|
9c075a71f8
|
Merge branch 'adding-support-for-mamba2' of https://github.com/Goekdeniz-Guelmez/mlx-examples into adding-support-for-mamba2
|
2024-10-11 20:54:35 +02:00 |
|