mlx-examples

mirror of https://github.com/ml-explore/mlx-examples.git synced 2025-08-21 12:06:51 +08:00

Author	SHA1	Message	Date
Goekdeniz-Guelmez	6a367fa31e	update Copyright to this year	2025-01-28 21:04:03 +01:00
Goekdeniz-Guelmez	0d4f2c4dc0	cleaning up and adding apple copyright to helium modelfile	2025-01-28 21:02:50 +01:00
Goekdeniz-Guelmez	dfd51f16d6	Update MambaBlock, Batched Input Processing, Improved Cache Handling, Pre-computed Constants, Cleaner State Management, Explicit Return Values:. Before: 82.442 tokens-per-sec, after: 129.130 tokens-per-sec.	2025-01-20 18:59:16 +01:00
Goekdeniz-Guelmez	db582e4f9e	Pre-computing A_log. After: 83.890 tokens-per-sec, before: 85.848 tokens-per-sec	2025-01-20 18:42:39 +01:00
Goekdeniz-Guelmez	9494a275ac	Fused Operations in delta, B, C = ... :. Before: 57.822 tokens-per-sec, after: 83.890 tokens-per-sec	2025-01-20 18:39:22 +01:00
Goekdeniz-Guelmez	e43ac7c90e	added mx.einsum() operations: before: 41.293 tokens-per-sec, after: 57.822 tokens-per-sec	2025-01-20 18:37:58 +01:00
ilyasch2	3b526f0aa1	Add support for falcon-mamba (#1074 ) * Add support for falcon-mamba * nits * nit --------- Co-authored-by: Awni Hannun <awni@apple.com>	2024-11-04 12:23:30 -08:00
Awni Hannun	9000e280ae	fix mamba models conversion (#1065 )	2024-10-22 15:44:08 -07:00
Awni Hannun	fca087be49	More cache improvements (#1015 ) * fix rotating kv cache for chat use case * reorg + fixes to caching, unify prompt caching across types and use cases for e.g. caching during a chat * nit in chat * fix tests * fix tests * fix tests * docs * chat command * comments + docs * Define meta_state on all Cache implementations * fixes + trim_prompt_cache api * fix default model --------- Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com>	2024-10-07 20:45:51 -07:00
Gökdeniz Gülmez	76710f61af	Adding support for mamba (#940 ) * initial commit * initial commit * Adding first lines * adding x, and dt projection layers * adding the clamping mechanism * First succesful inference * last commit for today - added custom geenrate function and it works as expected, will try training and then with loading a model from the hub * clean up * save up * almost * update * update * fixed cache handeling * fixed loading * added seperate generat_step method in the model and also in the utils to automaticaly use the generate step mthod in the model class * quick update * still not working * save * still not working * initial commit * utils.py logits = logits[:, -1, :] TypeError: tuple indices must be integers or slices, not tuple * update * update * Fixing the Batching Depfwise Comnvolution and multi token input * fixing generate and logits outputs * Done! * Fixing the cache handling, generating works now trying training * update ACKNOWLEDGEMENTS * removing the model_type if stuff in the _step loop in generate_step and adding MambaCache in base.py for training easier generations and removing mamba in tuner/utils. * quick clean up * update trainer/utils for right initialisation of the layers for LoRA, but not working. * clean up * Forther update to trainer/utils for correct layer selection. Successfull training * removing extra mamba-infer.py file * clean up, reformating will come later * reformat and big clean up, final commit * some speedups and cleanups * fix test * nits * nits --------- Co-authored-by: Awni Hannun <awni@apple.com>	2024-09-28 07:02:53 -07:00

10 Commits