* Add support for multiturn fewshot examples and chat templates
Added two new arguments to the evaluation script: `--fewshot-as-multiturn` and `--apply-chat-template` which correspond to lm_eval options of similar names and are very often used to ensure apples-to-apples comparisons of lm_evaluation results
* Add HF overrides for methods needed by added options
* don't add duplicate bos
---------
Co-authored-by: Awni Hannun <awni@apple.com>
* fix rotating kv cache for chat use case
* reorg + fixes to caching, unify prompt caching across types and use cases for e.g. caching during a chat
* nit in chat
* fix tests
* fix tests
* fix tests
* docs
* chat command
* comments + docs
* Define meta_state on all Cache implementations
* fixes + trim_prompt_cache api
* fix default model
---------
Co-authored-by: Angelos Katharopoulos <a_katharopoulos@apple.com>
* Make sure to import the correct "version" module when installing the
mlx_whisper package from local source code.
* Make sure to import the correct "version" module when installing the mlx_lm package from local source code
* fix
---------
Co-authored-by: Awni Hannun <awni@apple.com>
* Add hf_dataset configuration for using HF hub-hosted datasets for (Q)LoRA training
* Pre-commit formatting
* Fix YAML config example
* Print DS info
* Include name
* Add hf_dataset parameter default
* Remove TextHFDataset and CompletionsHFDataset and use Dataset and CompletionsDataset instead, adding a text_key constructor argument to the former (and changing it to work with a provided data structure instead of just from a JSON file), and prompt_key and completion_key arguments to the latter with defaults for backwards compatibility.
* nits
* update docs
---------
Co-authored-by: Awni Hannun <awni@apple.com>
* Add model management functionality for local caches
This commit introduces a set of command-line utilities for managing MLX models downloaded and saved locally in Hugging Face cache. The functionalities include scanning existing models, retrieving detailed information about a specific model, and deleting a model by its name.
* Added mlx_lm.model to setup.py
* nits
---------
Co-authored-by: Awni Hannun <awni@apple.com>
* LoRA: Improve validation error for LoRA layer count exceeding model layer
This commit enhances the error handling when the specified LoRA layer count exceeds the total number of layers in the model. It clarifies the error message to provide actionable feedback for users, guiding them to adjust their input parameters accordingly.
* format + nits
---------
Co-authored-by: Awni Hannun <awni@apple.com>