Add support for fewshot and apply chat template lm_eval functionality (#1180)

* Add support for multiturn fewshot examples and chat templates

Added two new arguments to the evaluation script: `--fewshot-as-multiturn` and `--apply-chat-template` which correspond to lm_eval options of similar names and are very often used to ensure apples-to-apples comparisons of lm_evaluation results

* Add HF overrides for methods needed by added options

* don't add duplicate bos

---------

Co-authored-by: Awni Hannun <awni@apple.com>
This commit is contained in:
Chime Ogbuji
2025-01-06 10:58:43 -05:00
committed by GitHub
parent 25ec2d8c44
commit f2619f507c
3 changed files with 45 additions and 20 deletions

View File

@@ -27,8 +27,8 @@ setup(
packages=["mlx_lm", "mlx_lm.models", "mlx_lm.tuner"],
python_requires=">=3.8",
extras_require={
"testing": ["datasets"],
"evaluation": ["lm-eval"],
"test": ["datasets"],
"evaluate": ["lm-eval", "tqdm"],
},
entry_points={
"console_scripts": [