Commit Graph

7 Commits

Author SHA1 Message Date
Alex Barron
f787c08585 comments 2025-01-23 12:31:59 -08:00
Alex Barron
d5f49d65b9 ordering 2025-01-23 06:37:47 -08:00
Alex Barron
4385363c0f distributed evaluate 2025-01-23 06:37:45 -08:00
Chime Ogbuji
f2619f507c
Add support for fewshot and apply chat template lm_eval functionality (#1180)
* Add support for multiturn fewshot examples and chat templates

Added two new arguments to the evaluation script: `--fewshot-as-multiturn` and `--apply-chat-template` which correspond to lm_eval options of similar names and are very often used to ensure apples-to-apples comparisons of lm_evaluation results

* Add HF overrides for methods needed by added options

* don't add duplicate bos

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2025-01-06 07:58:43 -08:00
Awni Hannun
c4833a2f55
fix encoding with special tokens + chat template (#1189) 2025-01-03 10:50:59 -08:00
Ikko Eltociear Ashimine
fc0674d2d8
chore: update evaluate.py (#1159)
occurence -> occurrence
2024-12-15 06:06:29 -08:00
Alex Barron
cd8cf28c39
mlx_lm.evaluate (#1140)
* Add evaluation script

* only write top level results

* add lm eval version

* typo

* create output dir

* relative import

* comment

---------

Co-authored-by: David Grangier <dgrangier@users.noreply.github.com>
2024-12-08 12:20:10 -08:00