Add support for fewshot and apply chat template lm_eval functionality (#1180)

* Add support for multiturn fewshot examples and chat templates

Added two new arguments to the evaluation script: `--fewshot-as-multiturn` and `--apply-chat-template` which correspond to lm_eval options of similar names and are very often used to ensure apples-to-apples comparisons of lm_evaluation results

* Add HF overrides for methods needed by added options

* don't add duplicate bos

---------

Co-authored-by: Awni Hannun <awni@apple.com>
This commit is contained in:
Chime Ogbuji
2025-01-06 10:58:43 -05:00
committed by GitHub
parent 25ec2d8c44
commit f2619f507c
3 changed files with 45 additions and 20 deletions

View File

@@ -32,7 +32,7 @@ jobs:
pip install --upgrade pip
pip install unittest-xml-reporting
cd llms/
pip install -e ".[testing]"
pip install -e ".[test]"
- run:
name: Run Python tests
command: |