Add support for fewshot and apply chat template lm_eval functionality (#1180)

* Add support for multiturn fewshot examples and chat templates Added two new arguments to the evaluation script: `--fewshot-as-multiturn` and `--apply-chat-template` which correspond to lm_eval options of similar names and are very often used to ensure apples-to-apples comparisons of lm_evaluation results * Add HF overrides for methods needed by added options * don't add duplicate bos --------- Co-authored-by: Awni Hannun <awni@apple.com>
2025-12-16 02:08:55 +08:00 · 2025-01-06 10:58:43 -05:00
parent 25ec2d8c44
commit f2619f507c
3 changed files with 45 additions and 20 deletions
--- a/.circleci/config.yml
+++ b/.circleci/config.yml
@@ -32,7 +32,7 @@ jobs:
            pip install --upgrade pip
            pip install unittest-xml-reporting
            cd llms/
-            pip install -e ".[testing]"
+            pip install -e ".[test]"
      - run:
          name: Run Python tests
          command: |