Commit Graph

  • 3aaf2d6c9f
    Merge branch 'main' into adding-dpo-training Gökdeniz Gülmez 2025-03-08 10:07:48 +0100
  • 292979d447
    Merge branch 'ml-explore:main' into main chaihahaha 2025-03-08 12:06:04 +0800
  • d2e02b3aae
    fix mixed quant option (#1326) Awni Hannun 2025-03-07 08:35:48 -0800
  • c72811da57 fix mixed quant option Awni Hannun 2025-03-07 06:55:35 -0800
  • 595f5da146
    remove lm head if unused (#1324) Awni Hannun 2025-03-06 15:35:47 -0800
  • 877d2a345b
    Change DEFAULT_SEED to None for stochastic generation by default (#1323) cavit99 2025-03-06 14:49:35 +0000
  • bc2fcec230
    Update llms/mlx_lm/generate.py Awni Hannun 2025-03-06 06:45:59 -0800
  • 421b0219a9
    Update llms/mlx_lm/chat.py Awni Hannun 2025-03-06 06:45:52 -0800
  • 717e415ad4 remove lm head if unused Awni Hannun 2025-03-06 06:18:46 -0800
  • 5a4252f290 Change DEFAULT_SEED to None for stochastic generation by default Cavit Erginsoy 2025-03-05 23:24:15 +0000
  • 32d10036de
    fix flaky test (#1322) Awni Hannun 2025-03-05 14:00:09 -0800
  • e150621095
    Adding multiple optimizers to mlx lm (#1315) Gökdeniz Gülmez 2025-03-05 22:54:54 +0100
  • 4d77043886 fix flaky test Awni Hannun 2025-03-05 13:53:35 -0800
  • 56d2db23e1
    adding OLMoE architecture (#1321) Gökdeniz Gülmez 2025-03-05 22:46:06 +0100
  • e7267d30f8
    Distributed support cifar (#1301) Angelos Katharopoulos 2025-03-05 13:33:15 -0800
  • 499a9f0758
    Merge branch 'ml-explore:main' into adding-support-for-OLMoE Gökdeniz Gülmez 2025-03-05 18:38:43 +0100
  • cae3705109 udpate Goekdeniz-Guelmez 2025-03-05 15:32:58 +0100
  • f13a0d04ca seperate functions Goekdeniz-Guelmez 2025-03-05 15:28:12 +0100
  • d723ddfeda updates Goekdeniz-Guelmez 2025-03-05 14:49:56 +0100
  • 9a36452519 updates Goekdeniz-Guelmez 2025-03-05 14:42:34 +0100
  • 326935be49 updates Goekdeniz-Guelmez 2025-03-05 14:40:23 +0100
  • 2d2f39f96e updates Goekdeniz-Guelmez 2025-03-05 14:25:55 +0100
  • 1f89453295 eos token return fix Goekdeniz-Guelmez 2025-03-05 14:00:51 +0100
  • 2bde97fe13 minor speed improvement Goekdeniz-Guelmez 2025-03-05 13:55:24 +0100
  • 3dfb21267b updates Goekdeniz-Guelmez 2025-03-05 12:59:41 +0100
  • 0df9491ff0 using default arguments Goekdeniz-Guelmez 2025-03-05 09:47:40 +0100
  • 5b091daeec removing muon Goekdeniz-Guelmez 2025-03-05 09:41:47 +0100
  • 64ed426518 Changed the switch to set opt_class Goekdeniz-Guelmez 2025-03-05 09:40:36 +0100
  • a1ff1bf72a formated Goekdeniz-Guelmez 2025-03-05 09:30:09 +0100
  • e4c56625f0 a little faster and adding norm_topk_prob Goekdeniz-Guelmez 2025-03-05 00:16:20 +0100
  • 4ca2cd5759 clean up Goekdeniz-Guelmez 2025-03-05 00:12:29 +0100
  • 140285080d adding SwitchGLU Goekdeniz-Guelmez 2025-03-05 00:09:17 +0100
  • f621218ff5
    Tool use example (#1316) Awni Hannun 2025-03-04 13:53:20 -0800
  • 230abad426 more clean ups Goekdeniz-Guelmez 2025-03-04 22:44:24 +0100
  • d828bc0c2d remove sanitize method Goekdeniz-Guelmez 2025-03-04 22:42:49 +0100
  • 86c665f7b4 nits Awni Hannun 2025-03-04 13:38:48 -0800
  • 65aa2ec849
    use a bool mask for attention (#1319) Awni Hannun 2025-03-04 12:47:32 -0800
  • 8b6beea3be faster generation Goekdeniz-Guelmez 2025-03-04 21:26:21 +0100
  • fd63c68280 clean up Goekdeniz-Guelmez 2025-03-04 21:24:13 +0100
  • bbde6ea4bc adding olmoe to training Goekdeniz-Guelmez 2025-03-04 21:08:55 +0100
  • ef8ec7a27a udpate ACKNOWLEDGMENTS.md Goekdeniz-Guelmez 2025-03-04 20:59:40 +0100
  • 10ee8fd718 initial commit Goekdeniz-Guelmez 2025-03-04 20:56:42 +0100
  • a67213a01c use a bool mask for attention Awni Hannun 2025-03-04 08:14:29 -0800
  • c817743333
    Merge branch 'ml-explore:main' into adding-GRPO-training Gökdeniz Gülmez 2025-03-03 22:13:42 +0100
  • 2954398002
    Merge branch 'ml-explore:main' into adding-dpo-training Gökdeniz Gülmez 2025-03-03 22:13:31 +0100
  • 3023ae0cd3
    Merge branch 'ml-explore:main' into adding-orpo-training Gökdeniz Gülmez 2025-03-03 22:13:14 +0100
  • 1bc3476a46
    chore(lora): Add real-time log buffering fix for nohup execution (#1311) Pierre-Louis 2025-03-03 09:12:33 -0500
  • 269faa5fa4
    Fix plamo2 model to use rms_norm (#1308) Shunta Saito 2025-03-03 23:12:02 +0900
  • 925f5621b0 tool use example Awni Hannun 2025-03-01 11:33:49 -0800
  • 132225a018 updates Goekdeniz-Guelmez 2025-03-01 22:23:33 +0100
  • 60df71bcbc update YAML example file Goekdeniz-Guelmez 2025-03-01 15:01:52 +0100
  • eed093b0ec adding more customized YAML configuartion Goekdeniz-Guelmez 2025-03-01 15:00:29 +0100
  • b0a2edbcf3 initial commmit Goekdeniz-Guelmez 2025-03-01 14:56:06 +0100
  • c119a7a4a5 updates Goekdeniz-Guelmez 2025-03-01 12:47:13 +0100
  • bb261aadcb updates Goekdeniz-Guelmez 2025-03-01 12:42:39 +0100
  • c03dc8df1f
    run pre-commit hooks chaihahaha 2025-03-01 09:54:08 +0800
  • 0af315c8ea
    format code chaihahaha 2025-03-01 09:50:40 +0800
  • 66b630b4f6 Fix average_stats Angelos Katharopoulos 2025-02-28 16:00:06 -0800
  • 8aeea10901
    Merge branch 'main' into adding-dpo-training Gökdeniz Gülmez 2025-02-28 22:10:56 +0100
  • 6a3912be7f
    Merge branch 'main' into adding-orpo-training Gökdeniz Gülmez 2025-02-28 22:10:21 +0100
  • 925e11439b updates Goekdeniz-Guelmez 2025-02-28 22:07:19 +0100
  • 80e10a59d7
    Merge branch 'main' into adding-GRPO-training Gökdeniz Gülmez 2025-02-28 21:16:02 +0100
  • 845cd8c01e
    support kimi + more options in chat mode (#1312) Awni Hannun 2025-02-28 11:33:18 -0800
  • b2108a0de6
    Allow mask prompt in config (#1314) Awni Hannun 2025-02-28 11:33:04 -0800
  • f96a25c7d8
    chore(lora): running pre-commit hook Pierre-Louis Létoquart 2025-02-28 14:12:21 -0500
  • c1aca340b2 support kimi + more options in chat mode Awni Hannun 2025-02-28 07:54:00 -0800
  • a42122d85d Allow mask prompt in config Awni Hannun 2025-02-28 09:33:48 -0800
  • 3ddd41e288
    chore(lora): remove python 3.7+ check Pierre-Louis 2025-02-28 11:26:54 -0500
  • ae0a39d1b4
    chore(lora): Add real-time log buffering fix for nohup execution Pierre-Louis 2025-02-28 10:54:16 -0500
  • 15d53279ae batching fix Goekdeniz-Guelmez 2025-02-28 16:02:40 +0100
  • 313d4a2ac9 summarize segsum Goekdeniz-Guelmez 2025-02-28 15:04:03 +0100
  • a04eb02257
    Merge branch 'ml-explore:main' into adding-GRPO-training Gökdeniz Gülmez 2025-02-28 11:18:32 +0100
  • 71d7e99199 Remove unused imports Shunta Saito 2025-02-28 04:05:27 +0900
  • 8924bdc546 Remove sliding window attention impl. cause it should be done by using RotatingKVCache Shunta Saito 2025-02-28 03:55:52 +0900
  • ab960f80dd Fix missing variable Shunta Saito 2025-02-28 01:31:15 +0900
  • 08a8dd2507 Fix plamo2 model to use rms_norm and enable sliding window attention Shunta Saito 2025-02-28 01:17:35 +0900
  • eb73549631
    Generate: Support Prefill Response (#1299) madroid 2025-02-27 23:44:00 +0800
  • 109eb4e942 nits Awni Hannun 2025-02-27 07:39:15 -0800
  • f27ed26b32
    Merge branch 'ml-explore:main' into adding-GRPO-training Gökdeniz Gülmez 2025-02-27 11:23:20 +0100
  • 9f9da6af23 Generate: rename prefill-prompt to prefill-response madroid 2025-02-27 12:27:08 +0800
  • 00a7379070
    Fixes for phi4 mini (#1305) Awni Hannun 2025-02-26 16:21:54 -0800
  • 92daef882b Fixes for phi4 mini Awni Hannun 2025-02-26 16:11:47 -0800
  • 0f240a4c7e
    Use max tokens from options in mlx_lm evaluate (#1302) Awni Hannun 2025-02-26 15:46:16 -0800
  • 56e60ad5a6
    fix manage for new transformers (#1304) Awni Hannun 2025-02-26 15:44:57 -0800
  • 953c8bf369 fix manage for new transformers Awni Hannun 2025-02-26 15:41:31 -0800
  • 87036dcfc4 Use max tokens from options in mlx_lm evaluate Awni Hannun 2025-02-26 12:34:36 -0800
  • b7f742ef56
    Mixed quant recipes (#1300) Pedro Cuenca 2025-02-26 20:32:36 +0100
  • 3862581d57 format / nits Awni Hannun 2025-02-26 11:22:58 -0800
  • 932b196b48 updates Goekdeniz-Guelmez 2025-02-26 16:51:18 +0100
  • fab2dc2688 smoll fix Goekdeniz-Guelmez 2025-02-26 15:21:57 +0100
  • 61fad00892 updates Goekdeniz-Guelmez 2025-02-26 15:16:45 +0100
  • a683344450 correct segsum function Goekdeniz-Guelmez 2025-02-26 14:46:46 +0100
  • d20413a54d Add it to the readme and fix the rank printing in main Angelos Katharopoulos 2025-02-25 17:40:24 -0800
  • 14faec4ca2 Fix the throughput calculation Angelos Katharopoulos 2025-02-25 17:20:15 -0800
  • 8a76b421a0 Add distributed support in the CIFAR example Angelos Katharopoulos 2025-02-25 17:07:30 -0800
  • 216265bbb5 Mixed 3/6 and 2/6 recipes based on Alex Barron's Pedro Cuenca 2025-02-25 21:25:50 +0100
  • b7c0bdfd49 adding pytorch implementation Goekdeniz-Guelmez 2025-02-25 16:31:19 +0100
  • 42c3cd2084
    Merge branch 'ml-explore:main' into adding-support-for-mamba2 Gökdeniz Gülmez 2025-02-25 13:27:45 +0100
  • a1c2ac2903
    Merge branch 'ml-explore:main' into adding-orpo-training Gökdeniz Gülmez 2025-02-25 13:27:20 +0100
  • 3387e06ccd
    Merge branch 'ml-explore:main' into adding-dpo-training Gökdeniz Gülmez 2025-02-25 13:27:08 +0100