Add support for ibm granite (#758)

* add support for granite 3-8B config * add gpt_bigcode * add positional embedding condition. * add support for granite 3-8B config * add gpt_bigcode * add positional embedding condition. * remove unused function * rebase fix * move position emebedding to mask creation * add to tuner and format * add support for granite 3-8B config * add gpt_bigcode * add positional embedding condition. * add support for granite 3-8B config * add gpt_bigcode * add positional embedding condition. * rebase fix * move position emebedding to mask creation * add to tuner and format * refactor mask * remove dropout layers
2025-12-10 05:59:00 +08:00 · 2024-05-22 05:16:31 +02:00
parent 9fc6efbd90
commit b044ce2acf
4 changed files with 238 additions and 20 deletions
--- a/llms/mlx_lm/tuner/utils.py
+++ b/llms/mlx_lm/tuner/utils.py
@@ -106,6 +106,9 @@ def linear_to_lora_layers(
        if model.model_type == "qwen2_moe":
            keys.add("mlp.gate")
            keys.add("mlp.shared_expert_gate")
+
+    elif model.model_type == "gpt_bigcode":
+        keys = set(["attn.c_attn"])
    elif model.model_type == "olmo":
        keys = set(["att_proj"])
    elif model.model_type == "openelm":