LoRA on all linear transformer block layers (#546)

* Add --lora-all-linear option to apply LoRa to all linear transfer block layers

* Moved to YAML config and added specification of rank & alpha

* nits in conifg, more tests

* nit

* run tests for prs

---------

Co-authored-by: Awni Hannun <awni@apple.com>
This commit is contained in:
Chime Ogbuji
2024-03-12 10:37:40 -04:00
committed by GitHub
parent fe5edee360
commit e56d9015ef
8 changed files with 163 additions and 40 deletions

View File

@@ -48,3 +48,12 @@ test_batches: 500
# Maximum sequence length.
max_seq_length: 2048
# LoRA parameters can only be specified in a config file
lora_parameters:
# The layer keys to apply LoRA to.
# These will be applied for the last lora_layers
keys: ["self_attn.q_proj", "self_attn.v_proj"]
rank: 8
alpha: 16.0
scale: 10.0