YAML configuration for mlx_lm.lora (#503)

* Convert mlx_lm.lora to use YAML configuration * pre-commit run fixes * Fix loading of config file * Remove invalid YAML from doc * Update command-line options and YAML parameter overriding, per feedback in #503 * Minor wording change * Positional argument * Moved config to a (-c/--config) flag * Removed CLI option defaults (since CLI options take precedence and their defaults are in CONFIG_DEFAULTS) * pre-commit format updates * Fix handling of CLI option defaults * Prevent None values of unspecified CLI options from overwriting values from CONFIG_DEFAULTS * nits --------- Co-authored-by: Awni Hannun <awni@apple.com>
2025-12-15 09:48:54 +08:00 · 2024-03-08 10:57:52 -05:00
parent 8b05bb6d18
commit 8c2cf665ed
3 changed files with 129 additions and 28 deletions
--- a/llms/mlx_lm/examples/lora_config.yaml
+++ b/llms/mlx_lm/examples/lora_config.yaml
@@ -0,0 +1,50 @@
+# The path to the local model directory or Hugging Face repo.
+model: "mlx_model"
+
+# Whether or not to train (boolean)
+train: true
+
+# Directory with {train, valid, test}.jsonl files
+data: "/path/to/training/data"
+
+# The PRNG seed
+seed: 0
+
+# Number of layers to fine-tune
+lora_layers: 16
+
+# Minibatch size.
+batch_size: 4
+
+# Iterations to train for.
+iters: 100
+
+# Number of validation batches, -1 uses the entire validation set.
+val_batches: 25
+
+# Adam learning rate.
+learning_rate: 1e-5
+
+# Number of training steps between loss reporting.
+steps_per_report: 10
+
+# Number of training steps between validations.
+steps_per_eval: 200
+
+# Load path to resume training with the given adapter weights.
+resume_adapter_file: null
+
+# Save/load path for the trained adapter weights.
+adapter_file: "adapters.npz"
+
+# Save the model every N iterations.
+save_every: 100
+
+# Evaluate on the test set after training
+test: false
+
+# Number of test set batches, -1 uses the entire test set.
+test_batches: 500
+
+# Maximum sequence length.
+max_seq_length: 2048