mirror of
https://github.com/ml-explore/mlx-examples.git
synced 2025-08-28 15:50:57 +08:00
Updates to LoRA documentation
This commit is contained in:
parent
14a75f3f03
commit
8ec802f468
@ -283,7 +283,7 @@ it on the command line. For example, pass `--data mlx-community/wikisql` to
|
|||||||
train on the pre-formatted WikiwSQL data.
|
train on the pre-formatted WikiwSQL data.
|
||||||
|
|
||||||
Otherwise, provide a mapping of keys in the dataset to the features MLX LM
|
Otherwise, provide a mapping of keys in the dataset to the features MLX LM
|
||||||
expects. Use a YAML config to specify the Hugging Face dataset arguments. For
|
expects. Use a YAML config to specify the Hugging Face (HF) dataset arguments. For
|
||||||
example:
|
example:
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
@ -295,11 +295,30 @@ hf_dataset:
|
|||||||
|
|
||||||
- Use `prompt_feature` and `completion_feature` to specify keys for a
|
- Use `prompt_feature` and `completion_feature` to specify keys for a
|
||||||
`completions` dataset. Use `text_feature` to specify the key for a `text`
|
`completions` dataset. Use `text_feature` to specify the key for a `text`
|
||||||
dataset.
|
dataset. Use `chat_feature` to specify the key for a chat dataset.
|
||||||
|
|
||||||
- To specify the train, valid, or test splits, set the corresponding
|
- To specify the train, valid, or test splits, set the corresponding
|
||||||
`{train,valid,test}_split` argument.
|
`{train,valid,test}_split` argument.
|
||||||
|
|
||||||
|
You can specify a list of HF datasets using the `hf_datasets` (plural) configuration, which is a list of records
|
||||||
|
each with the same structure as above. For example:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
hf_datasets: [
|
||||||
|
"hf_dataset":
|
||||||
|
name: "Open-Orca/OpenOrca"
|
||||||
|
train_split: "train[:90%]"
|
||||||
|
valid_split: "train[-10%:]"
|
||||||
|
prompt_feature: "question"
|
||||||
|
completion_feature: "response",
|
||||||
|
"hf_dataset":
|
||||||
|
name: "trl-lib/ultrafeedback_binarized"
|
||||||
|
train_split: "train[:90%]"
|
||||||
|
valid_split: "train[-10%:]"
|
||||||
|
chat_feature: "chosen"
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
- Arguments specified in `config` will be passed as keyword arguments to
|
- Arguments specified in `config` will be passed as keyword arguments to
|
||||||
[`datasets.load_dataset`](https://huggingface.co/docs/datasets/v2.20.0/en/package_reference/loading_methods#datasets.load_dataset).
|
[`datasets.load_dataset`](https://huggingface.co/docs/datasets/v2.20.0/en/package_reference/loading_methods#datasets.load_dataset).
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user