mirror of
https://github.com/ml-explore/mlx-examples.git
synced 2025-10-24 06:28:07 +08:00
FLUX: Optimize dataset loading logic (#1038)
This commit is contained in:
@@ -21,8 +21,9 @@ The dependencies are minimal, namely:
|
||||
|
||||
- `huggingface-hub` to download the checkpoints.
|
||||
- `regex` for the tokenization
|
||||
- `tqdm`, `PIL`, and `numpy` for the `txt2image.py` script
|
||||
- `tqdm`, `PIL`, and `numpy` for the scripts
|
||||
- `sentencepiece` for the T5 tokenizer
|
||||
- `datasets` for using an HF dataset directly
|
||||
|
||||
You can install all of the above with the `requirements.txt` as follows:
|
||||
|
||||
@@ -118,17 +119,12 @@ Finetuning
|
||||
|
||||
The `dreambooth.py` script supports LoRA finetuning of FLUX-dev (and schnell
|
||||
but ymmv) on a provided image dataset. The dataset folder must have an
|
||||
`index.json` file with the following format:
|
||||
`train.jsonl` file with the following format:
|
||||
|
||||
```json
|
||||
{
|
||||
"data": [
|
||||
{"image": "path-to-image-relative-to-dataset", "text": "Prompt to use with this image"},
|
||||
{"image": "path-to-image-relative-to-dataset", "text": "Prompt to use with this image"},
|
||||
{"image": "path-to-image-relative-to-dataset", "text": "Prompt to use with this image"},
|
||||
...
|
||||
]
|
||||
}
|
||||
```jsonl
|
||||
{"image": "path-to-image-relative-to-dataset", "prompt": "Prompt to use with this image"}
|
||||
{"image": "path-to-image-relative-to-dataset", "prompt": "Prompt to use with this image"}
|
||||
...
|
||||
```
|
||||
|
||||
The training script by default trains for 600 iterations with a batch size of
|
||||
@@ -150,19 +146,15 @@ The training images are the following 5 images [^2]:
|
||||
|
||||

|
||||
|
||||
We start by making the following `index.json` file and placing it in the same
|
||||
We start by making the following `train.jsonl` file and placing it in the same
|
||||
folder as the images.
|
||||
|
||||
```json
|
||||
{
|
||||
"data": [
|
||||
{"image": "00.jpg", "text": "A photo of sks dog"},
|
||||
{"image": "01.jpg", "text": "A photo of sks dog"},
|
||||
{"image": "02.jpg", "text": "A photo of sks dog"},
|
||||
{"image": "03.jpg", "text": "A photo of sks dog"},
|
||||
{"image": "04.jpg", "text": "A photo of sks dog"}
|
||||
]
|
||||
}
|
||||
```jsonl
|
||||
{"image": "00.jpg", "prompt": "A photo of sks dog"}
|
||||
{"image": "01.jpg", "prompt": "A photo of sks dog"}
|
||||
{"image": "02.jpg", "prompt": "A photo of sks dog"}
|
||||
{"image": "03.jpg", "prompt": "A photo of sks dog"}
|
||||
{"image": "04.jpg", "prompt": "A photo of sks dog"}
|
||||
```
|
||||
|
||||
Subsequently we finetune FLUX using the following command:
|
||||
@@ -175,6 +167,17 @@ python dreambooth.py \
|
||||
path/to/dreambooth/dataset/dog6
|
||||
```
|
||||
|
||||
|
||||
Or you can directly use the pre-processed Hugging Face dataset [mlx-community/dreambooth-dog6](https://huggingface.co/datasets/mlx-community/dreambooth-dog6) for fine-tuning.
|
||||
|
||||
```shell
|
||||
python dreambooth.py \
|
||||
--progress-prompt 'A photo of an sks dog lying on the sand at a beach in Greece' \
|
||||
--progress-every 600 --iterations 1200 --learning-rate 0.0001 \
|
||||
--lora-rank 4 --grad-accumulate 8 \
|
||||
mlx-community/dreambooth-dog6
|
||||
```
|
||||
|
||||
The training requires approximately 50GB of RAM and on an M2 Ultra it takes a
|
||||
bit more than 1 hour.
|
||||
|
||||
|
Reference in New Issue
Block a user