FLUX.1 [dev] Fine-tuned with Leaf Images

FLUX.1 [dev] is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions.

Install diffusers

pip install -U diffusers

Model description

These are LoRA adaption weights for the FLUX.1 [dev] model (black-forest-labs/FLUX.1-dev). The base model is, and you must first get access to it before loading this LoRA adapter.

This LoRA adapter has rank=64 and alpha=64, trained for 4,000 steps. Earlier checkpoints are available in this repository as well (you can load these via the adapter parameter, see example below).

Trigger keywords

The following images were used during fine-tuning using the keyword <leaf microstructure>:

Dataset used for training: lamm-mit/leaf-flux-images-and-captions

You should use <leaf microstructure> to trigger this feature during image generation.

How to use

Defining some helper functions:

import os
from datetime import datetime
from PIL import Image

def generate_filename(base_name, extension=".png"):
    timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
    return f"{base_name}_{timestamp}{extension}"

def save_image(image, directory, base_name="image_grid"):
    filename = generate_filename(base_name)
    file_path = os.path.join(directory, filename)
    image.save(file_path)
    print(f"Image saved as {file_path}")

def image_grid(imgs, rows, cols, save=True, save_dir='generated_images', base_name="image_grid",
              save_individual_files=False):
    
    if not os.path.exists(save_dir):
        os.makedirs(save_dir)
        
    assert len(imgs) == rows * cols

    w, h = imgs[0].size
    grid = Image.new('RGB', size=(cols * w, rows * h))
    grid_w, grid_h = grid.size

    for i, img in enumerate(imgs):
        grid.paste(img, box=(i % cols * w, i // cols * h))
        if save_individual_files:
            save_image(img, save_dir, base_name=base_name+f'_{i}-of-{len(imgs)}_')
            
    if save and save_dir:
        save_image(grid, save_dir, base_name)
    
    return grid

Text-to-image

Model loading:

from diffusers import FluxPipeline
import torch

repo_id = 'lamm-mit/leaf-L-FLUX.1-dev'

pipeline = FluxPipeline.from_pretrained(
    "black-forest-labs/FLUX.1-dev",
    torch_dtype=torch.bfloat16,
    max_sequence_length=512,
)

#pipeline.enable_model_cpu_offload() #save some VRAM by offloading the model to CPU. Comment out if you have enough GPU VRAM

adapter='leaf-flux.safetensors' #Step 4000, final step
#adapter='leaf-flux-step-3000.safetensors' #Step 3000
#adapter='leaf-flux-step-3500.safetensors' #Step 3500

pipeline.load_lora_weights(repo_id, weight_name=adapter) #You need to use the weight_name parameter since the repo includes multiple checkpoints

pipeline=pipeline.to('cuda')

Image generation - Example #1:

prompt="""Generate a futuristic, eco-friendly architectural concept utilizing a biomimetic composite material that integrates the structural efficiency of spider silk with the adaptive porosity of plant tissues. Utilize the following key features:

* Fibrous architecture inspired by spider silk, represented by sinuous lines and curved forms.
* Interconnected, spherical nodes reminiscent of plant cell walls, emphasizing growth and adaptation.
* Open cellular structures echoing the permeable nature of plant leaves, suggesting dynamic exchanges and self-regulation capabilities.
* Gradations of opacity and transparency inspired by the varying densities found in plant tissues, highlighting functional differentiation and multi-functionality.
"""

num_samples =2
num_rows = 2
n_steps=25
guidance_scale=3.5
all_images = []
for _ in range(num_rows):
     
        
    image = pipeline(prompt,num_inference_steps=n_steps,num_images_per_prompt=num_samples,
                     guidance_scale=guidance_scale,).images
     
    all_images.extend(image)

grid = image_grid(all_images, num_rows, num_samples,  save_individual_files=True,  )
grid

Image generation - Example #2:

prompt="""A cube that looks like a <leaf microstructure>, with a wrap-around sign that says 'MATERIOMICS'. 

The cube is placed in a stunning mountain landscape with snow.

The photo is taken with a Sony A1 camera, bokeh, during the golden hour.
"""

num_samples =1
num_rows = 1
n_steps=25
guidance_scale=5.
all_images = []
for _ in range(num_rows):        
    image = pipeline(prompt,num_inference_steps=n_steps,num_images_per_prompt=num_samples,
                     guidance_scale=guidance_scale,
                     height=1024, width=1920,).images
    all_images.extend(image)

grid = image_grid(all_images, num_rows, num_samples,  save_individual_files=True,  )
grid

Image generation - Example #3 (different aspect ratio, e.g. 1024x1920):

prompt=prompt="""A sign with letters inspired by the patterns in <leaf microstructure>, it says "MATERIOMICS".
The sign is placed in a stunning mountain landscape with snow. The photo is taken with a Sony A1 camera, bokeh, during the golden hour.
"""

num_samples =1
num_rows = 1
n_steps=25
guidance_scale=5.
all_images = []
for _ in range(num_rows):        
    image = pipeline(prompt,num_inference_steps=n_steps,num_images_per_prompt=num_samples,
                     guidance_scale=guidance_scale,
                     height=1024, width=1920,).images
    all_images.extend(image)

grid = image_grid(all_images, num_rows, num_samples,  save_individual_files=True,  )
grid

Image generation - Example #4:

 prompt="""A cube that looks like a leaf microstructure, placed in a stunning mountain landscape with snow.

The photo is taken with a Sony A1 camera, bokeh, during the golden hour.
"""

num_samples =2
num_rows = 2
n_steps=25
guidance_scale=15.
all_images = []
for _ in range(num_rows):
     
        
    image = pipeline(prompt,num_inference_steps=n_steps,num_images_per_prompt=num_samples,
                     guidance_scale=guidance_scale,
                     height=1024, width=1024,).images
     
    all_images.extend(image)

grid = image_grid(all_images, num_rows, num_samples,  save_individual_files=True,  )
grid

Image generation - Example #5:

 prompt="""A jar of round <leaf microstructure> cookies with a piece of white tape that says "Materiomics Cookies". Looks tasty. Old fashioned.
"""

num_samples =2
num_rows = 2
n_steps=25
guidance_scale=15.
all_images = []
for _ in range(num_rows):
     
        
    image = pipeline(prompt,num_inference_steps=n_steps,num_images_per_prompt=num_samples,
                     guidance_scale=guidance_scale,
                     height=1024, width=1024,).images
     
    all_images.extend(image)

grid = image_grid(all_images, num_rows, num_samples,  save_individual_files=True,  )
grid

@article{LuLuuBuehler2024,
  title={Fine-tuning large language models for domain adaptation: Exploration of training strategies, scaling, model merging and synergistic capabilities},
  author={Wei Lu and Rachel K. Luu and Markus J. Buehler},
  journal={arXiv: https://arxiv.org/abs/2409.03444},
  year={2024},
}

lamm-mit
/

leaf-L-FLUX.1-dev

FLUX.1 [dev] Fine-tuned with Leaf Images

Model description

Trigger keywords

How to use

Text-to-image

Model tree for lamm-mit/leaf-L-FLUX.1-dev

Dataset used to train lamm-mit/leaf-L-FLUX.1-dev

Collection including lamm-mit/leaf-L-FLUX.1-dev

Leaf-inspired Image Generation