Skip to main content

Image Generation Using Diffusers

Convert and Optimize Model

Download and convert model (e.g. stabilityai/stable-diffusion-xl-base-1.0) to OpenVINO format from Hugging Face:

optimum-cli export --model stabilityai/stable-diffusion-xl-base-1.0 --weight-format int4 --trust-remote-code stable_diffusion_xl_base_1_0_ov

See all supported Image Generation Models.

info

Refer to the Model Preparation guide for detailed instructions on how to download, convert and optimize models for OpenVINO GenAI.

Run Model Using OpenVINO GenAI

OpenVINO GenAI supports the following diffusion model pipelines:

Text2ImagePipeline

import openvino_genai as ov_genai
from PIL import Image

pipe = ov_genai.Text2ImagePipeline(model_path, "CPU")
image_tensor = pipe.generate(prompt)

image = Image.fromarray(image_tensor.data[0])
image.save("image.bmp")
tip

Use CPU or GPU as devices without any other code change.

Image2ImagePipeline

import openvino_genai as ov_genai
import openvino as ov
from PIL import Image
import numpy as np

def read_image(path: str) -> ov.Tensor:
pic = Image.open(path).convert("RGB")
image_data = np.array(pic)[None]
return ov.Tensor(image_data)

input_image_data = read_image("input_image.jpg")

pipe = ov_genai.Image2ImagePipeline(model_path, "CPU")
image_tensor = pipe.generate(prompt, image=input_image_data, strength=0.8)

image = Image.fromarray(image_tensor.data[0])
image.save("image.bmp")

InpaintingPipeline

import openvino_genai as ov_genai
import openvino as ov
from PIL import Image
import numpy as np

def read_image(path: str) -> ov.Tensor:
pic = Image.open(path).convert("RGB")
image_data = np.array(pic)[None]
return ov.Tensor(image_data)

input_image_data = read_image("input_image.jpg")
mask_image = read_image("mask.jpg")

pipe = ov_genai.InpaintingPipeline(model_path, "CPU")
image_tensor = pipe.generate(prompt, image=input_image_data, mask_image=mask_image)

image = Image.fromarray(image_tensor.data[0])
image.save("image.bmp")

Additional Usage Options

tip

Check out Python and C++ image generation samples.

Use Different Generation Parameters

Generation Configuration Workflow

  1. Get the model default config with get_generation_config()
  2. Modify parameters
  3. Apply the updated config using one of the following methods:
    • Use set_generation_config(config)
    • Pass config directly to generate() (e.g. generate(prompt, config))
    • Specify options as inputs in the generate() method (e.g. generate(prompt, max_new_tokens=100))

Image Generation Configuration

You can adjust several parameters to control the image generation process, including dimensions and the number of inference steps:

import openvino_genai as ov_genai
from PIL import Image

pipe = ov_genai.Text2ImagePipeline(model_path, "CPU")
image_tensor = pipe.generate(
prompt,
width=512,
height=512,
num_images_per_prompt=1,
num_inference_steps=30,
guidance_scale=7.5
)

image = Image.fromarray(image_tensor.data[0])
image.save("image.bmp")
Understanding Image Generation Parameters
  • width: The width of resulting image(s).
  • height: The height of resulting image(s).
  • num_images_per_prompt: Specifies how many image variations to generate in a single request for the same prompt.
  • num_inference_steps: Defines denoising iteration count. Higher values increase quality and generation time, lower values generate faster with less detail.
  • guidance_scale: Balances prompt adherence vs. creativity. Higher values follow prompt more strictly, lower values allow more creative freedom.
  • rng_seed: Controls randomness for reproducible results. Same seed produces identical images across runs.

For the full list of generation parameters, refer to the Image Generation Config API.

Working with LoRA Adapters

For image generation models like Stable Diffusion, LoRA adapters can modify the generation process to produce images with specific artistic styles, content types, or quality enhancements.

Refer to the LoRA Adapters for more details on working with LoRA adapters.