LoRA Adapters

Overview

Low-Rank Adaptation (LoRA) is a technique for efficiently fine-tuning large models without changing the base model's weights. LoRA adapters enable customization of model outputs for specific tasks, styles, or domains while requiring significantly fewer computational resources than full fine-tuning.

OpenVINO GenAI provides built-in support for LoRA adapters in text generation and image generation pipelines. This capability allows you to dynamically switch between or combine multiple adapters without recompiling the model.

info

See Supported Models for the list of models that support LoRA adapters.

Key Features

Dynamic Adapter Application: Apply LoRA adapters at runtime without model recompilation.
Multiple Adapter Support: Blend effects from multiple adapters with different weights.
Adapter Switching: Change adapters between generation calls without pipeline reconstruction.
Safetensors Format: Support for industry-standard safetensors format for adapter files.

Using LoRA Adapters

Python
C++

import openvino_genai as ov_genai

# Initialize pipeline with adapters
adapter_config = ov_genai.AdapterConfig()

# Add multiple adapters with different weights
adapter1 = ov_genai.Adapter("path/to/lora1.safetensors")
adapter2 = ov_genai.Adapter("path/to/lora2.safetensors")

adapter_config.add(adapter1, alpha=0.5)
adapter_config.add(adapter2, alpha=0.5)

pipe = ov_genai.LLMPipeline(
    model_path,
    "CPU",
    adapters=adapter_config
)

# Generate with current adapters
output1 = pipe.generate("Generate story about", max_new_tokens=100)

# Switch to different adapter configuration
new_config = ov_genai.AdapterConfig()
new_config.add(adapter1, alpha=1.0)
output2 = pipe.generate(
    "Generate story about",
    max_new_tokens=100,
    adapters=new_config
)

#include "openvino/genai/llm_pipeline.hpp"

int main() {
    ov::genai::AdapterConfig adapter_config;

    // Add multiple adapters with different weights
    ov::genai::Adapter adapter1("path/to/lora1.safetensors");
    ov::genai::Adapter adapter2("path/to/lora2.safetensors");

    adapter_config.add(adapter1, 0.5f);
    adapter_config.add(adapter2, 0.5f);

    ov::genai::LLMPipeline pipe(
        model_path,
        "CPU",
        ov::genai::adapters(adapter_config)
    );

    // Generate with current adapters
    auto output1 = pipe.generate("Generate story about", ov::genai::max_new_tokens(100));

    // Switch to different adapter configuration
    ov::genai::AdapterConfig new_config;
    new_config.add(adapter1, 1.0f);
    auto output2 = pipe.generate(
        "Generate story about",
        ov::genai::adapters(new_config),
        ov::genai::max_new_tokens(100)
    );
}

LoRA Adapters Sources

Hugging Face: Browse adapters for various models at huggingface.co/models using "LoRA" filter.
Civitai: For stable diffusion models, Civitai offers a wide range of LoRA adapters for various styles and subjects.

Overview​

Key Features​

Using LoRA Adapters​

LoRA Adapters Sources​

Overview

Key Features

Using LoRA Adapters

LoRA Adapters Sources