Supported Models

Models Compatibility

Other models with similar architectures may also work successfully even if not explicitly validated. Consider testing any unlisted models to verify compatibility with your specific use case.

Large Language Models (LLMs)

LoRA Support

LLM pipeline supports LoRA adapters.

Architecture	Models	Example HuggingFace Models
`AquilaModel`	Aquila	BAAI/Aquila-7B BAAI/AquilaChat-7B BAAI/Aquila2-7B BAAI/AquilaChat2-7B
`ArcticForCausalLM`	Snowflake	Snowflake/snowflake-arctic-instruct Snowflake/snowflake-arctic-base
`BaichuanForCausalLM`	Baichuan2	baichuan-inc/Baichuan2-7B-Chat baichuan-inc/Baichuan2-13B-Chat
`BloomForCausalLM`	Bloom	bigscience/bloom-560m bigscience/bloom-1b1 bigscience/bloom-1b7 bigscience/bloom-3b bigscience/bloom-7b1
`BloomForCausalLM`	Bloomz	bigscience/bloomz-560m bigscience/bloomz-1b1 bigscience/bloomz-1b7 bigscience/bloomz-3b bigscience/bloomz-7b1
`ChatGLMModel`	ChatGLM	THUDM/chatglm2-6b THUDM/chatglm3-6b THUDM/glm-4-9b THUDM/glm-4-9b-chat
`CodeGenForCausalLM`	CodeGen	Salesforce/codegen-350m-multi Salesforce/codegen-2B-multi Salesforce/codegen-6B-multi Salesforce/codegen-16B-multi Salesforce/codegen-350m-mono Salesforce/codegen-2B-mono Salesforce/codegen-6B-mono Salesforce/codegen-16B-mono Salesforce/codegen2-1B_P Salesforce/codegen2-3_7B_P Salesforce/codegen2-7B_P Salesforce/codegen2-16B_P
`CohereForCausalLM`	Aya	CohereLabs/aya-23-8B CohereLabs/aya-expanse-8b CohereLabs/aya-23-35B
`CohereForCausalLM`	C4AI Command R	CohereLabs/c4ai-command-r7b-12-2024 CohereLabs/c4ai-command-r-v01
`DbrxForCausalLM`	DBRX	databricks/dbrx-instruct databricks/dbrx-base
`DeciLMForCausalLM`	DeciLM	Deci/DeciLM-7B Deci/DeciLM-7B-instruct
`DeepseekForCausalLM`	DeepSeek-MoE	deepseek-ai/deepseek-moe-16b-base deepseek-ai/deepseek-moe-16b-chat
`DeepseekV2ForCausalLM`	DeepSeekV2	deepseek-ai/DeepSeek-V2-Lite deepseek-ai/DeepSeek-V2-Lite-Chat deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct
`DeepseekV3ForCausalLM`	DeepSeekV3	deepseek-ai/DeepSeek-V3 deepseek-ai/DeepSeek-V3-Base deepseek-ai/DeepSeek-R1 deepseek-ai/DeepSeek-R1-Zero
`ExaoneForCausalLM`	Exaone	LGAI-EXAONE/EXAONE-3.5-2.4B-Instruct LGAI-EXAONE/EXAONE-3.5-7.8B-Instruct LGAI-EXAONE/EXAONE-3.5-32B-Instruct LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct
`Exaone4ForCausalLM`	Exaone 4.0	LGAI-EXAONE/EXAONE-4.0-1.2B
`FalconForCausalLM`	Falcon	tiiuae/falcon-11B tiiuae/falcon-7b tiiuae/falcon-7b-instruct tiiuae/falcon-40b tiiuae/falcon-40b-instruct
`GemmaForCausalLM`	Gemma	google/gemma-2b google/gemma-2b-it google/gemma-1.1-2b-it google/codegemma-2b google/codegemma-1.1-2b google/gemma-7b google/gemma-7b-it google/gemma-1.1-7b-it google/codegemma-7b google/codegemma-7b-it google/codegemma-1.1-7b-it
`Gemma2ForCausalLM`	Gemma2	google/gemma-2-2b google/gemma-2-2b-it google/gemma-2-9b google/gemma-2-9b-it google/gemma-2-27b google/gemma-2-27b-it
`Gemma3ForCausalLM`	Gemma3	google/gemma-3-270m google/gemma-3-270m-it google/gemma-3-1b-it google/gemma-3-1b-pt
`GlmForCausalLM`	GLM	THUDM/glm-edge-1.5b-chat THUDM/glm-edge-4b-chat THUDM/glm-4-9b-hf THUDM/glm-4-9b-chat-hf THUDM/glm-4-9b-chat-1m-hf
`GPT2LMHeadModel`	GPT2	openai-community/gpt2 openai-community/gpt2-medium openai-community/gpt2-large openai-community/gpt2-xl distilbert/distilgpt2
`GPT2LMHeadModel`	CodeParrot	codeparrot/codeparrot-small codeparrot/codeparrot-small-code-to-text codeparrot/codeparrot-small-text-to-code codeparrot/codeparrot-small-multi codeparrot/codeparrot
`GPTBigCodeForCausalLM`	StarCoder	bigcode/starcoderbase-1b bigcode/starcoderbase-3b bigcode/starcoderbase-7b bigcode/starcoderbase bigcode/starcoder bigcode/octocoder HuggingFaceH4/starchat-alpha HuggingFaceH4/starchat-beta
`GPTJForCausalLM`	GPT-J	EleutherAI/gpt-j-6b crumb/Instruct-GPT-J
`GPTNeoForCausalLM`	GPT Neo	EleutherAI/gpt-neo-1.3B EleutherAI/gpt-neo-2.7B
`GPTNeoXForCausalLM`	GPT NeoX	EleutherAI/gpt-neox-20b
	Dolly	databricks/dolly-v2-3b databricks/dolly-v2-7b databricks/dolly-v2-12b
	RedPajama	ikala/redpajama-3b-chat togethercomputer/RedPajama-INCITE-Chat-3B-v1 togethercomputer/RedPajama-INCITE-Instruct-3B-v1 togethercomputer/RedPajama-INCITE-7B-Chat togethercomputer/RedPajama-INCITE-7B-Instruct
`GPTNeoXJapaneseForCausalLM`	GPT NeoX Japanese	abeja/gpt-neox-japanese-2.7b
`GptOssForCausalLM`	GPT-OSS	openai/gpt-oss-20b
`GraniteForCausalLM`	Granite	ibm-granite/granite-3.2-2b-instruct ibm-granite/granite-3.2-8b-instruct ibm-granite/granite-3.1-2b-instruct ibm-granite/granite-3.1-8b-instruct ibm-granite/granite-3.0-2b-instruct ibm-granite/granite-3.0-8b-instruct
`GraniteMoeForCausalLM`	GraniteMoE	ibm-granite/granite-3.1-1b-a400m-instruct ibm-granite/granite-3.1-3b-a800m-instruct ibm-granite/granite-3.0-1b-a400m-instruct ibm-granite/granite-3.0-3b-a800m-instruct
`InternLMForCausalLM`	InternLM	internlm/internlm-chat-7b internlm/internlm-7b
`InternLM2ForCausalLM`	InternLM2	internlm/internlm2-chat-1_8b internlm/internlm2-1_8b internlm/internlm2-chat-7b internlm/internlm2-7b internlm/internlm2-chat-20b internlm/internlm2-20b internlm/internlm2_5-1_8b-chat internlm/internlm2_5-1_8b internlm/internlm2_5-7b-chat internlm/internlm2_5-7b internlm/internlm2_5-20b-chat internlm/internlm2_5-20b
`JAISLMHeadModel`	Jais	inceptionai/jais-13b-chat inceptionai/jais-13b
`LlamaForCausalLM`	Llama 3	meta-llama/Llama-3.2-1B meta-llama/Llama-3.2-1B-Instruct meta-llama/Llama-3.2-3B meta-llama/Llama-3.2-3B-Instruct meta-llama/Llama-3.1-8B meta-llama/Llama-3.1-8B-Instruct meta-llama/Meta-Llama-3-8B meta-llama/Meta-Llama-3-8B-Instruct meta-llama/Llama-3.3-70B-Instruct meta-llama/Llama-3.1-70B meta-llama/Llama-3.1-70B-Instruct meta-llama/Meta-Llama-3-70B meta-llama/Meta-Llama-3-70B-Instruct deepseek-ai/DeepSeek-R1-Distill-Llama-8B deepseek-ai/DeepSeek-R1-Distill-Llama-70B
	Llama 2	meta-llama/Llama-2-13b-chat-hf meta-llama/Llama-2-13b-hf meta-llama/Llama-2-7b-chat-hf meta-llama/Llama-2-7b-hf meta-llama/Llama-2-70b-chat-hf meta-llama/Llama-2-70b-hf microsoft/Llama2-7b-WhoIsHarryPotter
	Falcon3	tiiuae/Falcon3-1B-Instruct tiiuae/Falcon3-1B-Base tiiuae/Falcon3-3B-Instruct tiiuae/Falcon3-3B-Base tiiuae/Falcon3-7B-Instruct tiiuae/Falcon3-7B-Base tiiuae/Falcon3-10B-Instruct tiiuae/Falcon3-10B-Base
	OpenLLaMA	openlm-research/open_llama_13b openlm-research/open_llama_3b openlm-research/open_llama_3b_v2 openlm-research/open_llama_7b openlm-research/open_llama_7b_v2
	TinyLlama	TinyLlama/TinyLlama-1.1B-Chat-v1.0
`MPTForCausalLM`	MPT	mosaicml/mpt-7b mosaicml/mpt-7b-instruct mosaicml/mpt-7b-chat mosaicml/mpt-30b mosaicml/mpt-30b-instruct mosaicml/mpt-30b-chat
`MiniCPMForCausalLM`	MiniCPM	openbmb/MiniCPM-1B-sft-bf16 openbmb/MiniCPM-2B-dpo-fp16 openbmb/MiniCPM-2B-sft-fp32 openbmb/MiniCPM-2B-dpo-fp32 openbmb/MiniCPM-2B-sft-bf16 openbmb/MiniCPM-2B-dpo-bf16 openbmb/MiniCPM4-0.5B openbmb/MiniCPM4-8B
`MiniCPM3ForCausalLM`	MiniCPM3	openbmb/MiniCPM3-4B
`MistralForCausalLM`	Mistral	mistralai/Mistral-7B-Instruct-v0.1 mistralai/Mistral-7B-Instruct-v0.2 mistralai/Mistral-7B-Instruct-v0.3 mistralai/Mistral-Nemo-Instruct-2407 mistralai/Mistral-Nemo-Base-2407 mistralai/Mistral-7B-v0.1 mistralai/Mistral-7B-v0.3
	Notus	argilla/notus-7b-v1
	Zephyr	HuggingFaceH4/zephyr-7b-beta
	Neural Chat	Intel/neural-chat-7b-v3-3 Intel/neural-chat-7b-v3-2 Intel/neural-chat-7b-v3-1 Intel/neural-chat-7b-v3
`MixtralForCausalLM`	Mixtral	mistralai/Mixtral-8x7B-Instruct-v0.1 mistralai/Mixtral-8x7B-v0.1
`OlmoForCausalLM`	OLMo	allenai/OLMo-1B-hf allenai/OLMo-7B-hf allenai/OLMo-7B-Twin-2T-hf allenai/OLMo-7B-Instruct-hf allenai/OLMo-7B-0724-Instruct-hf allenai/OLMo-7B-0724-SFT-hf
`OPTForCausalLM`	OPT	facebook/opt-125m facebook/opt-350m facebook/opt-1.3b facebook/opt-2.7b facebook/opt-6.7b facebook/opt-13b
`OrionForCausalLM`	Orion	OrionStarAI/Orion-14B-Chat OrionStarAI/Orion-14B-LongChat OrionStarAI/Orion-14B-Base
`PhiForCausalLM`	Phi	microsoft/phi-2 microsoft/phi-1_5
`Phi3ForCausalLM`	Phi3	microsoft/Phi-3-mini-4k-instruct microsoft/Phi-3-mini-128k-instruct microsoft/Phi-3-medium-4k-instruct microsoft/Phi-3-medium-128k-instruct microsoft/Phi-3.5-mini-instruct microsoft/Phi-4-mini-instruct microsoft/phi-4 microsoft/Phi-4-reasoning
`PhimoeForCausalLM`	Phi-3.5-MoE	microsoft/Phi-3.5-MoE-instruct
`QWenLMHeadModel`	Qwen	Qwen/Qwen-1_8B-Chat Qwen/Qwen-1_8B-Chat-Int4 Qwen/Qwen-1_8B Qwen/Qwen-7B-Chat Qwen/Qwen-7B-Chat-Int4 Qwen/Qwen-7B Qwen/Qwen-14B-Chat Qwen/Qwen-14B-Chat-Int4 Qwen/Qwen-14B Qwen/Qwen-72B-Chat Qwen/Qwen-72B-Chat-Int4 Qwen/Qwen-72B
`Qwen2ForCausalLM`	Qwen2	Qwen/Qwen2.5-0.5B-Instruct Qwen/Qwen2.5-1.5B-Instruct Qwen/Qwen2.5-3B-Instruct Qwen/Qwen2.5-7B-Instruct Qwen/Qwen2.5-14B-Instruct Qwen/Qwen2.5-32B-Instruct Qwen/Qwen2.5-72B-Instruct Qwen/Qwen2-0.5B-Instruct Qwen/Qwen2-1.5B-Instruct Qwen/Qwen2-7B-Instruct Qwen/Qwen2-72B-Instruct Qwen/Qwen1.5-0.5B-Chat Qwen/Qwen1.5-1.8B-Chat Qwen/Qwen1.5-4B-Chat Qwen/Qwen1.5-7B-Chat Qwen/Qwen1.5-14B-Chat Qwen/Qwen1.5-32B-Chat Qwen/Qwen1.5-7B-Chat-GPTQ-Int4 Qwen/QwQ-32B deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B deepseek-ai/DeepSeek-R1-Distill-Qwen-7B deepseek-ai/DeepSeek-R1-Distill-Qwen-14B deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
`Qwen2MoeForCausalLM`	Qwen2MoE	Qwen/Qwen2-57B-A14B-Instruct Qwen/Qwen2-57B-A14B Qwen/Qwen1.5-MoE-A2.7B-Chat Qwen/Qwen1.5-MoE-A2.7B
`Qwen3ForCausalLM`	Qwen3	Qwen/Qwen3-0.6B Qwen/Qwen3-1.7B Qwen/Qwen3-4B Qwen/Qwen3-8B Qwen/Qwen3-14B Qwen/Qwen3-32B Qwen/Qwen3-0.6B-Base Qwen/Qwen3-1.7B-Base Qwen/Qwen3-4B-Base Qwen/Qwen3-8B-Base Qwen/Qwen3-14B-Base
`Qwen3MoeForCausalLM`	Qwen3MoE	Qwen/Qwen3-30B-A3B Qwen/Qwen3-30B-A3B-Base
`StableLmForCausalLM`	StableLM	stabilityai/stablelm-zephyr-3b stabilityai/stablelm-2-1_6b stabilityai/stablelm-2-12b stabilityai/stablelm-2-zephyr-1_6b stabilityai/stablelm-3b-4e1t
`Starcoder2ForCausalLM`	Startcoder2	bigcode/starcoder2-3b bigcode/starcoder2-7b bigcode/starcoder2-15b
`XGLMForCausalLM`	XGLM	facebook/xglm-564M facebook/xglm-1.7B facebook/xglm-2.9B facebook/xglm-4.5B facebook/xglm-7.5B
`XverseForCausalLM`	Xverse	xverse/XVERSE-7B xverse/XVERSE-7B-Chat xverse/XVERSE-13B xverse/XVERSE-13B-Chat xverse/XVERSE-65B xverse/XVERSE-65B-Chat

info

The LLM pipeline can work with other similar topologies produced by optimum-intel with the same model signature. The model is required to have the following inputs after the conversion:

input_ids contains the tokens.
attention_mask is filled with 1.
beam_idx selects beams.
position_ids (optional) encodes a position of currently generating token in the sequence and a single logits output.

note

Models should belong to the same family and have the same tokenizers.

Image Generation Models

Architecture	Text to Image	Image to Image	Inpainting	LoRA Support	Example HuggingFace Models
`Latent Consistency Model`	✅	✅	✅	✅	SimianLuo/LCM_Dreamshaper_v7
`Stable Diffusion`	✅	✅	✅	✅	CompVis/stable-diffusion-v1-1 CompVis/stable-diffusion-v1-2 CompVis/stable-diffusion-v1-3 CompVis/stable-diffusion-v1-4 junnyu/stable-diffusion-v1-4-paddle jcplus/stable-diffusion-v1-5 stable-diffusion-v1-5/stable-diffusion-v1-5 botp/stable-diffusion-v1-5 dreamlike-art/dreamlike-anime-1.0 stabilityai/stable-diffusion-2 stabilityai/stable-diffusion-2-base stabilityai/stable-diffusion-2-1 bguisard/stable-diffusion-nano-2-1 justinpinkney/pokemon-stable-diffusion stablediffusionapi/architecture-tuned-model IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Chinese-EN-v0.1 ZeroCool94/stable-diffusion-v1-5 pcuenq/stable-diffusion-v1-4 rinna/japanese-stable-diffusion benjamin-paine/stable-diffusion-v1-5 philschmid/stable-diffusion-v1-4-endpoints naclbit/trinart_stable_diffusion_v2 Fictiverse/Stable_Diffusion_PaperCut_Model
`Stable Diffusion Inpainting`	❌	❌	✅	✅	stabilityai/stable-diffusion-2-inpainting stable-diffusion-v1-5/stable-diffusion-inpainting botp/stable-diffusion-v1-5-inpainting parlance/dreamlike-diffusion-1.0-inpainting
`Stable Diffusion XL`	✅	✅	✅	✅	stabilityai/stable-diffusion-xl-base-0.9 stabilityai/stable-diffusion-xl-base-1.0 stabilityai/sdxl-turbo cagliostrolab/animagine-xl-4.0
`Stable Diffusion XL Inpainting`	❌	❌	✅	✅	diffusers/stable-diffusion-xl-1.0-inpainting-0.1
`Stable Diffusion 3`	✅	✅	✅	❌	stabilityai/stable-diffusion-3-medium-diffusers stabilityai/stable-diffusion-3.5-medium stabilityai/stable-diffusion-3.5-large stabilityai/stable-diffusion-3.5-large-turbo tensorart/stable-diffusion-3.5-medium-turbo tensorart/stable-diffusion-3.5-large-TurboX
`Flux`	✅	✅	✅	✅	black-forest-labs/FLUX.1-schnell shuttleai/shuttle-3-diffusion shuttleai/shuttle-3.1-aesthetic shuttleai/shuttle-jaguar

Video Generation Models

Architecture	Text to Video	Image to Video	LoRA Support	Example HuggingFace Models
`LTX-Video`	✅	❌	❌	Lightricks/LTX-Video

Visual Language Models (VLMs)

LoRA Support

VLM pipeline does not support LoRA adapters.

Architecture	Models	Example HuggingFace Models
`InternVLChat`	InternVLChatModel (Notes)	OpenGVLab/InternVL2-1B OpenGVLab/InternVL2-2B OpenGVLab/InternVL2-4B OpenGVLab/InternVL2-8B OpenGVLab/InternVL2_5-1B OpenGVLab/InternVL2_5-2B OpenGVLab/InternVL2_5-4B OpenGVLab/InternVL2_5-8B OpenGVLab/InternVL3-1B OpenGVLab/InternVL3-2B OpenGVLab/InternVL3-8B OpenGVLab/InternVL3-9B OpenGVLab/InternVL3-14B
`LLaVA`	LLaVA-v1.5	llava-hf/llava-1.5-7b-hf
`nanoLLaVA`	nanoLLaVA (Notes)	qnguyen3/nanoLLaVA
`nanoLLaVA`	nanoLLaVA-1.5	qnguyen3/nanoLLaVA-1.5
`LLaVA-NeXT`	LLaVA-v1.6	llava-hf/llava-v1.6-mistral-7b-hf llava-hf/llava-v1.6-vicuna-7b-hf llava-hf/llama3-llava-next-8b-hf
`LLaVA-NeXT-Video`	LLaVA-Next-Video	llava-hf/LLaVA-NeXT-Video-7B-hf
`MiniCPMO`	MiniCPM-o-2_6 (Notes)	openbmb/MiniCPM-o-2_6
`MiniCPMV`	MiniCPM-V-2_6	openbmb/MiniCPM-V-2_6
`Phi3VForCausalLM`	phi3_v (Notes)	microsoft/Phi-3-vision-128k-instruct microsoft/Phi-3.5-vision-instruct
`Phi4MMForCausalLM`	phi4mm (Notes)	microsoft/Phi-4-multimodal-instruct
`Qwen2-VL`	Qwen2-VL	Qwen/Qwen2-VL-2B-Instruct Qwen/Qwen2-VL-7B-Instruct Qwen/Qwen2-VL-2B Qwen/Qwen2-VL-7B
`Qwen2.5-VL`	Qwen2.5-VL	Qwen/Qwen2.5-VL-3B-Instruct Qwen/Qwen2.5-VL-7B-Instruct
`Qwen3-VL`	Qwen3-VL (Notes)	Qwen/Qwen3-VL-2B-Instruct Qwen/Qwen3-VL-2B-Thinking Qwen/Qwen3-VL-4B-Instruct Qwen/Qwen3-VL-4B-Thinking Qwen/Qwen3-VL-8B-Instruct Qwen/Qwen3-VL-8B-Thinking Qwen/Qwen3-VL-32B-Instruct Qwen/Qwen3-VL-32B-Thinking
`Gemma3ForConditionalGeneration`	gemma3	google/gemma-3-4b-it google/gemma-3-12b-it google/gemma-3-27b-it

VLM Models Notes

InternVL2

To convert InternVL2 models, timm and einops are required:

pip install timm einops

MiniCPMO

openbmb/MiniCPM-o-2_6 doesn't support transformers>=4.52 which is required for optimum-cli export.
--task image-text-to-text is required for optimum-cli export openvino --trust-remote-code because image-text-to-text isn't MiniCPM-o-2_6's native task.

phi3_v

Models' configs aren't consistent. It's required to override the default eos_token_id with the one from a tokenizer:

generation_config.set_eos_token_id(pipe.get_tokenizer().get_eos_token_id())

phi4mm

Apply https://huggingface.co/microsoft/Phi-4-multimodal-instruct/discussions/78/files to fix the model export for transformers>=4.50

Qwen3-VL

The model requires transformers>=4.57 for the export with optimum-cli.

nanoLLaVA

The model requires transformers>=4.48 for the export with optimum-cli.

Speech Recognition Models (Whisper-based)

LoRA Support

Speech recognition pipeline does not support LoRA adapters.

Architecture	Models	Example HuggingFace Models
`WhisperForConditionalGeneration`	Whisper	openai/whisper-tiny openai/whisper-tiny.en openai/whisper-base openai/whisper-base.en openai/whisper-small openai/whisper-small.en openai/whisper-medium openai/whisper-medium.en openai/whisper-large-v3
`WhisperForConditionalGeneration`	Distil-Whisper	distil-whisper/distil-small.en distil-whisper/distil-medium.en distil-whisper/distil-large-v3

Speech Generation Models

LoRA Support

Speech generation pipeline does not support LoRA adapters.

Architecture	Models	Example HuggingFace Models
`SpeechT5ForTextToSpeech`	SpeechT5 TTS	microsoft/speecht5_tts

Text Embeddings Models

LoRA Support

Text embeddings pipeline does not support LoRA adapters.

Architecture	Example HuggingFace Models
`BertModel`	BAAI/bge-small-en-v1.5 BAAI/bge-base-en-v1.5 BAAI/bge-large-en-v1.5 sentence-transformers/all-MiniLM-L12-v2 mixedbread-ai/mxbai-embed-large-v1 mixedbread-ai/mxbai-embed-xsmall-v1 WhereIsAI/UAE-Large-V1
`MPNetForMaskedLM`	sentence-transformers/all-mpnet-base-v2 sentence-transformers/multi-qa-mpnet-base-dot-v1
`RobertaForMaskedLM`	sentence-transformers/all-distilroberta-v1
`XLMRobertaModel`	mixedbread-ai/deepset-mxbai-embed-de-large-v1 intfloat/multilingual-e5-large-instruct intfloat/multilingual-e5-large
`Qwen3ForCausalLM`	Qwen/Qwen3-Embedding-0.6B Qwen/Qwen3-Embedding-4B Qwen/Qwen3-Embedding-8B

Text Embeddings Models Notes

Qwen3 Embedding models require --task feature-extraction during the conversion with optimum-cli.

Text Rerank Models

LoRA Support

Text rerank pipeline does not support LoRA adapters.

Architecture	`optimum-cli` task	Example HuggingFace Models
`BertForSequenceClassification`	`text-classification`	cross-encoder/ms-marco-MiniLM-L2-v2 cross-encoder/ms-marco-MiniLM-L4-v2 cross-encoder/ms-marco-MiniLM-L6-v2 cross-encoder/ms-marco-MiniLM-L12-v2 cross-encoder/ms-marco-TinyBERT-L2-v2 tomaarsen/reranker-MiniLM-L12-gooaq-bce
`XLMRobertaForSequenceClassification`	`text-classification`	BAAI/bge-reranker-v2-m3
`ModernBertForSequenceClassification`	`text-classification`	tomaarsen/reranker-ModernBERT-base-gooaq-bce tomaarsen/reranker-ModernBERT-large-gooaq-bce Alibaba-NLP/gte-reranker-modernbert-base
`Qwen3ForCausalLM`	`text-generation-with-past`	Qwen/Qwen3-Reranker-0.6B Qwen/Qwen3-Reranker-4B Qwen/Qwen3-Reranker-8B

Text Rerank Models Notes

Text Rerank models require appropriate --task provided during the conversion with optimum-cli. Task can be found in the table above.

Hugging Face Notes

Some models may require access request submission on the Hugging Face page to be downloaded.

If https://huggingface.co/ is down, the conversion step won't be able to download the models.

Large Language Models (LLMs)​

Image Generation Models​

Video Generation Models​

Visual Language Models (VLMs)​

InternVL2​

MiniCPMO​

phi3_v​

phi4mm​

Qwen3-VL​

nanoLLaVA​

Speech Recognition Models (Whisper-based)​

Speech Generation Models​

Text Embeddings Models​

Text Rerank Models​

Large Language Models (LLMs)

Image Generation Models

Video Generation Models

Visual Language Models (VLMs)

InternVL2

MiniCPMO

phi3_v

phi4mm

Qwen3-VL

nanoLLaVA

Speech Recognition Models (Whisper-based)

Speech Generation Models

Text Embeddings Models

Text Rerank Models