ZStack 资源中心

不同推理模板适用的模型不同，部署模型时需注意推理模板兼容性。本章介绍系统推理模板兼容的模型。用户可将兼容列表中的模型上传到AI模型平台，并使用对应的系统推理模板部署。

说明：如使用自定义推理模板，请参考对应模板的官方兼容性说明。

vLLM 0.20.2
vLLM 0.17.1
vLLM 0.11.0
vLLM 0.9.2
vLLM 0.8.5
vllm-ascend-v0.18.0
vllm-ascend-v0.17.0rc1
Diffusers 0.37.0
Transformers 5.3.0
Sentence Transformers 5.3.0
llama.cpp-b6152
SGLang-0.5.11
MindIE 2.3.0
MindIE 1.0.0

vLLM 0.20.2

以下列举该模板兼容的模型架构、名称和示例。如需进一步了解兼容列表中各类模型的使用方法和注意事项，可参考vLLM官方文档

表1 纯文本语言模型 | 生成模型 | 文本生成
架构	模型	HuggingFace模型示例
AfmoeForCausalLM	Afmoe	TBA
ApertusForCausalLM	Apertus	swiss-ai/Apertus-8B-2509, swiss-ai/Apertus-70B-Instruct-2509, etc.
AquilaForCausalLM	Aquila, Aquila2	BAAI/Aquila-7B, BAAI/AquilaChat-7B, etc.
ArceeForCausalLM	Arcee (AFM)	arcee-ai/AFM-4.5B-Base, etc.
ArcticForCausalLM	Arctic	Snowflake/snowflake-arctic-base, Snowflake/snowflake-arctic-instruct, etc.
AXK1ForCausalLM	A.X-K1	skt/A.X-K1, etc.
BaiChuanForCausalLM	Baichuan2, Baichuan	baichuan-inc/Baichuan2-13B-Chat, baichuan-inc/Baichuan-7B, etc.
BailingMoeForCausalLM	Ling	inclusionAI/Ling-lite-1.5, inclusionAI/Ling-plus, etc.
BailingMoeV2ForCausalLM	Ling	inclusionAI/Ling-mini-2.0, etc.
BailingMoeV2_5ForCausalLM	Ling	inclusionAI/Ling-2.5-1T, inclusionAI/Ring-2.5-1T
BambaForCausalLM	Bamba	ibm-ai-platform/Bamba-9B-fp8, ibm-ai-platform/Bamba-9B
BloomForCausalLM	BLOOM, BLOOMZ, BLOOMChat	bigscience/bloom, bigscience/bloomz, etc.
ChatGLMModel, ChatGLMForConditionalGeneration	ChatGLM	zai-org/chatglm2-6b, zai-org/chatglm3-6b, thu-coai/ShieldLM-6B-chatglm3, etc.
CohereForCausalLM, Cohere2ForCausalLM	Command-R, Command-A	CohereLabs/c4ai-command-r-v01, CohereLabs/c4ai-command-r7b-12-2024, CohereLabs/c4ai-command-a-03-2025, CohereLabs/command-a-reasoning-08-2025, etc.
CwmForCausalLM	CWM	facebook/cwm, etc.
DbrxForCausalLM	DBRX	databricks/dbrx-base, databricks/dbrx-instruct, etc.
DeciLMForCausalLM	DeciLM	nvidia/Llama-3_3-Nemotron-Super-49B-v1, etc.
DeepseekForCausalLM	DeepSeek	deepseek-ai/deepseek-llm-67b-base, deepseek-ai/deepseek-llm-7b-chat, etc.
DeepseekV2ForCausalLM	DeepSeek-V2	deepseek-ai/DeepSeek-V2, deepseek-ai/DeepSeek-V2-Chat, etc.
DeepseekV3ForCausalLM	DeepSeek-V3	deepseek-ai/DeepSeek-V3, deepseek-ai/DeepSeek-R1, deepseek-ai/DeepSeek-V3.1, etc.
DeepseekV4ForCausalLM	DeepSeek-V4	deepseek-ai/DeepSeek-V4-Flash, deepseek-ai/DeepSeek-V4-Pro, etc.
Dots1ForCausalLM	dots.llm1	rednote-hilab/dots.llm1.base, rednote-hilab/dots.llm1.inst, etc.
DotsOCRForCausalLM	dots_ocr	rednote-hilab/dots.ocr
Ernie4_5ForCausalLM	Ernie4.5	baidu/ERNIE-4.5-0.3B-PT, etc.
Ernie4_5_MoeForCausalLM	Ernie4.5MoE	baidu/ERNIE-4.5-21B-A3B-PT, baidu/ERNIE-4.5-300B-A47B-PT, etc.
ExaoneForCausalLM	EXAONE-3	LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct, etc.
ExaoneMoEForCausalLM	K-EXAONE	LGAI-EXAONE/K-EXAONE-236B-A23B, etc.
Exaone4ForCausalLM	EXAONE-4	LGAI-EXAONE/EXAONE-4.0-32B, etc.
Fairseq2LlamaForCausalLM	Llama (fairseq2 format)	mgleize/fairseq2-dummy-Llama-3.2-1B, etc.
FalconForCausalLM	Falcon	tiiuae/falcon-7b, tiiuae/falcon-40b, tiiuae/falcon-rw-7b, etc.
FalconMambaForCausalLM	FalconMamba	tiiuae/falcon-mamba-7b, tiiuae/falcon-mamba-7b-instruct, etc.
FalconH1ForCausalLM	Falcon-H1	tiiuae/Falcon-H1-34B-Base, tiiuae/Falcon-H1-34B-Instruct, etc.
FlexOlmoForCausalLM	FlexOlmo	allenai/FlexOlmo-7x7B-1T, allenai/FlexOlmo-7x7B-1T-RT, etc.
GemmaForCausalLM	Gemma	google/gemma-2b, google/gemma-1.1-2b-it, etc.
Gemma2ForCausalLM	Gemma 2	google/gemma-2-9b, google/gemma-2-27b, etc.
Gemma3ForCausalLM	Gemma 3	google/gemma-3-1b-it, etc.
Gemma3nForCausalLM	Gemma 3n	google/gemma-3n-E2B-it, google/gemma-3n-E4B-it, etc.
Gemma4ForCausalLM	Gemma 4	google/gemma-4-E2B-it, etc.
GlmForCausalLM	GLM-4	zai-org/glm-4-9b-chat-hf, etc.
Glm4ForCausalLM	GLM-4-0414	zai-org/GLM-4-32B-0414, etc.
Glm4MoeForCausalLM	GLM-4.5, GLM-4.6, GLM-4.7	zai-org/GLM-4.5, etc.
Glm4MoeLiteForCausalLM	GLM-4.7-Flash	zai-org/GLM-4.7-Flash, etc.
GPT2LMHeadModel	GPT-2	openai-community/gpt2, openai-community/gpt2-xl, etc.
GPTBigCodeForCausalLM	StarCoder, SantaCoder, WizardCoder	bigcode/starcoder, bigcode/gpt_bigcode-santacoder, WizardLM/WizardCoder-15B-V1.0, etc.
GPTJForCausalLM	GPT-J	EleutherAI/gpt-j-6b, nomic-ai/gpt4all-j, etc.
GPTNeoXForCausalLM	GPT-NeoX, Pythia, OpenAssistant, Dolly V2, StableLM	EleutherAI/gpt-neox-20b, EleutherAI/pythia-12b, OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5, databricks/dolly-v2-12b, stabilityai/stablelm-tuned-alpha-7b, etc.
GptOssForCausalLM	GPT-OSS	openai/gpt-oss-120b, openai/gpt-oss-20b
GraniteForCausalLM	Granite 3.0, Granite 3.1, PowerLM	ibm-granite/granite-3.0-2b-base, ibm-granite/granite-3.1-8b-instruct, ibm/PowerLM-3b, etc.
GraniteMoeForCausalLM	Granite 3.0 MoE, PowerMoE	ibm-granite/granite-3.0-1b-a400m-base, ibm-granite/granite-3.0-3b-a800m-instruct, ibm/PowerMoE-3b, etc.
GraniteMoeHybridForCausalLM	Granite 4.0 MoE Hybrid	ibm-granite/granite-4.0-tiny-preview, etc.
GraniteMoeSharedForCausalLM	Granite MoE Shared	ibm-research/moe-7b-1b-active-shared-experts (test model)
GritLM	GritLM	parasail-ai/GritLM-7B-vllm.
Grok1ModelForCausalLM	Grok1	hpcai-tech/grok-1.
Grok1ForCausalLM	Grok2	xai-org/grok-2
HunYuanDenseV1ForCausalLM	Hunyuan Dense	tencent/Hunyuan-7B-Instruct
HunYuanMoEV1ForCausalLM	Hunyuan-A13B	tencent/Hunyuan-A13B-Instruct, tencent/Hunyuan-A13B-Pretrain, tencent/Hunyuan-A13B-Instruct-FP8, etc.
HYV3ForCausalLM	HY3	tencent/Hy3-preview-Base, tencent/Hy3-preview
HyperCLOVAXForCausalLM	HyperCLOVAX-SEED-Think-14B	naver-hyperclovax/HyperCLOVAX-SEED-Think-14B
InternLMForCausalLM	InternLM	internlm/internlm-7b, internlm/internlm-chat-7b, etc.
InternLM2ForCausalLM	InternLM2	internlm/internlm2-7b, internlm/internlm2-chat-7b, etc.
InternLM3ForCausalLM	InternLM3	internlm/internlm3-8b-instruct, etc.
IQuestCoderForCausalLM	IQuestCoderV1	IQuestLab/IQuest-Coder-V1-40B-Instruct, etc.
IQuestLoopCoderForCausalLM	IQuestLoopCoderV1	IQuestLab/IQuest-Coder-V1-40B-Loop-Instruct, etc.
JAISLMHeadModel	Jais	inceptionai/jais-13b, inceptionai/jais-13b-chat, inceptionai/jais-30b-v3, inceptionai/jais-30b-chat-v3, etc.
Jais2ForCausalLM	Jais2	inceptionai/Jais-2-8B-Chat, inceptionai/Jais-2-70B-Chat, etc.
JambaForCausalLM	Jamba	ai21labs/AI21-Jamba-1.5-Large, ai21labs/AI21-Jamba-1.5-Mini, ai21labs/Jamba-v0.1, etc.
KimiLinearForCausalLM	Kimi-Linear-48B-A3B-Base, Kimi-Linear-48B-A3B-Instruct	moonshotai/Kimi-Linear-48B-A3B-Base, moonshotai/Kimi-Linear-48B-A3B-Instruct
Lfm2ForCausalLM	LFM2	LiquidAI/LFM2-1.2B, LiquidAI/LFM2-700M, LiquidAI/LFM2-350M, etc.
Lfm2MoeForCausalLM	LFM2MoE	LiquidAI/LFM2-8B-A1B-preview, etc.
LlamaForCausalLM	Llama 3.1, Llama 3, Llama 2, LLaMA, Yi	meta-llama/Meta-Llama-3.1-405B-Instruct, meta-llama/Meta-Llama-3.1-70B, meta-llama/Meta-Llama-3-70B-Instruct, meta-llama/Llama-2-70b-hf, 01-ai/Yi-34B, etc.
LongcatFlashForCausalLM	LongCat-Flash	meituan-longcat/LongCat-Flash-Chat, meituan-longcat/LongCat-Flash-Chat-FP8
MambaForCausalLM	Mamba	state-spaces/mamba-130m-hf, state-spaces/mamba-790m-hf, state-spaces/mamba-2.8b-hf, etc.
Mamba2ForCausalLM	Mamba2	mistralai/Mamba-Codestral-7B-v0.1, etc.
MiMoForCausalLM	MiMo	XiaomiMiMo/MiMo-7B-RL, etc.
MiMoV2FlashForCausalLM	MiMoV2Flash	XiaomiMiMo/MiMo-V2-Flash, etc.
MiniCPMForCausalLM	MiniCPM	openbmb/MiniCPM-2B-sft-bf16, openbmb/MiniCPM-2B-dpo-bf16, openbmb/MiniCPM-S-1B-sft, etc.
MiniCPM3ForCausalLM	MiniCPM3	openbmb/MiniCPM3-4B, etc.
MiniMaxForCausalLM	MiniMax-Text	MiniMaxAI/MiniMax-Text-01-hf, etc.
MiniMaxM2ForCausalLM	MiniMax-M2, MiniMax-M2.1	MiniMaxAI/MiniMax-M2, etc.
MistralForCausalLM	Ministral-3, Mistral, Mistral-Instruct	mistralai/Ministral-3-3B-Instruct-2512, mistralai/Mistral-7B-v0.1, mistralai/Mistral-7B-Instruct-v0.1, etc.
MistralLarge3ForCausalLM	Mistral-Large-3-675B-Base-2512, Mistral-Large-3-675B-Instruct-2512	mistralai/Mistral-Large-3-675B-Base-2512, mistralai/Mistral-Large-3-675B-Instruct-2512, etc.
MixtralForCausalLM	Mixtral-8x7B, Mixtral-8x7B-Instruct	mistralai/Mixtral-8x7B-v0.1, mistralai/Mixtral-8x7B-Instruct-v0.1, mistral-community/Mixtral-8x22B-v0.1, etc.
MPTForCausalLM	MPT, MPT-Instruct, MPT-Chat, MPT-StoryWriter	mosaicml/mpt-7b, mosaicml/mpt-7b-storywriter, mosaicml/mpt-30b, etc.
NemotronForCausalLM	Nemotron-3, Nemotron-4, Minitron	nvidia/Minitron-8B-Base, mgoin/Nemotron-4-340B-Base-hf-FP8, etc.
NemotronHForCausalLM	Nemotron-H	nvidia/Nemotron-H-8B-Base-8K, nvidia/Nemotron-H-47B-Base-8K, nvidia/Nemotron-H-56B-Base-8K, etc.
OlmoForCausalLM	OLMo	allenai/OLMo-1B-hf, allenai/OLMo-7B-hf, etc.
Olmo2ForCausalLM	OLMo2	allenai/OLMo-2-0425-1B, etc.
Olmo3ForCausalLM	OLMo3	allenai/Olmo-3-7B-Instruct, allenai/Olmo-3-32B-Think, etc.
OlmoHybridForCausalLM	OLMo Hybrid	allenai/Olmo-Hybrid-7B
OlmoeForCausalLM	OLMoE	allenai/OLMoE-1B-7B-0924, allenai/OLMoE-1B-7B-0924-Instruct, etc.
OPTForCausalLM	OPT, OPT-IML	facebook/opt-66b, facebook/opt-iml-max-30b, etc.
OrionForCausalLM	Orion	OrionStarAI/Orion-14B-Base, OrionStarAI/Orion-14B-Chat, etc.
OuroForCausalLM	ouro	ByteDance/Ouro-1.4B, ByteDance/Ouro-2.6B, etc.
PanguEmbeddedForCausalLM	openPangu-Embedded-7B	FreedomIntelligence/openPangu-Embedded-7B-V1.1
PanguProMoEV2ForCausalLM	openpangu-pro-moe-v2	N/A
PanguUltraMoEForCausalLM	openpangu-ultra-moe-718b-model	FreedomIntelligence/openPangu-Ultra-MoE-718B-V1.1
Param2MoEForCausalLM	param2moe	bharatgenai/Param2-17B-A2.4B-Thinking, etc.
PhiForCausalLM	Phi	microsoft/phi-1_5, microsoft/phi-2, etc.
Phi3ForCausalLM	Phi-4, Phi-3	microsoft/Phi-4-mini-instruct, microsoft/Phi-4, microsoft/Phi-3-mini-4k-instruct, microsoft/Phi-3-mini-128k-instruct, microsoft/Phi-3-medium-128k-instruct, etc.
PhiMoEForCausalLM	Phi-3.5-MoE	microsoft/Phi-3.5-MoE-instruct, etc.
PersimmonForCausalLM	Persimmon	adept/persimmon-8b-base, adept/persimmon-8b-chat, etc.
Plamo2ForCausalLM	PLaMo2	pfnet/plamo-2-1b, pfnet/plamo-2-8b, etc.
Plamo3ForCausalLM	PLaMo3	pfnet/plamo-3-nict-2b-base, pfnet/plamo-3-nict-8b-base, etc.
QWenLMHeadModel	Qwen	Qwen/Qwen-7B, Qwen/Qwen-7B-Chat, etc.
Qwen2ForCausalLM	QwQ, Qwen2	Qwen/QwQ-32B-Preview, Qwen/Qwen2-7B-Instruct, Qwen/Qwen2-7B, etc.
Qwen2MoeForCausalLM	Qwen2MoE	Qwen/Qwen1.5-MoE-A2.7B, Qwen/Qwen1.5-MoE-A2.7B-Chat, etc.
Qwen3ForCausalLM	Qwen3	Qwen/Qwen3-8B, etc.
Qwen3MoeForCausalLM	Qwen3MoE	Qwen/Qwen3-30B-A3B, etc.
Qwen3NextForCausalLM	Qwen3NextMoE	Qwen/Qwen3-Next-80B-A3B-Instruct, etc.
RWForCausalLM	Falcon RW	tiiuae/falcon-40b, etc.
Rnj1ForCausalLM	Rnj1	EssentialAI/rnj-1-instruct, etc.
SarvamMoEForCausalLM	Sarvam 2	sarvamai/sarvam2-30b-a3b, etc.
SarvamMLAForCausalLM	Sarvam 2	sarvamai/sarvam2-105b-a9b, etc.
SeedOssForCausalLM	SeedOss	ByteDance-Seed/Seed-OSS-36B-Instruct, etc.
SolarForCausalLM	Solar Pro	upstage/solar-pro-preview-instruct, etc.
StableLmForCausalLM	StableLM	stabilityai/stablelm-3b-4e1t, stabilityai/stablelm-base-alpha-7b-v2, etc.
StableLMEpochForCausalLM	StableLM Epoch	stabilityai/stablelm-zephyr-3b, etc.
Starcoder2ForCausalLM	Starcoder2	bigcode/starcoder2-3b, bigcode/starcoder2-7b, bigcode/starcoder2-15b, etc.
Step1ForCausalLM	Step-Audio	stepfun-ai/Step-Audio-EditX, etc.
Step3p5ForCausalLM	Step-3.5-flash	stepfun-ai/Step-3.5-Flash, etc.
TeleChatForCausalLM	TeleChat	chuhac/TeleChat2-35B, etc.
TeleChat2ForCausalLM	TeleChat2	Tele-AI/TeleChat2-3B, Tele-AI/TeleChat2-7B, Tele-AI/TeleChat2-35B, etc.
TeleChat3ForCausalLM	TeleChat3	Tele-AI/TeleChat3-36B-Thinking, Tele-AI/TeleChat3-Coder-36B-Thinking, etc.
TeleFLMForCausalLM	TeleFLM	CofeAI/FLM-2-52B-Instruct-2407, CofeAI/Tele-FLM, etc.
XverseForCausalLM	XVERSE	xverse/XVERSE-7B-Chat, xverse/XVERSE-13B-Chat, xverse/XVERSE-65B-Chat, etc.
MiniMaxM1ForCausalLM	MiniMax-Text	MiniMaxAI/MiniMax-M1-40k, MiniMaxAI/MiniMax-M1-80k, etc.
MiniMaxText01ForCausalLM	MiniMax-Text	MiniMaxAI/MiniMax-Text-01, etc.
Zamba2ForCausalLM	Zamba2	Zyphra/Zamba2-7B-instruct, Zyphra/Zamba2-2.7B-instruct, Zyphra/Zamba2-1.2B-instruct, etc.
SmolLM3ForCausalLM	SmolLM3	HuggingFaceTB/SmolLM3-3B

表2 纯文本语言模型 | 池化模型 | 嵌入
架构	模型	HuggingFace模型示例
BertModel	BERT-based	BAAI/bge-base-en-v1.5, Snowflake/snowflake-arctic-embed-xs, etc.
BertSpladeSparseEmbeddingModel	SPLADE	naver/splade-v3
ErnieModel	BERT-like Chinese ERNIE	shibing624/text2vec-base-chinese-sentence
Gemma2Model^C	Gemma 2-based	BAAI/bge-multilingual-gemma2, etc.
Gemma3TextModel^C	Gemma 3-based	google/embeddinggemma-300m, etc.
GritLM	GritLM	parasail-ai/GritLM-7B-vllm.
GteModel	Arctic-Embed-2.0-M	Snowflake/snowflake-arctic-embed-m-v2.0.
GteNewModel	mGTE-TRM (see note)	Alibaba-NLP/gte-multilingual-base, etc.
JinaEmbeddingsV5Model^C	Qwen3-based with task-specific LoRA adapters	jinaai/jina-embeddings-v5-text-small (see note)
LlamaBidirectionalModel^C	Llama-based with bidirectional attention	nvidia/llama-nemotron-embed-1b-v2, etc.
LlamaModel^C, LlamaForCausalLM^C, MistralModel^C, etc.	Llama-based	intfloat/e5-mistral-7b-instruct, etc.
ModernBertModel	ModernBERT-based	Alibaba-NLP/gte-modernbert-base, etc.
NomicBertModel	Nomic BERT	nomic-ai/nomic-embed-text-v1, nomic-ai/nomic-embed-text-v2-moe, Snowflake/snowflake-arctic-embed-m-long, etc.
Qwen2Model^C, Qwen2ForCausalLM^C	Qwen2-based	ssmits/Qwen2-7B-Instruct-embed-base (see note), Alibaba-NLP/gte-Qwen2-7B-instruct (see note), etc.
Qwen3Model^C, Qwen3ForCausalLM^C	Qwen3-based	Qwen/Qwen3-Embedding-0.6B, etc.
RobertaModel, RobertaForMaskedLM	RoBERTa-based	sentence-transformers/all-roberta-large-v1, etc.
VoyageQwen3BidirectionalEmbedModel^C	Voyage Qwen3-based with bidirectional attention	voyageai/voyage-4-nano, etc.
XLMRobertaModel	XLMRobertaModel-based	BAAI/bge-m3 (see note), intfloat/multilingual-e5-base, jinaai/jina-embeddings-v3 (see note), etc.
Model^C, ForCausalLM^C, etc.	Generative models	N/A

表3 纯文本语言模型 | 池化模型 | 奖励
架构	模型	HuggingFace模型示例
JambaForSequenceClassification	Jamba	ai21labs/Jamba-tiny-reward-dev, etc.
Qwen3ForSequenceClassification^C	Qwen3-based	Skywork/Skywork-Reward-V2-Qwen3-0.6B, etc.
LlamaForSequenceClassification^C	Llama-based	Skywork/Skywork-Reward-V2-Llama-3.2-1B, etc.
Model^C, ForCausalLM^C, etc.	Generative models	N/A
InternLM2ForRewardModel	InternLM2-based	internlm/internlm2-1_8b-reward, internlm/internlm2-7b-reward, etc.
Qwen2ForRewardModel	Qwen2-based	Qwen/Qwen2.5-Math-RM-72B, etc.
LlamaForCausalLM	Llama-based	peiyi9979/math-shepherd-mistral-7b-prm, etc.
Qwen2ForProcessRewardModel	Qwen2-based	Qwen/Qwen2.5-Math-PRM-7B, etc.

表4 纯文本语言模型 | 池化模型 | 分类
架构	模型	HuggingFace模型示例
ErnieForSequenceClassification	BERT-like Chinese ERNIE	Forrest20231206/ernie-3.0-base-zh-cls
GPT2ForSequenceClassification	GPT2	nie3e/sentiment-polish-gpt2-small
Qwen2ForSequenceClassification^C	Qwen2-based	jason9693/Qwen2.5-1.5B-apeach
Model^C, ForCausalLM^C, etc.	Generative models	N/A

表5 纯文本语言模型 | 池化模型 | 交叉编码/重排序
架构	模型	HuggingFace模型示例
BertForSequenceClassification	BERT-based	cross-encoder/ms-marco-MiniLM-L-6-v2, etc.
GemmaForSequenceClassification	Gemma-based	BAAI/bge-reranker-v2-gemma(see note), etc.
GteNewForSequenceClassification	mGTE-TRM (see note)	Alibaba-NLP/gte-multilingual-reranker-base, etc.
LlamaBidirectionalForSequenceClassification^C	Llama-based with bidirectional attention	nvidia/llama-nemotron-rerank-1b-v2, etc.
Qwen2ForSequenceClassification^C	Qwen2-based	mixedbread-ai/mxbai-rerank-base-v2(see note), etc.
Qwen3ForSequenceClassification^C	Qwen3-based	tomaarsen/Qwen3-Reranker-0.6B-seq-cls, Qwen/Qwen3-Reranker-0.6B(see note), etc.
RobertaForSequenceClassification	RoBERTa-based	cross-encoder/quora-roberta-base, etc.
XLMRobertaForSequenceClassification	XLM-RoBERTa-based	BAAI/bge-reranker-v2-m3, etc.
Model^C, ForCausalLM^C, etc.	Generative models	N/A

表6 纯文本语言模型 | 池化模型 | Token分类
架构	模型	HuggingFace模型示例
BertForTokenClassification	bert-based	boltuix/NeuroBERT-NER (see note), etc.
ErnieForTokenClassification	BERT-like Chinese ERNIE	gyr66/Ernie-3.0-base-chinese-finetuned-ner
ModernBertForTokenClassification	ModernBERT-based	disham993/electrical-ner-ModernBERT-base
Qwen3ForTokenClassification^C	Qwen3-based	bd2lcco/Qwen3-0.6B-finetuned
Model^C, ForCausalLM^C, etc.	Generative models	N/A
InternLM2ForRewardModel	InternLM2-based	internlm/internlm2-1_8b-reward, internlm/internlm2-7b-reward, etc.
Qwen2ForRewardModel	Qwen2-based	Qwen/Qwen2.5-Math-RM-72B, etc.

表7 多模态模型 | 生成模型 | 文本生成
架构	模型	输入	HuggingFace模型示例
AriaForConditionalGeneration	Aria	T + I⁺	rhymes-ai/Aria
AudioFlamingo3ForConditionalGeneration	AudioFlamingo3	T + A	nvidia/audio-flamingo-3-hf, nvidia/music-flamingo-hf
AyaVisionForConditionalGeneration	Aya Vision	T + I⁺	CohereLabs/aya-vision-8b, CohereLabs/aya-vision-32b, etc.
BagelForConditionalGeneration	BAGEL	T + I⁺	ByteDance-Seed/BAGEL-7B-MoT
BeeForConditionalGeneration	Bee-8B	T + I^E+	Open-Bee/Bee-8B-RL, Open-Bee/Bee-8B-SFT
Blip2ForConditionalGeneration	BLIP-2	T + I^E	Salesforce/blip2-opt-2.7b, Salesforce/blip2-opt-6.7b, etc.
ChameleonForConditionalGeneration	Chameleon	T + I	facebook/chameleon-7b, etc.
CheersForConditionalGeneration	Cheers	T + I	ai9stars/Cheers
Cohere2VisionForConditionalGeneration	Command A Vision	T + I⁺	CohereLabs/command-a-vision-07-2025, etc.
DeepseekVLV2ForCausalLM	DeepSeek-VL2	T + I⁺	deepseek-ai/deepseek-vl2-tiny, deepseek-ai/deepseek-vl2-small, deepseek-ai/deepseek-vl2, etc.
DeepseekOCRForCausalLM	DeepSeek-OCR	T + I⁺	deepseek-ai/DeepSeek-OCR, etc.
DeepseekOCR2ForCausalLM	DeepSeek-OCR-2	T + I⁺	deepseek-ai/DeepSeek-OCR-2, etc.
Eagle2_5_VLForConditionalGeneration	Eagle2.5-VL	T + I^E+	nvidia/Eagle2.5-8B, etc.
Ernie4_5_VLMoeForConditionalGeneration	Ernie4.5-VL	T + I⁺/ V⁺	baidu/ERNIE-4.5-VL-28B-A3B-PT, baidu/ERNIE-4.5-VL-424B-A47B-PT
Exaone4_5_ForConditionalGeneration	EXAONE-4.5	T + I^E+	LGAI-EXAONE/EXAONE-4.5-33B, etc.
FuyuForCausalLM	Fuyu	T + I	adept/fuyu-8b, etc.
Gemma3ForConditionalGeneration	Gemma 3	T + I^E+	google/gemma-3-4b-it, google/gemma-3-27b-it, etc.
Gemma3nForConditionalGeneration	Gemma 3n	T + I + A	google/gemma-3n-E2B-it, google/gemma-3n-E4B-it, etc.
Gemma4ForConditionalGeneration	Gemma 4	T + I⁺ + V + A^*	google/gemma-4-E2B-it, etc.
GLM4VForCausalLM^{^}	GLM-4V	T + I	zai-org/glm-4v-9b, zai-org/cogagent-9b-20241220, etc.
Glm4vForConditionalGeneration	GLM-4.1V-Thinking	T + I^E+ + V^E+	zai-org/GLM-4.1V-9B-Thinking, etc.
Glm4vMoeForConditionalGeneration	GLM-4.5V	T + I^E+ + V^E+	zai-org/GLM-4.5V, etc.
GlmOcrForConditionalGeneration	GLM-OCR	T + I^E+	zai-org/GLM-OCR, etc.
Granite4VisionForConditionalGeneration	Granite 4 Vision	T + I^E+	ibm-granite/granite-4.1-3b-vision, etc.
GraniteSpeechForConditionalGeneration	Granite Speech	T + A	ibm-granite/granite-speech-3.3-8b
HCXVisionForCausalLM	HyperCLOVAX-SEED-Vision-Instruct-3B	T + I⁺ + V⁺	naver-hyperclovax/HyperCLOVAX-SEED-Vision-Instruct-3B
HCXVisionV2ForCausalLM	HyperCLOVAX-SEED-Think-32B	T + I⁺ + V⁺	naver-hyperclovax/HyperCLOVAX-SEED-Think-32B
H2OVLChatModel	H2OVL	T + I^E+	h2oai/h2ovl-mississippi-800m, h2oai/h2ovl-mississippi-2b, etc.
HunYuanVLForConditionalGeneration	HunyuanOCR	T + I^E+	tencent/HunyuanOCR, etc.
Idefics3ForConditionalGeneration	Idefics3	T + I	HuggingFaceM4/Idefics3-8B-Llama3, etc.
IsaacForConditionalGeneration	Isaac	T + I⁺	PerceptronAI/Isaac-0.1
InternS1ForConditionalGeneration	Intern-S1	T + I^E+ + V^E+	internlm/Intern-S1, internlm/Intern-S1-mini, etc.
InternS1ProForConditionalGeneration	Intern-S1-Pro	T + I^E+ + V^E+	internlm/Intern-S1-Pro, etc.
InternVLChatModel	InternVL 3.5, InternVL 3.0, InternVideo 2.5, InternVL 2.5, Mono-InternVL, InternVL 2.0	T + I^E+ + (V^E+)	OpenGVLab/InternVL3_5-14B, OpenGVLab/InternVL3-9B, OpenGVLab/InternVideo2_5_Chat_8B, OpenGVLab/InternVL2_5-4B, OpenGVLab/Mono-InternVL-2B, OpenGVLab/InternVL2-4B, etc.
InternVLForConditionalGeneration	InternVL 3.0 (HF format)	T + I^E+ + V^E+	OpenGVLab/InternVL3-1B-hf, etc.
KananaVForConditionalGeneration	Kanana-V	T + I⁺	kakaocorp/kanana-1.5-v-3b-instruct, etc.
KeyeForConditionalGeneration	Keye-VL-8B-Preview	T + I^E+ + V^E+	Kwai-Keye/Keye-VL-8B-Preview
KeyeVL1_5ForConditionalGeneration	Keye-VL-1_5-8B	T + I^E+ + V^E+	Kwai-Keye/Keye-VL-1_5-8B
KimiAudioForConditionalGeneration	Kimi-Audio	T + A⁺	moonshotai/Kimi-Audio-7B-Instruct
KimiK25ForConditionalGeneration	Kimi-K2.5	T + I⁺	moonshotai/Kimi-K2.5
KimiVLForConditionalGeneration	Kimi-VL-A3B-Instruct, Kimi-VL-A3B-Thinking	T + I⁺	moonshotai/Kimi-VL-A3B-Instruct, moonshotai/Kimi-VL-A3B-Thinking
LightOnOCRForConditionalGeneration	LightOnOCR-1B	T + I⁺	lightonai/LightOnOCR-1B, etc
Lfm2VlForConditionalGeneration	LFM2-VL	T + I⁺	LiquidAI/LFM2-VL-450M, LiquidAI/LFM2-VL-3B, LiquidAI/LFM2-VL-8B-A1B, etc.
Llama4ForConditionalGeneration	Llama 4	T + I⁺	meta-llama/Llama-4-Scout-17B-16E-Instruct, meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8, meta-llama/Llama-4-Maverick-17B-128E-Instruct, etc.
Llama_Nemotron_Nano_VL	Llama Nemotron Nano VL	T + I^E+	nvidia/Llama-3.1-Nemotron-Nano-VL-8B-V1
LlavaForConditionalGeneration	LLaVA-1.5, Pixtral (HF Transformers)	T + I^E+	llava-hf/llava-1.5-7b-hf, TIGER-Lab/Mantis-8B-siglip-llama3 (see note), mistral-community/pixtral-12b, etc.
LlavaNextForConditionalGeneration	LLaVA-NeXT, Granite Vision	T + I^E+	llava-hf/llava-v1.6-mistral-7b-hf, llava-hf/llava-v1.6-vicuna-7b-hf, ibm-granite/granite-vision-3.3-2b, etc.
LlavaNextVideoForConditionalGeneration	LLaVA-NeXT-Video	T + V	llava-hf/LLaVA-NeXT-Video-7B-hf, etc.
LlavaOnevisionForConditionalGeneration	LLaVA-Onevision	T + I⁺ + V⁺	llava-hf/llava-onevision-qwen2-7b-ov-hf, llava-hf/llava-onevision-qwen2-0.5b-ov-hf, etc.
MiDashengLMModel	MiDashengLM	T + A⁺	mispeech/midashenglm-7b
MiniCPMO	MiniCPM-O	T + I^E+ + V^E+ + A^E+	openbmb/MiniCPM-o-2_6, etc.
MiniCPMV	MiniCPM-V	T + I^E+ + V^E+	openbmb/MiniCPM-V-2 (see note), openbmb/MiniCPM-Llama3-V-2_5, openbmb/MiniCPM-V-2_6, openbmb/MiniCPM-V-4, openbmb/MiniCPM-V-4_5, etc.
MiniMaxVL01ForConditionalGeneration	MiniMax-VL	T + I^E+	MiniMaxAI/MiniMax-VL-01, etc.
Mistral3ForConditionalGeneration	Mistral3 (HF Transformers)	T + I⁺	mistralai/Mistral-Small-3.1-24B-Instruct-2503, etc.
MolmoForCausalLM	Molmo	T + I⁺	allenai/Molmo-7B-D-0924, allenai/Molmo-7B-O-0924, etc.
Molmo2ForConditionalGeneration	Molmo2	T + I⁺ / V	allenai/Molmo2-4B, allenai/Molmo2-8B, allenai/Molmo2-O-7B
MusicFlamingoForConditionalGeneration	MusicFlamingo	T + A	nvidia/music-flamingo-2601-hf, nvidia/music-flamingo-think-2601-hf
NVLM_D_Model	NVLM-D 1.0	T + I⁺	nvidia/NVLM-D-72B, etc.
OpenCUAForConditionalGeneration	OpenCUA-7B	T + I^E+	xlangai/OpenCUA-7B
OpenPanguVLForConditionalGeneration	openpangu-VL	T + I^E+ + V^E+	FreedomIntelligence/openPangu-VL-7B
Ovis	Ovis2, Ovis1.6	T + I⁺	AIDC-AI/Ovis2-1B, AIDC-AI/Ovis1.6-Llama3.2-3B, etc.
Ovis2_5	Ovis2.5	T + I⁺ + V	AIDC-AI/Ovis2.5-9B, etc.
Ovis2_6ForCausalLM	Ovis2.6	T + I⁺ + V	AIDC-AI/Ovis2.6-2B, etc.
Ovis2_6_MoeForCausalLM	Ovis2.6	T + I⁺ + V	AIDC-AI/Ovis2.6-30B-A3B, etc.
PaddleOCRVLForConditionalGeneration	Paddle-OCR	T + I⁺	PaddlePaddle/PaddleOCR-VL, etc.
PaliGemmaForConditionalGeneration	PaliGemma, PaliGemma 2	T + I^E	google/paligemma-3b-pt-224, google/paligemma-3b-mix-224, google/paligemma2-3b-ft-docci-448, etc.
Phi3VForCausalLM	Phi-3-Vision, Phi-3.5-Vision	T + I^E+	microsoft/Phi-3-vision-128k-instruct, microsoft/Phi-3.5-vision-instruct, etc.
Phi4MMForCausalLM	Phi-4-multimodal	T + I⁺ / T + A⁺ / I⁺ + A⁺	microsoft/Phi-4-multimodal-instruct, etc.
Phi4ForCausalLMV	Phi-4-reasoning-vision	T + I⁺	microsoft/Phi-4-reasoning-vision-15B, etc.
PixtralForConditionalGeneration	Ministral 3 (Mistral format), Mistral 3 (Mistral format), Mistral Large 3 (Mistral format), Pixtral (Mistral format)	T + I⁺	mistralai/Ministral-3-3B-Instruct-2512, mistralai/Mistral-Small-3.1-24B-Instruct-2503, mistralai/Mistral-Large-3-675B-Instruct-2512 mistralai/Pixtral-12B-2409 etc.
QwenVLForConditionalGeneration^{^}	Qwen-VL	T + I^E+	Qwen/Qwen-VL, Qwen/Qwen-VL-Chat, etc.
Qwen2AudioForConditionalGeneration	Qwen2-Audio	T + A⁺	Qwen/Qwen2-Audio-7B-Instruct
Qwen2VLForConditionalGeneration	QVQ, Qwen2-VL	T + I^E+ + V^E+	Qwen/QVQ-72B-Preview, Qwen/Qwen2-VL-7B-Instruct, Qwen/Qwen2-VL-72B-Instruct, etc.
Qwen2_5_VLForConditionalGeneration	Qwen2.5-VL	T + I^E+ + V^E+	Qwen/Qwen2.5-VL-3B-Instruct, Qwen/Qwen2.5-VL-72B-Instruct, etc.
Qwen2_5OmniThinkerForConditionalGeneration	Qwen2.5-Omni	T + I^E+ + V^E+ + A⁺	Qwen/Qwen2.5-Omni-3B, Qwen/Qwen2.5-Omni-7B
Qwen3_5ForConditionalGeneration	Qwen3.5	T + I^E+ + V^E+	Qwen/Qwen3.5-9B-Instruct, etc.
Qwen3_5MoeForConditionalGeneration	Qwen3.5-MOE	T + I^E+ + V^E+	Qwen/Qwen3.5-35B-A3B-Instruct, etc.
Qwen3VLForConditionalGeneration	Qwen3-VL	T + I^E+ + V^E+	Qwen/Qwen3-VL-4B-Instruct, etc.
Qwen3VLMoeForConditionalGeneration	Qwen3-VL-MOE	T + I^E+ + V^E+	Qwen/Qwen3-VL-30B-A3B-Instruct, etc.
Qwen3OmniMoeThinkerForConditionalGeneration	Qwen3-Omni	T + I^E+ + V^E+ + A⁺	Qwen/Qwen3-Omni-30B-A3B-Instruct, Qwen/Qwen3-Omni-30B-A3B-Thinking
Qwen3ASRForConditionalGeneration	Qwen3-ASR	T + A⁺	Qwen/Qwen3-ASR-1.7B
RForConditionalGeneration	R-VL-4B	T + I^E+	YannQi/R-4B
SkyworkR1VChatModel	Skywork-R1V-38B	T + I	Skywork/Skywork-R1V-38B
SmolVLMForConditionalGeneration	SmolVLM2	T + I	SmolVLM2-2.2B-Instruct
Step3VLForConditionalGeneration	Step3-VL	T + I⁺	stepfun-ai/step3
StepVLForConditionalGeneration	Step3-VL-10B	T + I⁺	stepfun-ai/Step3-VL-10B
TarsierForConditionalGeneration	Tarsier	T + I^E+	omni-search/Tarsier-7b, omni-search/Tarsier-34b
Tarsier2ForConditionalGeneration^{^}	Tarsier2	T + I^E+ + V^E+	omni-research/Tarsier2-Recap-7b, omni-research/Tarsier2-7b-0115
UltravoxModel	Ultravox	T + A^E+	fixie-ai/ultravox-v0_5-llama-3_2-1b
Emu3ForConditionalGeneration	Emu3	T + I	BAAI/Emu3-Chat-hf

表8 多模态模型 | 生成模型 | 文本转换
架构	模型	HuggingFace模型示例
CohereAsrForConditionalGeneration	Cohere-Transcribe	CohereLabs/cohere-transcribe-03-2026
FireRedASR2ForConditionalGeneration	FireRedASR2	allendou/FireRedASR2-LLM-vllm, etc.
FireRedLIDForConditionalGeneration	FireRedLID	PatchyTisa/FireRedLID-vllm, etc.
FunASRForConditionalGeneration	FunASR	allendou/Fun-ASR-Nano-2512-vllm, etc.
Gemma3nForConditionalGeneration	Gemma3n	google/gemma-3n-E2B-it, google/gemma-3n-E4B-it, etc.
GlmAsrForConditionalGeneration	GLM-ASR	zai-org/GLM-ASR-Nano-2512
GraniteSpeechForConditionalGeneration	Granite Speech	ibm-granite/granite-4.0-1b-speech, ibm-granite/granite-speech-3.3-2b, etc.
Qwen3ASRForConditionalGeneration	Qwen3-ASR	Qwen/Qwen3-ASR-1.7B, etc.
Qwen3OmniMoeThinkerForConditionalGeneration	Qwen3-Omni	Qwen/Qwen3-Omni-30B-A3B-Instruct, etc.
VoxtralForConditionalGeneration	Voxtral (Mistral format)	mistralai/Voxtral-Mini-3B-2507, mistralai/Voxtral-Small-24B-2507, etc.
WhisperForConditionalGeneration	Whisper	openai/whisper-small, openai/whisper-large-v3-turbo, etc.

表9 多模态模型 | 生成模型 | 实时文本转换
架构	模型	HuggingFace模型示例
VoxtralRealtimeGeneration	Voxtral Realtime	mistralai/Voxtral-Mini-4B-Realtime-2602
Qwen3ASRRealtimeGeneration	Qwen3-ASR Realtime	Qwen/Qwen3-ASR-0.6B

表10 多模态模型 | 池化模型 | 嵌入
架构	模型	输入	HuggingFace模型示例
CLIPModel	CLIP	T / I	openai/clip-vit-base-patch32, openai/clip-vit-large-patch14, etc.
LlamaNemotronVLModel	Llama Nemotron Embedding + SigLIP	T + I	nvidia/llama-nemotron-embed-vl-1b-v2
LlavaNextForConditionalGeneration^C	LLaVA-NeXT-based	T / I	royokong/e5-v
Phi3VForCausalLM^C	Phi-3-Vision-based	T + I	TIGER-Lab/VLM2Vec-Full
Qwen3VLForConditionalGeneration^C	Qwen3-VL	T + I + V	Qwen/Qwen3-VL-Embedding-2B, etc.
SiglipModel	SigLIP, SigLIP2	T / I	google/siglip-base-patch16-224, google/siglip2-base-patch16-224
ForConditionalGeneration^C, ForCausalLM^C, etc.	Generative models	*	N/A

表11 多模态模型 | 池化模型 | 分类
架构	模型	输入	HuggingFace模型示例
Qwen2_5_VLForSequenceClassification^C	Qwen2_5_VL-based	T + I^E+ + V^E+	muziyongshixin/Qwen2.5-VL-7B-for-VideoCls
ForConditionalGeneration^C, ForCausalLM^C, etc.	Generative models	*	N/A

表12 多模态模型 | 池化模型 | 交叉编码/重排序
架构	模型	输入	HuggingFace模型示例
JinaVLForSequenceClassification	JinaVL-based	T + I^E+	jinaai/jina-reranker-m0, etc.
LlamaNemotronVLForSequenceClassification	Llama Nemotron Reranker + SigLIP	T + I^E+	nvidia/llama-nemotron-rerank-vl-1b-v2
Qwen3VLForSequenceClassification	Qwen3-VL-Reranker	T + I^E+ + V^E+	Qwen/Qwen3-VL-Reranker-2B(see note), etc.

表13 多模态模型 | 池化模型 | Token分类
架构	模型	输入	HuggingFace模型示例
Qwen3ASRForcedAlignerForTokenClassification	Qwen3-ForcedAligner	T + A⁺	Qwen/Qwen3-ForcedAligner-0.6B (see note)

说明：

^C表示该模型可通过--convert转换为对应池化任务。
*表示模型功能和原始模型一致。
模态说明：Text表示文本，Image表示图片，Video表示视频，Audio表示音频。
+表示支持同时输入多种模态；/表示支持多种模态，但多种模态不可同时使用。
^E表示可为该模态输入预计算嵌入。

vLLM 0.17.1

以下列举该模板兼容的模型架构、名称和示例。如需进一步了解兼容列表中各类模型的使用方法和注意事项，可参考vLLM官方文档

表14 纯文本语言模型 | 生成模型 | 文本生成
架构	模型	HuggingFace模型示例
LongcatFlashForCausalLM	LongCat-Flash	meituan-longcat/LongCat-Flash-Chat, meituan-longcat/LongCat-Flash-Chat-FP8
Zamba2ForCausalLM	Zamba2	Zyphra/Zamba2-7B-instruct, Zyphra/Zamba2-2.7B-instruct, Zyphra/Zamba2-1.2B-instruct, etc.
MiniMaxText01ForCausalLM	MiniMax-Text	MiniMaxAI/MiniMax-Text-01, etc.
MiniMaxM1ForCausalLM	MiniMax-Text	MiniMaxAI/MiniMax-M1-40k, `MiniMaxAI/MiniMax-M1-80k`, etc.
XverseForCausalLM	XVERSE	xverse/XVERSE-7B-Chat , xverse/XVERSE-13B-Chat , xverse/XVERSE-65B-Chat , etc.
TeleFLMForCausalLM	TeleFLM	CofeAI/FLM-2-52B-Instruct-2407, CofeAI/Tele-FLM, etc.
TeleChat2ForCausalLM	TeleChat2	TeleAI/TeleChat2-3B , TeleAI/TeleChat2-7B , TeleAI/TeleChat2-35B , etc.
TeleChatForCausalLM	TeleChat	chuhac/TeleChat2-35B, etc.
Step1ForCausalLM	Step-Audio	stepfun-ai/Step-Audio-EditX, etc.
Step3p5ForCausalLM	Step-3.5-flash	stepfun-ai/step-3.5-flash, etc.
SolarForCausalLM	Solar Pro	upstage/solar-pro-preview-instruct , etc.
Starcoder2ForCausalLM	Starcoder2	bigcode/starcoder2-3b, bigcode/starcoder2-7b, bigcode/starcoder2-15b, etc.
StableLMEpochForCausalLM	StableLM Epoch	stabilityai/stablelm-zephyr-3b, etc.
StableLmForCausalLM	StableLM	stabilityai/stablelm-3b-4e1t, stabilityai/stablelm-base-alpha-7b-v2, etc.
SeedOssForCausalLM	SeedOss	ByteDance-Seed/Seed-OSS-36B-Instruct, etc.
RWForCausalLM	Falcon RW	tiiuae/falcon-40b, etc.
QWenLMHeadModel	Qwen	Qwen/Qwen-7B , Qwen/Qwen-7B-Chat , etc.
Qwen2MoeForCausalLM	Qwen2MoE	Qwen/Qwen1.5-MoE-A2.7B , Qwen/Qwen1.5-MoE-A2.7B-Chat , etc.
Qwen2ForCausalLM	QwQ, Qwen2	Qwen/QwQ-32B-Preview , Qwen/Qwen2-7B-Instruct , Qwen/Qwen2-7B , etc.
Qwen3ForCausalLM	Qwen3	Qwen/Qwen3-8B, etc.
Qwen3MoeForCausalLM	Qwen3MoE	Qwen/Qwen3-MoE-15B-A2B, etc.
Qwen3NextForCausalLM	Qwen3NextMoE	Qwen/Qwen3-Next-80B-A3B-Instruct, etc.
Plamo3ForCausalLM	PLaMo3	pfnet/plamo-3-nict-2b-base, pfnet/plamo-3-nict-8b-base, etc.
Plamo2ForCausalLM	PLaMo2	pfnet/plamo-2-1b, pfnet/plamo-2-8b, etc.
PersimmonForCausalLM	Persimmon	adept/persimmon-8b-base, adept/persimmon-8b-chat, etc.
PhiMoEForCausalLM	Phi-3.5-MoE	microsoft/Phi-3.5-MoE-instruct , etc.
PhiForCausalLM	Phi	microsoft/phi-1_5 , microsoft/phi-2 , etc.
Phi3ForCausalLM	Phi-4, Phi-3	microsoft/Phi-4 , microsoft/Phi-3-mini-4k-instruct , microsoft/Phi-3-mini-128k-instruct , microsoft/Phi-3-medium-128k-instruct , etc.
PanguUltraMoEForCausalLM	openpangu-ultra-moe-718b-model	FreedomIntelligence/openPangu-Ultra-MoE-718B-V1.1
PanguProMoEV2ForCausalLM	openpangu-pro-moe-v2	-
PanguEmbeddedForCausalLM	openPangu-Embedded-7B	FreedomIntelligence/openPangu-Embedded-7B-V1.1
OuroForCausalLM	ouro	OrionStarAI/Orion-14B-Base, OrionStarAI/Orion-14B-Chat, etc.
OrionForCausalLM	Orion	OrionStarAI/Orion-14B-Base , OrionStarAI/Orion-14B-Chat , etc.
OPTForCausalLM	OPT, OPT-IML	facebook/opt-66b , facebook/opt-iml-max-30b , etc.
OlmoForCausalLM	OLMo	allenai/OLMo-1B-hf , allenai/OLMo-7B-hf , etc.
OlmoeForCausalLM	OLMoE	allenai/OLMoE-1B-7B-0924 , allenai/OLMoE-1B-7B-0924-Instruct , etc.
Olmo2ForCausalLM	OLMo2	allenai/OLMo2-7B-1124 , etc.
Olmo3ForCausalLM	OLMo3	TBA
NemotronHForCausalLM	Nemotron-H	nvidia/Nemotron-H-8B-Base-8K, nvidia/Nemotron-H-47B-Base-8K, nvidia/Nemotron-H-56B-Base-8K, etc.
NemotronForCausalLM	Nemotron-3, Nemotron-4, Minitron	nvidia/Minitron-8B-Base , mgoin/Nemotron-4-340B-Base-hf-FP8 , etc.
MPTForCausalLM	MPT, MPT-Instruct, MPT-Chat, MPT-StoryWriter	mosaicml/mpt-7b , mosaicml/mpt-7b-storywriter , mosaicml/mpt-30b , etc.
MixtralForCausalLM	Mixtral-8x7B, Mixtral-8x7B-Instruct	mistralai/Mixtral-8x7B-v0.1 , mistralai/Mixtral-8x7B-Instruct-v0.1 , mistral-community/Mixtral-8x22B-v0.1 , etc.
MistralLarge3ForCausalLM	Mistral-Large-3-675B-Base-2512, Mistral-Large-3-675B-Instruct-2512	mistralai/Mistral-Large-3-675B-Base-2512, mistralai/Mistral-Large-3-675B-Instruct-2512, etc.
MistralForCausalLM	Ministral-3, Mistral, Mistral-Instruct	mistralai/Ministral-3-3B-Instruct-2512, mistralai/Mistral-7B-v0.1, mistralai/Mistral-7B-Instruct-v0.1, etc.
MiniMaxForCausalLM	MiniMax-Text	MiniMaxAI/MiniMax-Text-01-hf, etc.
MiniMaxM2ForCausalLM	MiniMax-M2, MiniMax-M2.1	MiniMaxAI/MiniMax-M2, etc.
MiniCPM3ForCausalLM	MiniCPM3	openbmb/MiniCPM3-4B , etc.
MiniCPMForCausalLM	MiniCPM	openbmb/MiniCPM-2B-sft-bf16, openbmb/MiniCPM-2B-dpo-bf16, openbmb/MiniCPM-S-1B-sft, etc.
MiMoV2FlashForCausalLM	MiMoV2Flash	XiaomiMiMo/MiMo-V2-Flash, etc.
MiMoForCausalLM	MiMo	XiaomiMiMo/MiMo-7B-RL, etc.
MambaForCausalLM	Mamba	state-spaces/mamba-130m-hf , state-spaces/mamba-790m-hf , state-spaces/mamba-2.8b-hf , etc.
Mamba2ForCausalLM	Mamba2	mistralai/Mamba-Codestral-7B-v0.1, etc.
LlamaForCausalLM	Llama 3.1, Llama 3, Llama 2, LLaMA, Yi	meta-llama/Meta-Llama-3.1-405B-Instruct , meta-llama/Meta-Llama-3.1-70B , meta-llama/Meta-Llama-3-70B-Instruct , meta-llama/Llama-2-70b-hf , 01-ai/Yi-34B , etc.
Lfm2MoeForCausalLM	LFM2MoE	LiquidAI/LFM2-8B-A1B-preview, etc.
Lfm2ForCausalLM	LFM2	LiquidAI/LFM2-1.2B, LiquidAI/LFM2-700M, LiquidAI/LFM2-350M, etc.
KimiLinearForCausalLM	Kimi-Linear-48B-A3B-Base, Kimi-Linear-48B-A3B-Instruct	moonshotai/Kimi-Linear-48B-A3B-Base, moonshotai/Kimi-Linear-48B-A3B-Instruct
JambaForCausalLM	Jamba	ai21labs/AI21-Jamba-1.5-Large , ai21labs/AI21-Jamba-1.5-Mini , ai21labs/Jamba-v0.1 , etc.
Jais2ForCausalLM	Jais2	inceptionai/Jais-2-8B-Chat, inceptionai/Jais-2-70B-Chat, etc.
JAISLMHeadModel	Jais	inceptionai/jais-13b , inceptionai/jais-13b-chat , inceptionai/jais-30b-v3 , inceptionai/jais-30b-chat-v3 , etc.
IQuestCoderForCausalLM	IQuestCoderV1	IQuestLab/IQuest-Coder-V1-40B-Instruct, etc.
IQuestLoopCoderForCausalLM	IQuestLoopCoderV1	IQuestLab/IQuest-Coder-V1-40B-Loop-Instruct, etc.
InternLM3ForCausalLM	InternLM3	internlm/internlm3-8b-instruct , etc.
InternLM2ForCausalLM	InternLM2	internlm/internlm2-7b , internlm/internlm2-chat-7b , etc.
InternLMForCausalLM	InternLM	internlm/internlm-7b, internlm/internlm-chat-7b, etc.
HunYuanMoEV1ForCausalLM	Hunyuan-A13B	tencent/Hunyuan-A13B-Instruct, tencent/Hunyuan-A13B-Pretrain, tencent/Hunyuan-A13B-Instruct-FP8, etc.
HunYuanDenseV1ForCausalLM	Hunyuan Dense	tencent/Hunyuan-7B-Instruct-0124
Grok1ForCausalLM	Grok2	xai-org/grok-2
Grok1ModelForCausalLM	Grok1	hpcai-tech/grok-1.
GritLM	GritLM	parasail-ai/GritLM-7B-vllm .
GraniteMoeSharedForCausalLM	Granite MoE Shared	ibm-research/moe-7b-1b-active-shared-experts (test model)
GraniteMoeHybridForCausalLM	Granite 4.0 MoE Hybrid	ibm-granite/granite-4.0-tiny-preview, etc.
GraniteMoeForCausalLM	Granite 3.0 MoE, PowerMoE	ibm-granite/granite-3.0-1b-a400m-base , ibm-granite/granite-3.0-3b-a800m-instruct , ibm/PowerMoE-3b , etc.
GraniteForCausalLM	Granite 3.0, Granite 3.1, PowerLM	ibm-granite/granite-3.0-2b-base , ibm-granite/granite-3.1-8b-instruct , ibm/PowerLM-3b , etc.
GptOssForCausalLM	GPT-OSS	openai/gpt-oss-120b, openai/gpt-oss-20b
GPTNeoXForCausalLM	GPT-NeoX, Pythia, OpenAssistant, Dolly V2, StableLM	EleutherAI/gpt-neox-20b , EleutherAI/pythia-12b , OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5 , databricks/dolly-v2-12b , stabilityai/stablelm-tuned-alpha-7b , etc.
GPTJForCausalLM	GPT-J	EleutherAI/gpt-j-6b , nomic-ai/gpt4all-j , etc.
GPTBigCodeForCausalLM	StarCoder, SantaCoder, WizardCoder	bigcode/starcoder , bigcode/gpt_bigcode-santacoder , WizardLM/WizardCoder-15B-V1.0 , etc.
GPT2LMHeadModel	GPT-2	gpt2 , gpt2-xl , etc.
Glm4MoeLiteForCausalLM	GLM-4.7-Flash	zai-org/GLM-4.7-Flash, etc.
Glm4MoeForCausalLM	GLM-4.5, GLM-4.6, GLM-4.7	zai-org/GLM-4.5, etc.
Glm4ForCausalLM	GLM-4-0414	THUDM/GLM-4-32B-0414, etc.
GlmForCausalLM	GLM-4	THUDM/glm-4-9b-chat-hf , etc.
Gemma3nForCausalLM	Gemma 3n	google/gemma-3n-E2B-it, google/gemma-3n-E4B-it, etc.
Gemma3ForCausalLM	Gemma 3	google/gemma-3-1b-it, etc.
Gemma2ForCausalLM	Gemma 2	google/gemma-2-9b, google/gemma-2-27b, etc.
GemmaForCausalLM	Gemma	google/gemma-2b , google/gemma-7b , etc.
FlexOlmoForCausalLM	FlexOlmo	allenai/FlexOlmo-7x7B-1T, allenai/FlexOlmo-7x7B-1T-RT, etc.
FalconH1ForCausalLM	Falcon-H1	tiiuae/Falcon-H1-34B-Base, tiiuae/Falcon-H1-34B-Instruct, etc.
FalconMambaForCausalLM	FalconMamba	tiiuae/falcon-mamba-7b , tiiuae/falcon-mamba-7b-instruct , etc.
FalconForCausalLM	Falcon	tiiuae/falcon-7b , tiiuae/falcon-40b , tiiuae/falcon-rw-7b , etc.
Fairseq2LlamaForCausalLM	Llama (fairseq2 format)	mgleize/fairseq2-dummy-Llama-3.2-1B, etc.
Exaone4ForCausalLM	EXAONE-4	LGAI-EXAONE/EXAONE-4.0-32B, etc.
ExaoneMoEForCausalLM	K-EXAONE	LGAI-EXAONE/K-EXAONE-236B-A23B, etc.
ExaoneForCausalLM	EXAONE-3	LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct , etc.
Ernie4_5_MoeForCausalLM	Ernie4.5MoE	baidu/ERNIE-4.5-21B-A3B-PT, baidu/ERNIE-4.5-300B-A47B-PT, etc.
Ernie4_5ForCausalLM	Ernie4.5	baidu/ERNIE-4.5-0.3B-PT, etc.
DotsOCRForCausalLM	dots_ocr	rednote-hilab/dots.ocr
Dots1ForCausalLM	dots.llm1	rednote-hilab/dots.llm1.base, rednote-hilab/dots.llm1.inst, etc.
DeepseekV3ForCausalLM	DeepSeek-V3	deepseek-ai/DeepSeek-V3-Base , deepseek-ai/DeepSeek-V3 etc.
DeepseekV2ForCausalLM	DeepSeek-V2	deepseek-ai/DeepSeek-V2 , deepseek-ai/DeepSeek-V2-Chat etc.
DeepseekForCausalLM	DeepSeek	deepseek-ai/deepseek-llm-67b-base , deepseek-ai/deepseek-llm-7b-chat etc.
DeciLMForCausalLM	DeciLM	Deci/DeciLM-7B , Deci/DeciLM-7B-instruct , etc.
DbrxForCausalLM	DBRX	databricks/dbrx-base , databricks/dbrx-instruct , etc.
CwmForCausalLM	CWM	facebook/cwm, etc.
CohereForCausalLM , Cohere2ForCausalLM	Command-R, Command-A	CohereForAI/c4ai-command-r-v01 , CohereForAI/c4ai-command-r7b-12-2024 , etc.
ChatGLMModel, ChatGLMForConditionalGeneration	ChatGLM	THUDM/chatglm2-6b , THUDM/chatglm3-6b , etc.
BloomForCausalLM	BLOOM, BLOOMZ, BLOOMChat	bigscience/bloom , bigscience/bloomz , etc.
BambaForCausalLM	Bamba	ibm-ai-platform/Bamba-9B-fp8, ibm-ai-platform/Bamba-9B
BailingMoeV2ForCausalLM	Ling	inclusionAI/Ling-mini-2.0, etc.
BailingMoeForCausalLM	Ling	inclusionAI/Ling-lite-1.5, inclusionAI/Ling-plus, etc.
BaiChuanForCausalLM	Baichuan2, Baichuan	baichuan-inc/Baichuan2-13B-Chat , baichuan-inc/Baichuan-7B , etc.
AXK1ForCausalLM	A.X-K1	skt/A.X-K1, etc.
ArcticForCausalLM	Arctic	Snowflake/snowflake-arctic-base , Snowflake/snowflake-arctic-instruct , etc.
ArceeForCausalLM	Arcee (AFM)	arcee-ai/AFM-4.5B-Base, etc.
AquilaForCausalLM	Aquila, Aquila2	BAAI/Aquila-7B , BAAI/AquilaChat-7B , etc.
ApertusForCausalLM	Apertus	swiss-ai/Apertus-8B-2509, swiss-ai/Apertus-70B-Instruct-2509, etc.
AfmoeForCausalLM	Afmoe	TBA
BailingMoeV2_5ForCausalLM	Ling-V2.5 / Ring-V2.5	inclusionAI/Ling-mini-2.5, inclusionAI/Ring-mini-2.5, etc.
SmolLM3ForCausalLM	SmolLM3	HuggingFaceTB/SmolLM3-3B, etc.

表15 纯文本语言模型 | 池化模型 | 嵌入
架构	模型	HuggingFace模型示例
BertModel^C	BERT-based	BAAI/bge-base-en-v1.5 , etc.
BertSpladeSparseEmbeddingModel	SPLADE	naver/splade-v3
Gemma2Model^C	Gemma 2-based	BAAI/bge-multilingual-gemma2 , etc.
Gemma3TextModel^C	Gemma 3-based	google/embeddinggemma-300m, etc.
GritLM	GritLM	parasail-ai/GritLM-7B-vllm.
GteModel^C	Arctic-Embed-2.0-M	Snowflake/snowflake-arctic-embed-m-v2.0.
GteNewModel^C	mGTE-TRM	Alibaba-NLP/gte-multilingual-base, etc.
ModernBertModel^C	ModernBERT-based	Alibaba-NLP/gte-modernbert-base, etc.
NomicBertModel^C	Nomic BERT	nomic-ai/nomic-embed-text-v1, nomic-ai/nomic-embed-text-v2-moe, Snowflake/snowflake-arctic-embed-m-long, etc.
LlamaBidirectionalModel^C	Llama-based with bidirectional attention	nvidia/llama-nemotron-embed-1b-v2, etc.
LlamaModel^C, LlamaForCausalLM^C, MistralModel^C, etc.	Llama-based	intfloat/e5-mistral-7b-instruct , etc.
Qwen2Model^C, Qwen2ForCausalLM^C	Qwen2-based	ssmits/Qwen2-7B-Instruct-embed-base (see note), Alibaba-NLP/gte-Qwen2-7B-instruct (see note), etc.
Qwen3Model^C, Qwen3ForCausalLM^C	Qwen3-based	Qwen/Qwen3-Embedding-0.6B, etc.
RobertaModel , RobertaForMaskedLM	RoBERTa-based	sentence-transformers/all-roberta-large-v1 , sentence-transformers/all-roberta-large-v1 , etc.
VoyageQwen3BidirectionalEmbedModel^C	Voyage Qwen3-based with bidirectional attention	voyageai/voyage-4-nano, etc.
Model^C, ForCausalLMC^C, etc.	Generative models	N/A

说明：

^C表示该模型可通过--convert embed转换为嵌入模型。
*表示模型功能和原始模型一致。

表16 纯文本语言模型 | 池化模型 | 奖励
架构	模型	HuggingFace模型示例
InternLM2ForRewardModel	InternLM2-based	internlm/internlm2-1_8b-reward , internlm/internlm2-7b-reward , etc.
LlamaForCausalLM^C	Llama-based	peiyi9979/math-shepherd-mistral-7b-prm , etc.
Qwen2ForRewardModel	Qwen2-based	Qwen/Qwen2.5-Math-RM-72B , etc.
Qwen2ForProcessRewardModel	Qwen2-based	Qwen/Qwen2.5-Math-PRM-7B , Qwen/Qwen2.5-Math-PRM-72B , etc.

说明：

^C表示该模型可通过--convert reward转换为奖励模型。
*表示模型功能和原始模型一致。

表17 纯文本语言模型 | 池化模型 | 分类 ( --task classify)
架构	模型	HuggingFace模型示例
JambaForSequenceClassification	Jamba	ai21labs/Jamba-tiny-reward-dev , etc.
GPT2ForSequenceClassification	GPT2	nie3e/sentiment-polish-gpt2-small
Model^C, ForCausalLM^C, etc.	Generative models	N/A

说明：

^C表示该模型可通过--convert classify转换为分类模型。
*表示模型功能和原始模型一致。

表18 纯文本语言模型 | 池化模型 | 交叉编码/重排序
架构	模型	HuggingFace模型示例
BertForSequenceClassification	BERT-based	cross-encoder/ms-marco-MiniLM-L-6-v2 , etc.
GemmaForSequenceClassification	Gemma-based	BAAI/bge-reranker-v2-gemma, etc.
GteNewForSequenceClassification	mGTE-TRM	Alibaba-NLP/gte-multilingual-reranker-base, etc.
LlamaBidirectionalForSequenceClassification^C	Llama-based with bidirectional attention	nvidia/llama-nemotron-rerank-1b-v2, etc.
Qwen2ForSequenceClassification^C	Qwen2-based	mixedbread-ai/mxbai-rerank-base-v2, etc.
Qwen3ForSequenceClassification^C	Qwen3-based	tomaarsen/Qwen3-Reranker-0.6B-seq-cls, Qwen/Qwen3-Reranker-0.6B, etc.
RobertaForSequenceClassification	RoBERTa-based	cross-encoder/quora-roberta-base , etc.
XLMRobertaForSequenceClassification	XLM-RoBERTa-based	BAAI/bge-reranker-v2-m3 , etc.
Model^C, ForCausalLM^C, etc.	Generative models	N/A

说明：

^C表示该模型可通过--convert classify转换为分类模型。
*表示模型功能和原始模型一致。

表19 纯文本语言模型 | 池化模型 | Token分类
架构	模型	HuggingFace模型示例
BertForTokenClassification	bert-based	boltuix/NeuroBERT-NER, etc.
ModernBertForTokenClassification	ModernBERT-based	disham993/electrical-ner-ModernBERT-base

表20 多模态模型 | 生成模型 | 文本生成
架构	模型	输入	HuggingFace模型示例	说明
AriaForConditionalGeneration	Aria	T + I⁺	rhymes-ai/Aria	模态说明： Text：文本 Image：图片 Video：视频 Audio：音频特殊字符含义： +：支持同时输入两种模态。例如 T+I 表示：支持纯文本输入、纯图片输入，或文本+图片输入 /：支持多种模态，但多种模态不可同时使用。例如 T/I表示：支持纯文本输入或纯图片输入，不支持文本+图片输入 ^E：该模态下，支持输入预计算的嵌入 ⁺ ：该模态下，每个文本 Prompt 支持输入多条
AudioFlamingo3ForConditionalGeneration	AudioFlamingo3	T + A⁺	nvidia/audio-flamingo-3-hf, nvidia/music-flamingo-hf
AyaVisionForConditionalGeneration	Aya Vision	T + I⁺	CohereForAI/aya-vision-8b, CohereForAI/aya-vision-32b, etc.
BagelForConditionalGeneration	BAGEL	T + I⁺	ByteDance-Seed/BAGEL-7B-MoT
BeeForConditionalGeneration	Bee-8B	T + I^E+
Blip2ForConditionalGeneration	BLIP-2	T + I^E	Salesforce/blip2-opt-2.7b, Salesforce/blip2-opt-6.7b, etc.
ChameleonForConditionalGeneration	Chameleon	T + I	facebook/chameleon-7b etc.
Cohere2VisionForConditionalGeneration	Command A Vision	T + I⁺	CohereLabs/command-a-vision-07-2025, etc.
DeepseekVLV2ForCausalLM	DeepSeek-VL2	T + I⁺	deepseek-ai/deepseek-vl2-tiny, deepseek-ai/deepseek-vl2-small, deepseek-ai/deepseek-vl2 etc.
DeepseekOCRForCausalLM	DeepSeek-OCR	T + I⁺	deepseek-ai/DeepSeek-OCR, etc.
DeepseekOCR2ForCausalLM	DeepSeek-OCR-2	T + I⁺	deepseek-ai/DeepSeek-OCR-2, etc.
Eagle2_5_VLForConditionalGeneration	Eagle2.5-VL	T + I^E+	nvidia/Eagle2.5-8B, etc.
Ernie4_5_VLMoeForConditionalGeneration	Ernie4.5-VL	T + I⁺/ V⁺	baidu/ERNIE-4.5-VL-28B-A3B-PT, baidu/ERNIE-4.5-VL-424B-A47B-PT
FuyuForCausalLM	Fuyu	T + I	adept/fuyu-8b etc.
Gemma3ForConditionalGeneration	Gemma 3	T + I⁺	google/gemma-3-4b-it, google/gemma-3-27b-it, etc.
Gemma3nForConditionalGeneration	Gemma 3n	T + I + A	google/gemma-3n-E2B-it, google/gemma-3n-E4B-it, etc.
GLM4VForCausalLM^	GLM-4V	T + I	zai-org/glm-4v-9b, zai-org/cogagent-9b-20241220, etc.
Glm4vForConditionalGeneration	GLM-4.1V-Thinking	T + I^E+ + V^E+	zai-org/GLM-4.1V-9B-Thinking, etc.
Glm4vMoeForConditionalGeneration	GLM-4.5V	T + I^E+ + V^E+	zai-org/GLM-4.5V, etc.
GlmOcrForConditionalGeneration	GLM-OCR	T + I^E+	zai-org/GLM-OCR, etc.
GraniteSpeechForConditionalGeneration	Granite Speech	T + A	ibm-granite/granite-speech-3.3-8b
H2OVLChatModel	H2OVL	T + I^E+	h2oai/h2ovl-mississippi-800m, h2oai/h2ovl-mississippi-2b, etc.
HCXVisionForCausalLM	HyperCLOVAX-SEED-Vision-Instruct-3B	T + I⁺ + V⁺	naver-hyperclovax/HyperCLOVAX-SEED-Vision-Instruct-3B
HunYuanVLForConditionalGeneration	HunyuanOCR	T + I^E+	tencent/HunyuanOCR, etc.
Idefics3ForConditionalGeneration	Idefics3	T + I	HuggingFaceM4/Idefics3-8B-Llama3 etc.
InternS1ForConditionalGeneration	Intern-S1	T + I^E+ + V^E+	internlm/Intern-S1, etc.
InternS1ProForConditionalGeneration	Intern-S1-Pro	T + I^E+ + V^E+	internlm/Intern-S1-Pro, etc.
InternVLChatModel	InternVL 3.5, InternVL 3.0, InternVL 2.5, Mono-InternVL, InternVL 2.0	T + I^E++ (V^E+)	OpenGVLab/InternVL3_5-14B, OpenGVLab/InternVL3-9B, OpenGVLab/InternVideo2_5_Chat_8B, OpenGVLab/InternVL2_5-4B, OpenGVLab/Mono-InternVL-2B, OpenGVLab/InternVL2-4B, etc.
InternVLForConditionalGeneration	InternVL 3.0 (HF format)	T + I^E+ + V^E+	OpenGVLab/InternVL3-1B-hf, etc.
IsaacForConditionalGeneration	Isaac	T + I⁺	PerceptronAI/Isaac-0.1
KananaVForConditionalGeneration	Kanana-V	T + I⁺	kakaocorp/kanana-1.5-v-3b-instruct, etc.
KeyeForConditionalGeneration	Keye-VL-8B-Preview	T + I^E+ + V^E+	Kwai-Keye/Keye-VL-8B-Preview
KeyeVL1_5ForConditionalGeneration	Keye-VL-1_5-8B	T + I^E+ + V^E+	Kwai-Keye/Keye-VL-1_5-8B
KimiK25ForConditionalGeneration	Kimi-K2.5	T + I⁺	moonshotai/Kimi-K2.5
KimiVLForConditionalGeneration	Kimi-VL-A3B-Instruct, Kimi-VL-A3B-Thinking	T + I⁺	moonshotai/Kimi-VL-A3B-Instruct, moonshotai/Kimi-VL-A3B-Thinking
LightOnOCRForConditionalGeneration	LightOnOCR-1B	T + I⁺	lightonai/LightOnOCR-1B, etc
Lfm2VlForConditionalGeneration	LFM2-VL	T + I⁺	LiquidAI/LFM2-VL-450M, LiquidAI/LFM2-VL-3B, LiquidAI/LFM2-VL-8B-A1B, etc.
Llama4ForConditionalGeneration	Llama 4	T + I⁺	meta-llama/Llama-4-Scout-17B-16E-Instruct, meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8, meta-llama/Llama-4-Maverick-17B-128E-Instruct, etc.
Llama_Nemotron_Nano_VL	Llama Nemotron Nano VL	T + I^E+	nvidia/Llama-3.1-Nemotron-Nano-VL-8B-V1
LlavaForConditionalGeneration	LLaVA-1.5, Pixtral (HF Transformers)	T + I^E+	llava-hf/llava-1.5-7b-hf, TIGER-Lab/Mantis-8B-siglip-llama3 (see note), etc.
LlavaNextForConditionalGeneration	LLaVA-NeXT	T + I^E+	llava-hf/llava-v1.6-mistral-7b-hf, llava-hf/llava-v1.6-vicuna-7b-hf, etc.
LlavaNextVideoForConditionalGeneration	LLaVA-NeXT-Video	T + V	llava-hf/LLaVA-NeXT-Video-7B-hf, etc.
LlavaOnevisionForConditionalGeneration	LLaVA-Onevision	T + I⁺ + V⁺	llava-hf/llava-onevision-qwen2-7b-ov-hf, llava-hf/llava-onevision-qwen2-0.5b-ov-hf, etc.
MiDashengLMModel	MiDashengLM	T + A⁺	mispeech/midashenglm-7b
MiniCPMO	MiniCPM-O	T + I^E+ + V^E+ + A^E+	openbmb/MiniCPM-o-2_6, etc.
MiniCPMV	MiniCPM-V	T + I^E+ + V^E+	openbmb/MiniCPM-V-2 (see note), openbmb/MiniCPM-Llama3-V-2_5, openbmb/MiniCPM-V-2_6, etc.
MiniMaxVL01ForConditionalGeneration	MiniMax-VL	T + I^E+	MiniMaxAI/MiniMax-VL-01, etc.
Mistral3ForConditionalGeneration	Mistral3 (HF Transformers)	T + I⁺	mistralai/Mistral-Small-3.1-24B-Instruct-2503, etc.
MolmoForCausalLM	Molmo	T + I⁺	allenai/Molmo-7B-D-0924, allenai/Molmo-72B-0924, etc.
Molmo2ForConditionalGeneration	Molmo2	T + I⁺/ V	allenai/Molmo2-4B, allenai/Molmo2-8B, allenai/Molmo2-O-7B
NVLM_D_Model	NVLM-D 1.0	T + I^E+	nvidia/NVLM-D-72B, etc.
OpenCUAForConditionalGeneration	OpenCUA-7B	T + I^E+	xlangai/OpenCUA-7B
OpenPanguVLForConditionalGeneration	openpangu-VL	T + I^E+ + V^E+	FreedomIntelligence/openPangu-VL-7B
Ovis	Ovis2, Ovis1.6	T + I⁺	AIDC-AI/Ovis2-1B, AIDC-AI/Ovis1.6-Llama3.2-3B, etc.
Ovis2_5	Ovis2.5	T + I⁺ + V	AIDC-AI/Ovis2.5-9B, etc.
Ovis2_6ForCausalLM	Ovis2.6	T + I⁺ + V	AIDC-AI/Ovis2.6-2B, etc.
Ovis2_6_MoeForCausalLM	Ovis2.6	T + I⁺ + V	AIDC-AI/Ovis2.6-30B-A3B, etc.
PaddleOCRVLForConditionalGeneration	Paddle-OCR	T + I⁺	PaddlePaddle/PaddleOCR-VL, etc.
PaliGemmaForConditionalGeneration	PaliGemma, PaliGemma 2	T + I^E	google/paligemma-3b-pt-224, google/paligemma-3b-mix-224, google/paligemma2-3b-ft-docci-448, etc.
Phi3VForCausalLM	Phi-3-Vision, Phi-3.5-Vision	T + I^E+	microsoft/Phi-3-vision-128k-instruct, microsoft/Phi-3.5-vision-instruct, etc.
Phi4MMForCausalLM	Phi-4-multimodal	T + I⁺ / T + A⁺/ I+ + A⁺	microsoft/Phi-4-multimodal-instruct, etc.
PixtralForConditionalGeneration	Ministral 3 (Mistral format), Mistral 3 (Mistral format), Mistral Large 3 (Mistral format), Pixtral (Mistral format)	T + I⁺	mistralai/Pixtral-12B-2409, mistral-community/pixtral-12b (see note), etc.
QwenVLForConditionalGeneration	Qwen-VL	T + I^E+	Qwen/Qwen-VL, Qwen/Qwen-VL-Chat, etc.
Qwen2AudioForConditionalGeneration	Qwen2-Audio	T + A⁺	Qwen/Qwen2-Audio-7B-Instruct
Qwen2VLForConditionalGeneration	QVQ, Qwen2-VL	T + I^E+ + V^E+	Qwen/QVQ-72B-Preview, Qwen/Qwen2-VL-7B-Instruct, Qwen/Qwen2-VL-72B-Instruct, etc.
Qwen2_5_VLForConditionalGeneration	Qwen2.5-VL	T + I^E+ + V^E+	Qwen/Qwen2.5-VL-3B-Instruct, Qwen/Qwen2.5-VL-72B-Instruct, etc.
Qwen2_5OmniThinkerForConditionalGeneration	Qwen2.5-Omni	T + I^E+ + V^E+ + A⁺	Qwen/Qwen2.5-Omni-7B
Qwen3VLForConditionalGeneration	Qwen3-VL	T + I^E+ + V^E+	Qwen/Qwen3-VL-4B-Instruct, etc.
Qwen3VLMoeForConditionalGeneration	Qwen3-VL-MOE	T + I^E+ + V^E+	Qwen/Qwen3-VL-30B-A3B-Instruct, etc.
Qwen3OmniMoeThinkerForConditionalGeneration	Qwen3-Omni	T + I^E+ + V^E+ + A⁺	Qwen/Qwen3-Omni-30B-A3B-Instruct, Qwen/Qwen3-Omni-30B-A3B-Thinking
Qwen3_5ForConditionalGeneration	Qwen3.5	T + I^E+ + V^E+	Qwen/Qwen3.5-9B-Instruct, etc.
Qwen3_5MoeForConditionalGeneration	Qwen3.5-MOE	T + I^E+ + V^E+	Qwen/Qwen3.5-35B-A3B-Instruct, etc.
RForConditionalGeneration	R-VL-4B	T + I^E+	YannQi/R-4B
SkyworkR1VChatModel	Skywork-R1V-38B	T + I	Skywork/Skywork-R1V-38B
SmolVLMForConditionalGeneration	SmolVLM2	T + I	SmolVLM2-2.2B-Instruct
Step3VLForConditionalGeneration	Step3-VL	T + I⁺	stepfun-ai/step3
StepVLForConditionalGeneration	Step3-VL-10B	T + I⁺	stepfun-ai/Step3-VL-10B
TarsierForConditionalGeneration	Tarsier	T + I^E+	omni-search/Tarsier-7b, omni-search/Tarsier-34b
Tarsier2ForConditionalGeneration^	Tarsier2	T + I^E+ + V^E+	omni-research/Tarsier2-Recap-7b, omni-research/Tarsier2-7b-0115
UltravoxModel	Ultravox	T + A^E+	fixie-ai/ultravox-v0_5-llama-3_2-1b
Emu3ForConditionalGeneration	Emu3	T + I⁺	BAAI/Emu3-Chat

表21 多模态模型 | 生成模型 | 文本转换
架构	模型	HuggingFace模型示例
FireRedASR2ForConditionalGeneration	FireRedASR2	allendou/FireRedASR2-LLM-vllm, etc.
FunASRForConditionalGeneration	FunASR	allendou/Fun-ASR-Nano-2512-vllm, etc.
Gemma3nForConditionalGeneration	Gemma3n	google/gemma-3n-E2B-it, google/gemma-3n-E4B-it, etc.
GlmAsrForConditionalGeneration	GLM-ASR	zai-org/GLM-ASR-Nano-2512
GraniteSpeechForConditionalGeneration	Granite Speech	ibm-granite/granite-speech-3.3-2b, ibm-granite/granite-speech-3.3-8b, etc.
Qwen3ASRForConditionalGeneration	Qwen3-ASR	Qwen/Qwen3-ASR-1.7B, etc.
Qwen3OmniMoeThinkerForConditionalGeneration	Qwen3-Omni	Qwen/Qwen3-Omni-30B-A3B-Instruct, etc.
VoxtralForConditionalGeneration	Voxtral (Mistral format)	mistralai/Voxtral-Mini-3B-2507, mistralai/Voxtral-Small-24B-2507, etc.
WhisperForConditionalGeneration	Whisper	openai/whisper-small, openai/whisper-large-v3-turbo, etc.

表22 多模态模型 | 池化模型 | 嵌入
架构	模型	输入	HuggingFace模型示例	说明
CLIPModel	CLIP	T / I	openai/clip-vit-base-patch32, openai/clip-vit-large-patch14, etc.	模态说明： Text：文本 Image：图片 Video：视频 Audio：音频特殊字符含义： +：支持同时输入两种模态。例如 T+I 表示：支持纯文本输入、纯图片输入，或文本+图片输入 /：支持多种模态，但多种模态不可同时使用。例如 T/I表示：支持纯文本输入或纯图片输入，不支持文本+图片输入
ColModernVBertForRetrieval	ColModernVBERT	T / I	ModernVBERT/colmodernvbert-merged
LlamaNemotronVLModel	Llama Nemotron Embedding + SigLIP	T + I	nvidia/llama-nemotron-embed-vl-1b-v2
LlavaNextForConditionalGeneration^C	LLaVA-NeXT-based	T / I	royokong/e5-v
Phi3VForCausalLM^C	Phi-3-Vision-based	T + I	TIGER-Lab/VLM2Vec-Full
Qwen3VLForConditionalGeneration^C	Qwen3-VL	T + I + V	Qwen/Qwen3-VL-Embedding-2B, etc.
SiglipModel	SigLIP, SigLIP2	T / I	google/siglip-base-patch16-224, google/siglip2-base-patch16-224
ForConditionalGeneration^C, ForCausalLM^C, etc.	Generative models	/	N/A

说明：

^C表示该模型可通过--convert embed转换为嵌入模型。
*表示模型功能和原始模型一致。

表23 多模态模型 | 池化模型 | 交叉编码/重排序
架构	模型	输入	HuggingFace模型示例	说明
JinaVLForSequenceClassification	JinaVL-based	T + I^E+	`jinaai/jina-reranker-m0`, etc.	模态说明： Text：文本 Image：图片 Video：视频 Audio：音频特殊字符含义： +：支持同时输入两种模态。例如 T+I 表示：支持纯文本输入、纯图片输入，或文本+图片输入 /：支持多种模态，但多种模态不可同时使用。例如 T/I表示：支持纯文本输入或纯图片输入，不支持文本+图片输入
LlamaNemotronVLForSequenceClassification	Llama Nemotron Reranker + SigLIP	T + I^E+	nvidia/llama-nemotron-rerank-vl-1b-v2
Qwen3VLForSequenceClassification	Qwen3-VL-Reranker	T + I^E+ + V^E+	Qwen/Qwen3-VL-Reranker-2B, etc.

vLLM 0.11.0

以下列举该模板兼容的模型架构、名称和示例。如需进一步了解兼容列表中各类模型的使用方法和注意事项，可参考vLLM官方文档

表24 纯文本语言模型 | 生成模型 | 文本生成
架构	模型	HuggingFace模型示例
Zamba2ForCausalLM	Zamba2	Zyphra/Zamba2-7B-instruct, Zyphra/Zamba2-2.7B-instruct, Zyphra/Zamba2-1.2B-instruct, etc.
LongcatFlashForCausalLM	LongCat-Flash	meituan-longcat/LongCat-Flash-Chat, meituan-longcat/LongCat-Flash-Chat-FP8
MiniMaxText01ForCausalLM	MiniMax-Text	MiniMaxAI/MiniMax-Text-01, etc.
MiniMaxM1ForCausalLM	MiniMax-Text	MiniMaxAI/MiniMax-M1-40k, `MiniMaxAI/MiniMax-M1-80k`, etc.
XverseForCausalLM	XVERSE	xverse/XVERSE-7B-Chat , xverse/XVERSE-13B-Chat , xverse/XVERSE-65B-Chat , etc.
TeleFLMForCausalLM	TeleFLM	CofeAI/FLM-2-52B-Instruct-2407, CofeAI/Tele-FLM, etc.
TeleChat2ForCausalLM	TeleChat2	TeleAI/TeleChat2-3B , TeleAI/TeleChat2-7B , TeleAI/TeleChat2-35B , etc.
Starcoder2ForCausalLM	Starcoder2	bigcode/starcoder2-3b , bigcode/starcoder2-7b , bigcode/starcoder2-15b , etc.
StableLmForCausalLM	StableLM	stabilityai/stablelm-3b-4e1t , stabilityai/stablelm-base-alpha-7b-v2 , etc.
SolarForCausalLM	Solar Pro	upstage/solar-pro-preview-instruct , etc.
SeedOssForCausalLM	SeedOss	ByteDance-Seed/Seed-OSS-36B-Instruct, etc.
QWenLMHeadModel	Qwen	Qwen/Qwen-7B , Qwen/Qwen-7B-Chat , etc.
Qwen2MoeForCausalLM	Qwen2MoE	Qwen/Qwen1.5-MoE-A2.7B , Qwen/Qwen1.5-MoE-A2.7B-Chat , etc.
Qwen2ForCausalLM	QwQ, Qwen2	Qwen/QwQ-32B-Preview , Qwen/Qwen2-7B-Instruct , Qwen/Qwen2-7B , etc.
Qwen3ForCausalLM	Qwen3	Qwen/Qwen3-8B, etc.
Qwen3MoeForCausalLM	Qwen3MoE	Qwen/Qwen3-MoE-15B-A2B, etc.
Qwen3NextForCausalLM	Qwen3NextMoE	Qwen/Qwen3-Next-80B-A3B-Instruct, etc.
Plamo2ForCausalLM	PLaMo2	pfnet/plamo-2-1b, pfnet/plamo-2-8b, etc.
PersimmonForCausalLM	Persimmon	adept/persimmon-8b-base, adept/persimmon-8b-chat, etc.
Phi4FlashForCausalLM	Phi-4-mini-flash-reasoning	microsoft/microsoft/Phi-4-mini-instruct, etc.
PhiMoEForCausalLM	Phi-3.5-MoE	microsoft/Phi-3.5-MoE-instruct , etc.
PhiForCausalLM	Phi	microsoft/phi-1_5 , microsoft/phi-2 , etc.
Phi3SmallForCausalLM	Phi-3-Small	microsoft/Phi-3-small-8k-instruct , microsoft/Phi-3-small-128k-instruct , etc.
Phi3ForCausalLM	Phi-4, Phi-3	microsoft/Phi-4 , microsoft/Phi-3-mini-4k-instruct , microsoft/Phi-3-mini-128k-instruct , microsoft/Phi-3-medium-128k-instruct , etc.
PersimmonForCausalLM	Persimmon	adept/persimmon-8b-base , adept/persimmon-8b-chat , etc.
OrionForCausalLM	Orion	OrionStarAI/Orion-14B-Base , OrionStarAI/Orion-14B-Chat , etc.
OPTForCausalLM	OPT, OPT-IML	facebook/opt-66b , facebook/opt-iml-max-30b , etc.
OlmoForCausalLM	OLMo	allenai/OLMo-1B-hf , allenai/OLMo-7B-hf , etc.
OlmoeForCausalLM	OLMoE	allenai/OLMoE-1B-7B-0924 , allenai/OLMoE-1B-7B-0924-Instruct , etc.
Olmo2ForCausalLM	OLMo2	allenai/OLMo2-7B-1124 , etc.
Olmo3ForCausalLM	OLMo3	TBA
NemotronHForCausalLM	Nemotron-H	nvidia/Nemotron-H-8B-Base-8K, nvidia/Nemotron-H-47B-Base-8K, nvidia/Nemotron-H-56B-Base-8K, etc.
NemotronForCausalLM	Nemotron-3, Nemotron-4, Minitron	nvidia/Minitron-8B-Base , mgoin/Nemotron-4-340B-Base-hf-FP8 , etc.
MPTForCausalLM	MPT, MPT-Instruct, MPT-Chat, MPT-StoryWriter	mosaicml/mpt-7b , mosaicml/mpt-7b-storywriter , mosaicml/mpt-30b , etc.
MotifForCausalLM	Motif-1-Tiny	Motif-Technologies/Motif-2.6B, `Motif-Technologies/Motif-2.6b-v1.1-LC`, etc.
MixtralForCausalLM	Mixtral-8x7B, Mixtral-8x7B-Instruct	mistralai/Mixtral-8x7B-v0.1 , mistralai/Mixtral-8x7B-Instruct-v0.1 , mistral-community/Mixtral-8x22B-v0.1 , etc.
MistralForCausalLM	Mistral, Mistral-Instruct	mistralai/Mistral-7B-v0.1 , mistralai/Mistral-7B-Instruct-v0.1 , etc.
MiniCPM3ForCausalLM	MiniCPM3	openbmb/MiniCPM3-4B , etc.
MiniCPMForCausalLM	MiniCPM	openbmb/MiniCPM-2B-sft-bf16, openbmb/MiniCPM-2B-dpo-bf16, openbmb/MiniCPM-S-1B-sft, etc.
MambaForCausalLM	Mamba	state-spaces/mamba-130m-hf , state-spaces/mamba-790m-hf , state-spaces/mamba-2.8b-hf , etc.
Mamba2ForCausalLM	Mamba2	mistralai/Mamba-Codestral-7B-v0.1, etc.
MiMoForCausalLM	MiMo	XiaomiMiMo/MiMo-7B-RL, etc.
LlamaForCausalLM	Llama 3.1, Llama 3, Llama 2, LLaMA, Yi	meta-llama/Meta-Llama-3.1-405B-Instruct , meta-llama/Meta-Llama-3.1-70B , meta-llama/Meta-Llama-3-70B-Instruct , meta-llama/Llama-2-70b-hf , 01-ai/Yi-34B , etc.
Lfm2ForCausalLM	LFM2	LiquidAI/LFM2-1.2B, LiquidAI/LFM2-700M, LiquidAI/LFM2-350M, etc.
JambaForCausalLM	Jamba	ai21labs/AI21-Jamba-1.5-Large , ai21labs/AI21-Jamba-1.5-Mini , ai21labs/Jamba-v0.1 , etc.
JAISLMHeadModel	Jais	inceptionai/jais-13b , inceptionai/jais-13b-chat , inceptionai/jais-30b-v3 , inceptionai/jais-30b-chat-v3 , etc.
InternLM3ForCausalLM	InternLM3	internlm/internlm3-8b-instruct , etc.
InternLM2ForCausalLM	InternLM2	internlm/internlm2-7b , internlm/internlm2-chat-7b , etc.
InternLMForCausalLM	InternLM	internlm/internlm-7b, internlm/internlm-chat-7b, etc.
HCXVisionForCausalLM	HyperCLOVAX-SEED-Vision-Instruct-3B	naver-hyperclovax/HyperCLOVAX-SEED-Vision-Instruct-3B
HunYuanMoEV1ForCausalLM	Hunyuan-80B-A13B	tencent/Hunyuan-A13B-Instruct, tencent/Hunyuan-A13B-Pretrain, tencent/Hunyuan-A13B-Instruct-FP8, etc.
HunYuanDenseV1ForCausalLM	Hunyuan-7B-Instruct-0124	tencent/Hunyuan-7B-Instruct-0124
Grok1ModelForCausalLM	Grok1	hpcai-tech/grok-1.
GritLM	GritLM	parasail-ai/GritLM-7B-vllm .
GraniteMoeSharedForCausalLM	Granite MoE Shared	ibm-research/moe-7b-1b-active-shared-experts (test model)
GraniteMoeHybridForCausalLM	Granite 4.0 MoE Hybrid	ibm-granite/granite-4.0-tiny-preview, etc.
GraniteMoeForCausalLM	Granite 3.0 MoE, PowerMoE	ibm-granite/granite-3.0-1b-a400m-base , ibm-granite/granite-3.0-3b-a800m-instruct , ibm/PowerMoE-3b , etc.
GraniteForCausalLM	Granite 3.0, Granite 3.1, PowerLM	ibm-granite/granite-3.0-2b-base , ibm-granite/granite-3.1-8b-instruct , ibm/PowerLM-3b , etc.
GptOssForCausalLM	GPT-OSS	openai/gpt-oss-120b, openai/gpt-oss-20b
GPTNeoXForCausalLM	GPT-NeoX, Pythia, OpenAssistant, Dolly V2, StableLM	EleutherAI/gpt-neox-20b , EleutherAI/pythia-12b , OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5 , databricks/dolly-v2-12b , stabilityai/stablelm-tuned-alpha-7b , etc.
GPTJForCausalLM	GPT-J	EleutherAI/gpt-j-6b , nomic-ai/gpt4all-j , etc.
GPTBigCodeForCausalLM	StarCoder, SantaCoder, WizardCoder	bigcode/starcoder , bigcode/gpt_bigcode-santacoder , WizardLM/WizardCoder-15B-V1.0 , etc.
GPT2LMHeadModel	GPT-2	gpt2 , gpt2-xl , etc.
Glm4MoeForCausalLM	GLM-4.5	zai-org/GLM-4.5, etc.
Glm4ForCausalLM	GLM-4-0414	THUDM/GLM-4-32B-0414, etc.
GlmForCausalLM	GLM-4	THUDM/glm-4-9b-chat-hf , etc.
Gemma3nForCausalLM	Gemma 3n	google/gemma-3n-E2B-it, google/gemma-3n-E4B-it, etc.
Gemma3ForCausalLM	Gemma 3	google/gemma-3-1b-it, etc.
Gemma2ForCausalLM	Gemma 2	google/gemma-2-9b, google/gemma-2-27b, etc.
GemmaForCausalLM	Gemma	google/gemma-2b , google/gemma-7b , etc.
FalconH1ForCausalLM	Falcon-H1	tiiuae/Falcon-H1-34B-Base, tiiuae/Falcon-H1-34B-Instruct, etc.
FalconMambaForCausalLM	FalconMamba	tiiuae/falcon-mamba-7b , tiiuae/falcon-mamba-7b-instruct , etc.
FalconForCausalLM	Falcon	tiiuae/falcon-7b , tiiuae/falcon-40b , tiiuae/falcon-rw-7b , etc.
Fairseq2LlamaForCausalLM	Llama (fairseq2 format)	mgleize/fairseq2-dummy-Llama-3.2-1B, etc.
Exaone4ForCausalLM	EXAONE-4	LGAI-EXAONE/EXAONE-4.0-32B, etc.
ExaoneForCausalLM	EXAONE-3	LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct , etc.
Ernie4_5_MoeForCausalLM	Ernie4.5MoE	baidu/ERNIE-4.5-21B-A3B-PT, baidu/ERNIE-4.5-300B-A47B-PT, etc.
Ernie4_5ForCausalLM	Ernie4.5	baidu/ERNIE-4.5-0.3B-PT, etc.
DotsOCRForCausalLM	dots_ocr	rednote-hilab/dots.ocr
Dots1ForCausalLM	dots.llm1	rednote-hilab/dots.llm1.base, rednote-hilab/dots.llm1.inst, etc.
DeepseekV3ForCausalLM	DeepSeek-V3	deepseek-ai/DeepSeek-V3-Base , deepseek-ai/DeepSeek-V3 etc.
DeepseekV2ForCausalLM	DeepSeek-V2	deepseek-ai/DeepSeek-V2 , deepseek-ai/DeepSeek-V2-Chat etc.
DeepseekForCausalLM	DeepSeek	deepseek-ai/deepseek-llm-67b-base , deepseek-ai/deepseek-llm-7b-chat etc.
DeciLMForCausalLM	DeciLM	Deci/DeciLM-7B , Deci/DeciLM-7B-instruct , etc.
DbrxForCausalLM	DBRX	databricks/dbrx-base , databricks/dbrx-instruct , etc.
CohereForCausalLM , Cohere2ForCausalLM	Command-R	CohereForAI/c4ai-command-r-v01 , CohereForAI/c4ai-command-r7b-12-2024 , etc.
ChatGLMModel, ChatGLMForConditionalGeneration	ChatGLM	THUDM/chatglm2-6b , THUDM/chatglm3-6b , etc.
BloomForCausalLM	BLOOM, BLOOMZ, BLOOMChat	bigscience/bloom , bigscience/bloomz , etc.
BambaForCausalLM	Bamba	ibm-ai-platform/Bamba-9B-fp8, ibm-ai-platform/Bamba-9B
BailingMoeV2ForCausalLM	Ling	inclusionAI/Ling-mini-2.0, etc.
BailingMoeForCausalLM	Ling	inclusionAI/Ling-lite-1.5, inclusionAI/Ling-plus, etc.
BaiChuanForCausalLM	Baichuan2, Baichuan	baichuan-inc/Baichuan2-13B-Chat , baichuan-inc/Baichuan-7B , etc.
ArcticForCausalLM	Arctic	Snowflake/snowflake-arctic-base , Snowflake/snowflake-arctic-instruct , etc.
ArceeForCausalLM	Arcee (AFM)	arcee-ai/AFM-4.5B-Base, etc.
AquilaForCausalLM	Aquila, Aquila2	BAAI/Aquila-7B , BAAI/AquilaChat-7B , etc.
ApertusForCausalLM	Apertus	swiss-ai/Apertus-8B-2509, swiss-ai/Apertus-70B-Instruct-2509, etc.

表25 纯文本语言模型 | 池化模型 | 嵌入
架构	模型	HuggingFace模型示例
BertModel^C	BERT-based	BAAI/bge-base-en-v1.5 , etc.
Gemma2Model^C	Gemma2-based	BAAI/bge-multilingual-gemma2 , etc.
Gemma3TextModel^C	Gemma 3-based	google/embeddinggemma-300m, etc.
GritLM	GritLM	parasail-ai/GritLM-7B-vllm.
GteModel^C	Arctic-Embed-2.0-M	Snowflake/snowflake-arctic-embed-m-v2.0.
GteNewModel^C	mGTE-TRM	Alibaba-NLP/gte-multilingual-base, etc.
ModernBertModel^C	ModernBERT-based	Alibaba-NLP/gte-modernbert-base, etc.
NomicBertModel^C	Nomic BERT	nomic-ai/nomic-embed-text-v1, nomic-ai/nomic-embed-text-v2-moe, Snowflake/snowflake-arctic-embed-m-long, etc.
LlamaModel^C, LlamaForCausalLM^C, MistralModel^C, etc.	Llama-based	intfloat/e5-mistral-7b-instruct , etc.
Qwen2Model^C, Qwen2ForCausalLM^C	Qwen2-based	ssmits/Qwen2-7B-Instruct-embed-base (see note), Alibaba-NLP/gte-Qwen2-7B-instruct (see note), etc.
Qwen3Model^C, Qwen3ForCausalLM^C	Qwen3-based	Qwen/Qwen3-Embedding-0.6B, etc.
RobertaModel , RobertaForMaskedLM	RoBERTa-based	sentence-transformers/all-roberta-large-v1 , sentence-transformers/all-roberta-large-v1 , etc.
Model^C, ForCausalLMC^C, etc.	Generative models	N/A

说明：

^C表示该模型可通过--convert embed转换为嵌入模型。
*表示模型功能和原始模型一致。

表26 纯文本语言模型 | 池化模型 | 奖励
架构	模型	HuggingFace模型示例
InternLM2ForRewardModel	InternLM2-based	internlm/internlm2-1_8b-reward , internlm/internlm2-7b-reward , etc.
LlamaForCausalLM	Llama-based	peiyi9979/math-shepherd-mistral-7b-prm , etc.
Qwen2ForRewardModel	Qwen2-based	Qwen/Qwen2.5-Math-RM-72B , etc.
Qwen2ForProcessRewardModel	Qwen2-based	Qwen/Qwen2.5-Math-PRM-7B , Qwen/Qwen2.5-Math-PRM-72B , etc.
ModelC^C, ForCausalLMC^C, etc.	Generative models	N/A

说明：

^C表示该模型可通过--convert reward转换为奖励模型。
*表示模型功能和原始模型一致。

表27 纯文本语言模型 | 池化模型 | 分类 ( --task classify)
架构	模型	HuggingFace模型示例
JambaForSequenceClassification	Jamba	ai21labs/Jamba-tiny-reward-dev , etc.
GPT2ForSequenceClassification	GPT2	nie3e/sentiment-polish-gpt2-small
Model^C, ForCausalLM^C, etc.	Generative models	N/A

说明：

^C表示该模型可通过--convert classify转换为分类模型。
*表示模型功能和原始模型一致。

表28 纯文本语言模型 | 池化模型 | 交叉编码/重排序
架构	模型	HuggingFace模型示例
BertForSequenceClassification	BERT-based	cross-encoder/ms-marco-MiniLM-L-6-v2 , etc.
GemmaForSequenceClassification	Gemma-based	BAAI/bge-reranker-v2-gemma, etc.
GteNewForSequenceClassification	mGTE-TRM	Alibaba-NLP/gte-multilingual-reranker-base, etc.
Qwen2ForSequenceClassification	Qwen2-based	mixedbread-ai/mxbai-rerank-base-v2, etc.
Qwen3ForSequenceClassification	Qwen3-based	tomaarsen/Qwen3-Reranker-0.6B-seq-cls, Qwen/Qwen3-Reranker-0.6B, etc.
RobertaForSequenceClassification	RoBERTa-based	cross-encoder/quora-roberta-base , etc.
XLMRobertaForSequenceClassification	XLM-RoBERTa-based	BAAI/bge-reranker-v2-m3 , etc.
Model^C, ForCausalLM^C, etc.	Generative models	N/A

说明：

^C表示该模型可通过--convert classify转换为分类模型。
*表示模型功能和原始模型一致。

表29 纯文本语言模型 | 池化模型 | Token分类
架构	模型	HuggingFace模型示例
BertForTokenClassification	bert-based	boltuix/NeuroBERT-NER, etc.

表30 多模态模型 | 生成模型 | 文本生成
架构	模型	输入	HuggingFace模型示例	说明
AriaForConditionalGeneration	Aria	T + I⁺	rhymes-ai/Aria	模态说明： Text：文本 Image：图片 Video：视频 Audio：音频特殊字符含义： +：支持同时输入两种模态。例如 T+I 表示：支持纯文本输入、纯图片输入，或文本+图片输入 /：支持多种模态，但多种模态不可同时使用。例如 T/I表示：支持纯文本输入或纯图片输入，不支持文本+图片输入 ^E：该模态下，支持输入预计算的嵌入 ⁺ ：该模态下，每个文本 Prompt 支持输入多条
AyaVisionForConditionalGeneration	Aya Vision	T + I⁺	CohereForAI/aya-vision-8b, CohereForAI/aya-vision-32b, etc.
Blip2ForConditionalGeneration	BLIP-2	T + I^E	Salesforce/blip2-opt-2.7b, Salesforce/blip2-opt-6.7b, etc.
ChameleonForConditionalGeneration	Chameleon	T + I	facebook/chameleon-7b etc.
Cohere2VisionForConditionalGeneration	Command A Vision	T + I⁺	CohereLabs/command-a-vision-07-2025, etc.
DeepseekVLV2ForCausalLM	DeepSeek-VL2	T + I⁺	deepseek-ai/deepseek-vl2-tiny, deepseek-ai/deepseek-vl2-small, deepseek-ai/deepseek-vl2 etc.
Ernie4_5_VLMoeForConditionalGeneration	Ernie4.5-VL	T + I⁺/ V⁺	baidu/ERNIE-4.5-VL-28B-A3B-PT, baidu/ERNIE-4.5-VL-424B-A47B-PT
FuyuForCausalLM	Fuyu	T + I	adept/fuyu-8b etc.
Gemma3ForConditionalGeneration	Gemma 3	T + I⁺	google/gemma-3-4b-it, google/gemma-3-27b-it, etc.
Gemma3nForConditionalGeneration	Gemma 3n	T + I + A	google/gemma-3n-E2B-it, google/gemma-3n-E4B-it, etc.
GLM4VForCausalLM^	GLM-4V	T + I	zai-org/glm-4v-9b, zai-org/cogagent-9b-20241220, etc.
Glm4vForConditionalGeneration	GLM-4.1V-Thinking	T + I^E+ + V^E+	zai-org/GLM-4.1V-9B-Thinking, etc.
Glm4vMoeForConditionalGeneration	GLM-4.5V	T + I^E+ + V^E+	zai-org/GLM-4.5V, etc.
GraniteSpeechForConditionalGeneration	Granite Speech	T + A	ibm-granite/granite-speech-3.3-8b
H2OVLChatModel	H2OVL	T + I^E+	h2oai/h2ovl-mississippi-800m, h2oai/h2ovl-mississippi-2b, etc.
Idefics3ForConditionalGeneration	Idefics3	T + I	HuggingFaceM4/Idefics3-8B-Llama3 etc.
InternS1ForConditionalGeneration	Intern-S1	T + I^E+ + V^E+	internlm/Intern-S1, etc.
InternVLChatModel	InternVL 3.5, InternVL 3.0, InternVL 2.5, Mono-InternVL, InternVL 2.0	T + I^E++ (V^E+)	OpenGVLab/InternVL3_5-14B, OpenGVLab/InternVL3-9B, OpenGVLab/InternVideo2_5_Chat_8B, OpenGVLab/InternVL2_5-4B, OpenGVLab/Mono-InternVL-2B, OpenGVLab/InternVL2-4B, etc.
InternVLForConditionalGeneration	InternVL 3.0 (HF format)	T + I^E+ + V^E+	OpenGVLab/InternVL3-1B-hf, etc.
KeyeForConditionalGeneration	Keye-VL-8B-Preview	T + I^E+ + V^E+	Kwai-Keye/Keye-VL-8B-Preview
KeyeVL1_5ForConditionalGeneration	Keye-VL-1_5-8B	T + I^E+ + V^E+	Kwai-Keye/Keye-VL-1_5-8B
KimiVLForConditionalGeneration	Kimi-VL-A3B-Instruct, Kimi-VL-A3B-Thinking	T + I⁺	moonshotai/Kimi-VL-A3B-Instruct, moonshotai/Kimi-VL-A3B-Thinking
Llama4ForConditionalGeneration	Llama 4	T + I⁺	meta-llama/Llama-4-Scout-17B-16E-Instruct, meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8, meta-llama/Llama-4-Maverick-17B-128E-Instruct, etc.
Llama_Nemotron_Nano_VL	Llama Nemotron Nano VL	T + I^E+	nvidia/Llama-3.1-Nemotron-Nano-VL-8B-V1
LlavaForConditionalGeneration	LLaVA-1.5	T + I^E+	llava-hf/llava-1.5-7b-hf, TIGER-Lab/Mantis-8B-siglip-llama3 (see note), etc.
LlavaNextForConditionalGeneration	LLaVA-NeXT	T + I^E+	llava-hf/llava-v1.6-mistral-7b-hf, llava-hf/llava-v1.6-vicuna-7b-hf, etc.
LlavaNextVideoForConditionalGeneration	LLaVA-NeXT-Video	T + V	llava-hf/LLaVA-NeXT-Video-7B-hf, etc.
LlavaOnevisionForConditionalGeneration	LLaVA-Onevision	T + I⁺ + V⁺	llava-hf/llava-onevision-qwen2-7b-ov-hf, llava-hf/llava-onevision-qwen2-0.5b-ov-hf, etc.
MiDashengLMModel	MiDashengLM	T + A⁺	mispeech/midashenglm-7b
MiniCPMO	MiniCPM-O	T + I^E+ + V^E+ + A^E+	openbmb/MiniCPM-o-2_6, etc.
MiniCPMV	MiniCPM-V	T + I^E+ + V^E+	openbmb/MiniCPM-V-2 (see note), openbmb/MiniCPM-Llama3-V-2_5, openbmb/MiniCPM-V-2_6, etc.
MiniMaxVL01ForConditionalGeneration	MiniMax-VL	T + I^E+	MiniMaxAI/MiniMax-VL-01, etc.
Mistral3ForConditionalGeneration	Mistral3 (HF Transformers)	T + I⁺	mistralai/Mistral-Small-3.1-24B-Instruct-2503, etc.
MolmoForCausalLM	Molmo	T + I	allenai/Molmo-7B-D-0924, allenai/Molmo-72B-0924, etc.
NVLM_D_Model	NVLM-D 1.0	T + I^E+	nvidia/NVLM-D-72B, etc.
Ovis	Ovis2, Ovis1.6	T + I⁺	AIDC-AI/Ovis2-1B, AIDC-AI/Ovis1.6-Llama3.2-3B, etc.
Ovis2_5	Ovis2.5	T + I⁺ + V	AIDC-AI/Ovis2.5-9B, etc.
PaliGemmaForConditionalGeneration	PaliGemma, PaliGemma 2	T + I^E	google/paligemma-3b-pt-224, google/paligemma-3b-mix-224, google/paligemma2-3b-ft-docci-448, etc.
Phi3VForCausalLM	Phi-3-Vision, Phi-3.5-Vision	T + I^E+	microsoft/Phi-3-vision-128k-instruct, microsoft/Phi-3.5-vision-instruct, etc.
Phi4MMForCausalLM	Phi-4-multimodal	T + I⁺ / T + A⁺/ I+ + A⁺	microsoft/Phi-4-multimodal-instruct, etc.
PixtralForConditionalGeneration	Pixtral	T + I⁺	mistralai/Pixtral-12B-2409, mistral-community/pixtral-12b (see note), etc.
QwenVLForConditionalGeneration	Qwen-VL	T + I^E+	Qwen/Qwen-VL, Qwen/Qwen-VL-Chat, etc.
Qwen2AudioForConditionalGeneration	Qwen2-Audio	T + A⁺	Qwen/Qwen2-Audio-7B-Instruct
Qwen2VLForConditionalGeneration	QVQ, Qwen2-VL	T + I^E+ + V^E+	Qwen/QVQ-72B-Preview, Qwen/Qwen2-VL-7B-Instruct, Qwen/Qwen2-VL-72B-Instruct, etc.
Qwen2_5_VLForConditionalGeneration	Qwen2.5-VL	T + I^E+ + V^E+	Qwen/Qwen2.5-VL-3B-Instruct, Qwen/Qwen2.5-VL-72B-Instruct, etc.
Qwen2_5OmniThinkerForConditionalGeneration	Qwen2.5-Omni	T + I^E+ + V^E+ + A⁺	Qwen/Qwen2.5-Omni-7B
Qwen3VLForConditionalGeneration	Qwen3-VL	T + I^E+ + V^E+	Qwen/Qwen3-VL-4B-Instruct, etc.
Qwen3VLMoeForConditionalGeneration	Qwen3-VL-MOE	T + I^E+ + V^E+	Qwen/Qwen3-VL-30B-A3B-Instruct, etc.
RForConditionalGeneration	R-VL-4B	T + I^E+	YannQi/R-4B
SkyworkR1VChatModel	Skywork-R1V-38B	T + I	Skywork/Skywork-R1V-38B
SmolVLMForConditionalGeneration	SmolVLM2	T + I	SmolVLM2-2.2B-Instruct
Step3VLForConditionalGeneration	Step3-VL	T + I⁺	stepfun-ai/step3
TarsierForConditionalGeneration	Tarsier	T + I^E+	omni-search/Tarsier-7b, omni-search/Tarsier-34b
Tarsier2ForConditionalGeneration^	Tarsier2	T + I^E+ + V^E+	omni-research/Tarsier2-Recap-7b, omni-research/Tarsier2-7b-0115

表31 多模态模型 | 生成模型 | 文本转换
架构	模型	HuggingFace模型示例
WhisperForConditionalGeneration	Whisper	openai/whisper-small, openai/whisper-large-v3-turbo, etc.
VoxtralForConditionalGeneration	Voxtral (Mistral format)	mistralai/Voxtral-Mini-3B-2507, mistralai/Voxtral-Small-24B-2507, etc.
Gemma3nForConditionalGeneration	Gemma3n	google/gemma-3n-E2B-it, google/gemma-3n-E4B-it, etc.

表32 多模态模型 | 池化模型 | 嵌入
架构	模型	输入	HuggingFace模型示例	说明
LlavaNextForConditionalGeneration^C	LLaVA-NeXT-based	T / I	royokong/e5-v	模态说明： Text：文本 Image：图片 Video：视频 Audio：音频特殊字符含义： +：支持同时输入两种模态。例如 T+I 表示：支持纯文本输入、纯图片输入，或文本+图片输入 /：支持多种模态，但多种模态不可同时使用。例如 T/I表示：支持纯文本输入或纯图片输入，不支持文本+图片输入
Phi3VForCausalLM^C	Phi-3-Vision-based	T + I	TIGER-Lab/VLM2Vec-Full
ForConditionalGeneration^C, ForCausalLM^C, etc.	Generative models	/	N/A

说明：

^C表示该模型可通过--convert embed转换为嵌入模型。
*表示模型功能和原始模型一致。

表33 多模态模型 | 池化模型 | 交叉编码/重排序
架构	模型	输入	HuggingFace模型示例	说明
JinaVLForSequenceClassification	JinaVL-based	T + I^E+	`jinaai/jina-reranker-m0`, etc.	模态说明： Text：文本 Image：图片 Video：视频 Audio：音频特殊字符含义： +：支持同时输入两种模态。例如 T+I 表示：支持纯文本输入、纯图片输入，或文本+图片输入 /：支持多种模态，但多种模态不可同时使用。例如 T/I表示：支持纯文本输入或纯图片输入，不支持文本+图片输入

vLLM 0.9.2

以下列举该模板兼容的模型架构、名称和示例。如需进一步了解兼容列表中各类模型的使用方法和注意事项，可参考vLLM官方文档

表34 纯文本语言模型 | 生成模型 | 文本生成 ( --task generate)
架构	模型	HuggingFace模型示例
Zamba2ForCausalLM	Zamba2	Zyphra/Zamba2-7B-instruct, Zyphra/Zamba2-2.7B-instruct, Zyphra/Zamba2-1.2B-instruct, etc.
MiniMaxText01ForCausalLM	MiniMax-Text	MiniMaxAI/MiniMax-Text-01, etc.
MiniMaxM1ForCausalLM	MiniMax-Text	MiniMaxAI/MiniMax-M1-40k, MiniMaxAI/MiniMax-M1-80ketc.
XverseForCausalLM	XVERSE	xverse/XVERSE-7B-Chat , xverse/XVERSE-13B-Chat , xverse/XVERSE-65B-Chat , etc.
TeleFLMForCausalLM	TeleFLM	CofeAI/FLM-2-52B-Instruct-2407, CofeAI/Tele-FLM, etc.
TeleChat2ForCausalLM	TeleChat2	TeleAI/TeleChat2-3B , TeleAI/TeleChat2-7B , TeleAI/TeleChat2-35B , etc.
Starcoder2ForCausalLM	Starcoder2	bigcode/starcoder2-3b , bigcode/starcoder2-7b , bigcode/starcoder2-15b , etc.
StableLmForCausalLM	StableLM	stabilityai/stablelm-3b-4e1t , stabilityai/stablelm-base-alpha-7b-v2 , etc.
SolarForCausalLM	Solar Pro	upstage/solar-pro-preview-instruct , etc.
QWenLMHeadModel	Qwen	Qwen/Qwen-7B , Qwen/Qwen-7B-Chat , etc.
Qwen2MoeForCausalLM	Qwen2MoE	Qwen/Qwen1.5-MoE-A2.7B , Qwen/Qwen1.5-MoE-A2.7B-Chat , etc.
Qwen2ForCausalLM	QwQ, Qwen2	Qwen/QwQ-32B-Preview , Qwen/Qwen2-7B-Instruct , Qwen/Qwen2-7B , etc.
Qwen3ForCausalLM	Qwen3	Qwen/Qwen3-8B, etc.
Qwen3MoeForCausalLM	Qwen3MoE	Qwen/Qwen3-MoE-15B-A2B, etc.
Plamo2ForCausalLM	PLaMo2	pfnet/plamo-2-1b, pfnet/plamo-2-8b, etc.
PersimmonForCausalLM	Persimmon	adept/persimmon-8b-base, adept/persimmon-8b-chat, etc.
PhiMoEForCausalLM	Phi-3.5-MoE	microsoft/Phi-3.5-MoE-instruct , etc.
PhiForCausalLM	Phi	microsoft/phi-1_5 , microsoft/phi-2 , etc.
Phi3SmallForCausalLM	Phi-3-Small	microsoft/Phi-3-small-8k-instruct , microsoft/Phi-3-small-128k-instruct , etc.
Phi3ForCausalLM	Phi-4, Phi-3	microsoft/Phi-4 , microsoft/Phi-3-mini-4k-instruct , microsoft/Phi-3-mini-128k-instruct , microsoft/Phi-3-medium-128k-instruct , etc.
OrionForCausalLM	Orion	OrionStarAI/Orion-14B-Base , OrionStarAI/Orion-14B-Chat , etc.
OPTForCausalLM	OPT, OPT-IML	facebook/opt-66b , facebook/opt-iml-max-30b , etc.
OlmoForCausalLM	OLMo	allenai/OLMo-1B-hf , allenai/OLMo-7B-hf , etc.
OlmoeForCausalLM	OLMoE	allenai/OLMoE-1B-7B-0924 , allenai/OLMoE-1B-7B-0924-Instruct , etc.
Olmo2ForCausalLM	OLMo2	allenai/OLMo2-7B-1124 , etc.
NemotronHForCausalLM	Nemotron-H	nvidia/Nemotron-H-8B-Base-8K, nvidia/Nemotron-H-47B-Base-8K, nvidia/Nemotron-H-56B-Base-8K, etc.
NemotronForCausalLM	Nemotron-3, Nemotron-4, Minitron	nvidia/Minitron-8B-Base , mgoin/Nemotron-4-340B-Base-hf-FP8 , etc.
MPTForCausalLM	MPT, MPT-Instruct, MPT-Chat, MPT-StoryWriter	mosaicml/mpt-7b , mosaicml/mpt-7b-storywriter , mosaicml/mpt-30b , etc.
MixtralForCausalLM	Mixtral-8x7B, Mixtral-8x7B-Instruct	mistralai/Mixtral-8x7B-v0.1 , mistralai/Mixtral-8x7B-Instruct-v0.1 , mistral-community/Mixtral-8x22B-v0.1 , etc.
MistralForCausalLM	Mistral, Mistral-Instruct	mistralai/Mistral-7B-v0.1 , mistralai/Mistral-7B-Instruct-v0.1 , etc.
MiniCPM3ForCausalLM	MiniCPM3	openbmb/MiniCPM3-4B , etc.
MiniCPMForCausalLM	MiniCPM	openbmb/MiniCPM-2B-sft-bf16, openbmb/MiniCPM-2B-dpo-bf16, openbmb/MiniCPM-S-1B-sft, etc.
Mamba2ForCausalLM	Mamba2	mistralai/Mamba-Codestral-7B-v0.1, etc.
MambaForCausalLM	Mamba	state-spaces/mamba-130m-hf , state-spaces/mamba-790m-hf , state-spaces/mamba-2.8b-hf , etc.
LlamaForCausalLM	Llama 3.1, Llama 3, Llama 2, LLaMA, Yi	meta-llama/Meta-Llama-3.1-405B-Instruct , meta-llama/Meta-Llama-3.1-70B , meta-llama/Meta-Llama-3-70B-Instruct , meta-llama/Llama-2-70b-hf , 01-ai/Yi-34B , etc.
JambaForCausalLM	Jamba	ai21labs/AI21-Jamba-1.5-Large , ai21labs/AI21-Jamba-1.5-Mini , ai21labs/Jamba-v0.1 , etc.
JAISLMHeadModel	Jais	inceptionai/jais-13b , inceptionai/jais-13b-chat , inceptionai/jais-30b-v3 , inceptionai/jais-30b-chat-v3 , etc.
InternLMForCausalLM	InternLM	internlm/internlm-7b , internlm/internlm-chat-7b , etc.
InternLM3ForCausalLM	InternLM3	internlm/internlm3-8b-instruct , etc.
InternLM2ForCausalLM	InternLM2	internlm/internlm2-7b , internlm/internlm2-chat-7b , etc.
HunYuanMoEV1ForCausalLM	Hunyuan-80B-A13B	tencent/Hunyuan-A13B-Instruct, tencent/Hunyuan-A13B-Pretrain, tencent/Hunyuan-A13B-Instruct-FP8etc.
Grok1ModelForCausalLM	Grok1	hpcai-tech/grok-1.
GritLM	GritLM	parasail-ai/GritLM-7B-vllm .
GraniteMoeSharedForCausalLM	Granite MoE Shared	ibm-research/moe-7b-1b-active-shared-experts (test model)
GraniteMoeHybridForCausalLM	Granite 4.0 MoE Hybrid	ibm-granite/granite-4.0-tiny-preview, etc.
GraniteMoeForCausalLM	Granite 3.0 MoE, PowerMoE	ibm-granite/granite-3.0-1b-a400m-base , ibm-granite/granite-3.0-3b-a800m-instruct , ibm/PowerMoE-3b , etc.
GraniteForCausalLM	Granite 3.0, Granite 3.1, PowerLM	ibm-granite/granite-3.0-2b-base , ibm-granite/granite-3.1-8b-instruct , ibm/PowerLM-3b , etc.
GPTNeoXForCausalLM	GPT-NeoX, Pythia, OpenAssistant, Dolly V2, StableLM	EleutherAI/gpt-neox-20b , EleutherAI/pythia-12b , OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5 , databricks/dolly-v2-12b , stabilityai/stablelm-tuned-alpha-7b , etc.
GPTJForCausalLM	GPT-J	EleutherAI/gpt-j-6b , nomic-ai/gpt4all-j , etc.
GPTBigCodeForCausalLM	StarCoder, SantaCoder, WizardCoder	bigcode/starcoder , bigcode/gpt_bigcode-santacoder , WizardLM/WizardCoder-15B-V1.0 , etc.
GPT2LMHeadModel	GPT-2	gpt2 , gpt2-xl , etc.
Glm4ForCausalLM	GLM-4-0414	THUDM/GLM-4-32B-0414, etc.
GlmForCausalLM	GLM-4	THUDM/glm-4-9b-chat-hf , etc.
Gemma3nForConditionalGeneration	Gemma 3n	google/gemma-3n-E2B-it, google/gemma-3n-E4B-it, etc.
Gemma3ForCausalLM	Gemma 3	google/gemma-3-1b-it, etc.
Gemma2ForCausalLM	Gemma 2	google/gemma-2-9b, google/gemma-2-27b, etc.
GemmaForCausalLM	Gemma	google/gemma-2b , google/gemma-7b , etc.
FalconH1ForCausalLM	Falcon-H1	tiiuae/Falcon-H1-34B-Base, tiiuae/Falcon-H1-34B-Instruct, etc.
FalconMambaForCausalLM	FalconMamba	tiiuae/falcon-mamba-7b , tiiuae/falcon-mamba-7b-instruct , etc.
FalconForCausalLM	Falcon	tiiuae/falcon-7b , tiiuae/falcon-40b , tiiuae/falcon-rw-7b , etc.
ExaoneForCausalLM	EXAONE-3	LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct , etc.
Ernie4_5_MoeForCausalLM	Ernie4.5MoE	baidu/ERNIE-4.5-21B-A3B-PT, baidu/ERNIE-4.5-300B-A47B-PT, etc.
Ernie4_5_ForCausalLM	Ernie4.5	baidu/ERNIE-4.5-0.3B-PT,etc.
Dots1ForCausalLM	dots.llm1	rednote-hilab/dots.llm1.base, rednote-hilab/dots.llm1.inst etc.
DeepseekV3ForCausalLM	DeepSeek-V3	deepseek-ai/DeepSeek-V3-Base , deepseek-ai/DeepSeek-V3 etc.
DeepseekV2ForCausalLM	DeepSeek-V2	deepseek-ai/DeepSeek-V2 , deepseek-ai/DeepSeek-V2-Chat etc.
DeepseekForCausalLM	DeepSeek	deepseek-ai/deepseek-llm-67b-base , deepseek-ai/deepseek-llm-7b-chat etc.
DeciLMForCausalLM	DeciLM	Deci/DeciLM-7B , Deci/DeciLM-7B-instruct , etc.
DbrxForCausalLM	DBRX	databricks/dbrx-base , databricks/dbrx-instruct , etc.
CohereForCausalLM , Cohere2ForCausalLM	Command-R	CohereForAI/c4ai-command-r-v01 , CohereForAI/c4ai-command-r7b-12-2024 , etc.
ChatGLMModel, ChatGLMForConditionalGeneration	ChatGLM	THUDM/chatglm2-6b , THUDM/chatglm3-6b , etc.
BloomForCausalLM	BLOOM, BLOOMZ, BLOOMChat	bigscience/bloom , bigscience/bloomz , etc.
BartForConditionalGeneration	BART	facebook/bart-base , facebook/bart-large-cnn , etc.
BambaForCausalLM	Bamba	ibm-ai-platform/Bamba-9B-fp8, ibm-ai-platform/Bamba-9B
BaiChuanForCausalLM	Baichuan2, Baichuan	baichuan-inc/Baichuan2-13B-Chat , baichuan-inc/Baichuan-7B , etc.
ArcticForCausalLM	Arctic	Snowflake/snowflake-arctic-base , Snowflake/snowflake-arctic-instruct , etc.
AquilaForCausalLM	Aquila, Aquila2	BAAI/Aquila-7B , BAAI/AquilaChat-7B , etc.

表35 纯文本语言模型 | 池化模型 | 文本嵌入 ( --task embed)
架构	模型	HuggingFace模型示例
BertModel	BERT-based	BAAI/bge-base-en-v1.5 , etc.
Gemma2Model	Gemma2-based	BAAI/bge-multilingual-gemma2 , etc.
GritLM	GritLM	parasail-ai/GritLM-7B-vllm.
GteModel	Arctic-Embed-2.0-M	Snowflake/snowflake-arctic-embed-m-v2.0.
GteNewModel	mGTE-TRM	Alibaba-NLP/gte-multilingual-base, etc.
ModernBertModel	ModernBERT-based	Alibaba-NLP/gte-modernbert-base, etc.
NomicBertModel	Nomic BERT	nomic-ai/nomic-embed-text-v1, nomic-ai/nomic-embed-text-v2-moe, Snowflake/snowflake-arctic-embed-m-long, etc.
LlamaModel , LlamaForCausalLM , MistralModel , etc.	Llama-based	intfloat/e5-mistral-7b-instruct , etc.
Qwen2Model , Qwen2ForCausalLM	Qwen2-based	ssmits/Qwen2-7B-Instruct-embed-base (see note), Alibaba-NLP/gte-Qwen2-7B-instruct (see note), etc.
RobertaModel, RobertaForMaskedLM	RoBERTa-based	sentence-transformers/all-roberta-large-v1 , sentence-transformers/all-roberta-large-v1 , etc.
Qwen3Model, Qwen3ForCausalLM	Qwen3-based	Qwen/Qwen3-Embedding-0.6B, etc.

表36 纯文本语言模型 | 池化模型 | 奖励模型 ( --task reward)
架构	模型	HuggingFace模型示例
InternLM2ForRewardModel	InternLM2-based	internlm/internlm2-1_8b-reward , internlm/internlm2-7b-reward , etc.
LlamaForCausalLM	Llama-based	peiyi9979/math-shepherd-mistral-7b-prm , etc.
Qwen2ForRewardModel	Qwen2-based	Qwen/Qwen2.5-Math-RM-72B , etc.
Qwen2ForProcessRewardModel	Qwen2-based	Qwen/Qwen2.5-Math-PRM-7B , Qwen/Qwen2.5-Math-PRM-72B , etc.

表37 纯文本语言模型 | 池化模型 | 分类 ( --task classify)
架构	模型	HuggingFace模型示例
JambaForSequenceClassification	Jamba	ai21labs/Jamba-tiny-reward-dev , etc.
Qwen2ForSequenceClassification	Qwen2-based	jason9693/Qwen2.5-1.5B-apeach , etc.

表38 纯文本语言模型 | 池化模型 | 句子对评分 ( --task score)
架构	模型	HuggingFace模型示例
BertForSequenceClassification	BERT-based	cross-encoder/ms-marco-MiniLM-L-6-v2 , etc.
Qwen2ForSequenceClassification	Qwen2-based	mixedbread-ai/mxbai-rerank-base-v2, etc.
Qwen3ForSequenceClassification	Qwen3-based	tomaarsen/Qwen3-Reranker-0.6B-seq-cls, Qwen/Qwen3-Reranker-0.6B, etc.
RobertaForSequenceClassification	RoBERTa-based	cross-encoder/quora-roberta-base , etc.
XLMRobertaForSequenceClassification	XLM-RoBERTa-based	BAAI/bge-reranker-v2-m3 , etc.

表39 多模态模型 | 生成模型 | 文本生成
架构	模型	输入	HuggingFace模型示例	说明
AriaForConditionalGeneration	Aria	T + I⁺	rhymes-ai/Aria	模态说明： Text：文本 Image：图片 Video：视频 Audio：音频特殊字符含义： +：支持同时输入两种模态。例如 T+I 表示：支持纯文本输入、纯图片输入，或文本+图片输入 /：支持多种模态，但多种模态不可同时使用。例如 T/I表示：支持纯文本输入或纯图片输入，不支持文本+图片输入 ^E：该模态下，支持输入预计算的嵌入 ⁺ ：该模态下，每个文本 Prompt 支持输入多条
AyaVisionForConditionalGeneration	Aya Vision	T + I⁺	CohereForAI/aya-vision-8b, CohereForAI/aya-vision-32b, etc.
Blip2ForConditionalGeneration	BLIP-2	T + I^E	Salesforce/blip2-opt-2.7b, Salesforce/blip2-opt-6.7b, etc.
ChameleonForConditionalGeneration	Chameleon	T + I	facebook/chameleon-7b etc.
DeepseekVLV2ForCausalLM^	DeepSeek-VL2	T + I⁺	deepseek-ai/deepseek-vl2-tiny, deepseek-ai/deepseek-vl2-small, deepseek-ai/deepseek-vl2 etc.
Florence2ForConditionalGeneration	Florence-2	T + I	microsoft/Florence-2-base, microsoft/Florence-2-large etc.
FuyuForCausalLM	Fuyu	T + I	adept/fuyu-8b etc.
Gemma3ForConditionalGeneration	Gemma 3	T + I⁺	`google/gemma-3-4b-it`, `google/gemma-3-27b-it`, etc.
GLM4VForCausalLM^	GLM-4V	T + I	THUDM/glm-4v-9b, THUDM/cogagent-9b-20241220 etc.
Glm4vForConditionalGeneration	GLM-4.1V-Thinking	T + I^E+ + V^E+	THUDM/GLM-4.1V-9B-Thinkg, etc.
GraniteSpeechForConditionalGeneration	Granite Speech	T + A	ibm-granite/granite-speech-3.3-8b
H2OVLChatModel	H2OVL	T + I^E+	h2oai/h2ovl-mississippi-800m, h2oai/h2ovl-mississippi-2b, etc.
Idefics3ForConditionalGeneration	Idefics3	T + I	HuggingFaceM4/Idefics3-8B-Llama3 etc.
InternVLChatModel	InternVL 2.5, Mono-InternVL, InternVL 2.0	T + I^E+	OpenGVLab/InternVL2_5-4B, OpenGVLab/Mono-InternVL-2B, OpenGVLab/InternVL2-4B, etc.
KeyeForConditionalGeneration	Keye-VL-8B-Preview	T + I^E+ + V^E+	Kwai-Keye/Keye-VL-8B-Preview
KimiVLForConditionalGeneration	Kimi-VL-A3B-Instruct, Kimi-VL-A3B-Thinking	T + I⁺	moonshotai/Kimi-VL-A3B-Instruct, moonshotai/Kimi-VL-A3B-Thinking
Llama4ForConditionalGeneration	Llama 4	T + I⁺	meta-llama/Llama-4-Scout-17B-16E-Instruct, meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8, meta-llama/Llama-4-Maverick-17B-128E-Instruct, etc.
LlavaForConditionalGeneration	LLaVA-1.5	T + I^E+	llava-hf/llava-1.5-7b-hf, TIGER-Lab/Mantis-8B-siglip-llama3 (see note), etc.
LlavaNextForConditionalGeneration	LLaVA-NeXT	T + I^E+	llava-hf/llava-v1.6-mistral-7b-hf, llava-hf/llava-v1.6-vicuna-7b-hf, etc.
LlavaNextVideoForConditionalGeneration	LLaVA-NeXT-Video	T + V	llava-hf/LLaVA-NeXT-Video-7B-hf, etc.
LlavaOnevisionForConditionalGeneration	LLaVA-Onevision	T + I⁺ + V⁺	llava-hf/llava-onevision-qwen2-7b-ov-hf, llava-hf/llava-onevision-qwen2-0.5b-ov-hf, etc.
MiniCPMO	MiniCPM-O	T + I^E+ + V^E+ + A^E+	openbmb/MiniCPM-o-2_6, etc.
MiniCPMV	MiniCPM-V	T + I^E+ + V^E+	openbmb/MiniCPM-V-2 (see note), openbmb/MiniCPM-Llama3-V-2_5, openbmb/MiniCPM-V-2_6, etc.
MiniMaxVL01ForConditionalGeneration	MiniMax-VL	T + I^E+	MiniMaxAI/MiniMax-VL-01, etc.
Mistral3ForConditionalGeneration	Mistral3	T + I⁺	mistralai/Mistral-Small-3.1-24B-Instruct-2503, etc.
MllamaForConditionalGeneration	Llama 3.2	T + I⁺	meta-llama/Llama-3.2-90B-Vision-Instruct, meta-llama/Llama-3.2-11B-Vision, etc.
MolmoForCausalLM	Molmo	T + I	allenai/Molmo-7B-D-0924, allenai/Molmo-72B-0924, etc.
NVLM_D_Model	NVLM-D 1.0	T + I^E+	nvidia/NVLM-D-72B, etc.
Ovis	Ovis2, Ovis1.6	T + I⁺	AIDC-AI/Ovis2-1B, AIDC-AI/Ovis1.6-Llama3.2-3B, etc.
PaliGemmaForConditionalGeneration	PaliGemma, PaliGemma 2	T + I^E	google/paligemma-3b-pt-224, google/paligemma-3b-mix-224, google/paligemma2-3b-ft-docci-448, etc.
Phi3VForCausalLM	Phi-3-Vision, Phi-3.5-Vision	T + I^E+	microsoft/Phi-3-vision-128k-instruct, microsoft/Phi-3.5-vision-instruct, etc.
Phi4MMForCausalLM	Phi-4-multimodal	T + I⁺ / T + A⁺/ I+ + A⁺	microsoft/Phi-4-multimodal-instruct, etc.
PixtralForConditionalGeneration	Pixtral	T + I⁺	mistralai/Pixtral-12B-2409, mistral-community/pixtral-12b (see note), etc.
QwenVLForConditionalGeneration^	Qwen-VL	T + I^E+	Qwen/Qwen-VL, Qwen/Qwen-VL-Chat, etc.
Qwen2AudioForConditionalGeneration	Qwen2-Audio	T + A⁺	Qwen/Qwen2-Audio-7B-Instruct
Qwen2VLForConditionalGeneration	QVQ, Qwen2-VL	T + I^E+ + V^E+	Qwen/QVQ-72B-Preview, Qwen/Qwen2-VL-7B-Instruct, Qwen/Qwen2-VL-72B-Instruct, etc.
Qwen2_5_VLForConditionalGeneration	Qwen2.5-VL	T + I^E+ + V^E+	Qwen/Qwen2.5-VL-3B-Instruct, Qwen/Qwen2.5-VL-72B-Instruct, etc.
Qwen2_5OmniThinkerForConditionalGeneration	Qwen2.5-Omni	T + I^E+ + V^E+ + A⁺	Qwen/Qwen2.5-Omni-7B
SkyworkR1VChatModel	Skywork-R1V-38B	T + I	Skywork/Skywork-R1V-38B
SmolVLMForConditionalGeneration	SmolVLM2	T + I	SmolVLM2-2.2B-Instruct
TarsierForConditionalGeneration	Tarsier	T + I^E+	omni-search/Tarsier-7b, omni-search/Tarsier-34b
Tarsier2ForConditionalGeneration^	Tarsier2	T + I^E+ + V^E+	omni-research/Tarsier2-Recap-7b,omni-research/Tarsier2-7b-0115

表40 多模态模型 | 生成模型 | 文本生成 (--task generate)
架构	模型	输入	HuggingFace模型示例	说明
AriaForConditionalGeneration	Aria	T + I⁺	rhymes-ai/Aria	模态说明： Text：文本 Image：图片 Video：视频 Audio：音频特殊字符含义： +：支持同时输入两种模态。例如 T+I 表示：支持纯文本输入、纯图片输入，或文本+图片输入 /：支持多种模态，但多种模态不可同时使用。例如 T/I表示：支持纯文本输入或纯图片输入，不支持文本+图片输入 ^E：该模态下，支持输入预计算的嵌入 ⁺ ：该模态下，每个文本 Prompt 支持输入多条
AyaVisionForConditionalGeneration	Aya Vision	T + I⁺	CohereForAI/aya-vision-8b, CohereForAI/aya-vision-32b, etc.
Blip2ForConditionalGeneration	BLIP-2	T + I^E	Salesforce/blip2-opt-2.7b, Salesforce/blip2-opt-6.7b, etc.
ChameleonForConditionalGeneration	Chameleon	T + I	facebook/chameleon-7b etc.
DeepseekVLV2ForCausalLM^	DeepSeek-VL2	T + I⁺	deepseek-ai/deepseek-vl2-tiny, deepseek-ai/deepseek-vl2-small, deepseek-ai/deepseek-vl2 etc.
Florence2ForConditionalGeneration	Florence-2	T + I	microsoft/Florence-2-base, microsoft/Florence-2-large etc.
FuyuForCausalLM	Fuyu	T + I	adept/fuyu-8b etc.
Gemma3ForConditionalGeneration	Gemma 3	T + I⁺	`google/gemma-3-4b-it`, `google/gemma-3-27b-it`, etc.
GLM4VForCausalLM^	GLM-4V	T + I	THUDM/glm-4v-9b, THUDM/cogagent-9b-20241220 etc.
Glm4vForConditionalGeneration	GLM-4.1V-Thinking	T + I^E+ + V^E+	THUDM/GLM-4.1V-9B-Thinkg, etc.
GraniteSpeechForConditionalGeneration	Granite Speech	T + A	ibm-granite/granite-speech-3.3-8b
H2OVLChatModel	H2OVL	T + I^E+	h2oai/h2ovl-mississippi-800m, h2oai/h2ovl-mississippi-2b, etc.
Idefics3ForConditionalGeneration	Idefics3	T + I	HuggingFaceM4/Idefics3-8B-Llama3 etc.
InternVLChatModel	InternVL 2.5, Mono-InternVL, InternVL 2.0	T + I^E+	OpenGVLab/InternVL2_5-4B, OpenGVLab/Mono-InternVL-2B, OpenGVLab/InternVL2-4B, etc.
KeyeForConditionalGeneration	Keye-VL-8B-Preview	T + I^E+ + V^E+	Kwai-Keye/Keye-VL-8B-Preview
KimiVLForConditionalGeneration	Kimi-VL-A3B-Instruct, Kimi-VL-A3B-Thinking	T + I⁺	moonshotai/Kimi-VL-A3B-Instruct, moonshotai/Kimi-VL-A3B-Thinking
Llama4ForConditionalGeneration	Llama 4	T + I⁺	meta-llama/Llama-4-Scout-17B-16E-Instruct, meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8, meta-llama/Llama-4-Maverick-17B-128E-Instruct, etc.
LlavaForConditionalGeneration	LLaVA-1.5	T + I^E+	llava-hf/llava-1.5-7b-hf, TIGER-Lab/Mantis-8B-siglip-llama3 (see note), etc.
LlavaNextForConditionalGeneration	LLaVA-NeXT	T + I^E+	llava-hf/llava-v1.6-mistral-7b-hf, llava-hf/llava-v1.6-vicuna-7b-hf, etc.
LlavaNextVideoForConditionalGeneration	LLaVA-NeXT-Video	T + V	llava-hf/LLaVA-NeXT-Video-7B-hf, etc.
LlavaOnevisionForConditionalGeneration	LLaVA-Onevision	T + I⁺ + V⁺	llava-hf/llava-onevision-qwen2-7b-ov-hf, llava-hf/llava-onevision-qwen2-0.5b-ov-hf, etc.
MiniCPMO	MiniCPM-O	T + I^E+ + V^E+ + A^E+	openbmb/MiniCPM-o-2_6, etc.
MiniCPMV	MiniCPM-V	T + I^E+ + V^E+	openbmb/MiniCPM-V-2 (see note), openbmb/MiniCPM-Llama3-V-2_5, openbmb/MiniCPM-V-2_6, etc.
MiniMaxVL01ForConditionalGeneration	MiniMax-VL	T + I^E+	MiniMaxAI/MiniMax-VL-01, etc.
Mistral3ForConditionalGeneration	Mistral3	T + I⁺	mistralai/Mistral-Small-3.1-24B-Instruct-2503, etc.
MllamaForConditionalGeneration	Llama 3.2	T + I⁺	meta-llama/Llama-3.2-90B-Vision-Instruct, meta-llama/Llama-3.2-11B-Vision, etc.
MolmoForCausalLM	Molmo	T + I	allenai/Molmo-7B-D-0924, allenai/Molmo-72B-0924, etc.
NVLM_D_Model	NVLM-D 1.0	T + I^E+	nvidia/NVLM-D-72B, etc.
Ovis	Ovis2, Ovis1.6	T + I⁺	AIDC-AI/Ovis2-1B, AIDC-AI/Ovis1.6-Llama3.2-3B, etc.
PaliGemmaForConditionalGeneration	PaliGemma, PaliGemma 2	T + I^E	google/paligemma-3b-pt-224, google/paligemma-3b-mix-224, google/paligemma2-3b-ft-docci-448, etc.
Phi3VForCausalLM	Phi-3-Vision, Phi-3.5-Vision	T + I^E+	microsoft/Phi-3-vision-128k-instruct, microsoft/Phi-3.5-vision-instruct, etc.
Phi4MMForCausalLM	Phi-4-multimodal	T + I⁺ / T + A⁺/ I+ + A⁺	microsoft/Phi-4-multimodal-instruct, etc.
PixtralForConditionalGeneration	Pixtral	T + I⁺	mistralai/Pixtral-12B-2409, mistral-community/pixtral-12b (see note), etc.
QwenVLForConditionalGeneration^	Qwen-VL	T + I^E+	Qwen/Qwen-VL, Qwen/Qwen-VL-Chat, etc.
Qwen2AudioForConditionalGeneration	Qwen2-Audio	T + A⁺	Qwen/Qwen2-Audio-7B-Instruct
Qwen2VLForConditionalGeneration	QVQ, Qwen2-VL	T + I^E+ + V^E+	Qwen/QVQ-72B-Preview, Qwen/Qwen2-VL-7B-Instruct, Qwen/Qwen2-VL-72B-Instruct, etc.
Qwen2_5_VLForConditionalGeneration	Qwen2.5-VL	T + I^E+ + V^E+	Qwen/Qwen2.5-VL-3B-Instruct, Qwen/Qwen2.5-VL-72B-Instruct, etc.
Qwen2_5OmniThinkerForConditionalGeneration	Qwen2.5-Omni	T + I^E+ + V^E+ + A⁺	Qwen/Qwen2.5-Omni-7B
SkyworkR1VChatModel	Skywork-R1V-38B	T + I	Skywork/Skywork-R1V-38B
SmolVLMForConditionalGeneration	SmolVLM2	T + I	SmolVLM2-2.2B-Instruct
TarsierForConditionalGeneration	Tarsier	T + I^E+	omni-search/Tarsier-7b, omni-search/Tarsier-34b
Tarsier2ForConditionalGeneration^	Tarsier2	T + I^E+ + V^E+	omni-research/Tarsier2-Recap-7b,omni-research/Tarsier2-7b-0115

表41 多模态模型 | 生成模型 | 文本转换 (--task transcription)
架构	模型	HuggingFace模型示例
WhisperForConditionalGeneration	Whisper	openai/whisper-small, openai/whisper-large-v3-turbo, etc.

表42 多模态模型 | 池化模型 | 文本嵌入 (--task embed)
架构	模型	输入	HuggingFace模型示例	说明
LlavaNextForConditionalGeneration	LLaVA-NeXT-based	T / I	royokong/e5-v	模态说明： Text：文本 Image：图片 Video：视频 Audio：音频特殊字符含义： +：支持同时输入两种模态。例如 T+I 表示：支持纯文本输入、纯图片输入，或文本+图片输入 /：支持多种模态，但多种模态不可同时使用。例如 T/I表示：支持纯文本输入或纯图片输入，不支持文本+图片输入
Phi3VForCausalLM	Phi-3-Vision-based	T + I	TIGER-Lab/VLM2Vec-Full

vLLM 0.8.5

以下列举该模板兼容的模型架构、名称和示例。如需进一步了解兼容列表中各类模型的使用方法和注意事项，可参考vLLM官方文档

表43 纯文本语言模型 | 生成模型 | 文本生成 ( --task generate)
架构	模型	HuggingFace模型示例
Zamba2ForCausalLM	Zamba2	Zyphra/Zamba2-7B-instruct, Zyphra/Zamba2-2.7B-instruct, Zyphra/Zamba2-1.2B-instruct, etc.
MiniMaxText01ForCausalLM	MiniMax-Text	MiniMaxAI/MiniMax-Text-01, etc.
XverseForCausalLM	XVERSE	xverse/XVERSE-7B-Chat , xverse/XVERSE-13B-Chat , xverse/XVERSE-65B-Chat , etc.
TeleFLMForCausalLM	TeleFLM	CofeAI/FLM-2-52B-Instruct-2407, CofeAI/Tele-FLM, etc.
TeleChat2ForCausalLM	TeleChat2	TeleAI/TeleChat2-3B , TeleAI/TeleChat2-7B , TeleAI/TeleChat2-35B , etc.
Starcoder2ForCausalLM	Starcoder2	bigcode/starcoder2-3b , bigcode/starcoder2-7b , bigcode/starcoder2-15b , etc.
StableLmForCausalLM	StableLM	stabilityai/stablelm-3b-4e1t , stabilityai/stablelm-base-alpha-7b-v2 , etc.
SolarForCausalLM	Solar Pro	upstage/solar-pro-preview-instruct , etc.
QWenLMHeadModel	Qwen	Qwen/Qwen-7B , Qwen/Qwen-7B-Chat , etc.
Qwen2MoeForCausalLM	Qwen2MoE	Qwen/Qwen1.5-MoE-A2.7B , Qwen/Qwen1.5-MoE-A2.7B-Chat , etc.
Qwen2ForCausalLM	QwQ, Qwen2	Qwen/QwQ-32B-Preview , Qwen/Qwen2-7B-Instruct , Qwen/Qwen2-7B , etc.
Qwen3ForCausalLM	Qwen3	Qwen/Qwen3-8B, etc.
Qwen3MoeForCausalLM	Qwen3MoE	Qwen/Qwen3-MoE-15B-A2B, etc.
Plamo2ForCausalLM	PLaMo2	pfnet/plamo-2-1b, pfnet/plamo-2-8b, etc.
PersimmonForCausalLM	Persimmon	adept/persimmon-8b-base, adept/persimmon-8b-chat, etc.
PhiMoEForCausalLM	Phi-3.5-MoE	microsoft/Phi-3.5-MoE-instruct , etc.
PhiForCausalLM	Phi	microsoft/phi-1_5 , microsoft/phi-2 , etc.
Phi3SmallForCausalLM	Phi-3-Small	microsoft/Phi-3-small-8k-instruct , microsoft/Phi-3-small-128k-instruct , etc.
Phi3ForCausalLM	Phi-4, Phi-3	microsoft/Phi-4 , microsoft/Phi-3-mini-4k-instruct , microsoft/Phi-3-mini-128k-instruct , microsoft/Phi-3-medium-128k-instruct , etc.
PersimmonForCausalLM	Persimmon	adept/persimmon-8b-base , adept/persimmon-8b-chat , etc.
OrionForCausalLM	Orion	OrionStarAI/Orion-14B-Base , OrionStarAI/Orion-14B-Chat , etc.
OPTForCausalLM	OPT, OPT-IML	facebook/opt-66b , facebook/opt-iml-max-30b , etc.
OlmoForCausalLM	OLMo	allenai/OLMo-1B-hf , allenai/OLMo-7B-hf , etc.
OlmoeForCausalLM	OLMoE	allenai/OLMoE-1B-7B-0924 , allenai/OLMoE-1B-7B-0924-Instruct , etc.
Olmo2ForCausalLM	OLMo2	allenai/OLMo2-7B-1124 , etc.
NemotronForCausalLM	Nemotron-3, Nemotron-4, Minitron	nvidia/Minitron-8B-Base , mgoin/Nemotron-4-340B-Base-hf-FP8 , etc.
MPTForCausalLM	MPT, MPT-Instruct, MPT-Chat, MPT-StoryWriter	mosaicml/mpt-7b , mosaicml/mpt-7b-storywriter , mosaicml/mpt-30b , etc.
MixtralForCausalLM	Mixtral-8x7B, Mixtral-8x7B-Instruct	mistralai/Mixtral-8x7B-v0.1 , mistralai/Mixtral-8x7B-Instruct-v0.1 , mistral-community/Mixtral-8x22B-v0.1 , etc.
MistralForCausalLM	Mistral, Mistral-Instruct	mistralai/Mistral-7B-v0.1 , mistralai/Mistral-7B-Instruct-v0.1 , etc.
MiniCPM3ForCausalLM	MiniCPM3	openbmb/MiniCPM3-4B , etc.
MiniCPMForCausalLM	MiniCPM	openbmb/MiniCPM-2B-sft-bf16, openbmb/MiniCPM-2B-dpo-bf16, openbmb/MiniCPM-S-1B-sft, etc.
MambaForCausalLM	Mamba	state-spaces/mamba-130m-hf , state-spaces/mamba-790m-hf , state-spaces/mamba-2.8b-hf , etc.
LlamaForCausalLM	Llama 3.1, Llama 3, Llama 2, LLaMA, Yi	meta-llama/Meta-Llama-3.1-405B-Instruct , meta-llama/Meta-Llama-3.1-70B , meta-llama/Meta-Llama-3-70B-Instruct , meta-llama/Llama-2-70b-hf , 01-ai/Yi-34B , etc.
JambaForCausalLM	Jamba	ai21labs/AI21-Jamba-1.5-Large , ai21labs/AI21-Jamba-1.5-Mini , ai21labs/Jamba-v0.1 , etc.
JAISLMHeadModel	Jais	inceptionai/jais-13b , inceptionai/jais-13b-chat , inceptionai/jais-30b-v3 , inceptionai/jais-30b-chat-v3 , etc.
InternLMForCausalLM	InternLM	internlm/internlm-7b , internlm/internlm-chat-7b , etc.
InternLM3ForCausalLM	InternLM3	internlm/internlm3-8b-instruct , etc.
InternLM2ForCausalLM	InternLM2	internlm/internlm2-7b , internlm/internlm2-chat-7b , etc.
Grok1ModelForCausalLM	Grok1	hpcai-tech/grok-1.
GritLM	GritLM	parasail-ai/GritLM-7B-vllm.
GritLM	GritLM	parasail-ai/GritLM-7B-vllm .
GraniteMoeSharedForCausalLM	Granite MoE Shared	ibm-research/moe-7b-1b-active-shared-experts (test model)
GraniteMoeForCausalLM	Granite 3.0 MoE, PowerMoE	ibm-granite/granite-3.0-1b-a400m-base , ibm-granite/granite-3.0-3b-a800m-instruct , ibm/PowerMoE-3b , etc.
GraniteForCausalLM	Granite 3.0, Granite 3.1, PowerLM	ibm-granite/granite-3.0-2b-base , ibm-granite/granite-3.1-8b-instruct , ibm/PowerLM-3b , etc.
GPTNeoXForCausalLM	GPT-NeoX, Pythia, OpenAssistant, Dolly V2, StableLM	EleutherAI/gpt-neox-20b , EleutherAI/pythia-12b , OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5 , databricks/dolly-v2-12b , stabilityai/stablelm-tuned-alpha-7b , etc.
GPTJForCausalLM	GPT-J	EleutherAI/gpt-j-6b , nomic-ai/gpt4all-j , etc.
GPTBigCodeForCausalLM	StarCoder, SantaCoder, WizardCoder	bigcode/starcoder , bigcode/gpt_bigcode-santacoder , WizardLM/WizardCoder-15B-V1.0 , etc.
GPT2LMHeadModel	GPT-2	gpt2 , gpt2-xl , etc.
Glm4ForCausalLM	GLM-4-0414	THUDM/GLM-4-32B-0414, etc.
GlmForCausalLM	GLM-4	THUDM/glm-4-9b-chat-hf , etc.
Gemma3ForCausalLM	Gemma 3	google/gemma-3-1b-it, etc.
Gemma2ForCausalLM	Gemma 2	google/gemma-2-9b, google/gemma-2-27b, etc.
GemmaForCausalLM	Gemma	google/gemma-2b , google/gemma-7b , etc.
FalconMambaForCausalLM	FalconMamba	tiiuae/falcon-mamba-7b , tiiuae/falcon-mamba-7b-instruct , etc.
FalconForCausalLM	Falcon	tiiuae/falcon-7b , tiiuae/falcon-40b , tiiuae/falcon-rw-7b , etc.
ExaoneForCausalLM	EXAONE-3	LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct , etc.
DeepseekV3ForCausalLM	DeepSeek-V3	deepseek-ai/DeepSeek-V3-Base , deepseek-ai/DeepSeek-V3 etc.
DeepseekV2ForCausalLM	DeepSeek-V2	deepseek-ai/DeepSeek-V2 , deepseek-ai/DeepSeek-V2-Chat etc.
DeepseekForCausalLM	DeepSeek	deepseek-ai/deepseek-llm-67b-base , deepseek-ai/deepseek-llm-7b-chat etc.
DeciLMForCausalLM	DeciLM	Deci/DeciLM-7B , Deci/DeciLM-7B-instruct , etc.
DbrxForCausalLM	DBRX	databricks/dbrx-base , databricks/dbrx-instruct , etc.
CohereForCausalLM , Cohere2ForCausalLM	Command-R	CohereForAI/c4ai-command-r-v01 , CohereForAI/c4ai-command-r7b-12-2024 , etc.
ChatGLMModel, ChatGLMForConditionalGeneration	ChatGLM	THUDM/chatglm2-6b , THUDM/chatglm3-6b , etc.
BloomForCausalLM	BLOOM, BLOOMZ, BLOOMChat	bigscience/bloom , bigscience/bloomz , etc.
BartForConditionalGeneration	BART	facebook/bart-base , facebook/bart-large-cnn , etc.
BambaForCausalLM	Bamba	ibm-ai-platform/Bamba-9B-fp8, ibm-ai-platform/Bamba-9B
BaiChuanForCausalLM	Baichuan2, Baichuan	baichuan-inc/Baichuan2-13B-Chat , baichuan-inc/Baichuan-7B , etc.
ArcticForCausalLM	Arctic	Snowflake/snowflake-arctic-base , Snowflake/snowflake-arctic-instruct , etc.
AquilaForCausalLM	Aquila, Aquila2	BAAI/Aquila-7B , BAAI/AquilaChat-7B , etc.

表44 纯文本语言模型 | 池化模型 | 文本嵌入 ( --task embed)
架构	模型	HuggingFace模型示例
BertModel	BERT-based	BAAI/bge-base-en-v1.5 , etc.
Gemma2Model	Gemma2-based	BAAI/bge-multilingual-gemma2 , etc.
GritLM	GritLM	parasail-ai/GritLM-7B-vllm .
LlamaModel , LlamaForCausalLM , MistralModel , etc.	Llama-based	intfloat/e5-mistral-7b-instruct , etc.
Qwen2Model , Qwen2ForCausalLM	Qwen2-based	ssmits/Qwen2-7B-Instruct-embed-base (see note), Alibaba-NLP/gte-Qwen2-7B-instruct (see note), etc.
RobertaModel , RobertaForMaskedLM	RoBERTa-based	sentence-transformers/all-roberta-large-v1 , sentence-transformers/all-roberta-large-v1 , etc.
XLMRobertaModel	XLM-RoBERTa-based	intfloat/multilingual-e5-large , etc.

表45 纯文本语言模型 | 池化模型 | 奖励建模 ( --task reward)
架构	模型	HuggingFace模型示例
InternLM2ForRewardModel	InternLM2-based	internlm/internlm2-1_8b-reward , internlm/internlm2-7b-reward , etc.
LlamaForCausalLM	Llama-based	peiyi9979/math-shepherd-mistral-7b-prm , etc.
Qwen2ForRewardModel	Qwen2-based	Qwen/Qwen2.5-Math-RM-72B , etc.
Qwen2ForProcessRewardModel	Qwen2-based	Qwen/Qwen2.5-Math-PRM-7B , Qwen/Qwen2.5-Math-PRM-72B , etc.

表46 纯文本语言模型 | 池化模型 | 分类 ( --task classify)
架构	模型	HuggingFace模型示例
JambaForSequenceClassification	Jamba	ai21labs/Jamba-tiny-reward-dev , etc.
Qwen2ForSequenceClassification	Qwen2-based	jason9693/Qwen2.5-1.5B-apeach , etc.

表47 纯文本语言模型 | 池化模型 | 句子对评分 ( --task score)
架构	模型	HuggingFace模型示例
BertForSequenceClassification	BERT-based	cross-encoder/ms-marco-MiniLM-L-6-v2 , etc.
RobertaForSequenceClassification	RoBERTa-based	cross-encoder/quora-roberta-base , etc.
XLMRobertaForSequenceClassification	XLM-RoBERTa-based	BAAI/bge-reranker-v2-m3 , etc.
ModernBertForSequenceClassification	ModernBert-based	Alibaba-NLP/gte-reranker-modernbert-base, etc.

表48 多模态模型 | 生成模型 | 文本生成
架构	模型	输入	HuggingFace模型示例	说明
AriaForConditionalGeneration	Aria	T + I⁺	rhymes-ai/Aria	模态说明： Text：文本 Image：图片 Video：视频 Audio：音频特殊字符含义： +：支持同时输入两种模态。例如 T+I 表示：支持纯文本输入、纯图片输入，或文本+图片输入 /：支持多种模态，但多种模态不可同时使用。例如 T/I表示：支持纯文本输入或纯图片输入，不支持文本+图片输入 ^E：该模态下，支持输入预计算的嵌入 ⁺ ：该模态下，每个文本 Prompt 支持输入多条
AyaVisionForConditionalGeneration	Aya Vision	T + I⁺	CohereForAI/aya-vision-8b, CohereForAI/aya-vision-32b, etc.
Blip2ForConditionalGeneration	BLIP-2	T + I^E	Salesforce/blip2-opt-2.7b, Salesforce/blip2-opt-6.7b, etc.
ChameleonForConditionalGeneration	Chameleon	T + I	facebook/chameleon-7b etc.
DeepseekVLV2ForCausalLM	DeepSeek-VL2	T + I⁺	deepseek-ai/deepseek-vl2-tiny, deepseek-ai/deepseek-vl2-small, deepseek-ai/deepseek-vl2 etc.
Florence2ForConditionalGeneration	Florence-2	T + I	microsoft/Florence-2-base, microsoft/Florence-2-large etc.
FuyuForCausalLM	Fuyu	T + I	adept/fuyu-8b etc.
Gemma3ForConditionalGeneration	Gemma 3	T + I⁺	`google/gemma-3-4b-it`, `google/gemma-3-27b-it`, etc.
GLM4VForCausalLM^	GLM-4V	T + I	THUDM/glm-4v-9b, THUDM/cogagent-9b-20241220 etc.
GraniteSpeechForConditionalGeneration	Granite Speech	T + A	ibm-granite/granite-speech-3.3-8b
H2OVLChatModel	H2OVL	T + I^E+	h2oai/h2ovl-mississippi-800m, h2oai/h2ovl-mississippi-2b, etc.
Idefics3ForConditionalGeneration	Idefics3	T + I	HuggingFaceM4/Idefics3-8B-Llama3 etc.
InternVLChatModel	InternVL 2.5, Mono-InternVL, InternVL 2.0	T + I^E+	OpenGVLab/InternVL2_5-4B, OpenGVLab/Mono-InternVL-2B, OpenGVLab/InternVL2-4B, etc.
KimiVLForConditionalGeneration	Kimi-VL-A3B-Instruct, Kimi-VL-A3B-Thinking	T + I⁺	moonshotai/Kimi-VL-A3B-Instruct, moonshotai/Kimi-VL-A3B-Thinking
Llama4ForConditionalGeneration	Llama 4	T + I⁺	meta-llama/Llama-4-Scout-17B-16E-Instruct, meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8, meta-llama/Llama-4-Maverick-17B-128E-Instruct, etc.
LlavaForConditionalGeneration	LLaVA-1.5	T + I^E+	llava-hf/llava-1.5-7b-hf, TIGER-Lab/Mantis-8B-siglip-llama3 (see note), etc.
LlavaNextForConditionalGeneration	LLaVA-NeXT	T + I^E+	llava-hf/llava-v1.6-mistral-7b-hf, llava-hf/llava-v1.6-vicuna-7b-hf, etc.
LlavaNextVideoForConditionalGeneration	LLaVA-NeXT-Video	T + V	llava-hf/LLaVA-NeXT-Video-7B-hf, etc.
LlavaOnevisionForConditionalGeneration	LLaVA-Onevision	T + I⁺ + V⁺	llava-hf/llava-onevision-qwen2-7b-ov-hf, llava-hf/llava-onevision-qwen2-0.5b-ov-hf, etc.
MiniCPMO	MiniCPM-O	T + I^E+ + V^E+ + A^E+	openbmb/MiniCPM-o-2_6, etc.
MiniCPMV	MiniCPM-V	T + I^E+ + V^E+	openbmb/MiniCPM-V-2 (see note), openbmb/MiniCPM-Llama3-V-2_5, openbmb/MiniCPM-V-2_6, etc.
Mistral3ForConditionalGeneration	Mistral3	T + I⁺	mistralai/Mistral-Small-3.1-24B-Instruct-2503, etc.
MllamaForConditionalGeneration	Llama 3.2	T + I⁺	meta-llama/Llama-3.2-90B-Vision-Instruct, meta-llama/Llama-3.2-11B-Vision, etc.
MolmoForCausalLM	Molmo	T + I	allenai/Molmo-7B-D-0924, allenai/Molmo-72B-0924, etc.
NVLM_D_Model	NVLM-D 1.0	T + I^E+	nvidia/NVLM-D-72B, etc.
PaliGemmaForConditionalGeneration	PaliGemma, PaliGemma 2	T + I^E	google/paligemma-3b-pt-224, google/paligemma-3b-mix-224, google/paligemma2-3b-ft-docci-448, etc.
Phi3VForCausalLM	Phi-3-Vision, Phi-3.5-Vision	T + I^E+	microsoft/Phi-3-vision-128k-instruct, microsoft/Phi-3.5-vision-instruct, etc.
Phi4MMForCausalLM	Phi-4-multimodal	T + I⁺ / T + A⁺/ I+ + A⁺	microsoft/Phi-4-multimodal-instruct, etc.
PixtralForConditionalGeneration	Pixtral	T + I⁺	mistralai/Pixtral-12B-2409, mistral-community/pixtral-12b (see note), etc.
QwenVLForConditionalGeneration	Qwen-VL	T + I^E+	Qwen/Qwen-VL, Qwen/Qwen-VL-Chat, etc.
Qwen2AudioForConditionalGeneration	Qwen2-Audio	T + A⁺	Qwen/Qwen2-Audio-7B-Instruct
Qwen2VLForConditionalGeneration	QVQ, Qwen2-VL	T + I^E+ + V^E+	Qwen/QVQ-72B-Preview, Qwen/Qwen2-VL-7B-Instruct, Qwen/Qwen2-VL-72B-Instruct, etc.
Qwen2_5_VLForConditionalGeneration	Qwen2.5-VL	T + I^E+ + V^E+	Qwen/Qwen2.5-VL-3B-Instruct, Qwen/Qwen2.5-VL-72B-Instruct, etc.
Qwen2_5OmniThinkerForConditionalGeneration	Qwen2.5-Omni	T + I^E+ + V^E+ + A⁺	Qwen/Qwen2.5-Omni-7B
SkyworkR1VChatModel	Skywork-R1V-38B	T + I	Skywork/Skywork-R1V-38B
SmolVLMForConditionalGeneration	SmolVLM2	T + I	SmolVLM2-2.2B-Instruct
UltravoxModel	Ultravox	T + A^E+	fixie-ai/ultravox-v0_3

表49 多模态模型 | 池化模型 | 文本嵌入
架构	模型	输入	HuggingFace模型示例	说明
LlavaNextForConditionalGeneration	LLaVA-NeXT-based	T / I	royokong/e5-v	模态说明： Text：文本 Image：图片 Video：视频 Audio：音频特殊字符含义： +：支持同时输入两种模态。例如 T+I 表示：支持纯文本输入、纯图片输入，或文本+图片输入 /：支持多种模态，但多种模态不可同时使用。例如 T/I表示：支持纯文本输入或纯图片输入，不支持文本+图片输入
Phi3VForCausalLM	Phi-3-Vision-based	T + I	TIGER-Lab/VLM2Vec-Full
Qwen2VLForConditionalGeneration	Qwen2-VL-based	T + I	MrLight/dse-qwen2-2b-mrl-v1

vllm-ascend-v0.17.0rc1

以下列举该模板支持的模型名称。如需进一步了解兼容列表中各模型的使用方法和注意事项，可参考vllm-ascend官方文档


Models
Aria
Baichuan
Baichuan2
Bert
DeepSeek Distill (Qwen/Llama)
DeepSeek R1
DeepSeek V3.2
DeepSeek V3/3.1
Ernie4.5
Ernie4.5-Moe
Gemma-2
Gemma-3
Gemma3
GLM-4.x
GLM-5
Internlm
Kimi-K2-Thinking
DeepseekOCR2
MiniMax-M2.5
Llama2/3/3.1/3.2
Llama3.2
LLaVA-Next
LLaVA-Next-Video
MiniCPM
MiniCPM-V
MiniCPM3
Mistral/Mistral-Instruct
DeepSeek V2.5
Mllama
MiniMax-Text
Mistral3
Molmo
PaddleOCR-VL
Llama4
Keye-VL-8B-Preview
Florence-2
GLM-4V
InternVL2.0/2.5/3.0InternVideo2.5/Mono-InternVL
Whisper
Ultravox
Phi-3-Vision/Phi-3.5-Vision
Phi-3/4
Phi-4-mini
QVQ
Qwen2
Qwen2-Audio
Qwen2-based
Qwen2-VL
Qwen2.5
Qwen2.5-Omni
Qwen2.5-VL
Qwen3
Qwen3-based
Qwen3-Coder
Qwen3-Embedding
Qwen3-Moe
Qwen3-Next
Qwen3-Omni
Qwen3-Omni-30B-A3B-Thinking
Qwen3-Reranker
Qwen3-VL
Qwen3-VL-Embedding
Qwen3-VL-MOE
Qwen3.5-397B-A17B
Qwen3.5-27B
Qwen3-VL-Reranker
QwQ-32B
XLM-RoBERTa-based

vllm-ascend-v0.18.0

以下列举该模板支持的模型名称。如需进一步了解兼容列表中各模型的使用方法和注意事项，可参考vllm-ascend官方文档


Models
Aria
Baichuan
Baichuan2
Bert
DeepSeek Distill (Qwen/Llama)
DeepSeek R1
DeepSeek V3.2
DeepSeek V3/3.1
Ernie4.5
Ernie4.5-Moe
Gemma-2
Gemma-3
Gemma3
GLM-4.x
GLM-5
Internlm
Kimi-K2-Thinking
DeepseekOCR2
MiniMax-M2.5
MiniMax-M2.7
Llama2/3/3.1/3.2
Llama3.2
LLaVA-Next
LLaVA-Next-Video
MiniCPM
MiniCPM-V
MiniCPM3
Mistral/Mistral-Instruct
DeepSeek V2.5
Mllama
MiniMax-Text
Mistral3
Molmo
PaddleOCR-VL
Llama4
Keye-VL-8B-Preview
Florence-2
GLM-4V
InternVL2.0/2.5/3.0 InternVideo2.5/Mono-InternVL
Whisper
Ultravox
Phi-3-Vision/Phi-3.5-Vision
Phi-3/4
Phi-4-mini
QVQ
Qwen2
Qwen2-Audio
Qwen2-based
Qwen2-VL
Qwen2.5
Qwen2.5-Omni
Qwen2.5-VL
Qwen3
Qwen3-based
Qwen3-Coder
Qwen3-Embedding
Qwen3-Moe
Qwen3-Next
Qwen3-Omni
Qwen3-Omni-30B-A3B-Thinking
Qwen3-Reranker
Qwen3-VL
Qwen3-VL-Embedding
Qwen3-VL-MOE
Qwen3.5-397B-A17B
Qwen3.5-27B
Qwen3.5-35B-A3B
Qwen3.6-27B
Qwen3.6-35B-A3B
Qwen3-VL-Reranker
QwQ-32B
XLM-RoBERTa-based

Diffusers 0.37.0

以下列举该模板兼容模型名称。如需进一步了解兼容列表中各模型的使用方法和注意事项，可参考Diffuser官方文档


模型
AutoModel
ControlNetModel
ControlNetUnionModel
FluxControlNetModel
HunyuanDiT2DControlNetModel
SanaControlNetModel
SD3ControlNetModel
SparseControlNetModel
AllegroTransformer3DModel
AuraFlowTransformer2DModel
BriaFiboTransformer2DModel
BriaTransformer2DModel
ChromaTransformer2DModel
ChronoEditTransformer3DModel
CogVideoXTransformer3DModel
CogView3PlusTransformer2DModel
CogView4Transformer2DModel
ConsisIDTransformer3DModel
CosmosTransformer3DModel
DiTTransformer2DModel
EasyAnimateTransformer3DModel
FluxTransformer2DModel
Flux2Transformer2DModel
GlmImageTransformer2DModel
HeliosTransformer3DModel
HiDreamImageTransformer2DModel
HunyuanDiT2DModel
HunyuanImageTransformer2DModel
HunyuanVideo15Transformer3DModel
HunyuanVideoTransformer3DModel
LatteTransformer3DModel
LTX2VideoTransformer3DModel
LTXVideoTransformer3DModel
LongCatImageTransformer2DModel
Lumina2Transformer2DModel
LuminaNextDiT2DModel
MochiTransformer3DModel
OmniGenTransformer2DModel
OvisImageTransformer2DModel
PixArtTransformer2DModel
PriorTransformer
QwenImageTransformer2DModel
SanaTransformer2DModel
SanaVideoTransformer3DModel
SD3Transformer2DModel
SkyReelsV2Transformer3DModel
StableAudioDiTModel
Transformer2DModel
TransformerTemporalModel
WanAnimateTransformer3DModel
WanTransformer3DModel
ZImageTransformer2DModel
StableCascadeUNet
UNet1DModel
UNet2DConditionModel
UNet2DModel
UNet3DConditionModel
UNetMotionModel
UVit2DModel
AsymmetricAutoencoderKL
AutoencoderDC
AutoencoderKL
AutoencoderKLAllegro
AutoencoderKLCogVideoX
AutoencoderKLCosmos
AutoencoderKLHunyuanImage
AutoencoderKLHunyuanImageRefiner
AutoencoderKLHunyuanVideo
AutoencoderKLHunyuanVideo15
AutoencoderKLLTX2Audio
AutoencoderKLLTX2Video
AutoencoderKLLTXVideo
AutoencoderKLMagvit
AutoencoderKLMochi
AutoencoderKLQwenImage
AutoencoderKLWan
ConsistencyDecoderVAE
AutoencoderOobleck
AutoencoderRAE
AutoencoderTiny
VQModel

Transformers 5.3.0

以下列举该模板兼容模型名称。如需进一步了解兼容列表中各模型的使用方法和注意事项，可参考Transformers官方文档


模型
AFMoE
Aimv2
ALBERT
ALIGN
AltCLIP
Apertus
Arcee
Aria
Audio Spectrogram Transformer
AudioFlamingo3
Autoformer
AyaVision
Bamba
Bark
BART
BARThez
BARTpho
BEiT
BERT
Bert Generation
BertJapanese
BERTweet
BigBird
BigBird-Pegasus
BioGpt
BiT
BitNet
Blenderbot
BlenderbotSmall
BLIP
BLIP-2
BLOOM
BLT
BridgeTower
BROS
ByT5
CamemBERT
CANINE
Chameleon
Chinese-CLIP
CLAP
CLIP
CLIPSeg
CLVP
Code World Model (CWM)
CodeGen
CodeLlama
Cohere
Cohere2
Cohere2Vision
ColModernVBert
ColPali
ColQwen2
Conditional DETR
ConvBERT
ConvNeXT
ConvNeXTV2
CPM
CPM-Ant
CSM
CTRL
CvT
D-FINE
DAB-DETR
dac
Data2VecAudio
Data2VecText
Data2VecVision
DBRX
DeBERTa
DeBERTa-v2
Decision Transformer
DeepSeek-V2
DeepSeek-V3
DeepseekVL
DeepseekVLHybrid
Deformable DETR
DeiT
DePlot
Depth Anything
Depth Anything V2
DepthPro
DETR
Dia
DialoGPT
DiffLlama
DiNAT
DINOv2
DINOv2 with Registers
DINOv3
DistilBERT
DiT
Doge
DonutSwin
dots1
DPR
DPT
EdgeTAM
EdgeTamVideo
EfficientLoFTR
EfficientNet
ELECTRA
Emu3
EnCodec
Encoder decoder
EoMT
EoMT-DINOv3
ERNIE
Ernie4_5
Ernie4_5_MoE
ernie4_5_vl_moe
ESM
EuroBERT
Evolla
EXAONE-4.0
EXAONE-MoE
FairSeq Machine-Translation
Falcon
Falcon3
FalconH1
FalconMamba
FastSpeech2Conformer
FastVLM
FLAN-T5
FLAN-UL2
FlauBERT
FLAVA
FlexOlmo
Florence2
FNet
FocalNet
Funnel Transformer
Fuyu
Gemma
Gemma2
Gemma3
Gemma3n
GIT
GLM-4
GLM-4-0414
GLM-4.5, GLM-4.6, GLM-4.7
GLM-4.7-Flash
GLM-ASR
GLM-Image
Glm46V
glm4v
glm4v_moe
GlmMoeDsa
GlmOcr
GLPN
GOT-OCR2
GPT Neo
GPT NeoX
GPT NeoX Japanese
GPT-J
GPT-Sw3
GPTBigCode
GptOss
Granite
GraniteMoe
GraniteMoeHybrid
GraniteMoeShared
GraniteSpeech
GraniteVision
Grounding DINO
GroupViT
Helium
HerBERT
HGNet-V2
Hiera
Higgs Audio V2
Hubert
HunYuanDenseV1
HunYuanMoEV1
I-BERT
I-JEPA
IDEFICS
Idefics2
Idefics3
ImageGPT
Informer
InstructBLIP
InstructBlipVideo
InternVL
Jais2
Jamba
Janus
JetMoe
KOSMOS-2
KOSMOS-2.5
Kyutai Speech-To-Text
LASR
LayoutLM
LayoutLMv2
LayoutLMv3
LayoutXLM
LED
LeViT
LFM2
LFM2-VL
LFM2Moe
LightGlue
LightOnOcr
LiLT
LLaMA
Llama2
Llama3
Llama4
LLaVa
LLaVA-NeXT
LLaVa-NeXT-Video
LLaVA-Onevision
LongCatFlash
Longformer
LongT5
LUKE
LW-DETR
LXMERT
M2M100
MADLAD-400
Mamba
Mamba2
Marian
MarkupLM
Mask2Former
MaskFormer
MatCha
mBART
mBART-50
Megatron-BERT
Megatron-GPT2
MetaCLIP 2
MGP-STR
Mimi
MiniMax
MiniMax-M2
Ministral
Ministral3
Mistral
Mistral3
Mixtral
MLCD
mllama
mLUKE
MM Grounding DINO
MMS
MobileBERT
MobileNetV1
MobileNetV2
MobileViT
MobileViTV2
ModernBert
ModernBERTDecoder
ModernVBert
Moonshine
Moonshine Streaming
Moshi
MPNet
MPT
MRA
MT5
MusicGen
MusicGen Melody
MVP
myt5
NanoChat
Nemotron
nemotron_h
NLLB
NLLB-MOE
Nougat
Nyströmformer
OLMo
OLMo Hybrid
OLMo2
Olmo3
OLMoE
OmDet-Turbo
OneFormer
OpenAI GPT
OpenAI GPT-2
OPT
Ovis2
OWL-ViT
OWLv2
PaddleOCRVL
PaliGemma
Parakeet
PatchTSMixer
PatchTST
PE Audio
PE Audio Video
PE Video
Pegasus
PEGASUS-X
Perceiver
PerceptionLM
Persimmon
Phi
Phi-3
Phi4 Multimodal
PhiMoE
PhoBERT
Pix2Struct
Pixio
Pixtral
PLBart
PoolFormer
Pop2Piano
PP-DocLayoutV2
PP-DocLayoutV3
Prompt Depth Anything
ProphetNet
PVT
Pyramid Vision Transformer v2 (PVTv2)
Qwen2
Qwen2.5-Omni
Qwen2.5-VL
Qwen2Audio
Qwen2MoE
Qwen2VL
Qwen3
Qwen3-Omni-MoE
Qwen3.5
Qwen3.5 Moe
Qwen3MoE
Qwen3Next
Qwen3VL
Qwen3VLMoe
RAG
RecurrentGemma
Reformer
RegNet
RemBERT
ResNet
RoBERTa
RoBERTa-PreLayerNorm
RoCBert
RoFormer
RT-DETR
RT-DETRv2
RWKV
SAM
SAM2
SAM2 Video
SAM3
SAM3 Video
Sam3Tracker
Sam3TrackerVideo
SeamlessM4T
SeamlessM4Tv2
Seed-Oss
SegFormer
SegGPT
Segment Anything High Quality
SEW
SEW-D
ShieldGemma2
SigLIP
SigLIP2
SmolLM3
SmolVLM
SolarOpen
Speech Encoder decoder
Speech2Text
SpeechT5
Splinter
SqueezeBERT
StableLm
Starcoder2
SuperGlue
SuperPoint
SwiftFormer
Swin Transformer
Swin Transformer V2
Swin2SR
SwitchTransformers
T5
T5Gemma
T5Gemma2
T5v1.1
Table Transformer
TAPAS
TextNet
Time Series Transformer
TimesFM
TimesFM 2.5
TimeSformer
Timm Wrapper
TrOCR
TVP
UDOP
UL2
UMT5
UniSpeech
UniSpeechSat
UnivNet
UPerNet
V-JEPA 2
VaultGemma
VibeVoice ASR
VideoLlama3
VideoLlava
VideoMAE
ViLT
VipLlava
Vision Encoder decoder
VisionTextDualEncoder
VisualBERT
ViT
VitDet
ViTMAE
ViTMatte
ViTMSN
ViTPose
VITS
ViViT
Voxtral
VoxtralRealtime
Wav2Vec2
Wav2Vec2-BERT
Wav2Vec2-Conformer
Wav2Vec2Phoneme
WavLM
Whisper
X-CLIP
X-Codec
X-MOD
XGLM
XLM
XLM-RoBERTa
XLM-RoBERTa-XL
XLM-V
XLNet
XLS-R
XLSR-Wav2Vec2
xLSTM
YOLOS
YOSO
Youtu-LLM
Zamba
Zamba2
ZoeDepth

Sentence Transformers 5.3.0

以下列举该模板兼容模型名称。如需进一步了解兼容列表中各模型的使用方法和注意事项，可参考Sentence Transformers官方文档


模型
all-MiniLM-L12-v1
all-MiniLM-L12-v2
all-MiniLM-L6-v1
all-MiniLM-L6-v2
all-distilroberta-v1
all-mpnet-base-v1
all-mpnet-base-v2
all-roberta-large-v1
average_word_embeddings_glove.6B.300d
average_word_embeddings_komninos
distiluse-base-multilingual-cased-v1
distiluse-base-multilingual-cased-v2
gtr-t5-base
gtr-t5-large
gtr-t5-xxl
gtr-t5-xl
msmarco-bert-base-dot-v5
msmarco-distilbert-dot-v5
msmarco-distilbert-base-tas-b
msmarco-distilbert-cos-v5
msmarco-MiniLM-L12-cos-v5
msmarco-MiniLM-L6-cos-v5
multi-qa-MiniLM-L6-cos-v1
multi-qa-MiniLM-L6-dot-v1
multi-qa-distilbert-cos-v1
multi-qa-distilbert-dot-v1
multi-qa-mpnet-base-cos-v1
multi-qa-mpnet-base-dot-v1
paraphrase-MiniLM-L12-v2
paraphrase-MiniLM-L3-v2
paraphrase-MiniLM-L6-v2
paraphrase-TinyBERT-L6-v2
paraphrase-albert-small-v2
paraphrase-distilroberta-base-v2
paraphrase-mpnet-base-v2
paraphrase-multilingual-MiniLM-L12-v2
paraphrase-multilingual-mpnet-base-v2
LaBSE
sentence-t5-base
sentence-t5-large
sentence-t5-xl
sentence-t5-xxl
clip-ViT-L-14
clip-ViT-B-16
clip-ViT-B-32
clip-ViT-B-32-multilingual-v1
Qwen/Qwen3-VL-Embedding-2B
hkunlp/instructor-base
hkunlp/instructor-large
hkunlp/instructor-xl
allenai-specter

说明：以上是Sentence Transformers提供的官方模型，查看更多支持的社区模型，请参考Sentence Transformers社区模型

llama.cpp-b6152

以下列举该模板兼容模型名称。如需进一步了解兼容列表中各模型的使用方法和注意事项，可参考llama.cpp官方文档


模型类型	模型
纯文本模型	LLaMA
	LLaMA 2
	LLaMA 3
	Mistral 7B
	Mixtral MoE
	DBRX
	Falcon
	Chinese LLaMA / Alpaca
	Chinese LLaMA-2 / Alpaca-2
	Vigogne (French)
	BRT
	Koala
	Bichuan 1 & 2 + derivations
	Aquila 1 & 2
	Sarcoder models
	Refact
	MPT
	Bloom
	Yi models
	StableLM models
	Deepseek models
	Qwen models
	PaMo-13B
	Phi models
	PhiMoE
	GPT-2
	Orion 14B
	InternLM2
	CodeShell
	Gmma
	Mamba
	Grok-1
	Xverse
	Command-R models
	SA-LION
	GritLM-7B + GritLM-8x7B
	OLMo
	OMo 2
	OMoE
	Granite models
	GPT-NeoX + Pythia
	Swflake-Arctic MoE
	Smaug
	Poro 34B
	Bitnet b1.58 models
	Flan T5
	Open Elm models
	ChatGLM3-6b + ChatGLM4-9b + GLMEdge-1.5b + GLMEdge-4b
	GLM-4-0414
	SmolLM
	EXAONE-3.0-7.8B-Instruct
	FaconMamba Models
	Jais
	Bielik-11B-v2.3
	RKV-6
	QRWKV-6
	GigaChat-20B-A3B
	Trillion-7B-preview
	Ling models
	LFM2 models
多模态模型	LLaVA 1.5 models, LLaVA 1.6 models
	BakLLaVA
	Obsidian
	ShareGPT4V
	MobileVLM 1.7B/3B models
	Yi-VL
	Mini CPM
	Moondream
	Bunny
	GLM-EDGE
	Qwen2-VL

SGLang-0.5.11

以下列举该模板兼容模型名称。如需进一步了解兼容列表中各模型的使用方法和注意事项，可参考SGLang官方文档


模型类型	模型
大语言模型	DeepSeek (v1, v2, v3/R1)
	Kimi K2 (Thinking, Instruct)
	Kimi Linear (48B-A3B)
	GPT-OSS
	Qwen (3.5, 3, 3MoE, 3Next, 2.5, 2 series)
	Llama (2, 3.x, 4 series)
	Mistral (Mixtral, NeMo, Small3)
	Gemma (v1, v2, v3)
	Phi (Phi-1.5, Phi-2, Phi-3, Phi-4, Phi-MoE series)
	MiniCPM (v3, 4B)
	OLMo (2, 3)
	OLMoE (Open MoE)
	MiniMax-M2 (M2, M2.1, M2.5)
	StableLM (3B, 7B)
	Command-(R,A) (Cohere)
	DBRX (Databricks)
	Grok (xAI)
	ChatGLM (GLM-130B family)
	InternLM 2 (7B, 20B)
	ExaONE 3 (Korean-English)
	Baichuan 2 (7B, 13B)
	XVERSE (MoE)
	SmolLM (135M–1.7B)
	GLM-4 (Multilingual 9B)
	MiMo (7B series)
	ERNIE-4.5 (4.5, 4.5MoE series)
	Arcee AFM-4.5B
	Persimmon (8B)
	Solar (10.7B)
	Tele FLM (52B-1T)
	Ling (16.8B–290B)
	Granite 3.0, 3.1 (IBM)
	Granite 3.0 MoE (IBM)
	GPT-J (6B)
	Orion (14B)
	Llama Nemotron Super (v1, v1.5, NVIDIA)
	Llama Nemotron Ultra (v1, NVIDIA)
	NVIDIA Nemotron Nano 2.0
	NVIDIA Nemotron 3 Super (NVIDIA)
	NVIDIA Nemotron 3 Nano (NVIDIA)
	StarCoder2 (3B-15B)
	Jet-Nemotron
	Trinity (Nano, Mini)
	LFM2 (350M, 1.2B)
	LFM2-MoE (8B-A1B, 24B-A2B)
	Falcon-H1 (0.5B–34B)
	Hunyuan-Large (389B, MoE)
	IBM Granite 4.0 (Hybrid, Dense)
	Sarvam 2 (30B-A2B, 105B-A10B)
	Laguna XS.2 (poolside)
多模态模型	Qwen-VL (Qwen2-VL, Qwen2.5-VL, Qwen3-VL, Qwen3-Omni)
	DeepSeek-VL2
	DeepSeek-OCR / OCR-2
	Janus-Pro (1B, 7B)
	MiniCPM-V / MiniCPM-o
	Llama 3.2 Vision (11B)
	LLaVA (v1.5 & v1.6)
	LLaVA-NeXT (8B, 72B)
	LLaVA-OneVision
	Gemma 3 (Multimodal)
	Kimi-VL (A3B)
	Mistral-Small-3.1-24B
	Phi-4-multimodal-instruct
	MiMo-VL (7B)
	GLM-4.5V (106B) / GLM-4.1V(9B)
	GLM-OCR
	DotsVLM (General/OCR)
	DotsVLM-OCR
	NVILA (8B, 15B, Lite-2B, Lite-8B, Lite-15B)
	NVIDIA Nemotron Nano 2.0 VL
	Ernie4.5-VL
	JetVLM
	Step3-VL (10B)
	Qwen3-ASR (0.6B, 1.7B)
	Qwen3-Omni
	LFM2-VL
音频转写模型	Whisper
音频转写模型	Qwen3-ASR (0.6B, 1.7B)
扩散语言模型	LLaDA2.0 (mini, flash)
扩散语言模型	SDAR (JetLM, dense/MoE)
嵌入模型	E5 (Llama/Mistral based)
	GTE-Qwen2
	Qwen3-Embedding
	BGE
	GME (Multimodal)
	CLIP
奖励模型	Llama (3.1 Reward / LlamaForSequenceClassification)
	Gemma 2 (27B Reward / Gemma2ForSequenceClassification)
	InternLM 2 (Reward / InternLM2ForRewardModel)
	Qwen2.5 (Reward - Math / Qwen2ForRewardModel)
	Qwen2.5 (Reward - Sequence / Qwen2ForSequenceClassification)
重排序模型	BGE-Reranker (BgeRerankModel)
	Qwen3-Reranker (decoder-only yes/no)
	Qwen3-VL-Reranker (multimodal yes/no)
分类模型	LlamaForSequenceClassification
	Qwen2ForSequenceClassification
	Qwen3ForSequenceClassification
	BertForSequenceClassification
	Gemma2ForSequenceClassification

MindIE 2.3.0

以下列举该模板兼容模型名称。如需进一步了解兼容列表中各模型的使用方法和注意事项，可参考MindIE官方文档


模型类型	模型
大语言模型	Qwen3-235B-A22B
	Qwen3-30B-A3B
	DeepSeek-R1-0528
	DeepSeek-V2-236B
	DeepSeek-V3-0324
	DeepSeek-V3.1
	Mixtral-8x7B-Instruct-V0.1
	Mixtral-8x22B-Instruct-V0.1
	Kimi K2
	GLM4.5
	Ernie 4.5
	DeepSeek-R1-Distill-Llama-8B
	DeepSeek-R1-Distill-Llama-70B
	DeepSeek-R1-Distill-Qwen-1.5B
	DeepSeek-R1-Distill-Qwen-7B
	DeepSeek-R1-Distill-Qwen-14B
	Qwen2-7B-Instruct
	Qwen2-72B-Instruct
	Qwen2.5-7B-Instruct
	Qwen2.5-14B-Instruct
	Qwen2.5-32B-Instruct
	Qwen2.5-72B-Instruct
	Qwen3-4B
	Qwen3-8B
	Qwen3-14B
	Qwen3-32B
	LLaMA3-8B
	LLaMA3-70B
	LLaMA3.1-8B
	LLaMA3.1-70B
	LLaMA3.1-405B
	ChatGLM3-6B
	GLM4-9B
	Baichuan2-7B
	Baichuan2-13B
	Bloom-7B
多模态理解模型	GLM-4V-9B
	MiniCPM-V2.6-8B
	InternVL2-8B
	InternVL2-40B
	InternVL2.5-8B
	InternVL2.5-78B
	Qwen2-Audio-7B-Instruct
	Qwen2-VL-7B-Instruct
	Qwen2-VL-72B-Instruct
	Qwen2.5-VL-7B-Instruct
	Qwen2.5-VL-32B-Instruct
	Qwen2.5-VL-72B-Instruct
	VITA1.5-8B
多模态生成模型	Stable Diffusion 1.5
	Stable Diffusion 2.1
	Stable Diffusion XL
	Stable Diffusion XL_lighting
	Stable Diffusion 3
	Stable Video Diffusion
	Stable Audio Open v1.0
	OpenSora v1.2
	OpenSoraPlan v1.2
	OpenSoraPlan v1.3
	DiT
	sd-webui
	CogView3-Plus-3B
	CogVideoX-2B
	CogVideoX-5B
	FLUX.1-dev
	HunyuanDiT
	HunyuanVideo
	Wan2.1-T2V-14B
	Wan2.1-I2V-14B
	Wan2.2-T2V-A14B
	Wan2.2-I2V-A14B
	Wan2.2-TI2V-5B

MindIE 1.0.0

以下列举该模板兼容模型名称。如需进一步了解兼容列表中各模型的使用方法和注意事项，可参考MindIE官方文档


模型类型	模型
大语言模型	DeepSeek-V2-Lite-16B
	DeepSeek-V2-236B
	Qwen2.5-72B
	Qwen2.5-32B
	Qwen2.5-14B
	Qwen2.5-7B
	Qwen2-57B-A14B
	Qwen2-72B
	Qwen2-7B
	Qwen1.5-0.5B
	Qwen1.5-1.8B
	Qwen1.5-4B
	Qwen-7B
	Qwen-14B
	Qwen-72B
	LLaMA3-8B
	LLaMA3-70B
	LLaMA3.1-8B
	LLaMA3.1-70B
	LLaMA3.1-405B
	LLaMA-7B
	LLaMA-13B
	LLaMA-33B
	LLaMA-65B
	LLaMA2-7B
	LLaMA2-13B
	LLaMA2-70B
	ChatGLM2-6B
	ChatGLM3-6B
	ChatGLM3-6B-32K
	GLM4-9B-Chat
	Baichuan2-7B
	Baichuan2-13B
	Bloom-7B
	Bloom-176B
	CodeLLaMA-34B
	StarCoder-15.5B
	StarCoder2-15B
	Yi-6B-200K
	Yi-34B-200K
	CodeGeeX2-6B
	CodeShell-7B
	Gemma-7B
	GPT-NEOX-20B
	Ziya-Coding-34B
	InternLM2-20B
	InternLM-20B
	InternLM2-7B
	InternLM2-20B
	Mixtral-8x7B-Instruct-V0.1
	Mixtral-8x22B-Instruct-V0.1
	Vicuna-13B
嵌入模型	bge-large-zh-v1.5
	bge-reranker-large
	bge-m3
多模态理解模型	InternVL-Chat-V1-2
	InternVL-Chat-V1-5
	InternVL2-8B
	InternVL2-40B
	Qwen-VL-9.6B
	Qwen2-Audio-7B-Instruct
	Qwen2-VL-7B-Instruct
	internLM-xcomposer2-vl-7B
	internLM-XComposer2-4KHD-7B
	LLava-1.6-mistral-7B
	LLava-1.6-vicuna-7B
	LLava-1.6-vicuna-13B
	LLava-v1.6-34b-hf
	LLava-next-video-34b
	LLava-next-video-7b
	LLava-v1.5-13B
	LLava-v1.5-7B
	MiniCPM-Llama3-V-2_5
	MiniCPM-V-2
多模态生成模型	Stable Diffusion 1.5
	Stable Diffusion 2.1
	Stable Diffusion XL
	Stable Diffusion XL_controlnet
	Stable Diffusion XL_inpainting
	Stable Diffusion XL_prompt_weight
	Stable Diffusion 3
	Stable Video Diffusion
	Stable Audio Open v1.0
	OpenSora v1.2
	DiT
	sd-webui
	CogView3-Plus-3B
	HunyuanDiT

推理模板兼容清单

vLLM 0.20.2

vLLM 0.17.1

vLLM 0.11.0

vLLM 0.9.2

vLLM 0.8.5

vllm-ascend-v0.17.0rc1

vllm-ascend-v0.18.0

Diffusers 0.37.0

Transformers 5.3.0

Sentence Transformers 5.3.0

llama.cpp-b6152

SGLang-0.5.11

MindIE 2.3.0

MindIE 1.0.0