跳到主要内容
System-level intelligence

Signal
before scale

Built on Shannon signals, entropy folding, and neural-symbolic routing.

A router should feel like a system brain: encoder-guided, entropy-aware, and ruthlessly clear.

Signals13

13 signal families spanning intent, safety, modality, context, and preference.

Selection12

12 selectors across symbolic policy, latency heuristics, reinforcement learning, and ML routing.

Surfaces03

One architecture across cpu-local, amd-local, and ci-k8s.

Quick start

One supported local path. Copy the installer, run it, then open the dashboard.

Install locally in one line.

The supported first-run path is a single installer that sets up the CLI and local serve flow on macOS and Linux.

One-liner installmacOS / Linux
curl -fsSL https://vllm-semantic-router.com/zh-Hans/install.sh | bash

Installs into ~/.local/share/vllm-sr, writes ~/.local/bin/vllm-sr, and keeps Windows on the manual pip flow in the docs.

Core logic

Neural-symbolic routing, kept legible.

Encoder priors, Shannon mapping, entropy folding, and model selection stay visible from research prototypes to production paths.

Signal extraction

Encoder signals turn raw requests into legible semantic state.

Decision engine

Neural signals meet symbolic rules in auditable routing logic.

Plugin chain

Cache, safety, rewrite, and tracing attach as composable behaviors.

Intent-to-policy compile

Natural language intent compiles into neural-symbolic policy before execution begins.

Research-grade model selection

Selection stays measurable enough for papers, benchmarks, and production tuning.

System docs

Docs, papers, and product routes read as one system, not scattered collateral.

路由蓝图

系统如何工作

通过交互式演示理解信号提取、决策逻辑与模型路由行为。

香农映射

从通信理论到路由流水线的结构映射。

用户请求是在编码前的原始源消息。

基于编码器模型

编码器驱动的智能

专用编码器模型从每个请求中提取语义 — 理解意图、排序相关性、跨模态实时分类内容。

Signal surfaces

Sequence classification, token labeling, embeddings, and reranking collapse into one system-intelligence layer.

SEQ_CLSSequence classification for domain, jailbreak, fact-check, and feedback routing.
TOKENToken labeling for PII and safety-sensitive spans that need localized intervention.
EMBEDEmbedding and rerank paths for semantic cache, similarity search, and candidate scoring.
MOD

多模态

检测并路由文本、图像和音频输入到合适的模态模型。

Input
"Is machine learning related to AI?"
Tokenizer
[CLS]IsmachinelearningrelatedtoAI?[SEP]
Embedding
Token Emb
Segment Emb
Position Emb
h₀ = Σ
Encoder Block
×N
ATTNMulti-Head Attention
NORMAdd & Norm
FFNFeed-Forward
NORMAdd & Norm
Signals
CLS
Sentence-Level (CLS Token)[CLS] → Linear Head → "computer_science"TaskType: SEQ_CLS
DomainJailbreakFact-checkFeedbackModality
BIO
Token-Level (Per Token)Each token → BIO Label → O O B-LOC I-LOC OTaskType: TOKEN_CLS
PII Detection
EMB
Bi-Encodermean-pooling(h₁..hₙ) → [0.23, -0.41, 0.87, ...]TaskType: EMBEDDING
Semantic CacheSimilarityComplexity-CLJailbreak-CL
RER
Cross-Encoder[CLS] query [SEP] candidate [SEP] → scoreTaskType: CROSS_LEARNING
RerankMulti-Modal
BIE

Bi-Encoder 嵌入

独立编码查询和候选项为稠密向量,用于相似度搜索和语义缓存。

XCE

Cross-Encoder 学习

联合交叉注意力评分查询-候选对,实现高精度重排序。

CLS

分类

基于自研 BERT 的领域、越狱、PII 和事实核查的分类器,覆盖多个 signal

ATT

全注意力

跨 token 和句子的双向注意力 — 双向完整上下文,非因果掩码。

2DM

2DMSE

推理时自适应调整嵌入层数和维度,按需平衡计算量与精度。

MRL

MRL

无需重训即可截断嵌入向量到任意维度 — 按请求平衡精度与速度。

Contributors

认识我们的团队

vLLM Semantic Router 背后的优秀成员

Huamin Chen维护者

Huamin Chen

Distinguished Engineer @Red Hat

Chen Wang维护者

Chen Wang

Senior Staff Research Scientist @IBM

Yue Zhu维护者

Yue Zhu

Staff Research Scientist @IBM

Xunzhuo Liu维护者

Xunzhuo Liu

Intelligent Routing @vLLM

Senan Zedan提交者

Senan Zedan

R&D Manager @Red Hat

samzong提交者

samzong

AI Infrastructure / Cloud-Native PM @DaoCloud

Liav Weiss提交者

Liav Weiss

Software Engineer @Red Hat

Asaad Balum提交者

Asaad Balum

Senior Software Engineer @Red Hat

Yehudit提交者

Yehudit

Software Engineer @Red Hat

Noa Limoy提交者

Noa Limoy

Software Engineer @Red Hat

JaredforReal提交者

JaredforReal

Software Engineer @Z.ai

Srinivas A提交者

Srinivas A

Software Engineer @Yokogawa

carlory提交者

carlory

Open Source Engineer @DaoCloud

Yossi Ovadia提交者

Yossi Ovadia

Senior Principal Engineer @Red Hat

Jintao Zhang提交者

Jintao Zhang

Senior Software Engineer @Kong

yuluo-yx提交者

yuluo-yx

Individual Contributor

cryo-zd提交者

cryo-zd

Individual Contributor

OneZero-Y提交者

OneZero-Y

Individual Contributor

aeft提交者

aeft

Individual Contributor

Hao Wu提交者

Hao Wu

Individual Contributor

Qiping Pan提交者

Qiping Pan

Individual Contributor

Huamin Chen维护者

Huamin Chen

Distinguished Engineer @Red Hat

Chen Wang维护者

Chen Wang

Senior Staff Research Scientist @IBM

Yue Zhu维护者

Yue Zhu

Staff Research Scientist @IBM

Xunzhuo Liu维护者

Xunzhuo Liu

Intelligent Routing @vLLM

Senan Zedan提交者

Senan Zedan

R&D Manager @Red Hat

samzong提交者

samzong

AI Infrastructure / Cloud-Native PM @DaoCloud

Liav Weiss提交者

Liav Weiss

Software Engineer @Red Hat

Asaad Balum提交者

Asaad Balum

Senior Software Engineer @Red Hat

Yehudit提交者

Yehudit

Software Engineer @Red Hat

Noa Limoy提交者

Noa Limoy

Software Engineer @Red Hat

JaredforReal提交者

JaredforReal

Software Engineer @Z.ai

Srinivas A提交者

Srinivas A

Software Engineer @Yokogawa

carlory提交者

carlory

Open Source Engineer @DaoCloud

Yossi Ovadia提交者

Yossi Ovadia

Senior Principal Engineer @Red Hat

Jintao Zhang提交者

Jintao Zhang

Senior Software Engineer @Kong

yuluo-yx提交者

yuluo-yx

Individual Contributor

cryo-zd提交者

cryo-zd

Individual Contributor

OneZero-Y提交者

OneZero-Y

Individual Contributor

aeft提交者

aeft

Individual Contributor

Hao Wu提交者

Hao Wu

Individual Contributor

Qiping Pan提交者

Qiping Pan

Individual Contributor

Huamin Chen维护者

Huamin Chen

Distinguished Engineer @Red Hat

Chen Wang维护者

Chen Wang

Senior Staff Research Scientist @IBM

Yue Zhu维护者

Yue Zhu

Staff Research Scientist @IBM

Xunzhuo Liu维护者

Xunzhuo Liu

Intelligent Routing @vLLM

Senan Zedan提交者

Senan Zedan

R&D Manager @Red Hat

samzong提交者

samzong

AI Infrastructure / Cloud-Native PM @DaoCloud

Liav Weiss提交者

Liav Weiss

Software Engineer @Red Hat

Asaad Balum提交者

Asaad Balum

Senior Software Engineer @Red Hat

Yehudit提交者

Yehudit

Software Engineer @Red Hat

Noa Limoy提交者

Noa Limoy

Software Engineer @Red Hat

JaredforReal提交者

JaredforReal

Software Engineer @Z.ai

Srinivas A提交者

Srinivas A

Software Engineer @Yokogawa

carlory提交者

carlory

Open Source Engineer @DaoCloud

Yossi Ovadia提交者

Yossi Ovadia

Senior Principal Engineer @Red Hat

Jintao Zhang提交者

Jintao Zhang

Senior Software Engineer @Kong

yuluo-yx提交者

yuluo-yx

Individual Contributor

cryo-zd提交者

cryo-zd

Individual Contributor

OneZero-Y提交者

OneZero-Y

Individual Contributor

aeft提交者

aeft

Individual Contributor

Hao Wu提交者

Hao Wu

Individual Contributor

Qiping Pan提交者

Qiping Pan

Individual Contributor

Maintainers, committers, and contributors across research, infrastructure, and open-source operations.

查看所有团队成员
Documentation

Architecture, written to be used.

Install, configure, train, and operate from one dense documentation graph.

Docs index
Community

Research and builders in one loop.

Papers, working groups, and contributors evolve the same system in public.

Community routes