配置
本指南涵盖了 Semantic Router 的配置选项。系统使用单个 YAML 配置文件来控制 Signal-Driven Routing、Plugin Chain 处理和模型选择。
架构概览
配置定义了三个主要层:
- Signal Extraction Layer(信号提取层):定义 7 种类型的信号(keyword、embedding、domain、fact_check、user_feedback、preference、language)
- Decision Engine(决策引擎):使用 AND/OR 运算符组合信号以做出路由决策
- Plugin Chain(插件链):配置用于缓存、安全和优化的插件
配置文件
配置文件位于 config/config.yaml。以下是基于实际实现的结构:
# config/config.yaml - 实际配置结构
# 用于语义相似度的 BERT 模型
bert_model:
model_id: sentence-transformers/all-MiniLM-L12-v2
threshold: 0.6
use_cpu: true
# 语义缓存
semantic_cache:
backend_type: "memory" # 选项: "memory" 或 "milvus"
enabled: false
similarity_threshold: 0.8 # 全局默认阈值
max_entries: 1000
ttl_seconds: 3600
eviction_policy: "fifo" # 选项: "fifo", "lru", "lfu"
# 工具自动选择
tools:
enabled: false
top_k: 3
similarity_threshold: 0.2
tools_db_path: "config/tools_db.json"
fallback_to_empty: true
# Jailbreak 防护
prompt_guard:
enabled: false # 全局默认 - 可以针对每个类别覆盖
use_modernbert: true
model_id: "models/jailbreak_classifier_modernbert-base_model"
threshold: 0.7
use_cpu: true
# vLLM 端点 - 您的后端模型
vllm_endpoints:
- name: "endpoint1"
address: "192.168.1.100" # 替换为您的服务器 IP 地址
port: 11434
models:
- "your-model" # 替换为您的模型
weight: 1
# 模型配置
model_config:
"your-model":
pii_policy:
allow_by_default: true
pii_types_allowed: ["EMAIL_ADDRESS", "PERSON"]
preferred_endpoints: ["endpoint1"]
# 示例:具有自定义名称的 DeepSeek 模型
"ds-v31-custom":
reasoning_family: "deepseek" # 使用 DeepSeek 推理语法
preferred_endpoints: ["endpoint1"]
# 示例:具有自定义名称的 Qwen3 模型
"my-qwen3-model":
reasoning_family: "qwen3" # 使用 Qwen3 推理语法
preferred_endpoints: ["endpoint2"]
# 示例:不支持推理的模型
"phi4":
preferred_endpoints: ["endpoint1"]
# 分类模型
classifier:
category_model:
model_id: "models/category_classifier_modernbert-base_model"
use_modernbert: true
threshold: 0.6
use_cpu: true
pii_model:
model_id: "models/pii_classifier_modernbert-base_presidio_token_model"
use_modernbert: true
threshold: 0.7
use_cpu: true
# 信号 - 信号提取配置
signals:
# 基于关键词的信号(快速模式匹配)
keywords:
- name: "math_keywords"
operator: "OR"
keywords:
- "calculate"
- "equation"
- "solve"
- "derivative"
- "integral"
case_sensitive: false
- name: "code_keywords"
operator: "OR"
keywords:
- "function"
- "class"
- "debug"
- "compile"
case_sensitive: false
# 基于嵌入的信号(语义相似度)
embeddings:
- name: "code_debug"
threshold: 0.70
candidates:
- "how to debug the code"
- "troubleshooting steps for my code"
aggregation_method: "max"
- name: "math_intent"
threshold: 0.75
candidates:
- "solve mathematical problem"
- "calculate the result"
aggregation_method: "max"
# 领域信号(MMLU 分类)
domains:
- name: "mathematics"
description: "Mathematical and computational problems"
mmlu_categories:
- "abstract_algebra"
- "college_mathematics"
- "elementary_mathematics"
- name: "computer_science"
description: "Programming and computer science"
mmlu_categories:
- "computer_security"
- "machine_learning"
# 事实核查信号(检测验证需求)
fact_check:
- name: "needs_verification"
description: "Queries requiring fact verification"
# 用户反馈信号(满意度分析)
user_feedbacks:
- name: "correction_needed"
description: "User indicates previous answer was wrong"
# 偏好信号(基于 LLM 的匹配)
preferences:
- name: "complex_reasoning"
description: "Requires deep reasoning and analysis"
llm_endpoint: "http://localhost:11434"
# 类别 - 定义领域类别
categories:
- name: math
- name: computer science
- name: other
# 决策 - 结合信号以做出路由决策
decisions:
- name: math
description: "Route mathematical queries"
priority: 10
rules:
operator: "OR" # 匹配任何条件
conditions:
- type: "keyword"
name: "math_keywords"
- type: "embedding"
name: "math_intent"
- type: "domain"
name: "mathematics"
modelRefs:
- model: your-model
use_reasoning: true # 为数学问题启用推理
# 可选:决策级插件
plugins:
- type: "semantic-cache"
configuration:
enabled: true
similarity_threshold: 0.9 # 数学问题需要更高的阈值
- type: "jailbreak"
configuration:
enabled: true
- type: "pii"
configuration:
enabled: true
threshold: 0.8
- type: "system_prompt"
configuration:
enabled: true
prompt: "You are a mathematics expert. Solve problems step by step."
- name: computer_science
description: "Route computer science queries"
priority: 10
rules:
operator: "OR"
conditions:
- type: "keyword"
name: "code_keywords"
- type: "embedding"
name: "code_debug"
- type: "domain"
name: "computer_science"
modelRefs:
- model: your-model
use_reasoning: true # 为代码启用推理
plugins:
- type: "semantic-cache"
configuration:
enabled: true
similarity_threshold: 0.85
- type: "system_prompt"
configuration:
enabled: true
prompt: "You are a programming expert. Provide clear code examples."
- name: other
description: "Route general queries"
priority: 5
rules:
operator: "OR"
conditions:
- type: "domain"
name: "other"
modelRefs:
- model: your-model
use_reasoning: false # 通用查询不使用推理
plugins:
- type: "semantic-cache"
configuration:
enabled: true
similarity_threshold: 0.75 # 通用查询使用较低的阈值
default_model: your-model
# 推理家族配置 - 定义不同模型家族如何处理推理语法
reasoning_families:
deepseek:
type: "chat_template_kwargs"
parameter: "thinking"
qwen3:
type: "chat_template_kwargs"
parameter: "enable_thinking"
gpt-oss:
type: "reasoning_effort"
parameter: "reasoning_effort"
gpt:
type: "reasoning_effort"
parameter: "reasoning_effort"
# 全局默认推理努力等级
default_reasoning_effort: "medium"
在上面的 model_config 块中分配推理家族——每个模型使用 reasoning_family(参见示例中的 ds-v31-custom 和 my-qwen3-model)。不支持推理语法的模型只需省略该字段(例如 phi4)。