feat(api): 访谈人格/回复长度策略、口述归一、背景语气与输入净稿全链路

Chat 访谈
- 新增 persona 系统(default / warm_listener / curious_guide)与 background_voice 语气层
- 回复长度由 compute_reply_plan 统一决策(brief / standard / expanded),融合信息密度启发式
- 输入净稿(input_normalize):编排层可选 rules/llm 归一用户口语后再喂模型与记忆检索
- 记忆证据注入:按用户话检索 memory evidence 并注入 prompt

Memoir 回忆录
- 口述归一(oral_normalize):segment 原文保留,story 管线取派生净稿作叙事输入
- segment 入队批次门闸:累计字数 + 最长等待秒数,减少零碎提交
- fidelity_check / prompts / narrative_agent 微调
- Alembic 0005:清理跨章节 story 外键

Infra
- Dockerfile 加入 ffmpeg
- pyproject.toml 新增依赖并同步 uv.lock
- .env.example / .env.production 补全新配置项

Tests
- 新增 test_background_voice、test_chat_input_normalize、test_experience_regressions
- 扩展 test_interview_prompts、test_interview_reply_length、test_story_route_oral_invariant

Made-with: Cursor
This commit is contained in:
Kevin
2026-03-31 23:55:26 +08:00
parent 42ae2a5e91
commit 69a673e6c6
44 changed files with 2998 additions and 259 deletions

View File

@@ -54,20 +54,55 @@ class Settings(BaseSettings):
embedding_base_url: str = "https://open.bigmodel.cn/api/paas/v4"
embedding_model: str = "embedding-3"
# ── Chat 访谈(短回复:token 上限 + 代码截断,见 reply_limits──
chat_interview_max_tokens: int = 320
# ── Chat 访谈token 上限 + 代码截断,见 reply_limits──
chat_interview_max_tokens: int = 380
chat_interview_max_segments: int = 2
chat_interview_max_chars_per_segment: int = 220
chat_interview_max_chars_per_segment: int = 260
# 访谈:用户本轮极短输入时的更紧上限(见 interview_reply_length
chat_interview_brief_max_tokens: int = Field(default=260, ge=64, le=2048)
chat_interview_brief_max_chars_per_segment: int = Field(default=200, ge=32, le=2000)
# 访谈:有新细节/情绪/长段时的展开上限
chat_interview_expanded_max_tokens: int = Field(default=520, ge=64, le=4096)
chat_interview_expanded_max_chars_per_segment: int = Field(
default=380, ge=32, le=4000
)
# 干部/军队推断命中时standard 档在分桶基础上小幅放宽brief/expanded 不变)
chat_interview_cadre_military_standard_extra_tokens: int = Field(
default=40, ge=0, le=512
)
chat_interview_cadre_military_standard_extra_chars: int = Field(
default=40, ge=0, le=2000
)
chat_opening_max_tokens: int = 256
chat_profile_followup_max_tokens: int = 280
chat_era_context_enabled: bool = True
# 访谈:每轮用 LLM 判定用户主人生阶段并更新 MemoirState.current_stageFalse 时仅用关键词
chat_stage_detection_enabled: bool = True
chat_stage_detection_max_tokens: int = 128
# 访谈性格default | warm_listener | curious_guide未知值按 default
chat_interview_persona: str = "default"
# 访谈:按用户本轮话检索记忆并注入 prompt关则不调 MemoryService.retrieve
chat_memory_retrieval_enabled: bool = True
chat_memory_top_k: int = Field(default=8, ge=1, le=30)
chat_memory_evidence_max_chars: int = Field(default=4096, ge=256, le=50_000)
# ── Memoir 叙事忠实度检查FidelityCheckAgent────────────────
memoir_fidelity_check_enabled: bool = True
memoir_fidelity_check_max_tokens: int = 512
# 口述归一(进入叙事 / 忠实度前segment 原文不落库off | rules | llm
memoir_oral_normalize_enabled: bool = True
memoir_oral_normalize_mode: str = "rules"
memoir_oral_normalize_llm_max_tokens: int = Field(default=512, ge=64, le=4096)
memoir_oral_normalize_llm_max_input_chars: int = Field(
default=8000, ge=64, le=50_000
)
# 聊天:模型消费净稿(不改变 segment 落库原文);与 memoir 规则层共用,配置独立
chat_input_normalize_enabled: bool = True
chat_input_normalize_mode: str = "rules" # off | rules | llm
chat_input_normalize_llm_max_tokens: int = Field(default=512, ge=64, le=4096)
chat_input_normalize_llm_max_input_chars: int = Field(
default=8000, ge=64, le=50_000
)
# ── ASR ───────────────────────────────────────────────────
asr_provider: str = "whisper"
@@ -163,9 +198,15 @@ class Settings(BaseSettings):
evidence_top_k_default: int = Field(default=10, ge=1, le=50)
evidence_top_k_large_batch: int = Field(default=5, ge=1, le=50)
evidence_large_batch_threshold: int = Field(default=3, ge=1, le=100)
# 叙事输出相对口述过短回退为口述原文(比例与下限
memoir_narrative_fallback_body_ratio: float = 0.5
memoir_narrative_fallback_min_chars: int = 20
# 叙事输出相对口述极端过短回退仅防极端压缩0.3 = 模型输出不到口述 30% 才触发
memoir_narrative_fallback_body_ratio: float = 0.3
memoir_narrative_fallback_min_chars: int = 15
# 回忆录 Celery累计 strip 后口述字数未达此值则暂缓提交0=关闭,仅防抖后提交)
memoir_segment_batch_min_chars: int = Field(default=50, ge=0, le=50_000)
# 本批首条 segment 入队起最长等待(秒),超时则提交(即使字数不足)
memoir_segment_batch_max_wait_seconds: float = Field(
default=60.0, ge=0.0, le=3600.0
)
# ── Memory 检索与富化 ─────────────────────────────────────
# Truequery 为空时仍返回 rolling 摘要 + 最近事实/时间线(无 chunk FTS