feat(api): 访谈人格/回复长度策略、口述归一、背景语气与输入净稿全链路

Chat 访谈
- 新增 persona 系统(default / warm_listener / curious_guide)与 background_voice 语气层
- 回复长度由 compute_reply_plan 统一决策(brief / standard / expanded),融合信息密度启发式
- 输入净稿(input_normalize):编排层可选 rules/llm 归一用户口语后再喂模型与记忆检索
- 记忆证据注入:按用户话检索 memory evidence 并注入 prompt

Memoir 回忆录
- 口述归一(oral_normalize):segment 原文保留,story 管线取派生净稿作叙事输入
- segment 入队批次门闸:累计字数 + 最长等待秒数,减少零碎提交
- fidelity_check / prompts / narrative_agent 微调
- Alembic 0005:清理跨章节 story 外键

Infra
- Dockerfile 加入 ffmpeg
- pyproject.toml 新增依赖并同步 uv.lock
- .env.example / .env.production 补全新配置项

Tests
- 新增 test_background_voice、test_chat_input_normalize、test_experience_regressions
- 扩展 test_interview_prompts、test_interview_reply_length、test_story_route_oral_invariant

Made-with: Cursor
This commit is contained in:
Kevin
2026-03-31 23:55:26 +08:00
parent 42ae2a5e91
commit 69a673e6c6
44 changed files with 2998 additions and 259 deletions

View File

@@ -11,6 +11,7 @@ from datetime import datetime, timezone
from fastapi import WebSocket, WebSocketDisconnect, status
from starlette.websockets import WebSocketState
from app.agents.chat.background_voice import infer_background_voice
from app.agents.chat.prompts_profile import format_user_profile_context
from app.core.db import AsyncSessionLocal
from app.core.dependencies import get_asr_provider
@@ -201,6 +202,9 @@ async def websocket_endpoint(
conversation_id=conversation_id,
memoir_state=state,
user_profile_context=user_profile_context,
background_voice=infer_background_voice(
user.occupation
),
)
)
ai_msg_id = await ConversationHistoryStore(
@@ -300,7 +304,9 @@ async def websocket_endpoint(
await db.commit()
await db.refresh(segment)
await background_runner.queue_message(
conversation.user_id, segment.id
conversation.user_id,
segment.id,
text_char_count=len(text_message.strip()),
)
await process_user_message(
@@ -563,7 +569,9 @@ async def websocket_endpoint(
await db.commit()
await db.refresh(segment)
await background_runner.queue_message(
conversation.user_id, segment.id
conversation.user_id,
segment.id,
text_char_count=len((asr_text or "").strip()),
)
if asr_text and not asr_text.startswith("转写失败"):