feat(api): 访谈人格/回复长度策略、口述归一、背景语气与输入净稿全链路

Chat 访谈 - 新增 persona 系统（default / warm_listener / curious_guide）与 background_voice 语气层 - 回复长度由 compute_reply_plan 统一决策（brief / standard / expanded），融合信息密度启发式 - 输入净稿（input_normalize）：编排层可选 rules/llm 归一用户口语后再喂模型与记忆检索 - 记忆证据注入：按用户话检索 memory evidence 并注入 prompt Memoir 回忆录 - 口述归一（oral_normalize）：segment 原文保留，story 管线取派生净稿作叙事输入 - segment 入队批次门闸：累计字数 + 最长等待秒数，减少零碎提交 - fidelity_check / prompts / narrative_agent 微调 - Alembic 0005：清理跨章节 story 外键 Infra - Dockerfile 加入 ffmpeg - pyproject.toml 新增依赖并同步 uv.lock - .env.example / .env.production 补全新配置项 Tests - 新增 test_background_voice、test_chat_input_normalize、test_experience_regressions - 扩展 test_interview_prompts、test_interview_reply_length、test_story_route_oral_invariant Made-with: Cursor
2026-03-31 23:55:26 +08:00
parent 42ae2a5e91
commit 69a673e6c6
44 changed files with 2998 additions and 259 deletions
--- a/api/app/features/conversation/ws/router.py
+++ b/api/app/features/conversation/ws/router.py
@@ -11,6 +11,7 @@ from datetime import datetime, timezone
 from fastapi import WebSocket, WebSocketDisconnect, status
 from starlette.websockets import WebSocketState

+from app.agents.chat.background_voice import infer_background_voice
 from app.agents.chat.prompts_profile import format_user_profile_context
 from app.core.db import AsyncSessionLocal
 from app.core.dependencies import get_asr_provider
@@ -201,6 +202,9 @@ async def websocket_endpoint(
                                conversation_id=conversation_id,
                                memoir_state=state,
                                user_profile_context=user_profile_context,
+                                background_voice=infer_background_voice(
+                                    user.occupation
+                                ),
                            )
                        )
                        ai_msg_id = await ConversationHistoryStore(
@@ -300,7 +304,9 @@ async def websocket_endpoint(
                            await db.commit()
                            await db.refresh(segment)
                            await background_runner.queue_message(
-                                conversation.user_id, segment.id
+                                conversation.user_id,
+                                segment.id,
+                                text_char_count=len(text_message.strip()),
                            )

                            await process_user_message(
@@ -563,7 +569,9 @@ async def websocket_endpoint(
                                await db.commit()
                                await db.refresh(segment)
                                await background_runner.queue_message(
-                                    conversation.user_id, segment.id
+                                    conversation.user_id,
+                                    segment.id,
+                                    text_char_count=len((asr_text or "").strip()),
                                )

                                if asr_text and not asr_text.startswith("转写失败"):