refactor(api,expo): 多智能体与会话收敛、回忆录兼容层移除、后端测试集大幅删减

- 对齐「多智能体收敛」与「回忆录 stories-first / markdown-first」方向：收紧运行时契约、删除过渡兼容路径与双轨逻辑，并同步更新客户端与文档。 - Chat：以 ChatOrchestrator 为实时编排入口；删除独立 conversation_agent，精简 prompts。 - Memoir：删除 memory_agent；MemoirOrchestrator、classification / story_route 与 prompts 收敛到 prepare_batches + run_story_pipeline_for_category_batch 主链路。 - 将 agents 侧 processor 迁入 feature 层为 background_runner，并移除 features 下重复/过时 processor 封装。 - 新增 history_store，强化「conversation_messages 为 DB 真源、Redis 为缓存」模型。 - 调整 models、repo、service、session_history；精简 WS message_types，重构 pipeline 与 router。 - 移除章节占位、整章再生等旧路径；章节列表与封面逻辑要求 story 关联；收紧 cover 资格与 enqueue。 - helpers、repo、service、router、reading_segment_materialize、story_pipeline_sync、pdf_service 等按 canonical markdown / cover_asset_id 收缩；删除 memoir_images/provider 等冗余。 - tasks：memoir_tasks、chapter_cover_tasks 等大幅瘦身；story_image_tasks 等与当前图片任务对齐。 - core：config、logging、redis、task_tracker 小幅调整。 - auth / user / payment / quota：路由或服务侧删减过时接口或逻辑（如 payment router 行数减少）。 - pyproject.toml、development.sh、.env.example / .env.production、README 等同步说明或变量。 - Alembic 0001_initial_schema 微调（与当前 schema 叙事一致的小改动）。 - 回忆录：types / mappers / api、章节页与 memoir 页与后端契约对齐；markdown-renderer 调整。 - 语音：删除 voice/player，voice-segment-store 相应精简。 - api/tests：删除 conftest 及绝大部分既有测试文件（websocket_baseline、conversation、memoir 图片、PDF、SMS 等），属有意收缩/待按 backend-test-system 重建的信号。 - docs：新增多智能体收敛与移除兼容层计划摘要；更新 story-first 设计、backend-test-system、 multi-agent-refactor-plan、实施总结等。 BREAKING CHANGE: 后端对外契约、回忆录章节字段与若干路由/任务行为已变更；大量 API 测试被移除， CI 若依赖这些用例需按新策略补测或调整流水线。
2026-03-22 16:45:57 +08:00
parent 70070216c4
commit 786ebf8ae6
122 changed files with 2802 additions and 7941 deletions
--- a/api/app/agents/memoir/prompts.py
+++ b/api/app/agents/memoir/prompts.py
@@ -386,10 +386,11 @@ def get_narrative_json_prompt(
 1. 从对话中提炼与人生经历相关的核心内容，过滤语气词、寒暄、与AI的交互
 2. 使用第一人称，改写为流畅的书面叙述，不要直接引用对话原话
 3. 只输出新内容的改写，不要重复已有内容
-4. 每 200-300 字左右一个段落
-5. 如有衔接上下文，确保新内容与之自然衔接
-6. **不要使用 Markdown 表格**（不要用 `|` 管道表格）
-7. **不要用 `#`、`##` 写故事或章节标题**；标题由系统管理
+4. **本批输入对应一个独立叙事单元**：只围绕同一主题/事件链展开，不要写入与上述对话无关的其他话题或回忆
+5. 每 200-300 字左右一个段落
+6. 如有衔接上下文，确保新内容与之自然衔接
+7. **不要使用 Markdown 表格**（不要用 `|` 管道表格）
+8. **不要用 `#`、`##` 写故事或章节标题**；标题由系统管理

 ## 输出格式（严格 JSON）
 {{
@@ -417,6 +418,8 @@ def get_story_route_prompt(
 - append_story：内容明显延续、补充某一已有故事的主题与时间线，且能对应到具体 candidate id
 - new_story：新话题、新人生阶段片段，或与所有候选故事都不够贴合

+「故事」在此指：**可独立讲述的一段人生经历**——单一主题或同一事件链；不要假设本批里包含多个互不相关的故事（多段由系统其它步骤处理）。
+
 当前章节（写作容器）：
 - category: {chapter_category}
 - title: {chapter_title}
@@ -441,6 +444,54 @@ def get_story_route_prompt(
 """


+def get_story_batch_plan_prompt(
+    *,
+    chapter_category: str,
+    chapter_title: str,
+    segments_json: str,
+    candidate_stories_json: str,
+) -> str:
+    """同一章节类别下多 segment：划分为若干写入单元（每单元 new 或 append）。输出严格 JSON。"""
+    return f"""你是回忆录编辑助手。下面同一章节类别下有一批**按时间顺序**的用户口述片段（每段有 id 与文本）。
+
+## 「故事」定义（必须遵守）
+一段「故事」= **可独立讲述的一段人生经历**：单一主题或同一事件链，能单独成篇。若话题切换、时间线跳到另一件事、人物/主线明显变化，应作为**新的故事**（new_story），而不是塞进同一段 append。
+
+## 任务
+将本批 segment **划分为连续若干块**（每块包含至少一个 segment，顺序不能打乱；每个 segment 必须恰好属于一块）。对每一块决定：
+- **append_story**：内容明显延续、补充**某一已有候选故事**的主题与时间线，且能对应到具体 candidate id
+- **new_story**：新话题、与所有候选故事都不够贴合、或应独立成篇的片段
+
+当前章节（写作容器）：
+- category: {chapter_category}
+- title: {chapter_title}
+
+【本批口述片段】（JSON 数组，顺序即口述顺序）
+{segments_json}
+
+【候选故事】（仅允许在 append 时选择其中的 id；id 必须原样复制）
+{candidate_stories_json}
+
+## 输出 JSON（仅此一个对象，不要 markdown）
+{{
+  "units": [
+    {{
+      "segment_ids": ["<按顺序列出本块包含的 segment id>"],
+      "decision": "new_story" | "append_story",
+      "target_story_id": "<uuid 或 null；append 时必填且必须来自候选>",
+      "new_story_title": "<短标题，6-20 字；new_story 时必填，append 时可 null>",
+      "reason": "<一句中文理由，可选>"
+    }}
+  ]
+}}
+
+规则：
+- `units` 中所有 `segment_ids` 拼接后，必须**不重不漏**地覆盖本批全部 id，且顺序与【本批口述片段】数组一致
+- 若无法自信匹配某一候选，对该块选 new_story
+- new_story_title 应概括该块内容，不要与候选标题重复
+"""
+
+
 def format_evidence_chunks_for_prompt(evidence: dict) -> str:
    """将 retrieve_evidence 结果格式化为简短文本，供叙事 prompt 使用。"""
    chunks = evidence.get("relevant_chunks") or []