feat(memory,conversation): 记忆富化/证据包、时间线幂等字段与对话分段全链路

数据库 - 新增迁移 0003：timeline_events.memory_source_id 外键 → memory_sources，便于按 ingest 源做时间线幂等后端 - 记忆 - 新增 ingest 后 LLM 富化（摘要/事实/时间线），可配置开关与最大字符数 - 新增证据包组装：合并 chunk、摘要、事实、时间线、故事等检索结果；支持空 query 时是否仍带 rolling 等开关 - repo/retriever/service/router/schemas/summarizer/timeline/extractor 等扩展；文档 memory-retrieval.md 更新后端 - 对话 WS - 增加 PING/PONG；分段 ASR 日志与空音频处理；转写失败与「无助手回复」错误提示更明确 - 助手多段回复持久化使用统一分隔符，与分段逻辑一致后端 - Agent - reply_limits：按 [SPLIT] 与段落拆段，并保证非空 fallback，供 WS 与 TTS 多段下发后端 - 回忆录任务 - transcript ingest 记录 source_id；任务成功结?
2026-03-27 16:01:28 +08:00
parent 1374f6e8f5
commit e4bf0710c7
70 changed files with 3404 additions and 557 deletions
--- a/api/app/agents/memoir/fidelity_check_agent.py
+++ b/api/app/agents/memoir/fidelity_check_agent.py
@@ -1,6 +1,7 @@
 """
 FidelityCheckAgent：比较「用户口述」与叙事 JSON 输出，判定是否存在明显编造或越界。
-失败时由流水线回退为口述正文（见 story_pipeline_sync）。
+续写合并（append）时传入 `existing_canonical_markdown`，将已有故事正文一并视为允许来源。
+失败时由流水线回退（见 story_pipeline_sync）：续写为「已有 + 口述」，新建为口述原文。
 """

 from __future__ import annotations
@@ -43,6 +44,7 @@ class FidelityCheckAgent:
        oral_text: str,
        narrative_json: str,
        llm: Any,
+        existing_canonical_markdown: str | None = None,
    ) -> bool:
        if not llm or not settings.memoir_fidelity_check_enabled:
            return True
@@ -50,8 +52,32 @@ class FidelityCheckAgent:
        gen = (narrative_json or "").strip()
        if not oral or not gen:
            return True
+        existing = (existing_canonical_markdown or "").strip()
        _log_suspicious_years_not_in_oral(oral, gen)
-        prompt = f"""你是事实核对员。比较下面两段文字。
+        if existing:
+            prompt = f"""你是事实核对员。当前为**续写合并**：模型需要把「已有故事正文」与「本轮口述」合成一篇，生成稿**允许且应当**保留已有正文中的事实（可改写语序、合并段落），并融入本轮口述中的新事实。
+
+【用户本轮口述】（本段亲口补充）
+{oral[:8000]}
+
+【已有故事正文】（已落库、允许在生成稿中出现或改写；出现于此处的内容**不算**本轮编造）
+{existing[:12000]}
+
+【模型生成的 JSON 叙事】
+{gen[:16000]}
+
+判断：生成稿是否出现**既明显不在本轮口述、也明显不在已有故事正文**的具体人名、地名、时间、数字、事件经过、对话，或把摘录/档案里才有的信息写成了用户亲口经历？
+若内容可归因于「已有故事」或「本轮口述」的合理整理，pass=true。
+若存在无法归因的明显编造或越界，pass=false。
+
+**JSON 输出**：只输出一个合法 JSON 对象。
+{{"pass": true, "reason": null}}
+或
+{{"pass": false, "reason": "一句话说明"}}
+
+只输出 JSON，不要其它文字。"""
+        else:
+            prompt = f"""你是事实核对员。比较下面两段文字。

 【用户口述】（亲历内容）
 {oral[:8000]}