api/docs/memoir_reliability.md

# Memoir & memory reliability

This document summarizes production-oriented behavior for the memoir narrative pipeline, memory evidence, compaction, and async orchestration.

## Correlation ID (`memoir_correlation_id`)

- Phase 1 (`process_memoir_phase1`) generates a UUID at task start and logs `event=memoir_phase1_* … memoir_correlation_id=`.
- Phase 2 receives it via Celery `kwargs` and combines with `effective_correlation_id` (explicit id wins, else Celery task id).
- The same id is passed into `run_story_pipeline_for_category_batch`, structured logs, and `compaction_extra` when scheduling memory compaction after Phase 2.

## Feature flags (`app.core.config.Settings`)

| Flag | Default | Purpose |
|------|---------|---------|
| `memoir_fidelity_fail_open_on_parse_error` | `False` | When `True`, fidelity JSON/LLM failures pass the gate even for new stories (rollback only via ops need). |
| `memoir_narrative_evidence_overlap_min_chars` | `14` | Deterministic overlap check between body and evidence plain text. |
| `memoir_title_slots_require_body_or_oral_match` | `True` | Narrows title-generation slot inputs to body/oral overlap. |
| `memory_compaction_enabled` | `True` | Near-duplicate chunk soft-exclude; requires Celery worker + **Beat** for periodic `memory_compaction_sweep`. |
| `memoir_recompose_retry_on_lock_contention` | `True` | Chapter recompose retries with backoff when the chapter pipeline lock is held. |
| `memoir_phase2_singleflight_immediate` | `True` | Immediate Phase 2 `send_task` uses a stable `task_id` per user/category to reduce duplicate queue entries. |
| `chapter_pipeline_lock_ttl_seconds` | `360` | Shared lock TTL for Phase 2 and `recompose_chapter`; tune with longest expected runtimes. |

## Memory compaction → facts

When a chunk is soft-excluded as a near-duplicate loser, `mark_facts_stale_for_excluded_chunk_sync` sets linked `MemoryFact` rows (`source_chunk_id`, statuses `confirmed`/`candidate`) to **`stale`**. Downstream fact retrieval uses `confirmed` only for default search/browse paths.

## Acceptance-oriented metrics (log queries)

Monitor structured log events:

- `event=fidelity_parse_fail_closed` / `fidelity_check_fail`
- `event=memoir_phase2_*` with `memoir_correlation_id`
- `memory_compaction_exclude` / `memory_compaction_facts_staled`
- `event=recompose_chapter status=lock_busy_retry`

## Tests

Targeted regressions live under `api/tests/`:

- `test_fidelity_gate.py`, `test_narrative_boundary_regressions.py`
- `test_memory_consistency_rules.py`, `test_memoir_idempotency.py`
- `test_recompose_retry_policy.py`
- `test_llm_json_call.py`, `test_stage_slot_registry.py`

## LLM JSON (`llm_json_call`) and compat strip

- Standard path: `response_format=json_object` → `json.loads` → Pydantic validate.
- On decode failure only, `extract_json_payload` runs once (fence / brace strip). A hit emits **`event=llm_json_compat_strip_hit`** at WARNING.
- **Step 13 (sunset)**: observe this event in production for ~1–2 weeks; if zero hits, remove the compat branch from `app.core.llm_call` and migrate remaining callers off `extract_json_payload` for JSON-mode paths.
-												feat(api): 访谈路径轻量门控、Memoir Phase1 批处理与叙事/记忆管线加固

- 新增 utterance_substance：短时/应答/元话语可跳过记忆检索、阶段 LLM 与资料抽取 LLM；可配置
- 输入归一化：LLM 模式默认仅语音/ASR；配置项写入 .env.example
- Memoir Phase1：可选 batch LLM 一次性抽取+分类（失败回退逐段）；Extraction 空槽位时阶段与 current_stage 对齐，prompt 约束收紧
- 叙事与忠实度：narrative_safety、证据重叠/场合锚点、标题 slots 与履历短语 grounded；fidelity 解析失败 fail-open 可配置
- 章节管线：锁 TTL 上调、锁竞争 Celery 重试、Phase2 immediate singleflight 等；story_pipeline_sync / chapter_compose / memoir_tasks 联动
- Memory：compaction / repo / summarizer / evidence 小修；事实 FTS 未命中是否回退最近事实可配置
- 新增 memoir_pipeline_trace；补充 memoir_reliability 文档与多项回归/门控测试

											
										
										
											2026-04-03 10:12:59 +08:00
+								# Memoir & memory reliability
 								This document summarizes production-oriented behavior for the memoir narrative pipeline, memory evidence, compaction, and async orchestration.
 								## Correlation ID (`memoir_correlation_id`)
 								- Phase 1 (`process_memoir_phase1`) generates a UUID at task start and logs `event=memoir_phase1_* … memoir_correlation_id=`.
 								- Phase 2 receives it via Celery `kwargs` and combines with `effective_correlation_id` (explicit id wins, else Celery task id).
 								- The same id is passed into `run_story_pipeline_for_category_batch`, structured logs, and `compaction_extra` when scheduling memory compaction after Phase 2.
 								## Feature flags (`app.core.config.Settings`)
 								| Flag | Default | Purpose |
 								|------|---------|---------|
 								| `memoir_fidelity_fail_open_on_parse_error` | `False` | When `True`, fidelity JSON/LLM failures pass the gate even for new stories (rollback only via ops need). |
 								| `memoir_narrative_evidence_overlap_min_chars` | `14` | Deterministic overlap check between body and evidence plain text. |
 								| `memoir_title_slots_require_body_or_oral_match` | `True` | Narrows title-generation slot inputs to body/oral overlap. |
-												聊天和回忆录证据检索都走 pgvector，去掉 Postgres FTS/content_tsv，新迁移删掉 content_tsv 列（部署要先 alembic upgrade）。

Embedding 端口增加 is_available()，聊天和回忆录日志用统一方式表示向量是否真能调用。

记忆整理（compaction）支持 Beat 定期扫用户；

事实抽取提示与 subject 归一化，减少同一人多种称呼；

											
										
										
											2026-04-03 11:43:16 +08:00
+								| `memory_compaction_enabled` | `True` | Near-duplicate chunk soft-exclude; requires Celery worker + **Beat** for periodic `memory_compaction_sweep`. |
-												feat(api): 访谈路径轻量门控、Memoir Phase1 批处理与叙事/记忆管线加固

- 新增 utterance_substance：短时/应答/元话语可跳过记忆检索、阶段 LLM 与资料抽取 LLM；可配置
- 输入归一化：LLM 模式默认仅语音/ASR；配置项写入 .env.example
- Memoir Phase1：可选 batch LLM 一次性抽取+分类（失败回退逐段）；Extraction 空槽位时阶段与 current_stage 对齐，prompt 约束收紧
- 叙事与忠实度：narrative_safety、证据重叠/场合锚点、标题 slots 与履历短语 grounded；fidelity 解析失败 fail-open 可配置
- 章节管线：锁 TTL 上调、锁竞争 Celery 重试、Phase2 immediate singleflight 等；story_pipeline_sync / chapter_compose / memoir_tasks 联动
- Memory：compaction / repo / summarizer / evidence 小修；事实 FTS 未命中是否回退最近事实可配置
- 新增 memoir_pipeline_trace；补充 memoir_reliability 文档与多项回归/门控测试

											
										
										
											2026-04-03 10:12:59 +08:00
+								| `memoir_recompose_retry_on_lock_contention` | `True` | Chapter recompose retries with backoff when the chapter pipeline lock is held. |
 								| `memoir_phase2_singleflight_immediate` | `True` | Immediate Phase 2 `send_task` uses a stable `task_id` per user/category to reduce duplicate queue entries. |
 								| `chapter_pipeline_lock_ttl_seconds` | `360` | Shared lock TTL for Phase 2 and `recompose_chapter`; tune with longest expected runtimes. |
 								## Memory compaction → facts
 								When a chunk is soft-excluded as a near-duplicate loser, `mark_facts_stale_for_excluded_chunk_sync` sets linked `MemoryFact` rows (`source_chunk_id`, statuses `confirmed`/`candidate`) to **`stale`**. Downstream fact retrieval uses `confirmed` only for default search/browse paths.
 								## Acceptance-oriented metrics (log queries)
 								Monitor structured log events:
 								- `event=fidelity_parse_fail_closed` / `fidelity_check_fail`
 								- `event=memoir_phase2_*` with `memoir_correlation_id`
 								- `memory_compaction_exclude` / `memory_compaction_facts_staled`
 								- `event=recompose_chapter status=lock_busy_retry`
 								## Tests
 								Targeted regressions live under `api/tests/`:
 								- `test_fidelity_gate.py`, `test_narrative_boundary_regressions.py`
 								- `test_memory_consistency_rules.py`, `test_memoir_idempotency.py`
 								- `test_recompose_retry_policy.py`
-												feat(api): 统一 LLM JSON 调用层 llm_json_call，按域 Schema 迁移 chat/memoir agents

											
										
										
											2026-04-03 13:34:27 +08:00
+								- `test_llm_json_call.py`, `test_stage_slot_registry.py`
 								## LLM JSON (`llm_json_call`) and compat strip
 								- Standard path: `response_format=json_object` → `json.loads` → Pydantic validate.
 								- On decode failure only, `extract_json_payload` runs once (fence / brace strip). A hit emits **`event=llm_json_compat_strip_hit`** at WARNING.
 								- **Step 13 (sunset)**: observe this event in production for ~1–2 weeks; if zero hits, remove the compat branch from `app.core.llm_call` and migrate remaining callers off `extract_json_payload` for JSON-mode paths.