life-echo/api/docs/memoir_reliability.md

# Memoir & memory reliability

This document summarizes production-oriented behavior for the memoir narrative pipeline, memory evidence, compaction, and async orchestration.

## Correlation ID (`memoir_correlation_id`)

- Phase 1 (`process_memoir_phase1`) generates a UUID at task start and logs `event=memoir_phase1_* … memoir_correlation_id=`.
- Phase 2 receives it via Celery `kwargs` and combines with `effective_correlation_id` (explicit id wins, else Celery task id).
- The same id is passed into `run_story_pipeline_for_category_batch`, structured logs, and `compaction_extra` when scheduling memory compaction after Phase 2.

## Feature flags (`app.core.config.Settings`)

| Flag | Default | Purpose |
|------|---------|---------|
| `memoir_fidelity_fail_open_on_parse_error` | `False` | When `True`, fidelity JSON/LLM failures pass the gate even for new stories (rollback only via ops need). |
| `memoir_narrative_evidence_overlap_min_chars` | `14` | Deterministic overlap check between body and evidence plain text. |
| `memoir_title_slots_require_body_or_oral_match` | `True` | Narrows title-generation slot inputs to body/oral overlap. |
| `memory_fact_search_use_recent_fallback` | `False` | When `False`, fact FTS misses do **not** fall back to “recent confirmed facts” (reduces contradictory/unrelated facts in prompts). |
| `memoir_recompose_retry_on_lock_contention` | `True` | Chapter recompose retries with backoff when the chapter pipeline lock is held. |
| `memoir_phase2_singleflight_immediate` | `True` | Immediate Phase 2 `send_task` uses a stable `task_id` per user/category to reduce duplicate queue entries. |
| `chapter_pipeline_lock_ttl_seconds` | `360` | Shared lock TTL for Phase 2 and `recompose_chapter`; tune with longest expected runtimes. |

## Memory compaction → facts

When a chunk is soft-excluded as a near-duplicate loser, `mark_facts_stale_for_excluded_chunk_sync` sets linked `MemoryFact` rows (`source_chunk_id`, statuses `confirmed`/`candidate`) to **`stale`**. Downstream fact retrieval uses `confirmed` only for default search/browse paths.

## Acceptance-oriented metrics (log queries)

Monitor structured log events:

- `event=fidelity_parse_fail_closed` / `fidelity_check_fail`
- `event=memoir_phase2_*` with `memoir_correlation_id`
- `memory_compaction_exclude` / `memory_compaction_facts_staled`
- `event=recompose_chapter status=lock_busy_retry`

## Tests

Targeted regressions live under `api/tests/`:

- `test_fidelity_gate.py`, `test_narrative_boundary_regressions.py`
- `test_memory_consistency_rules.py`, `test_memoir_idempotency.py`
- `test_recompose_retry_policy.py`