Route all memory ingest/retrieve/enrichment/compaction through async MemoryService. Remove legacy sync memory implementations (ingest/retrieve/compaction); Celery and memoir Phase2 call asyncio.run into MemoryService-backed helpers. Memoir Phase1 batch ingest uses MemoryService.ingest_transcripts_batch; drop chapters. evidence_bundle_json mirror (Alembic 0015). Evaluation uses snapshot/link-only bundles; raise EvidenceClosureMissing instead of partial/fallback lineage tiers. Split memoir state into NarrativeCoverageState and InterviewControlState; delete the _interview_meta_store adapter layer. Remove rolling-query and recent-fact fallback settings from config and evidence assembly. Update judges, docs, tests, and PlaygroundPage alignment. Made-with: Cursor
50 lines
3.1 KiB
Markdown
50 lines
3.1 KiB
Markdown
# Memoir & memory reliability
|
||
|
||
This document summarizes production-oriented behavior for the memoir narrative pipeline, memory evidence, compaction, and async orchestration.
|
||
|
||
## Correlation ID (`memoir_correlation_id`)
|
||
|
||
- Phase 1 (`process_memoir_phase1`) generates a UUID at task start and logs `event=memoir_phase1_* … memoir_correlation_id=`.
|
||
- Phase 2 receives it via Celery `kwargs` and combines with `effective_correlation_id` (explicit id wins, else Celery task id).
|
||
- The same id is passed into `run_story_pipeline_for_category_batch`, structured logs, and `compaction_extra` when scheduling memory compaction after Phase 2.
|
||
|
||
## Feature flags (`app.core.config.Settings`)
|
||
|
||
| Flag | Default | Purpose |
|
||
|------|---------|---------|
|
||
| `memoir_fidelity_fail_open_on_parse_error` | `False` | When `True`, fidelity JSON/LLM failures pass the gate even for new stories (rollback only via ops need). |
|
||
| `memoir_narrative_evidence_overlap_min_chars` | `14` | Deterministic overlap check between body and evidence plain text. |
|
||
| `memoir_title_slots_require_body_or_oral_match` | `True` | Narrows title-generation slot inputs to body/oral overlap. |
|
||
| `memory_compaction_enabled` | `True` | Near-duplicate chunk soft-exclude; requires Celery worker + **Beat** for periodic `memory_compaction_sweep`. |
|
||
| `memoir_recompose_retry_on_lock_contention` | `True` | Chapter recompose retries with backoff when the chapter pipeline lock is held. |
|
||
| `memoir_phase2_singleflight_immediate` | `True` | Immediate Phase 2 `send_task` uses a stable `task_id` per user/category to reduce duplicate queue entries. |
|
||
| `chapter_pipeline_lock_ttl_seconds` | `360` | Shared lock TTL for Phase 2 and `recompose_chapter`; tune with longest expected runtimes. |
|
||
|
||
## Memory compaction → facts
|
||
|
||
When a chunk is soft-excluded as a near-duplicate loser, `mark_facts_stale_for_excluded_chunk_sync` sets linked `MemoryFact` rows (`source_chunk_id`, statuses `confirmed`/`candidate`) to **`stale`**. Downstream fact retrieval uses `confirmed` only for default search/browse paths.
|
||
|
||
## Acceptance-oriented metrics (log queries)
|
||
|
||
Monitor structured log events:
|
||
|
||
- `event=fidelity_parse_fail_closed` / `fidelity_check_fail`
|
||
- `event=memoir_phase2_*` with `memoir_correlation_id`
|
||
- `memory_compaction_exclude` / `memory_compaction_facts_staled`
|
||
- `event=recompose_chapter status=lock_busy_retry`
|
||
|
||
## Tests
|
||
|
||
Targeted regressions live under `api/tests/`:
|
||
|
||
- `test_fidelity_gate.py`, `test_narrative_boundary_regressions.py`
|
||
- `test_memory_consistency_rules.py`, `test_memoir_idempotency.py`
|
||
- `test_recompose_retry_policy.py`
|
||
- `test_llm_json_call.py`, `test_stage_slot_registry.py`
|
||
|
||
## LLM JSON (`llm_json_call`) and compat strip
|
||
|
||
- Standard path: `response_format=json_object` → `json.loads` → Pydantic validate.
|
||
- On decode failure only, `extract_json_payload` runs once (fence / brace strip). A hit emits **`event=llm_json_compat_strip_hit`** at WARNING.
|
||
- **Step 13 (sunset)**: observe this event in production for ~1–2 weeks; if zero hits, remove the compat branch from `app.core.llm_call` and migrate remaining callers off `extract_json_payload` for JSON-mode paths.
|