Route all memory ingest/retrieve/enrichment/compaction through async MemoryService. Remove legacy sync memory implementations (ingest/retrieve/compaction); Celery and memoir Phase2 call asyncio.run into MemoryService-backed helpers. Memoir Phase1 batch ingest uses MemoryService.ingest_transcripts_batch; drop chapters. evidence_bundle_json mirror (Alembic 0015). Evaluation uses snapshot/link-only bundles; raise EvidenceClosureMissing instead of partial/fallback lineage tiers. Split memoir state into NarrativeCoverageState and InterviewControlState; delete the _interview_meta_store adapter layer. Remove rolling-query and recent-fact fallback settings from config and evidence assembly. Update judges, docs, tests, and PlaygroundPage alignment. Made-with: Cursor
3.1 KiB
3.1 KiB
Memoir & memory reliability
This document summarizes production-oriented behavior for the memoir narrative pipeline, memory evidence, compaction, and async orchestration.
Correlation ID (memoir_correlation_id)
- Phase 1 (
process_memoir_phase1) generates a UUID at task start and logsevent=memoir_phase1_* … memoir_correlation_id=. - Phase 2 receives it via Celery
kwargsand combines witheffective_correlation_id(explicit id wins, else Celery task id). - The same id is passed into
run_story_pipeline_for_category_batch, structured logs, andcompaction_extrawhen scheduling memory compaction after Phase 2.
Feature flags (app.core.config.Settings)
| Flag | Default | Purpose |
|---|---|---|
memoir_fidelity_fail_open_on_parse_error |
False |
When True, fidelity JSON/LLM failures pass the gate even for new stories (rollback only via ops need). |
memoir_narrative_evidence_overlap_min_chars |
14 |
Deterministic overlap check between body and evidence plain text. |
memoir_title_slots_require_body_or_oral_match |
True |
Narrows title-generation slot inputs to body/oral overlap. |
memory_compaction_enabled |
True |
Near-duplicate chunk soft-exclude; requires Celery worker + Beat for periodic memory_compaction_sweep. |
memoir_recompose_retry_on_lock_contention |
True |
Chapter recompose retries with backoff when the chapter pipeline lock is held. |
memoir_phase2_singleflight_immediate |
True |
Immediate Phase 2 send_task uses a stable task_id per user/category to reduce duplicate queue entries. |
chapter_pipeline_lock_ttl_seconds |
360 |
Shared lock TTL for Phase 2 and recompose_chapter; tune with longest expected runtimes. |
Memory compaction → facts
When a chunk is soft-excluded as a near-duplicate loser, mark_facts_stale_for_excluded_chunk_sync sets linked MemoryFact rows (source_chunk_id, statuses confirmed/candidate) to stale. Downstream fact retrieval uses confirmed only for default search/browse paths.
Acceptance-oriented metrics (log queries)
Monitor structured log events:
event=fidelity_parse_fail_closed/fidelity_check_failevent=memoir_phase2_*withmemoir_correlation_idmemory_compaction_exclude/memory_compaction_facts_staledevent=recompose_chapter status=lock_busy_retry
Tests
Targeted regressions live under api/tests/:
test_fidelity_gate.py,test_narrative_boundary_regressions.pytest_memory_consistency_rules.py,test_memoir_idempotency.pytest_recompose_retry_policy.pytest_llm_json_call.py,test_stage_slot_registry.py
LLM JSON (llm_json_call) and compat strip
- Standard path:
response_format=json_object→json.loads→ Pydantic validate. - On decode failure only,
extract_json_payloadruns once (fence / brace strip). A hit emitsevent=llm_json_compat_strip_hitat WARNING. - Step 13 (sunset): observe this event in production for ~1–2 weeks; if zero hits, remove the compat branch from
app.core.llm_calland migrate remaining callers offextract_json_payloadfor JSON-mode paths.