Files

Kevin 71fbd39e32 feat(api)!: memory single chain — async MemoryService, strict eval closure

Route all memory ingest/retrieve/enrichment/compaction through async MemoryService.
Remove legacy sync memory implementations (ingest/retrieve/compaction); Celery and
memoir Phase2 call asyncio.run into MemoryService-backed helpers.

Memoir Phase1 batch ingest uses MemoryService.ingest_transcripts_batch; drop chapters.
evidence_bundle_json mirror (Alembic 0015). Evaluation uses snapshot/link-only bundles;
raise EvidenceClosureMissing instead of partial/fallback lineage tiers.

Split memoir state into NarrativeCoverageState and InterviewControlState; delete the
_interview_meta_store adapter layer. Remove rolling-query and recent-fact fallback
settings from config and evidence assembly.

Update judges, docs, tests, and PlaygroundPage alignment.

Made-with: Cursor

2026-04-30 14:11:50 +08:00

3.1 KiB

Raw Blame History

Memoir & memory reliability

This document summarizes production-oriented behavior for the memoir narrative pipeline, memory evidence, compaction, and async orchestration.

Correlation ID (`memoir_correlation_id`)

Phase 1 (process_memoir_phase1) generates a UUID at task start and logs event=memoir_phase1_* … memoir_correlation_id=.
Phase 2 receives it via Celery kwargs and combines with effective_correlation_id (explicit id wins, else Celery task id).
The same id is passed into run_story_pipeline_for_category_batch, structured logs, and compaction_extra when scheduling memory compaction after Phase 2.

Feature flags (`app.core.config.Settings`)

Flag	Default	Purpose
`memoir_fidelity_fail_open_on_parse_error`	`False`	When `True`, fidelity JSON/LLM failures pass the gate even for new stories (rollback only via ops need).
`memoir_narrative_evidence_overlap_min_chars`	`14`	Deterministic overlap check between body and evidence plain text.
`memoir_title_slots_require_body_or_oral_match`	`True`	Narrows title-generation slot inputs to body/oral overlap.
`memory_compaction_enabled`	`True`	Near-duplicate chunk soft-exclude; requires Celery worker + Beat for periodic `memory_compaction_sweep`.
`memoir_recompose_retry_on_lock_contention`	`True`	Chapter recompose retries with backoff when the chapter pipeline lock is held.
`memoir_phase2_singleflight_immediate`	`True`	Immediate Phase 2 `send_task` uses a stable `task_id` per user/category to reduce duplicate queue entries.
`chapter_pipeline_lock_ttl_seconds`	`360`	Shared lock TTL for Phase 2 and `recompose_chapter`; tune with longest expected runtimes.

Memory compaction → facts

When a chunk is soft-excluded as a near-duplicate loser, mark_facts_stale_for_excluded_chunk_sync sets linked MemoryFact rows (source_chunk_id, statuses confirmed/candidate) to stale. Downstream fact retrieval uses confirmed only for default search/browse paths.

Acceptance-oriented metrics (log queries)

Monitor structured log events:

event=fidelity_parse_fail_closed / fidelity_check_fail
event=memoir_phase2_* with memoir_correlation_id
memory_compaction_exclude / memory_compaction_facts_staled
event=recompose_chapter status=lock_busy_retry

Tests

Targeted regressions live under api/tests/:

test_fidelity_gate.py, test_narrative_boundary_regressions.py
test_memory_consistency_rules.py, test_memoir_idempotency.py
test_recompose_retry_policy.py
test_llm_json_call.py, test_stage_slot_registry.py

LLM JSON (`llm_json_call`) and compat strip

Standard path: response_format=json_object → json.loads → Pydantic validate.
On decode failure only, extract_json_payload runs once (fence / brace strip). A hit emits event=llm_json_compat_strip_hit at WARNING.
Step 13 (sunset): observe this event in production for ~1–2 weeks; if zero hits, remove the compat branch from app.core.llm_call and migrate remaining callers off extract_json_payload for JSON-mode paths.

3.1 KiB Raw Blame History Unescape Escape