feat(eval): memoir A/B chapter judging and eval-web parity with dialogue

- Judge baseline excerpt and library chapter separately; build_memoir_compare_summary for gate, nine-dim and leaf deltas. - Memoir SSE chapter payload: baseline_judge, compare_summary, baseline_judge_error. - MemoirJudgeOutput: loose score coercion and post-validate clamp; memoir judge prompt caps from settings. - app-eval-web: two-column MemoirScoreCard layout, MemoirCompareSummary, chapter blocks and CSS. - Add memoir_compare_summary, log_events, celery_log_context, memoir_pipeline_progress; tests and migration 0014. - Misc: memory/evidence and enrichment paths, task/orchestrator updates, internal-eval docs, env examples.
2026-04-10 10:23:43 +08:00
parent b0251e5b26
commit ac49bc7f23
59 changed files with 4773 additions and 696 deletions
--- a/api/docs/internal-eval.md
+++ b/api/docs/internal-eval.md
@@ -49,7 +49,7 @@ uv run uvicorn app.internal_main:internal_app --host 0.0.0.0 --port 8001
 Celery worker 与主站共用（`celery_app` 已 `include` 回忆录等任务；**不再**包含已下线的 `evaluation_tasks` 实验批量跑批）。需 Phase1 / 叙事推进时请启动 worker：

 ```bash
-uv run celery -A app.tasks.celery_app worker -l info
+uv run celery -A app.tasks.celery_app worker -l info -Q celery,memory_idle
 ```

 ## 前端（`app-eval-web`）