feat(evaluation): memoir readiness, judge/replay updates, eval web playground

Add memoir_readiness_service and router tests; extend judge schemas/services, replay_service, and conversation rubric; align story route agent, payload, prompts, and story_pipeline_sync; update agent logging, config, and DI. Document internal-eval; add replayDraft util and PlaygroundPage changes in app-eval-web.
This commit is contained in:
Kevin
2026-04-08 09:38:07 +08:00
parent 99543d04c6
commit 6772e1269c
26 changed files with 1255 additions and 124 deletions

View File

@@ -67,7 +67,7 @@ major_strengths, major_issues, insufficient_evidence, evidence_refs, confidence,
`confidence`0 到 1 之间小数,表示你对本次评分整体可信度(证据充分则偏高)。
`total_score` 必须等于上述 15 个细项之和(满分 100
`total_score` 必须等于上述 15 个细项之和(满分 100**输出前将 15 项逐项相加验算**;勿在未顶格时默认写 100例如情绪四项为 9+8+6+6、其余块均顶格时合计为 99 而非 100
聚合分 emotion_score、information_score、persona_score、structure_score、question_score 可不填(服务端会重算)。
只输出 JSON。
"""