Kevin
5972b0e721
feat(evaluation): 成稿 100 分 rubric、证据评审与评测台调整
- 回忆录细项上限收紧为合计 100 分,去掉 110 折算与 raw_dimension_total
- judge_memoir 拼接原始访谈与可选导出基线;无证据时提示保守打真实性相关分
- 自动评测 run 与手动章节/故事评审统一带 transcript 证据(会话/用户聚合、截断)
- 访谈打分仍为情绪强化版 15 细项、总分 100
- 评测台默认基准改为 zuckxu 导出 MD;移除逐轮用户句对齐表及相关逻辑
- 新增 judge schema 与 memoir prompt 组装的单元测试
2026-04-07 10:36:22 +08:00
..
2026-04-06 23:19:20 +08:00
2026-04-07 10:36:22 +08:00
2026-04-03 14:44:46 +08:00
2026-04-06 13:49:28 +08:00
2026-04-03 14:44:46 +08:00
2026-04-06 23:19:20 +08:00
2026-04-03 14:44:46 +08:00
2026-04-07 10:36:22 +08:00
2026-04-03 14:44:46 +08:00
2026-04-03 14:44:46 +08:00
2026-04-03 14:44:46 +08:00
2026-04-07 10:36:22 +08:00
2026-04-07 10:36:22 +08:00
2026-04-07 10:36:22 +08:00
2026-04-03 14:44:46 +08:00
2026-04-06 13:49:28 +08:00
2026-04-06 23:19:20 +08:00
2026-04-06 13:49:28 +08:00
2026-04-06 23:19:20 +08:00
2026-04-06 23:19:20 +08:00
2026-04-06 13:49:28 +08:00
2026-04-06 13:49:28 +08:00
2026-04-03 14:44:46 +08:00
2026-04-06 13:49:28 +08:00