feat: 回忆录证据血缘与内部评测可追溯,顺带对齐本地评测台与 CI

数据库与模型:新增多版迁移(章节证据快照、对话血缘、记忆事实/时间线 lineage 等),把「成稿 ↔ 对话/记忆」的溯源信息落到表结构里。
业务链路:会话与 WS、回忆录/故事流水线、记忆写入与 enrichment 等跟着接上线索与快照;新增章节证据快照与评测侧 EvalTraceService 等模块,方便组评审用的证据包。
内部评测:自动化 run 与手工 memoir 评审共用可追溯证据;rubric/ judge 相关脚本与文档有配套调整。
app-eval-web:Memoir/实验详情里能展开看证据摘要与 evidence_trace(含对话轮次 id);Vite 代理与 development.sh 注入的 API 端口与当前默认内部评测端口一致,避免改端口后页面连错服务。
工程杂项:GitHub Actions / 仓库说明有更新;各适配器与支付/配额/plan 等多处为小改动或跟随主改动的收尾;新增/扩充了?
This commit is contained in:
Kevin
2026-04-08 15:37:09 +08:00
parent 6772e1269c
commit 309a051038
109 changed files with 4125 additions and 858 deletions

View File

@@ -6,6 +6,7 @@ import uuid
from typing import Any
from sqlalchemy import select
from sqlalchemy.dialects.postgresql import insert as pg_insert
from sqlalchemy.ext.asyncio import AsyncSession
from app.features.evaluation.models import (
@@ -270,8 +271,10 @@ async def add_turn(
judge_scores_json: dict[str, Any] | None,
judge_rationale: str | None,
) -> EvalRunTurn:
row = EvalRunTurn(
id=_id(),
"""插入或更新同 (run_id, turn_index) 的轮次,避免 Celery 重试时 UniqueViolation。"""
tid = _id()
ins = pg_insert(EvalRunTurn).values(
id=tid,
run_id=run_id,
turn_index=turn_index,
user_utterance=user_utterance,
@@ -280,8 +283,27 @@ async def add_turn(
judge_scores_json=judge_scores_json,
judge_rationale=judge_rationale,
)
db.add(row)
stmt = ins.on_conflict_do_update(
constraint="uq_eval_run_turn_index",
set_={
"user_utterance": ins.excluded.user_utterance,
"assistant_reply": ins.excluded.assistant_reply,
"duration_ms": ins.excluded.duration_ms,
"judge_scores_json": ins.excluded.judge_scores_json,
"judge_rationale": ins.excluded.judge_rationale,
},
)
await db.execute(stmt)
await db.flush()
res = await db.execute(
select(EvalRunTurn)
.where(
EvalRunTurn.run_id == run_id,
EvalRunTurn.turn_index == turn_index,
)
.limit(1)
)
row = res.scalar_one()
return row