- 采用 hash 路由与会话式壳层(Playground / Datasets / Experiments / Versions / Memoir) - 抽取 api、types、hooks(轮询、通知、实验 SSE)与 NoticeContext - Playground:基线/实际生成双栏、重放、流式自动评分与 ScoreCard - Datasets:回归集与用例列表、Markdown/JSON 导入、会话快照 - Experiments:创建实验、提交运行、SSE 进度、DiffTable 与门禁展示 - 样式与无障碍:DM Sans + JetBrains Mono、侧栏响应式、? 快捷键帮助
- Extend evaluation API: schemas, router, repo, admin and execution services - Improve user export markdown importer; add fixtures and importer tests - Session catalog repo/service updates; internal app wiring and docs - Add internal-eval.sh helper; refresh app-eval-web (App, styles, Vite)