yangshilin
17b9fa3466
fix:
...
1. 修复登录界面文字被遮挡问题
2. 大字模式关闭后显示异常问题
3. 重新调整大字模式是否开启时的字体显示效果
2026-04-10 20:35:57 +08:00
Kevin
204ae24697
Merge branch 'eval/elapsed-time-memoir-batch-chunk' into development
2026-04-10 10:27:41 +08:00
Kevin
ac49bc7f23
feat(eval): memoir A/B chapter judging and eval-web parity with dialogue
...
- Judge baseline excerpt and library chapter separately; build_memoir_compare_summary for gate, nine-dim and leaf deltas.
- Memoir SSE chapter payload: baseline_judge, compare_summary, baseline_judge_error.
- MemoirJudgeOutput: loose score coercion and post-validate clamp; memoir judge prompt caps from settings.
- app-eval-web: two-column MemoirScoreCard layout, MemoirCompareSummary, chapter blocks and CSS.
- Add memoir_compare_summary, log_events, celery_log_context, memoir_pipeline_progress; tests and migration 0014.
- Misc: memory/evidence and enrichment paths, task/orchestrator updates, internal-eval docs, env examples.
2026-04-10 10:25:15 +08:00
yangshilin
e1341c6d18
feat:
...
1. 建立问题库大纲,对应每个人生阶段槽位
2. 鼓励使用更生活化的交流语言共情与总结
3. 降低评审模型可能发生截断的概率
4. 成稿质量维度强化情感表达和上下文连贯性
2026-04-09 15:32:35 +08:00
Kevin
6772e1269c
feat(evaluation): memoir readiness, judge/replay updates, eval web playground
...
Add memoir_readiness_service and router tests; extend judge schemas/services, replay_service, and conversation rubric; align story route agent, payload, prompts, and story_pipeline_sync; update agent logging, config, and DI. Document internal-eval; add replayDraft util and PlaygroundPage changes in app-eval-web.
2026-04-08 09:43:34 +08:00
Kevin
99543d04c6
feat(eval): internal-eval stack, judge fixes, and eval web overhaul
...
- Merge internal-eval into development.sh (single Celery/infra); internal-eval.sh
wraps with LIFE_ECHO_WITH_INTERNAL_EVAL; EVAL_ATTACH_ONLY for attaching 8001
when :8000 is already up; document in api/docs/internal-eval.md.
- Evaluation: transcript_for_judge, judge error surfacing, rubric/schema tweaks,
execution_service and router updates; tests for judge and composite eval.
- Memory: ingest nested transaction for embedding/enrichment rollback safety.
- Conversation WS: logger.exception for pipeline errors (avoid loguru KeyError).
- app-eval-web: Playground saved replays, dialogue turns helper, hash user_id
for Memoir; Memoir chapter baseline↔DB row compare with title heuristics;
Stories page (#memoir-stories); Markdown + copy buttons; toolbar/panel UI;
react-markdown; development proxy and fixture updates.
2026-04-07 17:18:47 +08:00
Kevin
5972b0e721
feat(evaluation): 成稿 100 分 rubric、证据评审与评测台调整
...
- 回忆录细项上限收紧为合计 100 分,去掉 110 折算与 raw_dimension_total
- judge_memoir 拼接原始访谈与可选导出基线;无证据时提示保守打真实性相关分
- 自动评测 run 与手动章节/故事评审统一带 transcript 证据(会话/用户聚合、截断)
- 访谈打分仍为情绪强化版 15 细项、总分 100
- 评测台默认基准改为 zuckxu 导出 MD;移除逐轮用户句对齐表及相关逻辑
- 新增 judge schema 与 memoir prompt 组装的单元测试
2026-04-07 10:36:22 +08:00
Kevin
2fded6fbd9
refactor(chat): AI-native prompts, remove interview heuristics
...
- Drop interview_reply_length and utterance_substance; always run stage LLM
and memory retrieval when enabled; trim Settings fields and .env.example.
- Replace guided/opening prompts with compact fact blocks plus unified
behavior guidance; slim background_voice and persona to tone hints.
- InterviewAgent uses fixed chat_interview max_tokens/chars/segments.
Also includes stacked work: profile followup/extract path, evaluation rubric
and judge schema updates, transcript SPLIT handling in execution service,
user export markdown split tests, and golden case fixture.
2026-04-06 22:23:46 +08:00
Kevin
b75edacb5f
feat/ 导出开发容器内的数据用于评估
2026-04-03 14:44:46 +08:00