Files
life-echo/api/app/features/conversation/input_normalize.py
Kevin 07c6478742 feat(api): 访谈路径轻量门控、Memoir Phase1 批处理与叙事/记忆管线加固
- 新增 utterance_substance:短时/应答/元话语可跳过记忆检索、阶段 LLM 与资料抽取 LLM;可配置
- 输入归一化:LLM 模式默认仅语音/ASR;配置项写入 .env.example
- Memoir Phase1:可选 batch LLM 一次性抽取+分类(失败回退逐段);Extraction 空槽位时阶段与 current_stage 对齐,prompt 约束收紧
- 叙事与忠实度:narrative_safety、证据重叠/场合锚点、标题 slots 与履历短语 grounded;fidelity 解析失败 fail-open 可配置
- 章节管线:锁 TTL 上调、锁竞争 Celery 重试、Phase2 immediate singleflight 等;story_pipeline_sync / chapter_compose / memoir_tasks 联动
- Memory:compaction / repo / summarizer / evidence 小修;事实 FTS 未命中是否回退最近事实可配置
- 新增 memoir_pipeline_trace;补充 memoir_reliability 文档与多项回归/门控测试
2026-04-03 10:12:59 +08:00

65 lines
2.0 KiB
Python
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
"""
聊天输入归一:供访谈 Agent / 编排层对 ASR 与键盘输入做可控预处理(规则 / 可选 LLM
不改变 segment 落库原文;仅作为模型侧派生净稿。
与 memoir 共用同一套确定性规则,避免聊天与回忆录对同一句理解割裂。
"""
from __future__ import annotations
from typing import Any
from app.core.config import settings
from app.core.text_normalize import apply_oral_rules, llm_normalize_text
from app.core.logging import get_logger
logger = get_logger(__name__)
apply_conversation_input_rules = apply_oral_rules
def _llm_normalize_chat_input(text: str, llm: Any) -> str | None:
"""仅修正明显错字与同音字,不增事实;失败返回 None。"""
return llm_normalize_text(
text,
llm,
max_input_chars=int(settings.chat_input_normalize_llm_max_input_chars),
max_tokens=int(settings.chat_input_normalize_llm_max_tokens),
agent_name="chat_input_normalize.llm",
)
def normalize_chat_input_for_agent(
text: str,
*,
llm: Any | None = None,
is_from_voice: bool = False,
) -> str:
"""
聊天侧单一出口:编排层与 InterviewAgent 共用。
- 全局关闭:原文
- off原文
- rules仅规则
- llm先规则可选LLM无 llm 或失败则保留规则结果
- chat_input_normalize_llm_voice_onlymode=llm 时仅 is_from_voice 为真才调用 LLM
"""
if not settings.chat_input_normalize_enabled:
return text or ""
mode = (settings.chat_input_normalize_mode or "rules").strip().lower()
if mode == "off":
return text or ""
base = apply_conversation_input_rules(text or "")
if mode != "llm":
return base
effective_llm = llm
if settings.chat_input_normalize_llm_voice_only and not is_from_voice:
effective_llm = None
refined = _llm_normalize_chat_input(base, effective_llm)
if refined is not None:
return refined
return base