- Add users.language_preference (Alembic 0018, default zh); capture at signup/SMS only; expose on auth and profile APIs - Lite English prompts for chat and memoir; localized stage labels and agent names (Life Echo / 岁月知己) - Tencent TTS: language-aware synthesis, ModelType=1 for 501004, English chunking - WebSocket pipeline: emit all AGENT_RESPONSE segments when TTS cancels; INFO logs for tts_this_turn and TTS decisions; on-demand TTS logging - Expo: device language on auth, i18n tiers/agent name, [SPLIT] streaming UX fixes - Tests for migration, prompts, pipeline, router tts_this_turn, reply segments Co-authored-by: Cursor <cursoragent@cursor.com>
285 lines
18 KiB
Python
285 lines
18 KiB
Python
"""
|
||
风格配置(Option B):Chat 与 Memoir 拥有**各自独立**的 StyleProfile。
|
||
|
||
历史上 chat prompt 的「成稿质量导向」段落与 memoir prompt 的「传记作家文体」段落共享同一
|
||
批形容词与维度,导致一处微调就会牵出另一处的隐性副作用——典型的**风格闭环过拟合**。
|
||
|
||
本模块的目标:
|
||
|
||
- `MemoirQualityHints`:memoir 评测维度(真实性/信息质量/叙事结构/语言与文笔等)数据;
|
||
**单一事实源**,chat 与 memoir 都只从此读取权重,不各自散写。
|
||
- `ChatStyleProfile`:chat 侧的语气、沉浸度、风格参考;引用 `MemoirQualityHints` 作为
|
||
「成稿目标的镜像」,而不是自己再写一遍;调整 chat 语气不会意外动到成稿。
|
||
- `MemoirStyleProfile`:memoir 侧的书面语文体与结构节奏;独立于 chat,chat 的语气改动
|
||
不会污染成稿。
|
||
|
||
迁移策略:当前版本把既有文本平移到两个 profile 中,保证对外生成的最终 prompt 字符串与
|
||
过渡前一致(有测试覆盖);之后任何一侧的风格演化都只改对应 profile。
|
||
"""
|
||
|
||
from __future__ import annotations
|
||
|
||
from dataclasses import dataclass, field
|
||
from typing import List, Tuple
|
||
|
||
# =============================================================================
|
||
# 共享:Memoir 评测维度单一事实源
|
||
# =============================================================================
|
||
|
||
|
||
@dataclass(frozen=True)
|
||
class MemoirQualityDimension:
|
||
key: str
|
||
label: str
|
||
weight: int
|
||
description: str
|
||
|
||
|
||
_DEFAULT_MEMOIR_QUALITY_DIMENSIONS: Tuple[MemoirQualityDimension, ...] = (
|
||
MemoirQualityDimension(
|
||
key="truth_and_coverage",
|
||
label="真实性与覆盖",
|
||
weight=23,
|
||
description="不诱导编造;五阶段叙述槽尽量收齐关键切片,缺角时用大纲**轻推**;已述事实当铁底。",
|
||
),
|
||
MemoirQualityDimension(
|
||
key="information_quality",
|
||
label="信息质量",
|
||
weight=14,
|
||
description="要可核对、有锚点的细节,忌水问、忌空泛「还有吗」;促使用户**落具体人事物**。",
|
||
),
|
||
MemoirQualityDimension(
|
||
key="narrative_structure",
|
||
label="叙事结构",
|
||
weight=14,
|
||
description="帮**场景—过程—感受**成链;必要时轻轻带时间、转折,让一段话像**一小节故事**而非点状清单。",
|
||
),
|
||
MemoirQualityDimension(
|
||
key="language_and_prose",
|
||
label="语言与文笔",
|
||
weight=18,
|
||
description="口语里也要有画面与具体词,为成稿**预埋好记的意象**;不在聊天里写书,但忌机关腔。",
|
||
),
|
||
MemoirQualityDimension(
|
||
key="emotional_expression",
|
||
label="情感表达",
|
||
weight=9,
|
||
description="情绪接**真**不接戏;留白与并肩胜过廉价金句;让用户感到**被接住、敢说下去**,勿因追求简洁显得疏离。",
|
||
),
|
||
MemoirQualityDimension(
|
||
key="character_modeling",
|
||
label="人物建模",
|
||
weight=9,
|
||
description="关系里谁在乎谁、怕谁、像谁、和谁拧着——从选择与反应里**多留一道口子**给成稿。",
|
||
),
|
||
MemoirQualityDimension(
|
||
key="coherence",
|
||
label="连贯性",
|
||
weight=4,
|
||
description="若年岁、称谓、地点与前文打架,**温和**顺一下,不要审讯式揪错。",
|
||
),
|
||
MemoirQualityDimension(
|
||
key="expression_richness",
|
||
label="表达丰富度",
|
||
weight=5,
|
||
description="比喻、通感**偶尔**即可,防单调也防堆砌;**避免固定意象反复复用**(如同一意象或同一比喻跨多轮反复出现,宁可换说法)。",
|
||
),
|
||
MemoirQualityDimension(
|
||
key="publication_readiness",
|
||
label="出版就绪度",
|
||
weight=4,
|
||
description="忌官样排比、忌导语腔;密度像能交给编辑接着润的口述。",
|
||
),
|
||
)
|
||
|
||
|
||
@dataclass(frozen=True)
|
||
class MemoirQualityHints:
|
||
"""成稿质量导向(chat 访谈与 memoir 编辑都以此为单一事实源)。"""
|
||
|
||
dimensions: Tuple[MemoirQualityDimension, ...] = _DEFAULT_MEMOIR_QUALITY_DIMENSIONS
|
||
|
||
def bullets(self) -> List[str]:
|
||
return [
|
||
f"- **{d.label}({d.weight})**:{d.description}" for d in self.dimensions
|
||
]
|
||
|
||
|
||
# =============================================================================
|
||
# Chat 侧风格
|
||
# =============================================================================
|
||
|
||
|
||
CHAT_REPLY_STYLE_EXAMPLES_ZH = (
|
||
"风格要点(**不要照抄字面、不要复用示例里的意象**;每一轮都要从对方本轮新说的词里长新句):\n"
|
||
"- 承接要**贴对方原词**:复述前把对方话里的名词或动词拎一个出来当钩子,再往前推半步;\n"
|
||
"- 可用**半句并肩**或**很短**一句通感;**避免**把同一比喻或意象跨多轮反复使用;\n"
|
||
"- **禁止**用「A 很…B 很…你选哪个」且每个选项里塞长篇描写或隐喻——开放问优先;\n"
|
||
"- 拿掉「听起来…」「我理解…」「这让我想起…」这些模板句;\n"
|
||
"- **反例**:每轮用固定的「火、田野、红薯、蚂蚁、巷子、烤…」作情怀底图——换对象换词。"
|
||
)
|
||
|
||
|
||
@dataclass(frozen=True)
|
||
class ChatStyleProfile:
|
||
"""Chat(访谈)侧风格配置。
|
||
|
||
通过 `memoir_quality_hints` 显式引用 memoir 成稿目标;chat 文本不再自己重写一遍权重
|
||
说明。`persona_tone` / `background_voice_tone` 为调用方在运行时拼入的音调 hint。
|
||
"""
|
||
|
||
persona_tone: str = ""
|
||
background_voice_tone: str = ""
|
||
reply_style_examples: str = CHAT_REPLY_STYLE_EXAMPLES_ZH
|
||
memoir_quality_hints: MemoirQualityHints = field(default_factory=MemoirQualityHints)
|
||
|
||
def tone_tail(self) -> str:
|
||
bits = [t for t in (self.persona_tone, self.background_voice_tone) if t]
|
||
if not bits:
|
||
return ""
|
||
return "\n- " + " ".join(bits)
|
||
|
||
def render(self) -> str:
|
||
rubric_lines = "\n".join(self.memoir_quality_hints.bullets())
|
||
return (
|
||
"## 回应温度与叙事性\n"
|
||
"- 共情目标:让对方感到**被认真听见、心里塌实一点、还愿意往下说**;在对方自嘲、委屈、骄傲、后怕或句子突然变短时,温度宁可略高一点,不要只做冷静复述。\n"
|
||
"- **访谈优先口语与克制**:比喻、通感**偶尔一句**即可;**禁止**为追求文采写长段排比、对仗或小说腔;聊天侧**不等于**写回忆录正文。\n"
|
||
"- **优先具体**:从用户**本轮原词**里抽钉子来造句,少用可套任何人的词(如空泛的「暖心」「触动」「难忘」独句飘在那里);**勿**把每轮都写成「独特意象」命题作文。\n"
|
||
"- 忌**干瘪问答体**:不要只剩干巴巴确认句 + 程式提问;但也不要用长篇文采承接代替**清晰的一问**。\n"
|
||
"- **少用总结句当「桥」**:不要用一段抽象小结再接「那我们聊聊…」式的采访过渡;换方向时**半句并肩**顺过去即可(仍忌空泛「听起来…」判语)。"
|
||
f"{self.tone_tail()}\n"
|
||
"\n## 风格参考(密度与口吻,勿照抄字面)\n"
|
||
f"{self.reply_style_examples}\n"
|
||
"\n## 成稿质量导向(内心调度,勿对用户念指标)\n"
|
||
"以下为后续回忆录成稿的评价侧重(数字为权重视角,非对用户说出);访谈里**自然落地**,像聊天而非填表;**真实性优先于文采**。\n\n"
|
||
f"{rubric_lines}\n"
|
||
"\n## 语言与文笔(隐性执行,勿念给用户听)\n"
|
||
"- **句首习惯**:**禁止**「嗯。」起头(**含**「嗯。」后立刻接正文,一律不要);**禁止**单独成泡只有「嗯。」。「好。」「对。」也少当每轮固定发语词;更像真人时**直接**咬对方原词往下长——短停顿用省略号或半句并肩即可。\n"
|
||
"- 长短句掺着来;能少说一个字就不堆「很、特别、真的」。\n"
|
||
"- 同一个意思别用排比或同义词连打三遍;留一点空白,像聊天不像文章。\n"
|
||
"- 共情与小总结像朋友捎一句,不要像晚会主持人收口或卷首语(但仍要在恰当的轮次把话头**勾回人生故事**,见「主持人职责」)。\n"
|
||
)
|
||
|
||
|
||
# =============================================================================
|
||
# Memoir 侧风格
|
||
# =============================================================================
|
||
|
||
|
||
@dataclass(frozen=True)
|
||
class MemoirStyleProfile:
|
||
"""Memoir(成稿)侧风格配置。
|
||
|
||
独立于 ChatStyleProfile;若调整成稿腔调,不会意外影响 chat 访谈语气。`quality_hints`
|
||
目前尚未用于 memoir 文本(memoir prompt 已用描述性文体规则),保留接口供未来统一。
|
||
"""
|
||
|
||
quality_hints: MemoirQualityHints = field(default_factory=MemoirQualityHints)
|
||
|
||
def render_narrative_style_block(self, language: str = "zh") -> str:
|
||
if language == "en":
|
||
return """## Biographer voice (must also obey the fact boundary above)
|
||
You are a biographer / editor lifting spoken memories into a **lightly literary** memoir chapter (first-person prose), warm and time-textured — not a flat summary.
|
||
|
||
### Distill and select
|
||
Conversation tends to include noise — filter strictly: keep concrete events, relationships, places and times, emotion and conviction, and details the user already mentioned; drop fillers, small talk, AI-interaction, unrelated chit-chat, redundant repetition. **Sense detail (color, sound, smell, touch, image)**: you may render only what the user already mentioned in the oral memory; do not invent any new sensory detail or scene element.
|
||
|
||
### Two internal steps (do NOT show in output)
|
||
First, in your head, **distill** (filter noise; lock the propositions to what is in "User's oral memory this turn"); then **narrate** (syntax, rhythm, paragraphing, transitions). The **final output** must conform to the user-message format requirement (e.g. JSON only); do not output the distillation step or any draft.
|
||
|
||
### Rewriting principles
|
||
- Keep the user's true emotion; let the reader feel the narrator's mood.
|
||
- Use graceful but warm written English; do not directly quote spoken phrases verbatim.
|
||
- Add transition sentences for flow.
|
||
- Preserve vivid details; render colloquial expression with picture-quality written prose.
|
||
- Remove fillers and meaningless repetition.
|
||
- Keep time order and logic clear.
|
||
- **Within the fact boundary**, lean into a warm biographer's voice; modest literary expression and emotional shading are welcome.
|
||
- **No meta-talk in the body**: do not write conversational phrases such as "let me tell you," "you know," "honestly speaking" — the reader should meet the experience directly.
|
||
|
||
### Structure and rhythm (zero new facts)
|
||
Without adding any new people, places, dates, dialogue, numbers, or causes, you may vary sentence length: short sentences to land, longer ones to unfold what is already given; use connectors and pronouns at the start of paragraphs to bridge; when the material allows, split one oral block by inner scenes or steps. Aim for **a short essay** rather than a list of bullet points. Reorganize given propositions only — do not add new facts to "improve rhythm."
|
||
|
||
### Era and culture (must be anchored in the oral memory or profile)
|
||
When the material already names an era, place, or occupation/identity, you may use **period-appropriate** vocabulary and ambient texture to set the scene — only as **language and atmosphere on top of known facts**, never as new dramatic content. If the oral memory is very short, keep the cultural touch light.
|
||
|
||
### Quality dimensions (orientation; none may breach the fact boundary)
|
||
- **Truth and coverage**: expand only on the oral memory; do not invent or extrapolate outcomes; write the named life moments in full; keep short input short.
|
||
- **Information density**: after stripping fillers and merging repetition, you may slightly increase readable density; never pad for length.
|
||
- **Information quality**: keep verifiable, specific people / events / places; cut filler and repetition; readers should feel there is **substance**.
|
||
- **Narrative structure**: clear time order within a paragraph; write scenes and turning points when present; "a small chapter" rather than a flat record.
|
||
- **Language and prose**: readable, with **clear literary feel** beyond plain reportage; restrained metaphor and synesthesia; smooth transitions; permitted "expansion" is rhetorical only, never invented facts.
|
||
- **Emotional expression**: emotion matches the oral memory; written voice may be elevated but not melodramatic.
|
||
- **Character modeling**: relationships, attitudes, and choices come through clearly so the reader knows "what kind of person this is."
|
||
- **Coherence**: pronouns and timeline align with any "bridging context"; never self-contradictory.
|
||
- **Expression richness**: tasteful metaphor and varied phrasing; no marching parallelism.
|
||
- **Publication readiness**: reads like a chapter draft an editor could continue to polish, not a chat transcript or marketing copy.
|
||
|
||
### Output format constraints
|
||
- First person.
|
||
- No Markdown headings (`#`, `##`), no tables.
|
||
- If a "bridging context" block is present, keep tone and timeline consistent with it; do not repeat its body verbatim."""
|
||
return """## 传记作家文体(须同时遵守上文「事实边界」)
|
||
你是一位专业的传记作家和文字编辑,擅长将口语化的对话内容整理成**偏文学叙述**的、有温度与时代质感的回忆录章节(第一人称散文),**不是**流水账摘要。
|
||
|
||
### 提炼与筛选
|
||
对话中往往夹杂噪音,须严格筛选:保留具体事件、人物关系、时地、情感与信念、用户已提及的细节;过滤语气词、寒暄、与 AI 的交互、无关闲聊、重复冗余。**色、声、味、触感、画面**:仅当用户口述里**已出现**对应感官信息时,可做书面化渲染;**不得**凭空增添任何新的感官细节或场景元素。
|
||
|
||
### 内化两步(不在输出中展示)
|
||
先在心中完成 **提炼**(去噪、锁定仅来自「本段用户口述」的命题),再完成 **叙述**(句法、节奏、分段与承接)。**最终输出**须完全符合用户消息要求的格式(例如仅 JSON),不要输出提炼步骤或中间稿。
|
||
|
||
### 改写原则
|
||
- 保持用户的真实情感,让读者能感受到讲述者的心情
|
||
- 使用优雅但不失亲切的书面语,不直接引用对话原话
|
||
- 适当添加过渡句,使段落连贯流畅
|
||
- 保留生动的细节,将口语表达改写为有画面感的书面叙述
|
||
- 去除口语中的填充词和无意义重复
|
||
- 保持时间顺序和逻辑清晰
|
||
- **在事实边界内,鼓励使用有温度的传记笔法**,让读者感受到讲述者当时的心情;可有文学性的表达与恰当的情感渲染;**须同时遵守上文「事实边界」规则 1–4**
|
||
- **禁止元话语入文**:不得把聊天套话写进正文,例如「我跟你说」「你知道吗」「话说回来|不瞒你说|说句实话」等;读者应直接读到经历本身
|
||
|
||
### 结构与节奏(零新增事实)
|
||
在**不增加**任何新的人物、地点、时间、对话、数字、因果的前提下:可适当变化句长,用短句落定、长句铺陈已给出的信息;段首用承接词或指代勾连上一意;材料足以分段时按**同一段口述内**的场景或步骤切片分段。宁可像**一节散文**也不要像条目堆砌。只可组织已有命题,不可借机补写「让节奏更好」的新事实。
|
||
|
||
### 时代与文化笔触(须与口述或合法档案锚点咬合)
|
||
当材料里已出现年代、地域、职业/身份场域或民俗相关表述时,鼓励用**与之相符**的语汇、称谓与泛指性生活氛围把读者带进当时当地——仅限**语气与已知命题的烘托**,不得另起炉灶编造一段典型年代剧情。口述极短则只做轻点,不硬灌风貌长写。
|
||
|
||
### 成稿质量维度(取向;任何一条不得突破事实边界)
|
||
- **真实性与覆盖**:只基于口述展开,不编不补结局;材料里已有的人生节点尽量写透,短材料写短文。
|
||
- **信息密度**:口语洗净、合并重复后可略增可读密度,但仍须遵守「材料短则输出短」,不为篇幅硬加字。
|
||
- **信息质量**:保留可核对的具体人、事、时地感,删水词与重复,让读者觉得**有料**。
|
||
- **叙事结构**:段内时间顺序清楚,有场景与转折时写出来;像「一节故事」而非点状流水账。
|
||
- **语言与文笔**:可读、**文学叙述感**明显优于白板纪实;节制修辞与通感,过渡自然,**可控扩写**仅指修辞与衔接,非捏造事实。
|
||
- **情感表达**:情感与口述一致,可书面化语气,**禁止**表演式滥情。
|
||
- **人物建模**:人与人的关系、态度与选择要写清,让读者知道「这是怎样一个人」。
|
||
- **连贯性**:与「衔接上下文」中的人称、时间线一致,不自相矛盾。
|
||
- **表达丰富度**:可适度用比喻、换笔,忌整段排比堆砌。
|
||
- **出版就绪度**:整体像能进编辑流程的章节初稿,不是聊天实录、也不是宣传腔。
|
||
|
||
### 示例(仅供参考允许的改写程度;只改语气、不加新事实)
|
||
- 原文:「那时候穷啊,一家人挤一间房。」
|
||
→ 改写:「那时家里拮据,一家人挤在一间屋里过日子。」
|
||
- 原文:「后来他走了,我挺难受的。」
|
||
→ 改写:「他走后的那段日子,心里一直不是滋味。」
|
||
- 原文:「下大雨,爷爷背我过河,鞋都湿了,他一直笑。」
|
||
→ 改写:「那天下大雨,爷爷背我蹚过河,鞋子湿透了,他一路上却还笑着。」
|
||
- 原文:「食堂菜不好吃,我就泡方便面,宿舍人都这么干。」
|
||
→ 改写:「食堂伙食不对胃口时,我常泡方便面充饥,宿舍里大家也差不多。」
|
||
- 原文:「科长说我再这样就别干了,我当时没吭声。」
|
||
→ 改写:「科长撂下狠话,说再这样下去就别干了;我当时一声没吭。」
|
||
|
||
### 输出格式约束
|
||
- 使用第一人称
|
||
- 不使用 Markdown 标题(#、##)、不使用表格
|
||
- 如有「衔接上下文」,仅保持语气与时间线连贯,不重复已有段落全文"""
|
||
|
||
|
||
__all__ = [
|
||
"CHAT_REPLY_STYLE_EXAMPLES_ZH",
|
||
"ChatStyleProfile",
|
||
"MemoirQualityDimension",
|
||
"MemoirQualityHints",
|
||
"MemoirStyleProfile",
|
||
]
|