refactor(api,expo): 多智能体与会话收敛、回忆录兼容层移除、后端测试集大幅删减

- 对齐「多智能体收敛」与「回忆录 stories-first / markdown-first」方向:收紧运行时契约、
  删除过渡兼容路径与双轨逻辑,并同步更新客户端与文档。

- Chat:以 ChatOrchestrator 为实时编排入口;删除独立 conversation_agent,精简 prompts。
- Memoir:删除 memory_agent;MemoirOrchestrator、classification / story_route 与 prompts 收敛到
  prepare_batches + run_story_pipeline_for_category_batch 主链路。
- 将 agents 侧 processor 迁入 feature 层为 background_runner,并移除 features 下重复/过时
  processor 封装。

- 新增 history_store,强化「conversation_messages 为 DB 真源、Redis 为缓存」模型。
- 调整 models、repo、service、session_history;精简 WS message_types,重构 pipeline 与 router。

- 移除章节占位、整章再生等旧路径;章节列表与封面逻辑要求 story 关联;收紧 cover 资格与
  enqueue。
- helpers、repo、service、router、reading_segment_materialize、story_pipeline_sync、pdf_service
  等按 canonical markdown / cover_asset_id 收缩;删除 memoir_images/provider 等冗余。
- tasks:memoir_tasks、chapter_cover_tasks 等大幅瘦身;story_image_tasks 等与当前图片任务对齐。

- core:config、logging、redis、task_tracker 小幅调整。
- auth / user / payment / quota:路由或服务侧删减过时接口或逻辑(如 payment router 行数减少)。

- pyproject.toml、development.sh、.env.example / .env.production、README 等同步说明或变量。

- Alembic 0001_initial_schema 微调(与当前 schema 叙事一致的小改动)。

- 回忆录:types / mappers / api、章节页与 memoir 页与后端契约对齐;markdown-renderer 调整。
- 语音:删除 voice/player,voice-segment-store 相应精简。

- api/tests:删除 conftest 及绝大部分既有测试文件(websocket_baseline、conversation、memoir
  图片、PDF、SMS 等),属有意收缩/待按 backend-test-system 重建的信号。
- docs:新增多智能体收敛与移除兼容层计划摘要;更新 story-first 设计、backend-test-system、
  multi-agent-refactor-plan、实施总结等。

BREAKING CHANGE: 后端对外契约、回忆录章节字段与若干路由/任务行为已变更;大量 API 测试被移除,
  CI 若依赖这些用例需按新策略补测或调整流水线。
This commit is contained in:
Kevin
2026-03-22 16:45:57 +08:00
parent 70070216c4
commit 786ebf8ae6
122 changed files with 2802 additions and 7941 deletions

View File

@@ -2,110 +2,48 @@
PDF 生成服务(从 services 迁入 memoir feature
"""
from app.core.logging import get_logger
from io import BytesIO
from typing import List, Optional
import httpx
from PIL import Image
from reportlab.lib.pagesizes import A4
from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
from reportlab.lib.styles import ParagraphStyle, getSampleStyleSheet
from reportlab.lib.units import inch
from reportlab.pdfbase import pdfmetrics
from reportlab.pdfbase.cidfonts import UnicodeCIDFont
from reportlab.platypus import (
Image as ReportLabImage,
)
from reportlab.platypus import (
PageBreak,
Paragraph,
SimpleDocTemplate,
Spacer,
)
from reportlab.pdfbase import pdfmetrics
from reportlab.pdfbase.cidfonts import UnicodeCIDFont
from app.core.logging import get_logger
from app.features.memoir.asset_resolver import (
collect_asset_ids_from_markdown,
split_markdown_by_asset_refs,
strip_legacy_image_placeholders,
strip_image_placeholders,
)
from app.features.memoir.chapter_markdown_compose import (
materialize_chapter_pdf_markdown_from_loaded_chapter,
)
from app.features.memoir.helpers import (
_chapter_markdown,
sections_to_content_and_images,
)
from app.features.memoir.helpers import _chapter_markdown
logger = get_logger(__name__)
def _chapter_markdown_for_pdf(chapter) -> str:
"""有 story 编排时 PDF 使用「## 故事名 + 正文」物化;否则沿用章节 canonical。"""
links = getattr(chapter, "story_links", None) or []
if links and any(getattr(l, "story", None) for l in links):
if links and any(getattr(link, "story", None) for link in links):
return materialize_chapter_pdf_markdown_from_loaded_chapter(chapter)
return _chapter_markdown(chapter)
from app.features.memoir.memoir_images.parser import PLACEHOLDER_RE
from app.features.memoir.memoir_images.schema import (
IMAGE_STATUS_COMPLETED,
normalize_image_assets,
)
from app.features.memoir.memoir_images.storage import (
CosDownloadUrlError,
TencentCosStorageService,
mark_image_delivery_unavailable,
resolve_image_storage_key,
)
logger = get_logger(__name__)
def strip_image_placeholders(text: str) -> str:
return PLACEHOLDER_RE.sub("", text or "").strip()
def split_content_blocks(content: str, images: list[dict]) -> list[dict]:
blocks: list[dict] = []
remaining = content
for image in sorted(images or [], key=lambda item: item.get("index", 0)):
placeholder = image.get("placeholder")
if not placeholder or placeholder not in remaining:
continue
before, remaining = remaining.split(placeholder, 1)
cleaned_before = strip_image_placeholders(before)
if cleaned_before:
blocks.append({"type": "text", "value": cleaned_before})
if image.get("status") == IMAGE_STATUS_COMPLETED and image.get("url"):
blocks.append({"type": "image", "url": image["url"]})
cleaned_remaining = strip_image_placeholders(remaining)
if cleaned_remaining:
blocks.append({"type": "text", "value": cleaned_remaining})
return blocks
def _prepare_pdf_image_assets(images: list[dict]) -> list[dict]:
storage = TencentCosStorageService.from_env()
prepared_assets: list[dict] = []
for item in normalize_image_assets(images):
asset = dict(item)
storage_key = resolve_image_storage_key(asset)
if asset.get("status") == IMAGE_STATUS_COMPLETED and storage_key:
try:
asset["url"] = storage.get_download_url(storage_key)
except CosDownloadUrlError as exc:
logger.warning(
"PDF 图片签名失败: key=%s, retryable=%s, request_id=%s, error=%s",
storage_key,
exc.retryable,
exc.request_id,
exc,
)
asset = mark_image_delivery_unavailable(asset)
except Exception as exc:
logger.warning("PDF 图片签名失败: key=%s, error=%s", storage_key, exc)
asset = mark_image_delivery_unavailable(asset)
prepared_assets.append(asset)
return prepared_assets
def _fit_image_size(
image_bytes: bytes, max_width: float, max_height: float
) -> tuple[float, float]:
@@ -180,12 +118,6 @@ class PDFService:
story.append(Spacer(1, 0.2 * inch))
# 有 story_links 时按章节内故事注入 ## 标题(与物化章节正文不含故事标题区分)
markdown = _chapter_markdown_for_pdf(chapter)
_, images_list = sections_to_content_and_images(chapter)
if not markdown:
markdown = getattr(chapter, "content", "") or ""
if not images_list:
images_list = list(getattr(chapter, "images", None) or [])
prepared_images = _prepare_pdf_image_assets(images_list)
blocks: list[dict]
if asset_url_map and collect_asset_ids_from_markdown(markdown):
blocks = split_markdown_by_asset_refs(
@@ -194,11 +126,14 @@ class PDFService:
)
for b in blocks:
if b.get("type") == "text":
b["value"] = strip_legacy_image_placeholders(
b.get("value") or ""
)
b["value"] = strip_image_placeholders(b.get("value") or "")
else:
blocks = split_content_blocks(markdown, prepared_images)
cleaned_markdown = strip_image_placeholders(markdown or "")
blocks = (
[{"type": "text", "value": cleaned_markdown}]
if cleaned_markdown
else []
)
for block in blocks:
if block["type"] == "text":
paragraphs = block["value"].split("\n\n")