feat: surgery pipeline API, video inference, voice confirm, and tests
- Add FastAPI routes for surgery start/end, results, pending confirmation (WAV upload), and health checks.
- Implement RTSP/Hikvision capture, consumable classification, session manager, MinIO/Baidu voice resolution, and DB persistence.
- Add documentation (client API, video backends, staging checklist) and sample camera/RTSP config.
- Add pytest suite (API contract, session manager, voice, repositories, pipeline persistence) and httpx dev dependency.
- Replace deprecated HTTP_422_UNPROCESSABLE_ENTITY with HTTP_422_UNPROCESSABLE_CONTENT.
- Fix SurgeryPipeline DB reads to use an explicit transaction with autobegin disabled.
Made-with: Cursor
2026-04-21 18:33:54 +08:00
|
|
|
|
from __future__ import annotations
|
|
|
|
|
|
|
|
|
|
|
|
from threading import Lock
|
|
|
|
|
|
from typing import Any
|
|
|
|
|
|
|
|
|
|
|
|
from aip import AipSpeech
|
|
|
|
|
|
|
2026-04-23 20:42:21 +08:00
|
|
|
|
from app.config import Settings, settings as _default_settings
|
feat: surgery pipeline API, video inference, voice confirm, and tests
- Add FastAPI routes for surgery start/end, results, pending confirmation (WAV upload), and health checks.
- Implement RTSP/Hikvision capture, consumable classification, session manager, MinIO/Baidu voice resolution, and DB persistence.
- Add documentation (client API, video backends, staging checklist) and sample camera/RTSP config.
- Add pytest suite (API contract, session manager, voice, repositories, pipeline persistence) and httpx dev dependency.
- Replace deprecated HTTP_422_UNPROCESSABLE_ENTITY with HTTP_422_UNPROCESSABLE_CONTENT.
- Fix SurgeryPipeline DB reads to use an explicit transaction with autobegin disabled.
Made-with: Cursor
2026-04-21 18:33:54 +08:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
class BaiduSpeechNotConfiguredError(RuntimeError):
|
2026-04-24 15:33:22 +08:00
|
|
|
|
"""未配置 `BAIDU_APP_ID` / `BAIDU_API_KEY` / `BAIDU_SECRET_KEY` 时调用接口会抛出。"""
|
feat: surgery pipeline API, video inference, voice confirm, and tests
- Add FastAPI routes for surgery start/end, results, pending confirmation (WAV upload), and health checks.
- Implement RTSP/Hikvision capture, consumable classification, session manager, MinIO/Baidu voice resolution, and DB persistence.
- Add documentation (client API, video backends, staging checklist) and sample camera/RTSP config.
- Add pytest suite (API contract, session manager, voice, repositories, pipeline persistence) and httpx dev dependency.
- Replace deprecated HTTP_422_UNPROCESSABLE_ENTITY with HTTP_422_UNPROCESSABLE_CONTENT.
- Fix SurgeryPipeline DB reads to use an explicit transaction with autobegin disabled.
Made-with: Cursor
2026-04-21 18:33:54 +08:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
class BaiduSpeechService:
|
|
|
|
|
|
"""百度短语音识别(asr)与在线语音合成(synthesis),基于 `baidu-aip` 的 `AipSpeech`。"""
|
|
|
|
|
|
|
2026-04-23 20:42:21 +08:00
|
|
|
|
def __init__(self, app_settings: Settings | None = None) -> None:
|
|
|
|
|
|
self._s = app_settings or _default_settings
|
feat: surgery pipeline API, video inference, voice confirm, and tests
- Add FastAPI routes for surgery start/end, results, pending confirmation (WAV upload), and health checks.
- Implement RTSP/Hikvision capture, consumable classification, session manager, MinIO/Baidu voice resolution, and DB persistence.
- Add documentation (client API, video backends, staging checklist) and sample camera/RTSP config.
- Add pytest suite (API contract, session manager, voice, repositories, pipeline persistence) and httpx dev dependency.
- Replace deprecated HTTP_422_UNPROCESSABLE_ENTITY with HTTP_422_UNPROCESSABLE_CONTENT.
- Fix SurgeryPipeline DB reads to use an explicit transaction with autobegin disabled.
Made-with: Cursor
2026-04-21 18:33:54 +08:00
|
|
|
|
self._client: AipSpeech | None = None
|
|
|
|
|
|
self._lock = Lock()
|
|
|
|
|
|
|
|
|
|
|
|
@property
|
|
|
|
|
|
def configured(self) -> bool:
|
2026-04-23 20:42:21 +08:00
|
|
|
|
return self._s.baidu_speech_configured
|
feat: surgery pipeline API, video inference, voice confirm, and tests
- Add FastAPI routes for surgery start/end, results, pending confirmation (WAV upload), and health checks.
- Implement RTSP/Hikvision capture, consumable classification, session manager, MinIO/Baidu voice resolution, and DB persistence.
- Add documentation (client API, video backends, staging checklist) and sample camera/RTSP config.
- Add pytest suite (API contract, session manager, voice, repositories, pipeline persistence) and httpx dev dependency.
- Replace deprecated HTTP_422_UNPROCESSABLE_ENTITY with HTTP_422_UNPROCESSABLE_CONTENT.
- Fix SurgeryPipeline DB reads to use an explicit transaction with autobegin disabled.
Made-with: Cursor
2026-04-21 18:33:54 +08:00
|
|
|
|
|
|
|
|
|
|
def _client_or_raise(self) -> AipSpeech:
|
|
|
|
|
|
if not self.configured:
|
|
|
|
|
|
raise BaiduSpeechNotConfiguredError(
|
2026-04-24 15:33:22 +08:00
|
|
|
|
"百度语音未配置:请设置 BAIDU_APP_ID、BAIDU_API_KEY、BAIDU_SECRET_KEY。"
|
feat: surgery pipeline API, video inference, voice confirm, and tests
- Add FastAPI routes for surgery start/end, results, pending confirmation (WAV upload), and health checks.
- Implement RTSP/Hikvision capture, consumable classification, session manager, MinIO/Baidu voice resolution, and DB persistence.
- Add documentation (client API, video backends, staging checklist) and sample camera/RTSP config.
- Add pytest suite (API contract, session manager, voice, repositories, pipeline persistence) and httpx dev dependency.
- Replace deprecated HTTP_422_UNPROCESSABLE_ENTITY with HTTP_422_UNPROCESSABLE_CONTENT.
- Fix SurgeryPipeline DB reads to use an explicit transaction with autobegin disabled.
Made-with: Cursor
2026-04-21 18:33:54 +08:00
|
|
|
|
)
|
|
|
|
|
|
with self._lock:
|
|
|
|
|
|
if self._client is None:
|
|
|
|
|
|
client = AipSpeech(
|
2026-04-23 20:42:21 +08:00
|
|
|
|
self._s.baidu_speech_app_id,
|
|
|
|
|
|
self._s.baidu_speech_api_key,
|
|
|
|
|
|
self._s.baidu_speech_secret_key,
|
feat: surgery pipeline API, video inference, voice confirm, and tests
- Add FastAPI routes for surgery start/end, results, pending confirmation (WAV upload), and health checks.
- Implement RTSP/Hikvision capture, consumable classification, session manager, MinIO/Baidu voice resolution, and DB persistence.
- Add documentation (client API, video backends, staging checklist) and sample camera/RTSP config.
- Add pytest suite (API contract, session manager, voice, repositories, pipeline persistence) and httpx dev dependency.
- Replace deprecated HTTP_422_UNPROCESSABLE_ENTITY with HTTP_422_UNPROCESSABLE_CONTENT.
- Fix SurgeryPipeline DB reads to use an explicit transaction with autobegin disabled.
Made-with: Cursor
2026-04-21 18:33:54 +08:00
|
|
|
|
)
|
2026-04-23 20:42:21 +08:00
|
|
|
|
if self._s.baidu_speech_connection_timeout_ms is not None:
|
feat: surgery pipeline API, video inference, voice confirm, and tests
- Add FastAPI routes for surgery start/end, results, pending confirmation (WAV upload), and health checks.
- Implement RTSP/Hikvision capture, consumable classification, session manager, MinIO/Baidu voice resolution, and DB persistence.
- Add documentation (client API, video backends, staging checklist) and sample camera/RTSP config.
- Add pytest suite (API contract, session manager, voice, repositories, pipeline persistence) and httpx dev dependency.
- Replace deprecated HTTP_422_UNPROCESSABLE_ENTITY with HTTP_422_UNPROCESSABLE_CONTENT.
- Fix SurgeryPipeline DB reads to use an explicit transaction with autobegin disabled.
Made-with: Cursor
2026-04-21 18:33:54 +08:00
|
|
|
|
client.setConnectionTimeoutInMillis(
|
2026-04-23 20:42:21 +08:00
|
|
|
|
self._s.baidu_speech_connection_timeout_ms
|
feat: surgery pipeline API, video inference, voice confirm, and tests
- Add FastAPI routes for surgery start/end, results, pending confirmation (WAV upload), and health checks.
- Implement RTSP/Hikvision capture, consumable classification, session manager, MinIO/Baidu voice resolution, and DB persistence.
- Add documentation (client API, video backends, staging checklist) and sample camera/RTSP config.
- Add pytest suite (API contract, session manager, voice, repositories, pipeline persistence) and httpx dev dependency.
- Replace deprecated HTTP_422_UNPROCESSABLE_ENTITY with HTTP_422_UNPROCESSABLE_CONTENT.
- Fix SurgeryPipeline DB reads to use an explicit transaction with autobegin disabled.
Made-with: Cursor
2026-04-21 18:33:54 +08:00
|
|
|
|
)
|
2026-04-23 20:42:21 +08:00
|
|
|
|
if self._s.baidu_speech_socket_timeout_ms is not None:
|
|
|
|
|
|
client.setSocketTimeoutInMillis(self._s.baidu_speech_socket_timeout_ms)
|
feat: surgery pipeline API, video inference, voice confirm, and tests
- Add FastAPI routes for surgery start/end, results, pending confirmation (WAV upload), and health checks.
- Implement RTSP/Hikvision capture, consumable classification, session manager, MinIO/Baidu voice resolution, and DB persistence.
- Add documentation (client API, video backends, staging checklist) and sample camera/RTSP config.
- Add pytest suite (API contract, session manager, voice, repositories, pipeline persistence) and httpx dev dependency.
- Replace deprecated HTTP_422_UNPROCESSABLE_ENTITY with HTTP_422_UNPROCESSABLE_CONTENT.
- Fix SurgeryPipeline DB reads to use an explicit transaction with autobegin disabled.
Made-with: Cursor
2026-04-21 18:33:54 +08:00
|
|
|
|
self._client = client
|
|
|
|
|
|
return self._client
|
|
|
|
|
|
|
|
|
|
|
|
def asr(
|
|
|
|
|
|
self,
|
|
|
|
|
|
speech: bytes | None = None,
|
|
|
|
|
|
format: str = "pcm",
|
|
|
|
|
|
rate: int = 16000,
|
|
|
|
|
|
options: dict[str, Any] | None = None,
|
|
|
|
|
|
) -> dict[str, Any]:
|
2026-04-23 14:24:20 +08:00
|
|
|
|
"""短语音识别。返回百度 JSON(含 `err_no`、`result` 等)。
|
|
|
|
|
|
|
|
|
|
|
|
固定使用普通话模型(`dev_pid` 来自配置),避免未传参时误用服务端默认导致偏英语等结果。
|
|
|
|
|
|
"""
|
|
|
|
|
|
merged: dict[str, Any] = dict(options or {})
|
2026-04-23 20:42:21 +08:00
|
|
|
|
merged["dev_pid"] = int(self._s.baidu_speech_asr_dev_pid)
|
2026-04-23 14:24:20 +08:00
|
|
|
|
return self._client_or_raise().asr(speech, format, rate, merged)
|
feat: surgery pipeline API, video inference, voice confirm, and tests
- Add FastAPI routes for surgery start/end, results, pending confirmation (WAV upload), and health checks.
- Implement RTSP/Hikvision capture, consumable classification, session manager, MinIO/Baidu voice resolution, and DB persistence.
- Add documentation (client API, video backends, staging checklist) and sample camera/RTSP config.
- Add pytest suite (API contract, session manager, voice, repositories, pipeline persistence) and httpx dev dependency.
- Replace deprecated HTTP_422_UNPROCESSABLE_ENTITY with HTTP_422_UNPROCESSABLE_CONTENT.
- Fix SurgeryPipeline DB reads to use an explicit transaction with autobegin disabled.
Made-with: Cursor
2026-04-21 18:33:54 +08:00
|
|
|
|
|
|
|
|
|
|
def synthesis(
|
|
|
|
|
|
self,
|
|
|
|
|
|
text: str,
|
|
|
|
|
|
lang: str = "zh",
|
|
|
|
|
|
ctp: int = 1,
|
|
|
|
|
|
options: dict[str, Any] | None = None,
|
|
|
|
|
|
) -> bytes | dict[str, Any]:
|
|
|
|
|
|
"""在线语音合成。成功为音频二进制;失败为错误信息 dict。"""
|
|
|
|
|
|
return self._client_or_raise().synthesis(text, lang, ctp, options)
|