feat(voice-client): PySide6 desktop client and Windows build scripts
Add voice_confirmation_client (poll, TTS MP3 playback, mic WAV resolve), PyInstaller spec, start/build helpers, and API unit tests. Pending manual testing: end-to-end on OR workstations and packaged exe. Made-with: Cursor
This commit is contained in:
80
voice_confirmation_client/README.md
Normal file
80
voice_confirmation_client/README.md
Normal file
@@ -0,0 +1,80 @@
|
||||
# 手术室耗材语音确认客户端(桌面版)
|
||||
|
||||
独立桌面程序:按可配置间隔(默认 **5 秒**)轮询 `GET /client/surgeries/{surgery_id}/pending-confirmation`,播放服务端返回的 **MP3 话术**,录制医生麦克风为 **16 kHz 单声道 WAV**,并调用 `POST .../pending-confirmation/{confirmation_id}/resolve`(`multipart` 字段名 `audio`)。协议与 `[docs/客户端手术通信接口说明.md](../docs/客户端手术通信接口说明.md)` 一致。
|
||||
|
||||
## 环境
|
||||
|
||||
- Python **3.13+**(与主项目一致)
|
||||
- 安装可选依赖组 `**voice-client`**(PySide6、httpx、numpy、sounddevice)
|
||||
|
||||
```bash
|
||||
cd /path/to/operation-room-monitor-server
|
||||
uv sync --group voice-client
|
||||
```
|
||||
|
||||
## 运行(开发态)
|
||||
|
||||
未配置项目 `build-system` 时,`uv` 可能不会注册 `voice-confirmation-client` 命令,推荐:
|
||||
|
||||
```bash
|
||||
./start_voice_confirmation_client.sh
|
||||
```
|
||||
|
||||
或在仓库根目录:
|
||||
|
||||
```bash
|
||||
uv run --group voice-client python -m voice_confirmation_client
|
||||
```
|
||||
|
||||
Windows(仓库根目录):
|
||||
|
||||
```bat
|
||||
start_voice_confirmation_client.bat
|
||||
```
|
||||
|
||||
若 entry point 已可用,也可:
|
||||
|
||||
```bash
|
||||
uv run --group voice-client voice-confirmation-client
|
||||
```
|
||||
|
||||
在界面中填写 **服务端 Base URL**、**6 位手术号**,点击 **开始监控**。
|
||||
|
||||
## 音频说明
|
||||
|
||||
- **播放 MP3**:优先使用本机 `ffplay`(ffmpeg),其次 macOS 使用 `afplay`;可将 `ffplay` 放到 `voice_confirmation_client/bin/`(与包同级目录下的 `bin/`)以便离线环境使用。
|
||||
- **录音**:默认使用 **sounddevice** 录制并重采样为 16 kHz 单声道 WAV(与浏览器 Demo 一致)。可选勾选 **优先使用 ffmpeg 录音**(依赖本机 ffmpeg 及可用的设备参数;Windows 默认设备名可能需按现场调整,见 `voice_confirmation_client/core/record.py` 中 `default_ffmpeg_input_args`)。
|
||||
|
||||
## 打包(PyInstaller)
|
||||
|
||||
在 **目标操作系统** 上构建(不要交叉编译 Qt 桌面程序)。
|
||||
|
||||
```bash
|
||||
uv sync --group voice-client-build
|
||||
uv run --group voice-client-build pyinstaller voice_client.spec --noconfirm
|
||||
# 或
|
||||
uv run --group voice-client-build python scripts/build_voice_client.py
|
||||
```
|
||||
|
||||
**Windows 一键打包(仓库根目录)**:双击或在 `cmd` 中执行 `build_voice_confirmation_client.bat`;需要干净构建时加参数 `--clean`(会先删除 `build/`、`dist/`)。
|
||||
|
||||
产物目录:`dist/voice-confirmation-client/`(目录分发,内含可执行文件)。Windows 下可执行文件为 `voice-confirmation-client.exe`。
|
||||
|
||||
**说明**:
|
||||
|
||||
- 体积较大(含 PySide6);杀毒软件可能对 PyInstaller 打包的 exe 误报,可向医院 IT 申请加白。
|
||||
- **macOS**:未签名/未公证的 `.app` 可能需在「隐私与安全性」中手动允许;正式发布需 Apple 开发者签名与公证。
|
||||
- **可选**:将 `ffmpeg`/`ffplay` 二进制放入打包目录下的 `voice_confirmation_bin/`,程序会优先使用(需在 spec 中增加 `datas` 将该目录打入包内,或手动复制到分发目录)。
|
||||
|
||||
## 术间排查
|
||||
|
||||
1. **网络**:客户端机器能访问监控服务 HTTP/HTTPS 端口(默认文档为 `38080`)。
|
||||
2. **麦克风**:在「输入设备」中选择正确设备;无列表时检查系统隐私权限(麦克风)。
|
||||
3. **无待确认**:轮询返回 404 为常态;可关闭「隐藏 404 轮询日志」观察请求节奏。
|
||||
4. **解析失败**:使用 **重试本轮** 重新播放 + 录音 + 上传;或使用 **仅重播话术** 听清提示。
|
||||
|
||||
## 与浏览器 Demo 的差异
|
||||
|
||||
- 浏览器 Demo(`scripts/demo_client/`)默认 **10 秒** 轮询;本客户端默认 **5 秒**,可在界面修改。
|
||||
- 本客户端无「开始/结束手术」按钮;手术需由既有流程或他端调用 `POST /client/surgeries/start` 启动。
|
||||
|
||||
3
voice_confirmation_client/__init__.py
Normal file
3
voice_confirmation_client/__init__.py
Normal file
@@ -0,0 +1,3 @@
|
||||
"""Desktop voice confirmation client for OR monitor API (pending-confirmation loop)."""
|
||||
|
||||
__version__ = "0.1.0"
|
||||
20
voice_confirmation_client/__main__.py
Normal file
20
voice_confirmation_client/__main__.py
Normal file
@@ -0,0 +1,20 @@
|
||||
"""Entry: `python -m voice_confirmation_client` or `voice-confirmation-client`."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import sys
|
||||
|
||||
|
||||
def main() -> None:
|
||||
from PySide6.QtWidgets import QApplication
|
||||
|
||||
from voice_confirmation_client.gui.main_window import MainWindow
|
||||
|
||||
app = QApplication(sys.argv)
|
||||
win = MainWindow()
|
||||
win.show()
|
||||
raise SystemExit(app.exec())
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
3
voice_confirmation_client/core/__init__.py
Normal file
3
voice_confirmation_client/core/__init__.py
Normal file
@@ -0,0 +1,3 @@
|
||||
from voice_confirmation_client.core.monitor_worker import MonitorWorker
|
||||
|
||||
__all__ = ["MonitorWorker"]
|
||||
87
voice_confirmation_client/core/api.py
Normal file
87
voice_confirmation_client/core/api.py
Normal file
@@ -0,0 +1,87 @@
|
||||
"""HTTP client for pending-confirmation and resolve endpoints."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
from dataclasses import dataclass
|
||||
from typing import Any
|
||||
from urllib.parse import quote, urljoin
|
||||
|
||||
import httpx
|
||||
|
||||
|
||||
@dataclass
|
||||
class PendingConfirmationPayload:
|
||||
surgery_id: str
|
||||
confirmation_id: str
|
||||
prompt_text: str
|
||||
prompt_audio_mp3_base64: str
|
||||
options: list[dict[str, Any]]
|
||||
model_top1_label: str
|
||||
model_top1_confidence: float
|
||||
created_at: str
|
||||
raw: dict[str, Any]
|
||||
|
||||
|
||||
class ConfirmationApiClient:
|
||||
def __init__(self, base_url: str, timeout: float = 60.0) -> None:
|
||||
self._base = base_url.rstrip("/") + "/"
|
||||
self._timeout = timeout
|
||||
self._client = httpx.Client(timeout=timeout)
|
||||
|
||||
@property
|
||||
def base_url_normalized(self) -> str:
|
||||
return self._base
|
||||
|
||||
def close(self) -> None:
|
||||
self._client.close()
|
||||
|
||||
def _url(self, path: str) -> str:
|
||||
return urljoin(self._base, path.lstrip("/"))
|
||||
|
||||
def get_pending(self, surgery_id: str) -> tuple[int, dict[str, Any] | str]:
|
||||
url = self._url(f"client/surgeries/{surgery_id}/pending-confirmation")
|
||||
r = self._client.get(url)
|
||||
text = r.text
|
||||
if not text:
|
||||
return r.status_code, {}
|
||||
try:
|
||||
body: dict[str, Any] | str = json.loads(text)
|
||||
except json.JSONDecodeError:
|
||||
body = text
|
||||
return r.status_code, body
|
||||
|
||||
def parse_pending(self, body: dict[str, Any]) -> PendingConfirmationPayload:
|
||||
return PendingConfirmationPayload(
|
||||
surgery_id=str(body.get("surgery_id", "")),
|
||||
confirmation_id=str(body["confirmation_id"]),
|
||||
prompt_text=str(body.get("prompt_text", "")),
|
||||
prompt_audio_mp3_base64=str(body.get("prompt_audio_mp3_base64", "")),
|
||||
options=list(body.get("options") or []),
|
||||
model_top1_label=str(body.get("model_top1_label", "")),
|
||||
model_top1_confidence=float(body.get("model_top1_confidence", 0.0)),
|
||||
created_at=str(body.get("created_at", "")),
|
||||
raw=body,
|
||||
)
|
||||
|
||||
def post_resolve(
|
||||
self,
|
||||
surgery_id: str,
|
||||
confirmation_id: str,
|
||||
wav_bytes: bytes,
|
||||
filename: str = "voice.wav",
|
||||
) -> tuple[int, dict[str, Any] | str]:
|
||||
cid_enc = quote(confirmation_id, safe="")
|
||||
url = self._url(
|
||||
f"client/surgeries/{surgery_id}/pending-confirmation/{cid_enc}/resolve"
|
||||
)
|
||||
files = {"audio": (filename, wav_bytes, "audio/wav")}
|
||||
r = self._client.post(url, files=files)
|
||||
text = r.text
|
||||
if not text:
|
||||
return r.status_code, {}
|
||||
try:
|
||||
body: dict[str, Any] | str = json.loads(text)
|
||||
except json.JSONDecodeError:
|
||||
body = text
|
||||
return r.status_code, body
|
||||
347
voice_confirmation_client/core/monitor_worker.py
Normal file
347
voice_confirmation_client/core/monitor_worker.py
Normal file
@@ -0,0 +1,347 @@
|
||||
"""Background polling + play + record + resolve (threaded, Qt-free)."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import re
|
||||
import threading
|
||||
import time
|
||||
from collections.abc import Callable
|
||||
from dataclasses import dataclass, field
|
||||
from typing import Any
|
||||
|
||||
from voice_confirmation_client.core.api import ConfirmationApiClient
|
||||
from voice_confirmation_client.core.playback import play_mp3_from_base64
|
||||
from voice_confirmation_client.core.record import record_wav_16k_mono
|
||||
|
||||
|
||||
@dataclass
|
||||
class MonitorSettings:
|
||||
base_url: str = "http://127.0.0.1:38080"
|
||||
surgery_id: str = ""
|
||||
interval_sec: float = 5.0
|
||||
record_seconds: float = 8.0
|
||||
dry_run: bool = False
|
||||
hide_404_logs: bool = True
|
||||
prefer_ffmpeg_record: bool = False
|
||||
sounddevice_device: int | str | None = None
|
||||
|
||||
|
||||
@dataclass
|
||||
class _MutableState:
|
||||
generation: int = 0
|
||||
busy: bool = False
|
||||
spoken_cid: str | None = None
|
||||
failed_resolve_cid: str | None = None
|
||||
force_retry: bool = False
|
||||
last_payload: dict[str, Any] | None = None
|
||||
|
||||
|
||||
class MonitorWorker:
|
||||
"""Polls pending-confirmation; on new item plays MP3, records WAV, POSTs resolve."""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
*,
|
||||
on_log: Callable[[str], None] | None = None,
|
||||
on_state: Callable[[str], None] | None = None,
|
||||
on_pending: Callable[[dict[str, Any] | None], None] | None = None,
|
||||
) -> None:
|
||||
self._on_log = on_log
|
||||
self._on_state = on_state
|
||||
self._on_pending = on_pending
|
||||
self._settings = MonitorSettings()
|
||||
self._settings_lock = threading.Lock()
|
||||
self._state = _MutableState()
|
||||
self._state_lock = threading.Lock()
|
||||
self._stop = threading.Event()
|
||||
self._wake = threading.Event()
|
||||
self._monitoring = threading.Event()
|
||||
self._thread: threading.Thread | None = None
|
||||
self._api: ConfirmationApiClient | None = None
|
||||
self._api_base: str | None = None
|
||||
self._api_lock = threading.Lock()
|
||||
|
||||
def set_settings(self, **kwargs: Any) -> None:
|
||||
with self._settings_lock:
|
||||
old_sid = self._settings.surgery_id
|
||||
for k, v in kwargs.items():
|
||||
if hasattr(self._settings, k):
|
||||
setattr(self._settings, k, v)
|
||||
sid_changed = (
|
||||
"surgery_id" in kwargs and self._settings.surgery_id != old_sid
|
||||
)
|
||||
with self._state_lock:
|
||||
self._state.generation += 1
|
||||
if sid_changed:
|
||||
self._state.spoken_cid = None
|
||||
self._state.failed_resolve_cid = None
|
||||
self._state.last_payload = None
|
||||
self._state.force_retry = False
|
||||
self._emit_pending(None)
|
||||
|
||||
def start_thread(self) -> None:
|
||||
if self._thread and self._thread.is_alive():
|
||||
return
|
||||
self._stop.clear()
|
||||
self._thread = threading.Thread(target=self._run, name="VoiceMonitor", daemon=True)
|
||||
self._thread.start()
|
||||
|
||||
def stop_thread(self) -> None:
|
||||
self._stop.set()
|
||||
self._wake.set()
|
||||
if self._thread:
|
||||
self._thread.join(timeout=8.0)
|
||||
self._thread = None
|
||||
with self._api_lock:
|
||||
if self._api:
|
||||
self._api.close()
|
||||
self._api = None
|
||||
self._api_base = None
|
||||
|
||||
def set_monitoring(self, active: bool) -> None:
|
||||
if active:
|
||||
self._monitoring.set()
|
||||
self._wake.set()
|
||||
else:
|
||||
self._monitoring.clear()
|
||||
with self._state_lock:
|
||||
self._state.generation += 1
|
||||
|
||||
def retry_failed(self) -> None:
|
||||
with self._state_lock:
|
||||
self._state.force_retry = True
|
||||
self._wake.set()
|
||||
|
||||
def replay_prompt_only(self) -> None:
|
||||
"""Play last pending MP3 again (GUI button); no record/upload."""
|
||||
threading.Thread(target=self._replay_prompt_job, name="ReplayPrompt", daemon=True).start()
|
||||
|
||||
def _replay_prompt_job(self) -> None:
|
||||
with self._state_lock:
|
||||
payload = self._state.last_payload
|
||||
if not payload:
|
||||
self._log("没有可重播的待确认数据")
|
||||
return
|
||||
b64 = payload.get("prompt_audio_mp3_base64") or ""
|
||||
if not b64:
|
||||
self._log("当前任务无 MP3 数据")
|
||||
return
|
||||
self._emit_state("播放话术(手动重播)…")
|
||||
try:
|
||||
play_mp3_from_base64(str(b64))
|
||||
except Exception as e:
|
||||
self._log(f"重播失败: {e}")
|
||||
finally:
|
||||
self._emit_state("待机")
|
||||
|
||||
def _log(self, msg: str) -> None:
|
||||
if self._on_log:
|
||||
self._on_log(msg)
|
||||
|
||||
def _emit_state(self, s: str) -> None:
|
||||
if self._on_state:
|
||||
self._on_state(s)
|
||||
|
||||
def _emit_pending(self, p: dict[str, Any] | None) -> None:
|
||||
if self._on_pending:
|
||||
self._on_pending(p)
|
||||
|
||||
def _get_api(self, base_url: str) -> ConfirmationApiClient:
|
||||
norm = base_url.rstrip("/") + "/"
|
||||
with self._api_lock:
|
||||
if self._api is None or self._api_base != norm:
|
||||
if self._api:
|
||||
self._api.close()
|
||||
self._api = ConfirmationApiClient(base_url)
|
||||
self._api_base = norm
|
||||
return self._api
|
||||
|
||||
def _run(self) -> None:
|
||||
while not self._stop.is_set():
|
||||
if not self._monitoring.is_set():
|
||||
time.sleep(0.15)
|
||||
continue
|
||||
|
||||
with self._settings_lock:
|
||||
cfg = MonitorSettings(
|
||||
base_url=self._settings.base_url,
|
||||
surgery_id=self._settings.surgery_id,
|
||||
interval_sec=self._settings.interval_sec,
|
||||
record_seconds=self._settings.record_seconds,
|
||||
dry_run=self._settings.dry_run,
|
||||
hide_404_logs=self._settings.hide_404_logs,
|
||||
prefer_ffmpeg_record=self._settings.prefer_ffmpeg_record,
|
||||
sounddevice_device=self._settings.sounddevice_device,
|
||||
)
|
||||
|
||||
if not re.fullmatch(r"\d{6}", cfg.surgery_id or ""):
|
||||
self._emit_state("手术号无效(需 6 位数字)")
|
||||
self._wake.wait(timeout=1.0)
|
||||
self._wake.clear()
|
||||
continue
|
||||
|
||||
api = self._get_api(cfg.base_url)
|
||||
|
||||
with self._state_lock:
|
||||
if self._state.busy:
|
||||
self._wake.wait(timeout=0.5)
|
||||
self._wake.clear()
|
||||
continue
|
||||
gen_before = self._state.generation
|
||||
|
||||
try:
|
||||
status, body = api.get_pending(cfg.surgery_id)
|
||||
except Exception as e:
|
||||
self._log(f"GET pending 失败: {e}")
|
||||
self._wait_interval(cfg.interval_sec)
|
||||
continue
|
||||
|
||||
with self._state_lock:
|
||||
if self._state.generation != gen_before:
|
||||
continue
|
||||
if self._state.busy:
|
||||
continue
|
||||
|
||||
if status == 404:
|
||||
with self._state_lock:
|
||||
self._state.last_payload = None
|
||||
self._state.spoken_cid = None
|
||||
self._state.failed_resolve_cid = None
|
||||
self._emit_pending(None)
|
||||
if not cfg.hide_404_logs:
|
||||
self._log("暂无待确认")
|
||||
self._emit_state("轮询中(无待确认)")
|
||||
self._wait_interval(cfg.interval_sec)
|
||||
continue
|
||||
|
||||
if status != 200 or not isinstance(body, dict):
|
||||
self._log(f"GET pending 异常 HTTP {status}: {body}")
|
||||
self._wait_interval(cfg.interval_sec)
|
||||
continue
|
||||
|
||||
cid = str(body.get("confirmation_id") or "")
|
||||
if not cid:
|
||||
self._wait_interval(cfg.interval_sec)
|
||||
continue
|
||||
|
||||
with self._state_lock:
|
||||
self._state.last_payload = body
|
||||
failed = self._state.failed_resolve_cid
|
||||
force = self._state.force_retry
|
||||
spoken = self._state.spoken_cid
|
||||
|
||||
if failed is not None and failed != cid:
|
||||
self._state.failed_resolve_cid = None
|
||||
self._state.force_retry = False
|
||||
failed = None
|
||||
|
||||
if failed == cid and not force:
|
||||
self._emit_pending(body)
|
||||
self._wait_interval(cfg.interval_sec)
|
||||
continue
|
||||
|
||||
if spoken == cid and failed is None and not force:
|
||||
# Already completed pipeline for this cid without failure; server still returns same id?
|
||||
self._emit_pending(body)
|
||||
self._wait_interval(cfg.interval_sec)
|
||||
continue
|
||||
|
||||
self._state.force_retry = False
|
||||
self._state.busy = True
|
||||
self._state.spoken_cid = cid
|
||||
|
||||
self._emit_pending(body)
|
||||
|
||||
try:
|
||||
self._pipeline_play_record_resolve(cfg, api, body, cid)
|
||||
finally:
|
||||
with self._state_lock:
|
||||
self._state.busy = False
|
||||
|
||||
self._wake.clear()
|
||||
self._wait_interval(cfg.interval_sec)
|
||||
|
||||
def _wait_interval(self, interval_sec: float) -> None:
|
||||
self._wake.wait(timeout=max(0.5, interval_sec))
|
||||
self._wake.clear()
|
||||
|
||||
def _pipeline_play_record_resolve(
|
||||
self,
|
||||
cfg: MonitorSettings,
|
||||
api: ConfirmationApiClient,
|
||||
body: dict[str, Any],
|
||||
cid: str,
|
||||
) -> None:
|
||||
gen_lock = self._state_lock
|
||||
with gen_lock:
|
||||
gen_run = self._state.generation
|
||||
|
||||
try:
|
||||
self._emit_state("播放话术…")
|
||||
play_mp3_from_base64(str(body.get("prompt_audio_mp3_base64") or ""))
|
||||
except Exception as e:
|
||||
self._log(f"播放失败: {e}")
|
||||
with gen_lock:
|
||||
self._state.failed_resolve_cid = cid
|
||||
self._emit_state("播放失败(可重试)")
|
||||
return
|
||||
|
||||
with gen_lock:
|
||||
if self._state.generation != gen_run:
|
||||
return
|
||||
|
||||
try:
|
||||
self._emit_state("录音中…")
|
||||
wav = record_wav_16k_mono(
|
||||
cfg.record_seconds,
|
||||
device=cfg.sounddevice_device,
|
||||
prefer_ffmpeg=cfg.prefer_ffmpeg_record,
|
||||
)
|
||||
except Exception as e:
|
||||
self._log(f"录音失败: {e}")
|
||||
with gen_lock:
|
||||
self._state.failed_resolve_cid = cid
|
||||
self._emit_state("录音失败(可重试)")
|
||||
return
|
||||
|
||||
with gen_lock:
|
||||
if self._state.generation != gen_run:
|
||||
return
|
||||
|
||||
if cfg.dry_run:
|
||||
self._log(f"[dry-run] 已录音 {len(wav)} 字节,跳过上传")
|
||||
with gen_lock:
|
||||
self._state.failed_resolve_cid = None
|
||||
self._state.spoken_cid = None
|
||||
self._state.generation += 1
|
||||
self._emit_state("待机(dry-run)")
|
||||
return
|
||||
|
||||
try:
|
||||
self._emit_state("上传识别…")
|
||||
st, res = api.post_resolve(cfg.surgery_id, cid, wav)
|
||||
except Exception as e:
|
||||
self._log(f"POST resolve 失败: {e}")
|
||||
with gen_lock:
|
||||
self._state.failed_resolve_cid = cid
|
||||
self._emit_state("上传失败(可重试)")
|
||||
return
|
||||
|
||||
if st == 200 and isinstance(res, dict) and res.get("status") == "accepted":
|
||||
self._log(
|
||||
f"已确认: {res.get('message', '')} "
|
||||
f"(resolved_label={res.get('resolved_label')!r})"
|
||||
)
|
||||
with gen_lock:
|
||||
self._state.failed_resolve_cid = None
|
||||
self._state.spoken_cid = None
|
||||
self._state.last_payload = None
|
||||
self._state.generation += 1
|
||||
self._emit_pending(None)
|
||||
self._emit_state("待机")
|
||||
return
|
||||
|
||||
self._log(f"resolve 未接受 HTTP {st}: {res}")
|
||||
with gen_lock:
|
||||
self._state.failed_resolve_cid = cid
|
||||
self._emit_state("解析/上传被拒(可重试)")
|
||||
47
voice_confirmation_client/core/paths.py
Normal file
47
voice_confirmation_client/core/paths.py
Normal file
@@ -0,0 +1,47 @@
|
||||
"""Resolve bundled helper binaries (ffplay/ffmpeg) next to the package or PyInstaller extract dir."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
def package_root() -> Path:
|
||||
"""Directory containing `voice_confirmation_client` package."""
|
||||
return Path(__file__).resolve().parent.parent
|
||||
|
||||
|
||||
def frozen_base() -> Path | None:
|
||||
"""PyInstaller onefile/onedir: sys._MEIPASS or executable dir."""
|
||||
if getattr(sys, "frozen", False):
|
||||
meipass = getattr(sys, "_MEIPASS", None)
|
||||
if meipass:
|
||||
return Path(meipass)
|
||||
return Path(sys.executable).resolve().parent
|
||||
return None
|
||||
|
||||
|
||||
def bin_dir() -> Path:
|
||||
"""Optional `bin/` next to package (dev) or under _MEIPASS (frozen)."""
|
||||
fb = frozen_base()
|
||||
if fb is not None:
|
||||
d = fb / "voice_confirmation_bin"
|
||||
if d.is_dir():
|
||||
return d
|
||||
return package_root() / "bin"
|
||||
|
||||
|
||||
def find_ffplay() -> Path | None:
|
||||
for name in ("ffplay", "ffplay.exe"):
|
||||
p = bin_dir() / name
|
||||
if p.is_file():
|
||||
return p
|
||||
return None
|
||||
|
||||
|
||||
def find_ffmpeg() -> Path | None:
|
||||
for name in ("ffmpeg", "ffmpeg.exe"):
|
||||
p = bin_dir() / name
|
||||
if p.is_file():
|
||||
return p
|
||||
return None
|
||||
61
voice_confirmation_client/core/playback.py
Normal file
61
voice_confirmation_client/core/playback.py
Normal file
@@ -0,0 +1,61 @@
|
||||
"""Play MP3 bytes via system player or bundled ffplay."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import base64
|
||||
import os
|
||||
import shutil
|
||||
import subprocess
|
||||
import sys
|
||||
import tempfile
|
||||
from pathlib import Path
|
||||
|
||||
from voice_confirmation_client.core.paths import find_ffplay
|
||||
|
||||
|
||||
def play_mp3_from_base64(b64: str) -> None:
|
||||
raw_b64 = "".join((b64 or "").split())
|
||||
if not raw_b64:
|
||||
raise ValueError("empty prompt_audio_mp3_base64")
|
||||
data = base64.b64decode(raw_b64, validate=False)
|
||||
with tempfile.NamedTemporaryFile(suffix=".mp3", delete=False) as f:
|
||||
f.write(data)
|
||||
tmp = f.name
|
||||
try:
|
||||
_play_mp3_path(Path(tmp))
|
||||
finally:
|
||||
try:
|
||||
os.unlink(tmp)
|
||||
except OSError:
|
||||
pass
|
||||
|
||||
|
||||
def _play_mp3_path(path: Path) -> None:
|
||||
bundled = find_ffplay()
|
||||
if bundled and bundled.is_file():
|
||||
subprocess.run(
|
||||
[str(bundled), "-nodisp", "-autoexit", "-loglevel", "quiet", str(path)],
|
||||
check=True,
|
||||
timeout=600,
|
||||
)
|
||||
return
|
||||
ffplay = shutil.which("ffplay")
|
||||
if ffplay:
|
||||
subprocess.run(
|
||||
[ffplay, "-nodisp", "-autoexit", "-loglevel", "quiet", str(path)],
|
||||
check=True,
|
||||
timeout=600,
|
||||
)
|
||||
return
|
||||
if sys.platform == "darwin":
|
||||
subprocess.run(["afplay", str(path)], check=True, timeout=600)
|
||||
return
|
||||
if os.name == "nt":
|
||||
os.startfile(str(path)) # type: ignore[attr-defined]
|
||||
import time
|
||||
|
||||
time.sleep(5)
|
||||
return
|
||||
raise RuntimeError(
|
||||
"No MP3 player found. Install ffmpeg (ffplay) or run on macOS with afplay."
|
||||
)
|
||||
94
voice_confirmation_client/core/record.py
Normal file
94
voice_confirmation_client/core/record.py
Normal file
@@ -0,0 +1,94 @@
|
||||
"""Record microphone to 16 kHz mono WAV (sounddevice or ffmpeg)."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import io
|
||||
import subprocess
|
||||
import sys
|
||||
import tempfile
|
||||
import wave
|
||||
from pathlib import Path
|
||||
|
||||
import numpy as np
|
||||
|
||||
from voice_confirmation_client.core.paths import find_ffmpeg
|
||||
|
||||
|
||||
def record_wav_16k_mono(
|
||||
duration_sec: float,
|
||||
*,
|
||||
device: int | str | None = None,
|
||||
prefer_ffmpeg: bool = False,
|
||||
ffmpeg_input_args: list[str] | None = None,
|
||||
) -> bytes:
|
||||
"""Return WAV file bytes (16-bit PCM, 16 kHz, mono)."""
|
||||
if prefer_ffmpeg:
|
||||
bundled = find_ffmpeg()
|
||||
ffmpeg_bin = str(bundled) if bundled and bundled.is_file() else shutil_which_ffmpeg()
|
||||
if ffmpeg_bin:
|
||||
return _record_ffmpeg(ffmpeg_bin, duration_sec, ffmpeg_input_args)
|
||||
return _record_sounddevice(duration_sec, device=device)
|
||||
|
||||
|
||||
def shutil_which_ffmpeg() -> str | None:
|
||||
import shutil
|
||||
|
||||
return shutil.which("ffmpeg")
|
||||
|
||||
|
||||
def _record_sounddevice(duration_sec: float, device: int | str | None) -> bytes:
|
||||
import sounddevice as sd
|
||||
|
||||
samplerate = 16000
|
||||
frames = int(duration_sec * samplerate)
|
||||
kwargs: dict = {"samplerate": samplerate, "channels": 1, "dtype": "float32"}
|
||||
if device is not None and device != "":
|
||||
kwargs["device"] = device
|
||||
recording = sd.rec(frames, **kwargs)
|
||||
sd.wait()
|
||||
mono = np.clip(recording.reshape(-1), -1.0, 1.0)
|
||||
pcm = (mono * 32767.0).astype(np.int16)
|
||||
buf = io.BytesIO()
|
||||
with wave.open(buf, "wb") as wf:
|
||||
wf.setnchannels(1)
|
||||
wf.setsampwidth(2)
|
||||
wf.setframerate(samplerate)
|
||||
wf.writeframes(pcm.tobytes())
|
||||
return buf.getvalue()
|
||||
|
||||
|
||||
def default_ffmpeg_input_args() -> list[str]:
|
||||
if sys.platform == "darwin":
|
||||
return ["-f", "avfoundation", "-i", ":0"]
|
||||
if sys.platform == "win32":
|
||||
return ["-f", "dshow", "-i", "audio=Microphone"]
|
||||
return ["-f", "alsa", "-i", "default"]
|
||||
|
||||
|
||||
def _record_ffmpeg(
|
||||
ffmpeg_bin: str, duration_sec: float, ffmpeg_input_args: list[str] | None
|
||||
) -> bytes:
|
||||
input_args = ffmpeg_input_args if ffmpeg_input_args else default_ffmpeg_input_args()
|
||||
with tempfile.NamedTemporaryFile(suffix=".wav", delete=False) as tmp:
|
||||
out = tmp.name
|
||||
try:
|
||||
cmd = [
|
||||
ffmpeg_bin,
|
||||
"-y",
|
||||
"-loglevel",
|
||||
"error",
|
||||
*input_args,
|
||||
"-t",
|
||||
str(duration_sec),
|
||||
"-ar",
|
||||
"16000",
|
||||
"-ac",
|
||||
"1",
|
||||
"-sample_fmt",
|
||||
"s16",
|
||||
out,
|
||||
]
|
||||
subprocess.run(cmd, check=True, timeout=int(duration_sec) + 45)
|
||||
return Path(out).read_bytes()
|
||||
finally:
|
||||
Path(out).unlink(missing_ok=True)
|
||||
1
voice_confirmation_client/gui/__init__.py
Normal file
1
voice_confirmation_client/gui/__init__.py
Normal file
@@ -0,0 +1 @@
|
||||
"""PySide6 desktop GUI."""
|
||||
198
voice_confirmation_client/gui/main_window.py
Normal file
198
voice_confirmation_client/gui/main_window.py
Normal file
@@ -0,0 +1,198 @@
|
||||
"""Main PySide6 window for the voice confirmation client."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
from datetime import datetime
|
||||
from typing import Any
|
||||
|
||||
from PySide6.QtCore import Qt, Signal, QObject
|
||||
from PySide6.QtGui import QCloseEvent
|
||||
from PySide6.QtWidgets import (
|
||||
QCheckBox,
|
||||
QComboBox,
|
||||
QDoubleSpinBox,
|
||||
QFormLayout,
|
||||
QGroupBox,
|
||||
QHBoxLayout,
|
||||
QLabel,
|
||||
QLineEdit,
|
||||
QMainWindow,
|
||||
QMessageBox,
|
||||
QPushButton,
|
||||
QPlainTextEdit,
|
||||
QSplitter,
|
||||
QVBoxLayout,
|
||||
QWidget,
|
||||
)
|
||||
|
||||
from voice_confirmation_client.core.monitor_worker import MonitorWorker
|
||||
|
||||
|
||||
class _Bridge(QObject):
|
||||
log_line = Signal(str)
|
||||
state_text = Signal(str)
|
||||
pending_payload = Signal(object)
|
||||
|
||||
|
||||
class MainWindow(QMainWindow):
|
||||
def __init__(self) -> None:
|
||||
super().__init__()
|
||||
self.setWindowTitle("手术室耗材语音确认客户端")
|
||||
self.resize(920, 640)
|
||||
|
||||
self._bridge = _Bridge()
|
||||
self._bridge.log_line.connect(self._append_log)
|
||||
self._bridge.pending_payload.connect(self._show_pending)
|
||||
|
||||
self._worker = MonitorWorker(
|
||||
on_log=lambda m: self._bridge.log_line.emit(m),
|
||||
on_state=lambda s: self._bridge.state_text.emit(s),
|
||||
on_pending=lambda p: self._bridge.pending_payload.emit(p),
|
||||
)
|
||||
self._worker.start_thread()
|
||||
|
||||
central = QWidget()
|
||||
self.setCentralWidget(central)
|
||||
root = QVBoxLayout(central)
|
||||
|
||||
form_box = QGroupBox("连接与手术")
|
||||
form = QFormLayout(form_box)
|
||||
self._base_url = QLineEdit("http://127.0.0.1:38080")
|
||||
self._surgery_id = QLineEdit("")
|
||||
self._surgery_id.setPlaceholderText("6 位数字,如 123456")
|
||||
self._interval = QDoubleSpinBox()
|
||||
self._interval.setRange(1.0, 120.0)
|
||||
self._interval.setValue(5.0)
|
||||
self._interval.setSuffix(" s")
|
||||
self._record_sec = QDoubleSpinBox()
|
||||
self._record_sec.setRange(2.0, 60.0)
|
||||
self._record_sec.setValue(8.0)
|
||||
self._record_sec.setSuffix(" s")
|
||||
form.addRow("服务端 Base URL", self._base_url)
|
||||
form.addRow("手术号 surgery_id", self._surgery_id)
|
||||
form.addRow("轮询间隔", self._interval)
|
||||
form.addRow("录音时长", self._record_sec)
|
||||
root.addWidget(form_box)
|
||||
|
||||
adv = QGroupBox("音频 / 调试")
|
||||
adv_l = QFormLayout(adv)
|
||||
self._device_combo = QComboBox()
|
||||
self._device_combo.addItem("系统默认麦克风", None)
|
||||
self._populate_input_devices()
|
||||
self._prefer_ffmpeg = QCheckBox("优先使用 ffmpeg 录音(需本机 ffmpeg 且设备参数可用)")
|
||||
self._hide_404 = QCheckBox("隐藏 404 轮询日志(推荐)")
|
||||
self._hide_404.setChecked(True)
|
||||
self._dry_run = QCheckBox("Dry-run:录音后不上传")
|
||||
adv_l.addRow("输入设备", self._device_combo)
|
||||
adv_l.addRow(self._prefer_ffmpeg)
|
||||
adv_l.addRow(self._hide_404)
|
||||
adv_l.addRow(self._dry_run)
|
||||
root.addWidget(adv)
|
||||
|
||||
btn_row = QHBoxLayout()
|
||||
self._btn_start = QPushButton("开始监控")
|
||||
self._btn_stop = QPushButton("停止监控")
|
||||
self._btn_stop.setEnabled(False)
|
||||
self._btn_retry = QPushButton("重试本轮(播放+录音+上传)")
|
||||
self._btn_replay = QPushButton("仅重播话术")
|
||||
btn_row.addWidget(self._btn_start)
|
||||
btn_row.addWidget(self._btn_stop)
|
||||
btn_row.addWidget(self._btn_retry)
|
||||
btn_row.addWidget(self._btn_replay)
|
||||
btn_row.addStretch()
|
||||
root.addLayout(btn_row)
|
||||
|
||||
self._status_label = QLabel("待机")
|
||||
root.addWidget(self._status_label)
|
||||
self._bridge.state_text.connect(self._status_label.setText)
|
||||
|
||||
split = QSplitter(Qt.Orientation.Horizontal)
|
||||
self._pending_view = QPlainTextEdit()
|
||||
self._pending_view.setReadOnly(True)
|
||||
self._pending_view.setPlaceholderText("待确认内容将显示在这里…")
|
||||
self._log = QPlainTextEdit()
|
||||
self._log.setReadOnly(True)
|
||||
self._log.setPlaceholderText("日志…")
|
||||
split.addWidget(self._pending_view)
|
||||
split.addWidget(self._log)
|
||||
split.setSizes([360, 520])
|
||||
root.addWidget(split, stretch=1)
|
||||
|
||||
self._btn_start.clicked.connect(self._start_monitoring)
|
||||
self._btn_stop.clicked.connect(self._stop_monitoring)
|
||||
self._btn_retry.clicked.connect(self._worker.retry_failed)
|
||||
self._btn_replay.clicked.connect(self._worker.replay_prompt_only)
|
||||
|
||||
self._apply_settings_silent()
|
||||
|
||||
def _show_pending(self, payload: object) -> None:
|
||||
if payload is None:
|
||||
self._pending_view.clear()
|
||||
return
|
||||
if not isinstance(payload, dict):
|
||||
self._pending_view.setPlainText(str(payload))
|
||||
return
|
||||
try:
|
||||
text = json.dumps(payload, ensure_ascii=False, indent=2)
|
||||
except (TypeError, ValueError):
|
||||
text = str(payload)
|
||||
self._pending_view.setPlainText(text)
|
||||
|
||||
def _populate_input_devices(self) -> None:
|
||||
try:
|
||||
import sounddevice as sd
|
||||
except ImportError:
|
||||
return
|
||||
try:
|
||||
devices = sd.query_devices()
|
||||
hostapis = sd.query_hostapis()
|
||||
except Exception:
|
||||
return
|
||||
for i, d in enumerate(devices):
|
||||
if d.get("max_input_channels", 0) <= 0:
|
||||
continue
|
||||
ha = hostapis[d["hostapi"]]["name"] if d.get("hostapi") is not None else ""
|
||||
label = f"{i}: {d.get('name', '')} ({ha})"
|
||||
self._device_combo.addItem(label, i)
|
||||
|
||||
def _apply_settings_silent(self) -> None:
|
||||
dev_data = self._device_combo.currentData()
|
||||
self._worker.set_settings(
|
||||
base_url=self._base_url.text().strip(),
|
||||
surgery_id=self._surgery_id.text().strip(),
|
||||
interval_sec=float(self._interval.value()),
|
||||
record_seconds=float(self._record_sec.value()),
|
||||
dry_run=self._dry_run.isChecked(),
|
||||
hide_404_logs=self._hide_404.isChecked(),
|
||||
prefer_ffmpeg_record=self._prefer_ffmpeg.isChecked(),
|
||||
sounddevice_device=dev_data,
|
||||
)
|
||||
|
||||
def _start_monitoring(self) -> None:
|
||||
sid = self._surgery_id.text().strip()
|
||||
if len(sid) != 6 or not sid.isdigit():
|
||||
QMessageBox.warning(self, "校验失败", "手术号必须为 6 位数字。")
|
||||
return
|
||||
self._apply_settings_silent()
|
||||
self._worker.set_monitoring(True)
|
||||
self._btn_start.setEnabled(False)
|
||||
self._btn_stop.setEnabled(True)
|
||||
self._append_log("—— 开始监控 ——")
|
||||
|
||||
def _stop_monitoring(self) -> None:
|
||||
self._worker.set_monitoring(False)
|
||||
self._btn_start.setEnabled(True)
|
||||
self._btn_stop.setEnabled(False)
|
||||
self._append_log("—— 已停止监控 ——")
|
||||
self._status_label.setText("已停止")
|
||||
|
||||
def _append_log(self, line: str) -> None:
|
||||
ts = datetime.now().strftime("%H:%M:%S")
|
||||
self._log.appendPlainText(f"[{ts}] {line}")
|
||||
sb = self._log.verticalScrollBar()
|
||||
sb.setValue(sb.maximum())
|
||||
|
||||
def closeEvent(self, event: QCloseEvent) -> None:
|
||||
self._worker.stop_thread()
|
||||
event.accept()
|
||||
Reference in New Issue
Block a user