feat: 语音确认、联调与运维增强

- 语音:序数解析(第一个/第二个等)、解析失败计数与 API detail.retry_remaining;
  百度 ASR 固定 dev_pid 为普通话;SurgeryPipelineError 支持 extra 并入 HTTP detail。
- Demo:demo 路由与假 RTSP、客户端 index 与 README;BackendResolver 与配置调整。
- 可观测:消耗 TSV 日志、语音文件日志、终端 Markdown 辅助;相关测试与依赖更新。
- 注意:.env 仍被 gitignore,本地密钥不会进入本提交。

Made-with: Cursor
This commit is contained in:
Kevin
2026-04-23 14:24:20 +08:00
parent 42720f81cf
commit 0c05463617
39 changed files with 3030 additions and 143 deletions

View File

@@ -137,6 +137,25 @@
white-space: pre-wrap;
word-break: break-word;
}
.log-hint {
margin-top: 6px;
padding: 6px 8px;
font-size: 11px;
line-height: 1.4;
color: #fcd34d;
background: rgba(245, 158, 11, 0.12);
border: 1px solid rgba(245, 158, 11, 0.35);
border-radius: 4px;
}
#orch-status-banner { border: 1px solid var(--border); }
.callout-ok {
background: rgba(34, 197, 94, 0.12);
border: 1px solid rgba(34, 197, 94, 0.4);
border-radius: 8px;
padding: 10px 12px;
margin: 0 0 10px;
line-height: 1.5;
}
.log-time { color: var(--muted); font-size: 11px; }
.badge {
display: inline-block;
@@ -171,6 +190,7 @@
.muted { color: var(--muted); }
.err { color: var(--danger); }
.ok { color: var(--accent-2); }
.warn { color: var(--warn); }
.small { font-size: 12px; }
.grow { flex: 1; }
audio { width: 100%; margin-top: 8px; }
@@ -180,6 +200,18 @@
.layout { grid-template-columns: 1fr; }
.log { position: static; height: auto; max-height: 50vh; }
}
pre.cmd {
background: var(--panel-2);
border: 1px solid var(--border);
border-radius: 6px;
padding: 10px 12px;
font-size: 11px;
line-height: 1.45;
overflow-x: auto;
margin: 8px 0 0;
white-space: pre-wrap;
word-break: break-all;
}
</style>
</head>
<body>
@@ -187,6 +219,7 @@
<main>
<section class="card">
<h1>Operation Room Monitor · Demo Client</h1>
<p id="orch-status-banner" class="small" style="display:none;margin:8px 0 0;padding:8px 10px;border-radius:6px"></p>
<p class="muted small">手动触发 <code>/client/*</code> 5 个接口;本地麦克风录音后生成 WAV 上传语音确认接口。</p>
<div class="row" style="margin-top:10px">
<div>
@@ -200,16 +233,74 @@
</div>
<div class="actions">
<button id="btn-health" class="secondary">GET /health</button>
<button type="button" class="secondary" id="btn-orch-status" title="检查一键联调接口是否已注册">GET 联调状态</button>
<span id="health-status" class="small muted"></span>
</div>
</section>
<section class="card">
<h2>调试:两路视频(与一键联调 / 无真摄像头)</h2>
<p class="callout-ok small">
<strong>路1 / 路2</strong>选好视频、§4.1 勾选「一键联调」后点「开始手术」即可;服务端会起假 RTSP 并写 <code>VIDEO_RTSP_URLS_JSON_FILE</code>。无法使用一键时,请按 <code>scripts/demo_client/README.md</code> 在宿主机手跑
<code>fake_rtsp_from_file.py</code> 并配置环境变量。
</p>
<h3>两路视频(为 §4.1 一键选文件;两路 <code>RTSP_PATH</code> / <code>camera_id</code> 须与 API 配置一致,如 <code>demo1</code> / <code>demo2</code></h3>
<div class="row" style="margin-top:10px; align-items:stretch; grid-template-columns:1fr 1fr">
<div class="debug-stream" id="debug-stream-1" style="border:1px solid var(--border); border-radius:8px; padding:10px">
<h3 style="margin:0 0 8px; color:var(--accent)">路 1</h3>
<label>视频(一键上传优先;可选手填本地路径作备注)</label>
<input id="debug-vpath-1" type="text" placeholder="/path/a.mp4 或 ./a.mp4" />
<div class="actions" style="margin-top:6px; align-items:center">
<input type="file" id="debug-vfile-1" accept="video/*" hidden />
<button type="button" class="secondary" id="btn-dbg-pick-1">选择…</button>
<span id="debug-hint-1" class="small muted"></span>
</div>
<div class="row" style="margin-top:8px">
<div>
<label>RTSP 路径名 <code>RTSP_PATH</code>URL 最后一段,两路须不同,如 <code>demo1</code></label>
<input id="debug-rpath-1" type="text" value="demo1" />
</div>
<div>
<label>camera_id</label>
<input id="debug-cam-1" type="text" value="or-cam-01" />
</div>
</div>
</div>
<div class="debug-stream" id="debug-stream-2" style="border:1px solid var(--border); border-radius:8px; padding:10px">
<h3 style="margin:0 0 8px; color:var(--accent)">路 2</h3>
<label>视频(一键上传优先;可选手填本地路径作备注)</label>
<input id="debug-vpath-2" type="text" placeholder="/path/b.mp4 或 ./b.mp4" />
<div class="actions" style="margin-top:6px; align-items:center">
<input type="file" id="debug-vfile-2" accept="video/*" hidden />
<button type="button" class="secondary" id="btn-dbg-pick-2">选择…</button>
<span id="debug-hint-2" class="small muted"></span>
</div>
<div class="row" style="margin-top:8px">
<div>
<label>RTSP 路径名 <code>RTSP_PATH</code></label>
<input id="debug-rpath-2" type="text" value="demo2" />
</div>
<div>
<label>camera_id</label>
<input id="debug-cam-2" type="text" value="or-cam-02" />
</div>
</div>
</div>
</div>
<p id="debug-file-note" class="muted small" style="margin:8px 0 0">
一键联调会<strong>直接上传</strong>你在此为路1/路2选择的文件。选文件时会把框内填成 <code>./文件名</code>,仅作展示;真正上传以文件选择器为准,无需在框里改路径。
</p>
<div class="actions" style="margin-top:8px">
<button type="button" class="secondary" id="btn-debug-apply-cams" title="把两路 camera_id 写进 §4.1 的 camera_ids">将 camera_id 填到开始手术</button>
</div>
</section>
<section class="card">
<h2>§4.1 开始手术</h2>
<div class="row">
<div>
<label>camera_ids逗号分隔至少一个</label>
<input id="camera-ids" type="text" value="or-cam-01" />
<input id="camera-ids" type="text" value="or-cam-01,or-cam-02" />
</div>
<div>
<label>candidate_consumables<span id="labels-hint" class="badge">loading…</span></label>
@@ -218,8 +309,14 @@
</div>
</div>
</div>
<p class="small muted" style="margin:8px 0 0">
<label style="display:inline-flex;align-items:flex-start;gap:8px;cursor:pointer;max-width:52rem">
<input type="checkbox" id="orch-oneclick" style="margin-top:2px" />
<span><strong>一键联调</strong>:点下面按钮时上传 §「调试」里为<strong>路1/路2</strong>选好的两个视频,由监控服务在<strong>能执行 docker+ffmpeg 的环境</strong>里自动起假 RTSP、写 <code>VIDEO_RTSP_URLS_JSON_FILE</code> 并开录(需 <code>DEMO_ORCHESTRATOR_ENABLED=true</code> 且该文件为可写挂载;详见 README。不勾选时仍为普通 JSON 开录(需自行先起假流)。</span>
</label>
</p>
<div class="actions">
<button id="btn-start">POST /client/surgeries/start</button>
<button id="btn-start">开始手术</button>
<button id="btn-load-all-labels" class="secondary" type="button">载入全部标签</button>
<button id="btn-clear-labels" class="secondary" type="button">清空</button>
</div>
@@ -245,10 +342,14 @@
<div class="actions">
<button id="btn-pending" class="secondary">拉一条待确认</button>
<label class="small" style="display:flex;align-items:center;gap:6px;cursor:pointer">
<input id="auto-poll" type="checkbox" /> 自动轮询(2s
<input id="auto-poll" type="checkbox" checked /> 自动轮询(10s
</label>
<label class="small" style="display:flex;align-items:center;gap:6px;cursor:pointer" title="拉取到待确认时朗读 prompt百度 TTS 或浏览器)">
<input id="tts-pending" type="checkbox" checked /> 有待确认时 TTS
</label>
<span id="voice-status" class="small muted"></span>
</div>
<p id="voice-pipeline-hint" class="small muted" style="margin:6px 0 0">默认策略:<strong>Top1 置信度 &lt; 0.9</strong> 且达语音下沿时多会<strong>入队待确认</strong>;≥ <code>VIDEO_AUTO_CONFIRM_CONFIDENCE</code>(默认 0.9)且标签在 <code>candidate_consumables</code> 内则<strong>直接记 vision</strong>,拉取待确认为 404。可在环境变量中调整 <code>VIDEO_AUTO_CONFIRM_CONFIDENCE</code>。确认时在「语音确认(录音)」上传 WAV 即可。</p>
<div id="pending-render" class="pending-box" hidden></div>
</section>
@@ -296,7 +397,7 @@
const surgeryId = () => $("surgery-id").value.trim();
const logEl = $("log");
function addLog(method, url, status, body, { error = false } = {}) {
function addLog(method, url, status, body, { error = false, hint = "" } = {}) {
const item = document.createElement("div");
item.className = "log-item";
const time = new Date().toLocaleTimeString();
@@ -318,6 +419,12 @@
catch { bodyEl.textContent = String(body); }
}
item.appendChild(bodyEl);
if (hint) {
const h = document.createElement("div");
h.className = "log-hint";
h.textContent = hint;
item.appendChild(h);
}
logEl.insertBefore(item, logEl.children[1] ?? null);
}
@@ -345,6 +452,40 @@
return { res, body: parsed };
}
async function apiMultipart(path, formData) {
const url = baseUrl() + path;
const bu = baseUrl();
console.info("[demo-client] orchestrate request", { baseUrl: bu, path, fullUrl: url });
let res;
try {
res = await fetch(url, { method: "POST", body: formData });
} catch (e) {
console.error("[demo-client] orchestrate network error", e);
const netHint = "无法连接 " + url + "。请确认「服务端 Base URL」指向监控 API默认 :38080且本页在 :38081 打开时勿把 Base URL 填成 demo 页自身。";
addLog("POST (orchestrate)", url, "NETWORK", String(e), { error: true, hint: netHint });
throw e;
}
const text = await res.text();
let parsed;
try { parsed = text ? JSON.parse(text) : null; } catch { parsed = text; }
const err = !res.ok;
let hint = "";
if (res.status === 404) {
hint = "HTTP 404本路径在服务端未注册。常见原因1) 未设 DEMO_ORCHESTRATOR_ENABLED=true 并重启主进程POST /internal/demo/orchestrate-and-start 未挂载2)「服务端 Base URL」填错须指向主 API 如 http://127.0.0.1:38080不是本 demo 静态站 :38081。可点「GET 联调状态」或打开浏览器控制台查看 [demo-client] 日志。";
} else if (res.status === 400 && parsed && (parsed.detail || "").toString().indexOf("VIDEO_RTSP") >= 0) {
hint = "需配置可写的 VIDEO_RTSP_URLS_JSON_FILE且 Docker 下请 bind-mount 到容器内同路径。";
} else if (res.status === 503) {
hint = "合成假 RTSP 或开录失败,请见响应体与主服务终端 logdemo orchestrate-and-start / ffmpeg / docker。";
}
if (err) {
console.error("[demo-client] orchestrate response", { status: res.status, statusText: res.statusText, body: parsed, url });
} else {
console.info("[demo-client] orchestrate ok", { status: res.status, url });
}
addLog("POST (orchestrate)", url, res.status, parsed, { error: err, hint });
return { res, body: parsed };
}
// ============================================================
// Surgery ID validation
// ============================================================
@@ -427,6 +568,42 @@
};
$("btn-clear-labels").onclick = () => { tags = []; renderTags(); };
// ============================================================
// 联调状态(不依赖一键开关,用于诊断 404
// ============================================================
async function refreshOrchStatus() {
const b = $("orch-status-banner");
const url = baseUrl() + "/internal/demo/orchestrator-status";
try {
const res = await fetch(url);
const text = await res.text();
let data;
try { data = text ? JSON.parse(text) : null; } catch { data = { raw: text }; }
console.info("[demo-client] GET orchestrator-status", { url, httpStatus: res.status, data });
addLog("GET (联调状态)", url, res.status, data, { error: !res.ok });
b.style.display = "block";
if (!res.ok) {
b.style.background = "rgba(239, 68, 68, 0.1)";
b.style.color = "var(--text)";
b.textContent = "无法拉取 " + url + "HTTP " + res.status + ")。请把「服务端 Base URL」设为主 API如 http://127.0.0.1:38080。";
return;
}
const on = data.orchestrator_enabled === true;
const fset = data.video_rtsp_urls_json_file_set === true;
b.style.background = on && fset ? "rgba(34, 197, 94, 0.1)" : "rgba(245, 158, 11, 0.12)";
b.style.color = "var(--text)";
const fp = data.video_rtsp_urls_json_file || "(未设)";
b.innerHTML = on
? ("一键 <code>POST " + (data.orchestrate_path || "/internal/demo/orchestrate-and-start") + "</code>" + (fset ? "已开放RTSP 映射文件 " : "未设 ") + "<code>" + fp + "</code>")
: ("一键开录 <strong>未注册</strong>:请在主服务 .env 设 <code>DEMO_ORCHESTRATOR_ENABLED=true</code> 并<strong>重启</strong>。当前 " + (data.orchestrate_path || "") + " 会 404。");
} catch (e) {
console.error("[demo-client] orchestrator-status failed", e);
b.style.display = "block";
b.style.background = "rgba(239, 68, 68, 0.1)";
b.textContent = "联调状态请求失败: " + e;
}
}
// ============================================================
// §health
// ============================================================
@@ -435,6 +612,7 @@
$("health-status").textContent = `HTTP ${res.status}`;
$("health-status").className = "small " + (res.ok ? "ok" : "err");
};
$("btn-orch-status").onclick = () => { refreshOrchStatus(); };
// ============================================================
// §4.1 start
@@ -442,6 +620,30 @@
$("btn-start").onclick = async () => {
const sid = ensureSurgeryId();
if (!sid) return;
if ($("orch-oneclick") && $("orch-oneclick").checked) {
const f1 = $("debug-vfile-1").files[0];
const f2 = $("debug-vfile-2").files[0];
if (!f1 || !f2) {
alert("请先在上方「调试」里为 路1 / 路2 各「选择…」一个视频文件。");
return;
}
const fd = new FormData();
fd.append("video1", f1, f1.name);
fd.append("video2", f2, f2.name);
fd.append("surgery_id", sid);
fd.append("camera_1", ($("debug-cam-1").value || "or-cam-01").trim() || "or-cam-01");
fd.append("camera_2", ($("debug-cam-2").value || "or-cam-02").trim() || "or-cam-02");
fd.append("rtsp_path_1", ($("debug-rpath-1").value || "demo1").trim() || "demo1");
fd.append("rtsp_path_2", ($("debug-rpath-2").value || "demo2").trim() || "demo2");
fd.append("candidate_consumables_json", JSON.stringify([...tags]));
const { res, body } = await apiMultipart("/internal/demo/orchestrate-and-start", fd);
if (!res.ok) {
const detail = (body && (body.detail !== undefined)) ? body.detail : body;
const errText = (typeof detail === "object" && detail !== null) ? JSON.stringify(detail, null, 2) : String(detail || body || "错误");
alert("一键开录失败 HTTP " + res.status + "\n\n" + errText);
}
return;
}
const camera_ids = $("camera-ids").value.split(",").map(s => s.trim()).filter(Boolean);
if (camera_ids.length === 0) { alert("camera_ids 至少要 1 个"); return; }
await apiJson("POST", "/client/surgeries/start", {
@@ -508,14 +710,111 @@
};
// ============================================================
// §4.4 pending-confirmation
// §4.4 pending-confirmation + 可选 TTS
// ============================================================
let pollTimer = null;
let lastTtsConfirmationId = null;
function pickZhTtsVoice() {
if (!window.speechSynthesis) return null;
const vs = window.speechSynthesis.getVoices() || [];
return (
vs.find((v) => /^zh/i.test((v.lang || "") + (v.voiceURI || ""))) ||
vs.find((v) => (v.lang || "").startsWith("zh")) ||
null
);
}
function speakTextPromise(text) {
return new Promise((resolve, reject) => {
if (!text || !window.speechSynthesis) {
resolve();
return;
}
try {
window.speechSynthesis.cancel();
const u = new SpeechSynthesisUtterance(text);
u.lang = "zh-CN";
const v = pickZhTtsVoice();
if (v) u.voice = v;
u.rate = 0.95;
u.onend = () => resolve();
u.onerror = (ev) => reject(ev.error || new Error("tts"));
window.speechSynthesis.speak(u);
} catch (e) {
reject(e);
}
});
}
/** 优先 GET /prompt-audio 播放百度 MP3失败时 speechSynthesis */
async function playPromptTts(surgeryId, confirmationId, textFallback) {
const path = `/client/surgeries/${surgeryId}/pending-confirmation/${encodeURIComponent(confirmationId)}/prompt-audio`;
const u = baseUrl() + path;
try {
const res = await fetch(u);
if (res.ok) {
const blob = await res.blob();
const o = URL.createObjectURL(blob);
return new Promise((resolve, reject) => {
const a = new Audio();
a.preload = "auto";
a.src = o;
a.onended = () => {
URL.revokeObjectURL(o);
resolve();
};
a.onerror = () => {
URL.revokeObjectURL(o);
reject(new Error("Audio 元素播放失败"));
};
const p = a.play();
if (p && typeof p.catch === "function") {
p.catch((err) => {
URL.revokeObjectURL(o);
reject(err);
});
}
});
}
} catch (e) {
console.warn("[demo-client] prompt-audio 不可用,回退浏览器 TTS", e);
}
return speakTextPromise((textFallback || "").trim());
}
if (window.speechSynthesis) {
window.speechSynthesis.addEventListener("voiceschanged", () => {});
}
$("surgery-id").addEventListener("input", () => {
lastTtsConfirmationId = null;
});
async function fetchPendingOnce() {
const sid = surgeryId();
if (!/^\d{6}$/.test(sid)) return;
const { res, body } = await apiJson("GET", `/client/surgeries/${sid}/pending-confirmation`);
const path = `/client/surgeries/${sid}/pending-confirmation`;
const url = baseUrl() + path;
let res;
try {
res = await fetch(url);
} catch (e) {
addLog("GET", url, "NETWORK", String(e), { error: true });
return;
}
const raw = await res.text();
let body;
try {
body = raw ? JSON.parse(raw) : null;
} catch {
body = raw;
}
if (res.status === 404) {
// 无待确认为常态,不写入右侧「响应日志」,减少刷屏
} else {
addLog("GET", url, res.status, body);
}
const box = $("pending-render");
if (res.status === 200 && body && body.confirmation_id) {
box.hidden = false;
@@ -528,6 +827,12 @@
<div style="margin-top:4px"><strong>prompt_text:</strong> ${body.prompt_text || ""}</div>
<div style="margin-top:4px"><strong>Top1:</strong> ${body.model_top1_label} <span class="muted">(${(body.model_top1_confidence * 100).toFixed(1)}%)</span></div>
<div style="margin-top:6px"><strong>options:</strong>${opts || '<div class="muted">(无)</div>'}</div>`;
const pt = (body.prompt_text || "").trim();
const ttsOn = $("tts-pending") && $("tts-pending").checked;
if (ttsOn && pt && body.confirmation_id !== lastTtsConfirmationId) {
lastTtsConfirmationId = body.confirmation_id;
void playPromptTts(sid, body.confirmation_id, pt).catch((e) => console.warn(e));
}
} else if (res.status === 404) {
box.hidden = false;
box.innerHTML = '<span class="muted">暂无待确认项。</span>';
@@ -538,16 +843,20 @@
}
$("btn-pending").onclick = fetchPendingOnce;
$("auto-poll").onchange = (e) => {
function applyAutoPoll() {
if (pollTimer) { clearInterval(pollTimer); pollTimer = null; }
if (e.target.checked) {
if ($("auto-poll") && $("auto-poll").checked) {
$("voice-status").textContent = "自动轮询中…";
pollTimer = setInterval(fetchPendingOnce, 2000);
pollTimer = setInterval(fetchPendingOnce, 10000);
fetchPendingOnce();
} else {
$("voice-status").textContent = "";
}
};
}
$("auto-poll").onchange = applyAutoPoll;
if ($("auto-poll") && $("auto-poll").checked) {
applyAutoPoll();
}
// ============================================================
// §4.5 Recording (mic → WAV 16kHz mono PCM)
@@ -706,12 +1015,91 @@
let parsed;
try { parsed = text ? JSON.parse(text) : null; } catch { parsed = text; }
addLog("POST (multipart)", url, res.status, parsed);
if (res.ok) {
recordingWav = null;
$("btn-resolve").disabled = true;
$("audio-preview").hidden = true;
$("btn-download").style.display = "none";
lastTtsConfirmationId = null;
$("rec-info").textContent = "已提交,正在拉取下一条待确认…";
$("rec-info").className = "ok small";
await fetchPendingOnce();
if ($("auto-poll") && $("auto-poll").checked) {
$("voice-status").textContent = "自动轮询中…";
}
} else if (res.status === 422 && parsed && parsed.detail && typeof parsed.detail === "object") {
const d = parsed.detail;
if (d.message) {
let line = "解析未通过:" + d.message;
if (typeof d.retry_remaining === "number") {
line += "retry_remaining=" + d.retry_remaining + "";
}
$("rec-info").textContent = line;
$("rec-info").className = "warn small";
}
}
};
// ============================================================
// Debug: two streams for one-click upload (路1/路2)
// ============================================================
$("btn-dbg-pick-1").onclick = () => $("debug-vfile-1").click();
$("debug-vfile-1").addEventListener("change", (e) => {
const f = e.target.files && e.target.files[0];
if (!f) return;
$("debug-vpath-1").value = "./" + f.name;
$("debug-hint-1").textContent = "已选: " + f.name;
});
$("btn-dbg-pick-2").onclick = () => $("debug-vfile-2").click();
$("debug-vfile-2").addEventListener("change", (e) => {
const f = e.target.files && e.target.files[0];
if (!f) return;
$("debug-vpath-2").value = "./" + f.name;
$("debug-hint-2").textContent = "已选: " + f.name;
});
$("btn-debug-apply-cams").onclick = () => {
const a = ($("debug-cam-1").value || "or-cam-01").trim() || "or-cam-01";
const b = ($("debug-cam-2").value || "or-cam-02").trim() || "or-cam-02";
$("camera-ids").value = a + "," + b;
};
(function setupDebugVideoDrop() {
function bindStreamCard(el, vpathId, hintId) {
if (!el) return;
el.addEventListener("dragover", (ev) => {
ev.preventDefault();
el.style.outline = "1px dashed var(--accent)";
});
el.addEventListener("dragleave", () => {
el.style.outline = "";
});
el.addEventListener("drop", (ev) => {
ev.preventDefault();
el.style.outline = "";
const f = ev.dataTransfer && ev.dataTransfer.files && ev.dataTransfer.files[0];
const looksVideo =
f &&
(/^video\//.test(f.type || "") ||
/\.(mp4|mov|mkv|avi|webm|m4v|mpeg|mpg)$/i.test(f.name || ""));
if (!looksVideo) {
$(hintId).textContent = "请拖入视频文件";
return;
}
$(vpathId).value = "./" + f.name;
$(hintId).textContent = "已选: " + f.name + "(拖放)";
});
}
bindStreamCard($("debug-stream-1"), "debug-vpath-1", "debug-hint-1");
bindStreamCard($("debug-stream-2"), "debug-vpath-2", "debug-hint-2");
})();
// ============================================================
// Boot
// ============================================================
loadLabels();
$("base-url").addEventListener("change", () => { refreshOrchStatus(); });
refreshOrchStatus();
</script>
</body>
</html>