Files
operating-room-monitor-server/docs/Docker部署.md
op 156e4ce095 Align API container UID with host and harden RTSP slice readiness.
Run compose api as HOST_UID/GID with cache under /tmp, poll slice files for
ready_event when ffmpeg stderr is silent, invoke batch via venv python, exclude
logs from build context, and document Docker cache/VLC troubleshooting.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-27 10:57:27 +08:00

123 lines
4.0 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Docker Compose 部署NVIDIA GPU
本文说明在 **NVIDIA GPU 服务器**上通过 Docker Compose 部署全套后端FastAPI + PostgreSQL + MinIO以及 Demo 客户端、语音确认页的**手动**启动方式。
## 仓库结构
```
operation-room-monitor/
backend/ # API + DB + MinIOdocker compose
clients/ # 独立前端(手动启动)
docs/ # 文档
```
## 架构
| 组件 | 部署方式 | 默认端口 |
|------|----------|----------|
| API + PostgreSQL + MinIO | `cd backend && docker compose up -d --build` | 38080 / 45432 / 19000 |
| Demo 客户端 | `clients/demo-client/start.sh` | 38081 |
| 语音确认页 | `clients/voice-confirmation/start.sh` | 8080 |
---
## 一、前置条件
- Docker Compose V2、NVIDIA 驱动、NVIDIA Container Toolkit
- 复制 `backend/.env.example``backend/.env` 并填写
- 算法子进程包:`backend/algorithm_subprocesses/5.15/`(含 `main.py``weights/`;镜像构建时会 `COPY` 进容器,勿在 `.dockerignore` 中整目录排除)
- 标注视频中文字体:镜像内已安装 `fonts-noto-cjk``fonts-wqy-microhei`(供 `visualize_result_video.py` 绘制耗材标签)
- 医生识别MediaPipe Pose镜像内已安装 `libgles2``libegl1``libegl-mesa0``libglx-mesa0``libgl1-mesa-dri` 等 Mesa/GLVND 库;构建阶段会 `import mediapipe` 校验 `libGLESv2.so.2` 可用。子进程强制 CPU delegate。若仍见该错误**`docker compose build --no-cache api`** 后重启(勿沿用旧 tarball 镜像)
- 可选备用权重:`backend/app/resources/actionformer_epoch_045.pth.tar`
---
## 二、启动后端
```bash
cd backend
docker compose up -d --build
```
健康检查:
```bash
curl -sf http://127.0.0.1:38080/health
```
GPU 验证:
```bash
docker compose exec api python -c "import torch; print(torch.cuda.is_available(), torch.cuda.get_device_name(0))"
```
停止 / 重置:
```bash
docker compose down
docker compose down -v # 删除 PostgreSQL / MinIO 卷
```
### 构建 API 镜像失败:`invalid tar header` / `unpigz: corrupted`
`uv sync` 已成功,但在 **exporting / unpacking** 阶段报错时,通常是 **Docker 本地层缓存或存储损坏**,与 Dockerfile 无关。
按顺序处理:
```bash
cd backend
chmod +x scripts/rebuild-api-image.sh
# 清缓存并重建(推荐)
./scripts/rebuild-api-image.sh
# 仍失败时:重启 Docker 后再跑
RESTART_DOCKER=1 ./scripts/rebuild-api-image.sh
# 再失败:改用旧版构建器(无 BuildKit
COMPOSE_DOCKER_CLI_BUILD=0 DOCKER_BUILDKIT=0 docker compose build api --no-cache
docker compose up -d --force-recreate api
```
手动等价步骤:`docker builder prune -af``docker rmi -f backend-api:latest``docker compose build api --no-cache`
确认根分区剩余空间充足(建议 ≥ 20GB空间不足时大层导出也容易损坏。
### RTSP 切片在宿主机无法用 VLC 打开
默认情况下 API 容器以 **root** 写入 `./logs`,切片属主为 `root:root`。普通用户虽可用 `cat` 读取,但 **Snap 版 VLC** 等沙箱应用常会报 Permission denied。
`backend/.env` 中设置与宿主机一致的 UID/GID`.env.example``HOST_UID` / `HOST_GID` / `DOCKER_GID`),然后重建 API 容器:
```bash
cd backend
docker compose up -d --force-recreate api
```
**已有** root 属主的切片需一次性修正(可选):
```bash
sudo chown -R "$(id -u):$(id -g)" backend/logs/rtsp_segments
```
---
## 三、手动启动客户端
```bash
cd clients/demo-client && ./start.sh
cd clients/voice-confirmation && ./start.sh
```
浏览器 Base URL 填 `http://<GPU服务器IP>:38080`
---
## 四、相关文档
- [部署版使用指南.md](部署版使用指南.md)
- [客户端手术通信接口说明.md](客户端手术通信接口说明.md)
- [clients/demo-client/README.md](../clients/demo-client/README.md)
- [clients/voice-confirmation/README.md](../clients/voice-confirmation/README.md)
- [离线镜像tarball部署.md](离线镜像tarball部署.md)