Go to file

hsz a5951461c4 first commit: 手术室耗材离线推理代码

Co-authored-by: Cursor <cursoragent@cursor.com>

2026-05-26 14:22:44 +08:00

code

first commit: 手术室耗材离线推理代码

2026-05-26 14:22:44 +08:00

configs

first commit: 手术室耗材离线推理代码

2026-05-26 14:22:44 +08:00

docs

first commit: 手术室耗材离线推理代码

2026-05-26 14:22:44 +08:00

doctor_identity_package

first commit: 手术室耗材离线推理代码

2026-05-26 14:22:44 +08:00

input

first commit: 手术室耗材离线推理代码

2026-05-26 14:22:44 +08:00

output

first commit: 手术室耗材离线推理代码

2026-05-26 14:22:44 +08:00

scripts

first commit: 手术室耗材离线推理代码

2026-05-26 14:22:44 +08:00

src

first commit: 手术室耗材离线推理代码

2026-05-26 14:22:44 +08:00

weights

first commit: 手术室耗材离线推理代码

2026-05-26 14:22:44 +08:00

.gitignore

first commit: 手术室耗材离线推理代码

2026-05-26 14:22:44 +08:00

main_debug.py

first commit: 手术室耗材离线推理代码

2026-05-26 14:22:44 +08:00

main.py

first commit: 手术室耗材离线推理代码

2026-05-26 14:22:44 +08:00

README.md

first commit: 手术室耗材离线推理代码

2026-05-26 14:22:44 +08:00

requirements.txt

first commit: 手术室耗材离线推理代码

2026-05-26 14:22:44 +08:00

visualize_result_video.py

first commit: 手术室耗材离线推理代码

2026-05-26 14:22:44 +08:00

README.md

手术室耗材离线推理包

主入口：python main.py（读取 configs/default_config.yaml 或 --config 指定 yaml）

功能

输入：主视角 MP4（io.video）+ 商品 Excel（io.excel，白名单与商品编码）
输出：制表符分隔结果文件（io.out，默认 output/result.txt）
流程：VideoSwin 特征 → ActionFormer 切时段 → 段内 YOLO 耗材推断 → 可选撕膜相邻段合并 → 末行医生信息

输出格式与 5.17 开发包完全一致（12 列 TSV + 医生信息： 行）。

安装

cd /path/to/本目录

# 1. 先按 https://pytorch.org 安装与 CUDA 匹配的 torch / torchvision
# 2. 安装依赖
pip install -r requirements.txt
pip install -e code/actionformer_release/libs/utils

运行

将待分析 MP4 与 Excel 放入 input/（或改 yaml 为绝对路径）
确认 weights/ 内 5 个模型文件齐全
确认 doctor_identity_package/doctor_info.pth 存在（医生识别默认开启）
推荐复制 configs/run_tracking_template.yaml 并修改 io.video / io.out
执行：

python main.py --config configs/run_your_video.yaml

HEVC 视频（4K 主视角）

VideoSwin 特征提取对 HEVC 原片可能解码失败，请先转 H.264：

./scripts/remux_hevc.sh /path/to/source.mp4
# 输出默认: input/remuxed/<stem>_h264.mp4

然后在 yaml 中将 io.video 指向转码后的文件。

Debug（Excel 时间段，跳过 ActionFormer）

python main_debug.py \
  --video input/remuxed/xxx_h264.mp4 \
  --excel input/视频中的商品信息表.xlsx \
  --out output/result_debug.txt \
  --config configs/run_tracking_template.yaml

可视化（可选）

主流程完成后，单独生成带手框与耗材标签的 MP4：

python visualize_result_video.py \
  --video input/remuxed/xxx_h264.mp4 \
  --result-txt output/result.txt \
  --out-video output/result_vis.mp4

输出格式

默认 12 列 TSV（Tab 分隔）；文件末尾一行 医生信息：...。

rank	start_sec	end_sec	product_id_top1	top1_name	top1_conf	...
医生信息：付玉峰 (id=24503, conf=0.8552)

关闭医生识别：在 yaml 中设 doctor_identity.enabled: false。

目录结构

├── main.py                      # 主流程入口
├── main_debug.py                # Excel 时间段 debug 入口
├── visualize_result_video.py    # 可选可视化
├── configs/
│   ├── default_config.yaml
│   └── run_tracking_template.yaml
├── scripts/remux_hevc.sh        # HEVC → H.264 转码
├── src/                         # 配置与编排
├── weights/                     # 5 个模型
├── input/remuxed/               # 转码后视频
├── output/                      # 结果 txt
├── doctor_identity_package/
└── code/                        # 算法子树（一般勿改）

不包含：RTSP 模拟推流、训练脚本。

README.md Unescape Escape

手术室耗材离线推理包

功能

安装

运行

HEVC 视频（4K 主视角）

Debug（Excel 时间段，跳过 ActionFormer）

可视化（可选）

输出格式

目录结构

README.md