48 lines
1.3 KiB
Markdown
48 lines
1.3 KiB
Markdown
|
|
# PyTorchVideo Training (No SlowFast)
|
||
|
|
|
||
|
|
Your environment currently has a PyTorchVideo / torchvision mismatch that breaks `pytorchvideo.transforms`.
|
||
|
|
So we train with **PyTorchVideo decoding + a small custom transform pipeline** (no SlowFast).
|
||
|
|
|
||
|
|
## 1) Prereqs
|
||
|
|
|
||
|
|
- You already created CSV files (e.g. `train.csv`, `val.csv`) with:
|
||
|
|
- `relative/path/to/video.mp4 <label_int>`
|
||
|
|
- Your videos live under:
|
||
|
|
- `/home/ubuntu/data/fish/fish_action_videos`
|
||
|
|
- Your CSV folder is:
|
||
|
|
- `/home/ubuntu/projects/FishAction/data/fish/fish_action_training_dataset`
|
||
|
|
|
||
|
|
## 2) Train (fine-tune pretrained X3D)
|
||
|
|
|
||
|
|
From repo root:
|
||
|
|
|
||
|
|
```bash
|
||
|
|
cd /home/ubuntu/projects/FishAction
|
||
|
|
|
||
|
|
python train_pytorchvideo_x3d.py \
|
||
|
|
--csv_dir /home/ubuntu/projects/FishAction/data/fish/fish_action_training_dataset \
|
||
|
|
--path_prefix /home/ubuntu/data/fish/fish_action_videos \
|
||
|
|
--model x3d_m \
|
||
|
|
--pretrained \
|
||
|
|
--num_frames 16 \
|
||
|
|
--sampling_rate 5 \
|
||
|
|
--batch_size 4 \
|
||
|
|
--epochs 30 \
|
||
|
|
--num_workers 4 \
|
||
|
|
--amp \
|
||
|
|
--output_dir /home/ubuntu/projects/FishAction/checkpoints/ptv_x3d_m
|
||
|
|
```
|
||
|
|
|
||
|
|
Notes:
|
||
|
|
- `--pretrained` uses `torch.hub` and will download weights to `~/.cache/torch/hub/` on first run.
|
||
|
|
- If you hit OOM, lower `--batch_size` to `2` or `1`.
|
||
|
|
|
||
|
|
## 3) Outputs
|
||
|
|
|
||
|
|
The script writes into `--output_dir`:
|
||
|
|
- `config.json`
|
||
|
|
- `checkpoint_last.pt`
|
||
|
|
- `checkpoint_best.pt`
|
||
|
|
|
||
|
|
|