# Training Guide: Fine-tuning SlowFast for Fish Action Classification This guide will help you fine-tune a pretrained SlowFast model on your fish action dataset. ## Prerequisites 1. ✅ You have generated CSV files using `prepare_fish_dataset.py` 2. ✅ Your videos are organized in `~/data/fish/fish_action_videos/` 3. ✅ CSV files are in `./data/fish/fish_action_training_dataset/` ## Step 1: Download Pretrained Model Download a pretrained SlowFast model from the [Model Zoo](https://github.com/facebookresearch/SlowFast/blob/main/MODEL_ZOO.md). ### Recommended Models: **SlowFast 8x8 R50 (Kinetics 400)** - Good balance of accuracy and speed: ```bash # Create checkpoints directory mkdir -p checkpoints # Download pretrained model (Caffe2 format) wget https://dl.fbaipublicfiles.com/pyslowfast/model_zoo/kinetics400/SLOWFAST_8x8_R50.pkl \ -O checkpoints/SLOWFAST_8x8_R50.pkl ``` Or download PyTorch format if available: ```bash # PyTorch format (if available) wget -O checkpoints/SLOWFAST_8x8_R50.pyth ``` ## Step 2: Configure Your Training Edit the config file `configs/fish_action_SLOWFAST_8x8_R50.yaml`: 1. **Set pretrained model path:** ```yaml TRAIN: CHECKPOINT_FILE_PATH: "checkpoints/SLOWFAST_8x8_R50.pkl" CHECKPOINT_TYPE: caffe2 # or "pytorch" if using .pyth file ``` 2. **Verify dataset paths:** ```yaml DATA: PATH_TO_DATA_DIR: "/home/ubuntu/projects/FishAction/data/fish/fish_action_training_dataset" PATH_PREFIX: "/home/ubuntu/data/fish/fish_action_videos" ``` 3. **Adjust batch size** based on your GPU memory: ```yaml TRAIN: BATCH_SIZE: 8 # Reduce if you get OOM errors ``` 4. **Number of classes** is already set to 5: ```yaml MODEL: NUM_CLASSES: 5 ``` ## Step 3: Start Training ### Basic Training Command ```bash cd /home/ubuntu/projects/FishAction python slowfast/tools/run_net.py \ --cfg configs/fish_action_SLOWFAST_8x8_R50.yaml \ NUM_GPUS 1 ``` ### Training with Command-Line Overrides You can override config values from the command line: ```bash python slowfast/tools/run_net.py \ --cfg configs/fish_action_SLOWFAST_8x8_R50.yaml \ TRAIN.BATCH_SIZE 4 \ NUM_GPUS 1 \ SOLVER.MAX_EPOCH 30 \ SOLVER.BASE_LR 0.005 ``` ### Multi-GPU Training If you have multiple GPUs: ```bash python slowfast/tools/run_net.py \ --cfg configs/fish_action_SLOWFAST_8x8_R50.yaml \ NUM_GPUS 2 \ TRAIN.BATCH_SIZE 16 ``` ## Step 4: Monitor Training Training logs will be saved to: - Console output - `checkpoints/fish_action/logs/` (if TensorBoard is enabled) ### Check Training Progress ```bash # View latest checkpoint ls -lh checkpoints/fish_action/ # View logs tail -f checkpoints/fish_action/logs/*.log ``` ## Step 5: Evaluate Model After training, evaluate on validation set: ```bash python slowfast/tools/run_net.py \ --cfg configs/fish_action_SLOWFAST_8x8_R50.yaml \ TRAIN.ENABLE False \ TEST.ENABLE True \ TEST.CHECKPOINT_FILE_PATH checkpoints/fish_action/checkpoints/checkpoint_epoch_00050.pyth \ NUM_GPUS 1 ``` ## Step 6: Test on Test Set ```bash python slowfast/tools/run_net.py \ --cfg configs/fish_action_SLOWFAST_8x8_R50.yaml \ TRAIN.ENABLE False \ TEST.ENABLE True \ TEST.CHECKPOINT_FILE_PATH checkpoints/fish_action/checkpoints/checkpoint_epoch_00050.pyth \ DATA.PATH_TO_DATA_DIR /home/ubuntu/projects/FishAction/data/fish/fish_action_training_dataset \ NUM_GPUS 1 ``` ## Troubleshooting ### Out of Memory (OOM) Errors Reduce batch size: ```yaml TRAIN: BATCH_SIZE: 4 # or even 2 ``` Or reduce number of frames: ```yaml DATA: NUM_FRAMES: 16 # instead of 32 ``` ### Model Not Loading 1. **Check checkpoint path:** ```bash ls -lh checkpoints/SLOWFAST_8x8_R50.pkl ``` 2. **Verify checkpoint type:** - `.pkl` files are usually Caffe2 format - `.pyth` files are PyTorch format - Set `CHECKPOINT_TYPE` accordingly 3. **If using Caffe2 checkpoint:** ```yaml TRAIN: CHECKPOINT_TYPE: caffe2 ``` ### Dataset Not Found 1. **Verify CSV files exist:** ```bash ls -lh data/fish/fish_action_training_dataset/*.csv ``` 2. **Check video paths in CSV:** ```bash head data/fish/fish_action_training_dataset/train.csv ``` 3. **Verify PATH_PREFIX:** - Should be absolute path to video directory - Videos should be accessible at: `PATH_PREFIX + path_in_csv` ### Slow Training 1. **Increase NUM_WORKERS** (if you have CPU cores): ```yaml DATA_LOADER: NUM_WORKERS: 8 ``` 2. **Use mixed precision training** (if supported): ```yaml TRAIN: MIXED_PRECISION: True ``` ## Tips for Better Results 1. **Learning Rate:** Start with 0.01 for fine-tuning, can go lower (0.001) if overfitting 2. **Epochs:** Monitor validation accuracy - stop if it plateaus 3. **Data Augmentation:** Already enabled in config (random crop, flip, etc.) 4. **Early Stopping:** Manually stop if validation accuracy doesn't improve ## Next Steps After training: 1. Evaluate on test set 2. Use the best checkpoint for inference 3. Create inference script for new videos ## Example: Complete Training Session ```bash # 1. Download pretrained model mkdir -p checkpoints wget https://dl.fbaipublicfiles.com/pyslowfast/model_zoo/kinetics400/SLOWFAST_8x8_R50.pkl \ -O checkpoints/SLOWFAST_8x8_R50.pkl # 2. Edit config file (set checkpoint path and verify dataset paths) # 3. Start training python slowfast/tools/run_net.py \ --cfg configs/fish_action_SLOWFAST_8x8_R50.yaml \ NUM_GPUS 1 # 4. After training, evaluate python slowfast/tools/run_net.py \ --cfg configs/fish_action_SLOWFAST_8x8_R50.yaml \ TRAIN.ENABLE False \ TEST.ENABLE True \ TEST.CHECKPOINT_FILE_PATH checkpoints/fish_action/checkpoints/checkpoint_epoch_00050.pyth \ NUM_GPUS 1 ``` ## Your Classes Based on your dataset preparation, your 5 classes are: - 0: feeding - 1: normal_underwater - 2: normal_upperwater - 3: scared_underwater - 4: scared_upperwater Check `data/fish/fish_action_training_dataset/label_map.txt` for the exact mapping.