6.0 KiB
Training Guide: Fine-tuning SlowFast for Fish Action Classification
This guide will help you fine-tune a pretrained SlowFast model on your fish action dataset.
Prerequisites
- ✅ You have generated CSV files using
prepare_fish_dataset.py - ✅ Your videos are organized in
~/data/fish/fish_action_videos/ - ✅ CSV files are in
./data/fish/fish_action_training_dataset/
Step 1: Download Pretrained Model
Download a pretrained SlowFast model from the Model Zoo.
Recommended Models:
SlowFast 8x8 R50 (Kinetics 400) - Good balance of accuracy and speed:
# Create checkpoints directory
mkdir -p checkpoints
# Download pretrained model (Caffe2 format)
wget https://dl.fbaipublicfiles.com/pyslowfast/model_zoo/kinetics400/SLOWFAST_8x8_R50.pkl \
-O checkpoints/SLOWFAST_8x8_R50.pkl
Or download PyTorch format if available:
# PyTorch format (if available)
wget <pytorch_model_url> -O checkpoints/SLOWFAST_8x8_R50.pyth
Step 2: Configure Your Training
Edit the config file configs/fish_action_SLOWFAST_8x8_R50.yaml:
-
Set pretrained model path:
TRAIN: CHECKPOINT_FILE_PATH: "checkpoints/SLOWFAST_8x8_R50.pkl" CHECKPOINT_TYPE: caffe2 # or "pytorch" if using .pyth file -
Verify dataset paths:
DATA: PATH_TO_DATA_DIR: "/home/ubuntu/projects/FishAction/data/fish/fish_action_training_dataset" PATH_PREFIX: "/home/ubuntu/data/fish/fish_action_videos" -
Adjust batch size based on your GPU memory:
TRAIN: BATCH_SIZE: 8 # Reduce if you get OOM errors -
Number of classes is already set to 5:
MODEL: NUM_CLASSES: 5
Step 3: Start Training
Basic Training Command
cd /home/ubuntu/projects/FishAction
python slowfast/tools/run_net.py \
--cfg configs/fish_action_SLOWFAST_8x8_R50.yaml \
NUM_GPUS 1
Training with Command-Line Overrides
You can override config values from the command line:
python slowfast/tools/run_net.py \
--cfg configs/fish_action_SLOWFAST_8x8_R50.yaml \
TRAIN.BATCH_SIZE 4 \
NUM_GPUS 1 \
SOLVER.MAX_EPOCH 30 \
SOLVER.BASE_LR 0.005
Multi-GPU Training
If you have multiple GPUs:
python slowfast/tools/run_net.py \
--cfg configs/fish_action_SLOWFAST_8x8_R50.yaml \
NUM_GPUS 2 \
TRAIN.BATCH_SIZE 16
Step 4: Monitor Training
Training logs will be saved to:
- Console output
checkpoints/fish_action/logs/(if TensorBoard is enabled)
Check Training Progress
# View latest checkpoint
ls -lh checkpoints/fish_action/
# View logs
tail -f checkpoints/fish_action/logs/*.log
Step 5: Evaluate Model
After training, evaluate on validation set:
python slowfast/tools/run_net.py \
--cfg configs/fish_action_SLOWFAST_8x8_R50.yaml \
TRAIN.ENABLE False \
TEST.ENABLE True \
TEST.CHECKPOINT_FILE_PATH checkpoints/fish_action/checkpoints/checkpoint_epoch_00050.pyth \
NUM_GPUS 1
Step 6: Test on Test Set
python slowfast/tools/run_net.py \
--cfg configs/fish_action_SLOWFAST_8x8_R50.yaml \
TRAIN.ENABLE False \
TEST.ENABLE True \
TEST.CHECKPOINT_FILE_PATH checkpoints/fish_action/checkpoints/checkpoint_epoch_00050.pyth \
DATA.PATH_TO_DATA_DIR /home/ubuntu/projects/FishAction/data/fish/fish_action_training_dataset \
NUM_GPUS 1
Troubleshooting
Out of Memory (OOM) Errors
Reduce batch size:
TRAIN:
BATCH_SIZE: 4 # or even 2
Or reduce number of frames:
DATA:
NUM_FRAMES: 16 # instead of 32
Model Not Loading
-
Check checkpoint path:
ls -lh checkpoints/SLOWFAST_8x8_R50.pkl -
Verify checkpoint type:
.pklfiles are usually Caffe2 format.pythfiles are PyTorch format- Set
CHECKPOINT_TYPEaccordingly
-
If using Caffe2 checkpoint:
TRAIN: CHECKPOINT_TYPE: caffe2
Dataset Not Found
-
Verify CSV files exist:
ls -lh data/fish/fish_action_training_dataset/*.csv -
Check video paths in CSV:
head data/fish/fish_action_training_dataset/train.csv -
Verify PATH_PREFIX:
- Should be absolute path to video directory
- Videos should be accessible at:
PATH_PREFIX + path_in_csv
Slow Training
-
Increase NUM_WORKERS (if you have CPU cores):
DATA_LOADER: NUM_WORKERS: 8 -
Use mixed precision training (if supported):
TRAIN: MIXED_PRECISION: True
Tips for Better Results
- Learning Rate: Start with 0.01 for fine-tuning, can go lower (0.001) if overfitting
- Epochs: Monitor validation accuracy - stop if it plateaus
- Data Augmentation: Already enabled in config (random crop, flip, etc.)
- Early Stopping: Manually stop if validation accuracy doesn't improve
Next Steps
After training:
- Evaluate on test set
- Use the best checkpoint for inference
- Create inference script for new videos
Example: Complete Training Session
# 1. Download pretrained model
mkdir -p checkpoints
wget https://dl.fbaipublicfiles.com/pyslowfast/model_zoo/kinetics400/SLOWFAST_8x8_R50.pkl \
-O checkpoints/SLOWFAST_8x8_R50.pkl
# 2. Edit config file (set checkpoint path and verify dataset paths)
# 3. Start training
python slowfast/tools/run_net.py \
--cfg configs/fish_action_SLOWFAST_8x8_R50.yaml \
NUM_GPUS 1
# 4. After training, evaluate
python slowfast/tools/run_net.py \
--cfg configs/fish_action_SLOWFAST_8x8_R50.yaml \
TRAIN.ENABLE False \
TEST.ENABLE True \
TEST.CHECKPOINT_FILE_PATH checkpoints/fish_action/checkpoints/checkpoint_epoch_00050.pyth \
NUM_GPUS 1
Your Classes
Based on your dataset preparation, your 5 classes are:
- 0: feeding
- 1: normal_underwater
- 2: normal_upperwater
- 3: scared_underwater
- 4: scared_upperwater
Check data/fish/fish_action_training_dataset/label_map.txt for the exact mapping.