# Fish Action Dataset Preparation This folder contains scripts to prepare your fish action video dataset for SlowFast training. ## Important: No Video Pre-splitting Required **SlowFast does NOT require you to pre-split videos into clips.** The framework automatically: - Loads your full-length videos - Samples short clips (typically 2-4 seconds) during training - Randomly samples one clip per video for training/validation - Uniformly samples multiple clips per video for testing The clip duration is determined by configuration parameters: - `NUM_FRAMES`: Number of frames to sample (e.g., 8, 16, 32, 64) - `SAMPLING_RATE`: Interval between sampled frames (e.g., 2, 8) - `TARGET_FPS`: Target frames per second (typically 30) Example: With `NUM_FRAMES=32` and `SAMPLING_RATE=2`, the clip duration is approximately: `(32-1) × 2 / 30 ≈ 2.07 seconds` ## Dataset Structure Your videos should be organized as follows: ``` ~/data/fish/fish_action_videos/ ├── class1/ │ ├── video1.mp4 │ ├── video2.mp4 │ └── ... ├── class2/ │ ├── video1.mp4 │ ├── video2.mp4 │ └── ... └── ... ``` Each subfolder represents a class, and contains multiple video files. ## Usage Run the preparation script: ```bash python dataset/prepare_fish_dataset.py \ --video_dir ~/data/fish/fish_action_videos \ --output_dir ./dataset ``` ### Options - `--video_dir`: Root directory containing class subfolders (default: `~/data/fish/fish_action_videos`) - `--output_dir`: Output directory for CSV files (default: `./dataset`) - `--train_ratio`: Ratio of videos for training (default: 0.7) - `--val_ratio`: Ratio of videos for validation (default: 0.15) - `--test_ratio`: Ratio of videos for testing (default: 0.15) - `--seed`: Random seed for reproducibility (default: 42) ## Output The script will generate: 1. **train.csv**: Training set with format `path_to_video label` 2. **val.csv**: Validation set with format `path_to_video label` 3. **test.csv**: Test set with format `path_to_video label` 4. **label_map.txt**: Mapping of label indices to class names ## Configuration After running the script, configure SlowFast by setting: - `DATA.PATH_TO_DATA_DIR`: Path to the output directory (where CSV files are located) - `DATA.PATH_PREFIX`: Path to the video directory (`~/data/fish/fish_action_videos`) Example configuration: ```yaml DATA: PATH_TO_DATA_DIR: "/home/ubuntu/projects/FishAction/dataset" PATH_PREFIX: "/home/ubuntu/data/fish/fish_action_videos" ``` ## Supported Video Formats The script supports common video formats: - `.mp4`, `.avi`, `.mov`, `.mkv`, `.webm`, `.flv`, `.m4v`