Files
FishServer/FishAction/dataset/README.md

86 lines
2.6 KiB
Markdown
Raw Normal View History

# Fish Action Dataset Preparation
This folder contains scripts to prepare your fish action video dataset for SlowFast training.
## Important: No Video Pre-splitting Required
**SlowFast does NOT require you to pre-split videos into clips.** The framework automatically:
- Loads your full-length videos
- Samples short clips (typically 2-4 seconds) during training
- Randomly samples one clip per video for training/validation
- Uniformly samples multiple clips per video for testing
The clip duration is determined by configuration parameters:
- `NUM_FRAMES`: Number of frames to sample (e.g., 8, 16, 32, 64)
- `SAMPLING_RATE`: Interval between sampled frames (e.g., 2, 8)
- `TARGET_FPS`: Target frames per second (typically 30)
Example: With `NUM_FRAMES=32` and `SAMPLING_RATE=2`, the clip duration is approximately:
`(32-1) × 2 / 30 ≈ 2.07 seconds`
## Dataset Structure
Your videos should be organized as follows:
```
~/data/fish/fish_action_videos/
├── class1/
│ ├── video1.mp4
│ ├── video2.mp4
│ └── ...
├── class2/
│ ├── video1.mp4
│ ├── video2.mp4
│ └── ...
└── ...
```
Each subfolder represents a class, and contains multiple video files.
## Usage
Run the preparation script:
```bash
python dataset/prepare_fish_dataset.py \
--video_dir ~/data/fish/fish_action_videos \
--output_dir ./dataset
```
### Options
- `--video_dir`: Root directory containing class subfolders (default: `~/data/fish/fish_action_videos`)
- `--output_dir`: Output directory for CSV files (default: `./dataset`)
- `--train_ratio`: Ratio of videos for training (default: 0.7)
- `--val_ratio`: Ratio of videos for validation (default: 0.15)
- `--test_ratio`: Ratio of videos for testing (default: 0.15)
- `--seed`: Random seed for reproducibility (default: 42)
## Output
The script will generate:
1. **train.csv**: Training set with format `path_to_video label`
2. **val.csv**: Validation set with format `path_to_video label`
3. **test.csv**: Test set with format `path_to_video label`
4. **label_map.txt**: Mapping of label indices to class names
## Configuration
After running the script, configure SlowFast by setting:
- `DATA.PATH_TO_DATA_DIR`: Path to the output directory (where CSV files are located)
- `DATA.PATH_PREFIX`: Path to the video directory (`~/data/fish/fish_action_videos`)
Example configuration:
```yaml
DATA:
PATH_TO_DATA_DIR: "/home/ubuntu/projects/FishAction/dataset"
PATH_PREFIX: "/home/ubuntu/data/fish/fish_action_videos"
```
## Supported Video Formats
The script supports common video formats:
- `.mp4`, `.avi`, `.mov`, `.mkv`, `.webm`, `.flv`, `.m4v`