Files
FishServer/FishAction/dataset/README.md
2026-04-08 19:32:23 +08:00

86 lines
2.6 KiB
Markdown
Executable File
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Fish Action Dataset Preparation
This folder contains scripts to prepare your fish action video dataset for SlowFast training.
## Important: No Video Pre-splitting Required
**SlowFast does NOT require you to pre-split videos into clips.** The framework automatically:
- Loads your full-length videos
- Samples short clips (typically 2-4 seconds) during training
- Randomly samples one clip per video for training/validation
- Uniformly samples multiple clips per video for testing
The clip duration is determined by configuration parameters:
- `NUM_FRAMES`: Number of frames to sample (e.g., 8, 16, 32, 64)
- `SAMPLING_RATE`: Interval between sampled frames (e.g., 2, 8)
- `TARGET_FPS`: Target frames per second (typically 30)
Example: With `NUM_FRAMES=32` and `SAMPLING_RATE=2`, the clip duration is approximately:
`(32-1) × 2 / 30 ≈ 2.07 seconds`
## Dataset Structure
Your videos should be organized as follows:
```
~/data/fish/fish_action_videos/
├── class1/
│ ├── video1.mp4
│ ├── video2.mp4
│ └── ...
├── class2/
│ ├── video1.mp4
│ ├── video2.mp4
│ └── ...
└── ...
```
Each subfolder represents a class, and contains multiple video files.
## Usage
Run the preparation script:
```bash
python dataset/prepare_fish_dataset.py \
--video_dir ~/data/fish/fish_action_videos \
--output_dir ./dataset
```
### Options
- `--video_dir`: Root directory containing class subfolders (default: `~/data/fish/fish_action_videos`)
- `--output_dir`: Output directory for CSV files (default: `./dataset`)
- `--train_ratio`: Ratio of videos for training (default: 0.7)
- `--val_ratio`: Ratio of videos for validation (default: 0.15)
- `--test_ratio`: Ratio of videos for testing (default: 0.15)
- `--seed`: Random seed for reproducibility (default: 42)
## Output
The script will generate:
1. **train.csv**: Training set with format `path_to_video label`
2. **val.csv**: Validation set with format `path_to_video label`
3. **test.csv**: Test set with format `path_to_video label`
4. **label_map.txt**: Mapping of label indices to class names
## Configuration
After running the script, configure SlowFast by setting:
- `DATA.PATH_TO_DATA_DIR`: Path to the output directory (where CSV files are located)
- `DATA.PATH_PREFIX`: Path to the video directory (`~/data/fish/fish_action_videos`)
Example configuration:
```yaml
DATA:
PATH_TO_DATA_DIR: "/home/ubuntu/projects/FishAction/dataset"
PATH_PREFIX: "/home/ubuntu/data/fish/fish_action_videos"
```
## Supported Video Formats
The script supports common video formats:
- `.mp4`, `.avi`, `.mov`, `.mkv`, `.webm`, `.flv`, `.m4v`