173 lines
5.8 KiB
Markdown
173 lines
5.8 KiB
Markdown
# Fish Weight Prediction using PointNet++
|
|
|
|
This module uses PointNet++ to predict fish weight from partial point cloud data.
|
|
|
|
## Overview
|
|
|
|
We provide two approaches for fish weight prediction:
|
|
|
|
### Approach 1: Direct Regression (Original)
|
|
Directly predict the absolute weight from a single point cloud.
|
|
|
|
### Approach 2: Comparison-Based (Recommended)
|
|
Predict the weight **difference** between two point clouds. For a new fish:
|
|
1. Find the closest reference fish by length (from known dataset)
|
|
2. Predict weight difference using the trained model
|
|
3. Calculate: `new_weight = reference_weight + predicted_difference`
|
|
|
|
**Why comparison-based?**
|
|
- Relative comparisons are often more accurate than absolute predictions
|
|
- Leverages known reference data
|
|
- More robust to incomplete/partial point clouds
|
|
- Better performance with small datasets (~100 samples)
|
|
|
|
## Workflow
|
|
|
|
### Approach 1: Direct Regression
|
|
|
|
1. **Dataset Preparation** (`dataset.py`): Prepare training data from multiple point clouds
|
|
2. **Training** (`train_weight_regression.py`): Train PointNet++ model for weight regression
|
|
3. **Testing/Inference** (`test_weight_regression.py`): Test the trained model on new point clouds
|
|
|
|
### Approach 2: Comparison-Based (Recommended)
|
|
|
|
1. **Dataset Preparation** (`dataset.py`): Same as Approach 1
|
|
2. **Training** (`train_weight_comparison.py`): Train PointNet++ model to predict weight differences
|
|
3. **Testing/Inference** (`test_weight_comparison.py`): Test using reference dataset
|
|
|
|
## Dataset Preparation
|
|
|
|
The `dataset.py` script processes point clouds from an input folder:
|
|
|
|
- **Input**: Folder containing multiple point cloud subfolders (e.g., `output_preview/xxxx/cloud/`)
|
|
- **Process**:
|
|
1. For each subfolder, find all PLY files
|
|
2. Select the point cloud with the largest length (max x - min x)
|
|
3. Normalize the point cloud by moving it to the center of origin (centroid = 0)
|
|
4. Save the normalized PLY file and corresponding weight label to the output folder
|
|
|
|
**Usage**:
|
|
```bash
|
|
python3 measure/dataset.py --input /path/to/pointclouds --labels /path/to/labels.csv --output /path/to/dataset
|
|
```
|
|
|
|
**Label CSV Format**:
|
|
```csv
|
|
subfolder_name,weight
|
|
HD1080_SN43186771_16-41-37,0.5
|
|
HD1080_SN43186771_16-41-40,0.6
|
|
...
|
|
```
|
|
|
|
## Training
|
|
|
|
The `train_weight_regression.py` script trains a PointNet++ model for weight regression:
|
|
|
|
- **Model**: PointNet++ (SSG - Single Scale Grouping) adapted for regression
|
|
- **Data Augmentation** (for 150 samples):
|
|
- Random point sampling (different number of points)
|
|
- Random rotation around z-axis
|
|
- Random scaling (small variations)
|
|
- Random jitter (noise)
|
|
- Random point dropout
|
|
|
|
**Usage**:
|
|
```bash
|
|
python3 measure/train_weight_regression.py \
|
|
--data_path /path/to/dataset \
|
|
--batch_size 16 \
|
|
--num_point 1024 \
|
|
--epoch 200 \
|
|
--learning_rate 0.001
|
|
```
|
|
|
|
## Testing/Inference
|
|
|
|
The `test_weight_regression.py` script performs inference on new point clouds:
|
|
|
|
**Usage**:
|
|
```bash
|
|
# Test on a single point cloud
|
|
python3 measure/test_weight_regression.py \
|
|
--model /path/to/checkpoint.pth \
|
|
--ply /path/to/pointcloud.ply
|
|
|
|
# Test on a folder of point clouds
|
|
python3 measure/test_weight_regression.py \
|
|
--model /path/to/checkpoint.pth \
|
|
--folder /path/to/pointclouds \
|
|
--output results.json
|
|
```
|
|
|
|
## Comparison-Based Approach (Recommended)
|
|
|
|
### Training
|
|
|
|
Train a model to predict weight differences between point cloud pairs:
|
|
|
|
```bash
|
|
python3 measure/train_weight_comparison.py \
|
|
--data_path /path/to/dataset \
|
|
--reference_folder /path/to/reference/dataset \
|
|
--batch_size 8 \
|
|
--num_point 1024 \
|
|
--epoch 200 \
|
|
--learning_rate 0.001 \
|
|
--pair_strategy random # or 'length_based'
|
|
```
|
|
|
|
**Pair Strategies:**
|
|
- `random`: Random pairs (more diverse training)
|
|
- `length_based`: Pair based on similar lengths (more realistic comparisons)
|
|
|
|
### Testing
|
|
|
|
For a new fish point cloud, find the closest reference and predict weight difference:
|
|
|
|
```bash
|
|
# Test on a single point cloud
|
|
python3 measure/test_weight_comparison.py \
|
|
--model /path/to/checkpoint.pth \
|
|
--reference_folder /path/to/reference/dataset \
|
|
--ply /path/to/new_fish.ply
|
|
|
|
# Test on a folder of point clouds
|
|
python3 measure/test_weight_comparison.py \
|
|
--model /path/to/checkpoint.pth \
|
|
--reference_folder /path/to/reference/dataset \
|
|
--folder /path/to/new_fishes \
|
|
--output results.json
|
|
```
|
|
|
|
**How it works:**
|
|
1. Loads all reference point clouds (with known weights and lengths)
|
|
2. For each new fish, finds the closest reference by length
|
|
3. Predicts weight difference: `predicted_diff = model(reference_pc, new_pc)`
|
|
4. Calculates weight: `new_weight = reference_weight + predicted_diff`
|
|
|
|
## File Structure
|
|
|
|
```
|
|
measure/
|
|
├── README.md # This file
|
|
├── dataset.py # Dataset preparation script
|
|
├── train_weight_regression.py # Direct regression training
|
|
├── test_weight_regression.py # Direct regression inference
|
|
├── train_weight_comparison.py # Comparison-based training
|
|
├── test_weight_comparison.py # Comparison-based inference
|
|
├── pointnet2_regression.py # Direct regression model
|
|
├── pointnet2_comparison.py # Comparison model
|
|
├── data_loader.py # Direct regression data loader
|
|
├── data_loader_comparison.py # Comparison data loader
|
|
└── data/ # Data folder (for OCR results, etc.)
|
|
```
|
|
|
|
## Notes
|
|
|
|
- **Direct Regression**: Predicts absolute weight from a single point cloud
|
|
- **Comparison-Based**: Predicts weight difference between two point clouds (recommended for small datasets)
|
|
- Point clouds are normalized to the origin before training/inference
|
|
- Data augmentation is crucial given the small dataset size (~100-150 samples)
|
|
- The comparison model uses shared PointNet++ encoder for both point clouds, then concatenates features
|
|
|