17 lines
1.1 KiB
Markdown
17 lines
1.1 KiB
Markdown
# Fish Weight Prediction – Ideas
|
||
|
||
## Multi-modal weight prediction (image + point cloud)
|
||
|
||
**Idea:** Instead of using only PointNet++ to predict weight from the 3D point cloud, use two separate models and combine their outputs:
|
||
|
||
1. **Image-based weight predictor** – A model that predicts fish weight from 2D images (e.g., RGB frames from the video).
|
||
2. **Point cloud–based weight predictor** – The existing PointNet++ model that predicts weight from 3D point clouds.
|
||
|
||
Then **combine the two feature representations** (or predictions) to produce a final weight estimate. This could be done by:
|
||
|
||
- Concatenating features from both encoders and feeding them into a small fusion head.
|
||
- Averaging or otherwise combining the two predictions (e.g., weighted average).
|
||
- Using a learned fusion module that decides how much to trust each modality.
|
||
|
||
**Rationale:** 2D images and 3D point clouds provide complementary information. Images capture texture, color, and fine visual details; point clouds capture geometry and scale. Combining both may improve robustness when one modality is noisy or incomplete (e.g., poor depth, occlusion, or low-quality segmentation).
|