OpenCV
4.1.0
Open Source Computer Vision
|
Classes | |
class | cv::cnn_3dobj::descriptorExtractor |
Caffe based 3D images descriptor. A class to extract features from an image. The so obtained descriptors can be used for classification and pose estimation goals. More... | |
class | cv::cnn_3dobj::icoSphere |
Icosohedron based camera view data generator. The class create some sphere views of camera towards a 3D object meshed from .ply files. More... | |
As CNN based learning algorithm shows better performance on the classification issues, the rich labeled data could be more useful in the training stage. 3D object classification and pose estimation is a jointed mission aimming at seperate different posed apart in the descriptor form.
In the training stage, we prepare 2D training images generated from our module with their class label and pose label. We fully exploit the information lies in their labels by using a triplet and pair-wise jointed loss function in CNN training.
As CNN based learning algorithm shows better performance on the classification issues, the rich labeled data could be more useful in the training stage. 3D object classification and pose estimation is a jointed mission aiming at separate different posea apart in the descriptor form.
In the training stage, we prepare 2D training images generated from our module with their class label and pose label. We fully exploit the information that lies in their labels by using a triplet and pair-wise jointed loss function in CNN training.
Both class and pose label are in consideration in the triplet loss. The loss score will be smaller when features from the same class and same pose is more similar and features from different classes or different poses will lead to a much larger loss score.
This loss is also jointed with a pair wise component to make sure the loss is never be zero and have a restriction on the model scale.
About the training and feature extraction process, it is a rough implementation by using OpenCV and Caffe from the idea of Paul Wohlhart. The principal purpose of this API is constructing a well labeled database from .ply models for CNN training with triplet loss and extracting features with the constructed model for prediction or other purpose of pattern recognition, algorithms into two main Class:
**icoSphere: methods belonging to this class generates 2D images from a 3D model, together with their class and pose from camera view labels.
**descriptorExtractor: methods belonging to this class extract descriptors from 2D images which is discriminant on category prediction and pose estimation.