OpenCV  4.1.0
Open Source Computer Vision
Modules | Classes | Typedefs | Enumerations | Functions
Deep Neural Network module

Modules

 Partial List of Implemented Layers
 
 Utilities for New Layers Registration
 

Classes

class  cv::dnn::BackendNode
 Derivatives of this class encapsulates functions of certain backends. More...
 
class  cv::dnn::BackendWrapper
 Derivatives of this class wraps cv::Mat for different backends and targets. More...
 
class  cv::dnn::Dict
 This class implements name-value dictionary, values are instances of DictValue. More...
 
struct  cv::dnn::DictValue
 This struct stores the scalar value (or array) of one of the following type: double, cv::String or int64. More...
 
class  cv::dnn::Layer
 This interface class allows to build new Layers - are building blocks of networks. More...
 
class  cv::dnn::LayerParams
 This class provides all data needed to initialize layer. More...
 
class  cv::dnn::Net
 This class allows to create and manipulate comprehensive artificial neural networks. More...
 

Typedefs

typedef std::vector< int > cv::dnn::MatShape
 

Enumerations

enum  cv::dnn::Backend {
  cv::dnn::DNN_BACKEND_DEFAULT,
  cv::dnn::DNN_BACKEND_HALIDE,
  cv::dnn::DNN_BACKEND_INFERENCE_ENGINE,
  cv::dnn::DNN_BACKEND_OPENCV,
  cv::dnn::DNN_BACKEND_VKCOM
}
 Enum of computation backends supported by layers. More...
 
enum  cv::dnn::Target {
  cv::dnn::DNN_TARGET_CPU,
  cv::dnn::DNN_TARGET_OPENCL,
  cv::dnn::DNN_TARGET_OPENCL_FP16,
  cv::dnn::DNN_TARGET_MYRIAD,
  cv::dnn::DNN_TARGET_VULKAN,
  cv::dnn::DNN_TARGET_FPGA
}
 Enum of target devices for computations. More...
 

Functions

Mat cv::dnn::blobFromImage (InputArray image, double scalefactor=1.0, const Size &size=Size(), const Scalar &mean=Scalar(), bool swapRB=false, bool crop=false, int ddepth=CV_32F)
 Creates 4-dimensional blob from image. Optionally resizes and crops image from center, subtract mean values, scales values by scalefactor, swap Blue and Red channels.
 
void cv::dnn::blobFromImage (InputArray image, OutputArray blob, double scalefactor=1.0, const Size &size=Size(), const Scalar &mean=Scalar(), bool swapRB=false, bool crop=false, int ddepth=CV_32F)
 Creates 4-dimensional blob from image.
 
Mat cv::dnn::blobFromImages (InputArrayOfArrays images, double scalefactor=1.0, Size size=Size(), const Scalar &mean=Scalar(), bool swapRB=false, bool crop=false, int ddepth=CV_32F)
 Creates 4-dimensional blob from series of images. Optionally resizes and crops images from center, subtract mean values, scales values by scalefactor, swap Blue and Red channels.
 
void cv::dnn::blobFromImages (InputArrayOfArrays images, OutputArray blob, double scalefactor=1.0, Size size=Size(), const Scalar &mean=Scalar(), bool swapRB=false, bool crop=false, int ddepth=CV_32F)
 Creates 4-dimensional blob from series of images.
 
std::vector< std::pair
< Backend, Target > > 
cv::dnn::getAvailableBackends ()
 
std::vector< Target > cv::dnn::getAvailableTargets (Backend be)
 
void cv::dnn::imagesFromBlob (const cv::Mat &blob_, OutputArrayOfArrays images_)
 Parse a 4D blob and output the images it contains as 2D arrays through a simpler data structure (std::vector<cv::Mat>).
 
void cv::dnn::NMSBoxes (const std::vector< Rect > &bboxes, const std::vector< float > &scores, const float score_threshold, const float nms_threshold, std::vector< int > &indices, const float eta=1.f, const int top_k=0)
 Performs non maximum suppression given boxes and corresponding scores.
 
void cv::dnn::NMSBoxes (const std::vector< Rect2d > &bboxes, const std::vector< float > &scores, const float score_threshold, const float nms_threshold, std::vector< int > &indices, const float eta=1.f, const int top_k=0)
 
void cv::dnn::NMSBoxes (const std::vector< RotatedRect > &bboxes, const std::vector< float > &scores, const float score_threshold, const float nms_threshold, std::vector< int > &indices, const float eta=1.f, const int top_k=0)
 
Net cv::dnn::readNet (const String &model, const String &config="", const String &framework="")
 Read deep learning network represented in one of the supported formats.
 
Net cv::dnn::readNet (const String &framework, const std::vector< uchar > &bufferModel, const std::vector< uchar > &bufferConfig=std::vector< uchar >())
 Read deep learning network represented in one of the supported formats.
 
Net cv::dnn::readNetFromCaffe (const String &prototxt, const String &caffeModel=String())
 Reads a network model stored in Caffe framework's format.
 
Net cv::dnn::readNetFromCaffe (const std::vector< uchar > &bufferProto, const std::vector< uchar > &bufferModel=std::vector< uchar >())
 Reads a network model stored in Caffe model in memory.
 
Net cv::dnn::readNetFromCaffe (const char *bufferProto, size_t lenProto, const char *bufferModel=NULL, size_t lenModel=0)
 Reads a network model stored in Caffe model in memory.
 
Net cv::dnn::readNetFromDarknet (const String &cfgFile, const String &darknetModel=String())
 Reads a network model stored in Darknet model files.
 
Net cv::dnn::readNetFromDarknet (const std::vector< uchar > &bufferCfg, const std::vector< uchar > &bufferModel=std::vector< uchar >())
 Reads a network model stored in Darknet model files.
 
Net cv::dnn::readNetFromDarknet (const char *bufferCfg, size_t lenCfg, const char *bufferModel=NULL, size_t lenModel=0)
 Reads a network model stored in Darknet model files.
 
Net cv::dnn::readNetFromModelOptimizer (const String &xml, const String &bin)
 Load a network from Intel's Model Optimizer intermediate representation.
 
Net cv::dnn::readNetFromONNX (const String &onnxFile)
 Reads a network model ONNX.
 
Net cv::dnn::readNetFromTensorflow (const String &model, const String &config=String())
 Reads a network model stored in TensorFlow framework's format.
 
Net cv::dnn::readNetFromTensorflow (const std::vector< uchar > &bufferModel, const std::vector< uchar > &bufferConfig=std::vector< uchar >())
 Reads a network model stored in TensorFlow framework's format.
 
Net cv::dnn::readNetFromTensorflow (const char *bufferModel, size_t lenModel, const char *bufferConfig=NULL, size_t lenConfig=0)
 Reads a network model stored in TensorFlow framework's format.
 
Net cv::dnn::readNetFromTorch (const String &model, bool isBinary=true, bool evaluate=true)
 Reads a network model stored in Torch7 framework's format.
 
Mat cv::dnn::readTensorFromONNX (const String &path)
 Creates blob from .pb file.
 
Mat cv::dnn::readTorchBlob (const String &filename, bool isBinary=true)
 Loads blob which was serialized as torch.Tensor object of Torch7 framework.
 
void cv::dnn::shrinkCaffeModel (const String &src, const String &dst, const std::vector< String > &layersTypes=std::vector< String >())
 Convert all weights of Caffe network to half precision floating point.
 
void cv::dnn::writeTextGraph (const String &model, const String &output)
 Create a text representation for a binary network stored in protocol buffer format.
 

Detailed Description

This module contains:

Functionality of this module is designed only for forward pass computations (i.e. network testing). A network training is in principle not supported.

Typedef Documentation

typedef std::vector<int> cv::dnn::MatShape

Enumeration Type Documentation

Enum of computation backends supported by layers.

See Also
Net::setPreferableBackend
Enumerator
DNN_BACKEND_DEFAULT 

DNN_BACKEND_DEFAULT equals to DNN_BACKEND_INFERENCE_ENGINE if OpenCV is built with Intel's Inference Engine library or DNN_BACKEND_OPENCV otherwise.

DNN_BACKEND_HALIDE 
DNN_BACKEND_INFERENCE_ENGINE 
DNN_BACKEND_OPENCV 
DNN_BACKEND_VKCOM 

Enum of target devices for computations.

See Also
Net::setPreferableTarget
Enumerator
DNN_TARGET_CPU 
DNN_TARGET_OPENCL 
DNN_TARGET_OPENCL_FP16 
DNN_TARGET_MYRIAD 
DNN_TARGET_VULKAN 
DNN_TARGET_FPGA 

FPGA device with CPU fallbacks using Inference Engine's Heterogeneous plugin.

Function Documentation

Mat cv::dnn::blobFromImage ( InputArray  image,
double  scalefactor = 1.0,
const Size &  size = Size(),
const Scalar &  mean = Scalar(),
bool  swapRB = false,
bool  crop = false,
int  ddepth = CV_32F 
)

Creates 4-dimensional blob from image. Optionally resizes and crops image from center, subtract mean values, scales values by scalefactor, swap Blue and Red channels.

Parameters
imageinput image (with 1-, 3- or 4-channels).
sizespatial size for output image
meanscalar with mean values which are subtracted from channels. Values are intended to be in (mean-R, mean-G, mean-B) order if image has BGR ordering and swapRB is true.
scalefactormultiplier for image values.
swapRBflag which indicates that swap first and last channels in 3-channel image is necessary.
cropflag which indicates whether image will be cropped after resize or not
ddepthDepth of output blob. Choose CV_32F or CV_8U.

if crop is true, input image is resized so one side after resize is equal to corresponding dimension in size and another one is equal or larger. Then, crop from the center is performed. If crop is false, direct resize without cropping and preserving aspect ratio is performed.

Returns
4-dimensional Mat with NCHW dimensions order.
Examples:
samples/dnn/classification.cpp, samples/dnn/colorization.cpp, samples/dnn/object_detection.cpp, samples/dnn/openpose.cpp, samples/dnn/segmentation.cpp, and samples/dnn/text_detection.cpp.
void cv::dnn::blobFromImage ( InputArray  image,
OutputArray  blob,
double  scalefactor = 1.0,
const Size &  size = Size(),
const Scalar &  mean = Scalar(),
bool  swapRB = false,
bool  crop = false,
int  ddepth = CV_32F 
)

Creates 4-dimensional blob from image.

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

Mat cv::dnn::blobFromImages ( InputArrayOfArrays  images,
double  scalefactor = 1.0,
Size  size = Size(),
const Scalar &  mean = Scalar(),
bool  swapRB = false,
bool  crop = false,
int  ddepth = CV_32F 
)

Creates 4-dimensional blob from series of images. Optionally resizes and crops images from center, subtract mean values, scales values by scalefactor, swap Blue and Red channels.

Parameters
imagesinput images (all with 1-, 3- or 4-channels).
sizespatial size for output image
meanscalar with mean values which are subtracted from channels. Values are intended to be in (mean-R, mean-G, mean-B) order if image has BGR ordering and swapRB is true.
scalefactormultiplier for images values.
swapRBflag which indicates that swap first and last channels in 3-channel image is necessary.
cropflag which indicates whether image will be cropped after resize or not
ddepthDepth of output blob. Choose CV_32F or CV_8U.

if crop is true, input image is resized so one side after resize is equal to corresponding dimension in size and another one is equal or larger. Then, crop from the center is performed. If crop is false, direct resize without cropping and preserving aspect ratio is performed.

Returns
4-dimensional Mat with NCHW dimensions order.
void cv::dnn::blobFromImages ( InputArrayOfArrays  images,
OutputArray  blob,
double  scalefactor = 1.0,
Size  size = Size(),
const Scalar &  mean = Scalar(),
bool  swapRB = false,
bool  crop = false,
int  ddepth = CV_32F 
)

Creates 4-dimensional blob from series of images.

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

std::vector< std::pair<Backend, Target> > cv::dnn::getAvailableBackends ( )
std::vector<Target> cv::dnn::getAvailableTargets ( Backend  be)
void cv::dnn::imagesFromBlob ( const cv::Mat blob_,
OutputArrayOfArrays  images_ 
)

Parse a 4D blob and output the images it contains as 2D arrays through a simpler data structure (std::vector<cv::Mat>).

Parameters
[in]blob_4 dimensional array (images, channels, height, width) in floating point precision (CV_32F) from which you would like to extract the images.
[out]images_array of 2D Mat containing the images extracted from the blob in floating point precision (CV_32F). They are non normalized neither mean added. The number of returned images equals the first dimension of the blob (batch size). Every image has a number of channels equals to the second dimension of the blob (depth).
void cv::dnn::NMSBoxes ( const std::vector< Rect > &  bboxes,
const std::vector< float > &  scores,
const float  score_threshold,
const float  nms_threshold,
std::vector< int > &  indices,
const float  eta = 1.f,
const int  top_k = 0 
)

Performs non maximum suppression given boxes and corresponding scores.

Parameters
bboxesa set of bounding boxes to apply NMS.
scoresa set of corresponding confidences.
score_thresholda threshold used to filter boxes by score.
nms_thresholda threshold used in non maximum suppression.
indicesthe kept indices of bboxes after NMS.
etaa coefficient in adaptive threshold formula: \(nms\_threshold_{i+1}=eta\cdot nms\_threshold_i\).
top_kif >0, keep at most top_k picked indices.
Examples:
samples/dnn/object_detection.cpp, and samples/dnn/text_detection.cpp.
void cv::dnn::NMSBoxes ( const std::vector< Rect2d > &  bboxes,
const std::vector< float > &  scores,
const float  score_threshold,
const float  nms_threshold,
std::vector< int > &  indices,
const float  eta = 1.f,
const int  top_k = 0 
)
void cv::dnn::NMSBoxes ( const std::vector< RotatedRect > &  bboxes,
const std::vector< float > &  scores,
const float  score_threshold,
const float  nms_threshold,
std::vector< int > &  indices,
const float  eta = 1.f,
const int  top_k = 0 
)
Net cv::dnn::readNet ( const String &  model,
const String &  config = "",
const String &  framework = "" 
)

Read deep learning network represented in one of the supported formats.

Parameters
[in]modelBinary file contains trained weights. The following file extensions are expected for models from different frameworks:
[in]configText file contains network configuration. It could be a file with the following extensions:
[in]frameworkExplicit framework name tag to determine a format.
Returns
Net object.

This function automatically detects an origin framework of trained model and calls an appropriate function such readNetFromCaffe, readNetFromTensorflow, readNetFromTorch or readNetFromDarknet. An order of model and config arguments does not matter.

Examples:
samples/dnn/classification.cpp, samples/dnn/object_detection.cpp, samples/dnn/openpose.cpp, samples/dnn/segmentation.cpp, and samples/dnn/text_detection.cpp.
Net cv::dnn::readNet ( const String &  framework,
const std::vector< uchar > &  bufferModel,
const std::vector< uchar > &  bufferConfig = std::vector< uchar >() 
)

Read deep learning network represented in one of the supported formats.

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

Parameters
[in]frameworkName of origin framework.
[in]bufferModelA buffer with a content of binary file with weights
[in]bufferConfigA buffer with a content of text file contains network configuration.
Returns
Net object.
Net cv::dnn::readNetFromCaffe ( const String &  prototxt,
const String &  caffeModel = String() 
)

Reads a network model stored in Caffe framework's format.

Parameters
prototxtpath to the .prototxt file with text description of the network architecture.
caffeModelpath to the .caffemodel file with learned network.
Returns
Net object.
Examples:
samples/dnn/colorization.cpp.
Net cv::dnn::readNetFromCaffe ( const std::vector< uchar > &  bufferProto,
const std::vector< uchar > &  bufferModel = std::vector< uchar >() 
)

Reads a network model stored in Caffe model in memory.

Parameters
bufferProtobuffer containing the content of the .prototxt file
bufferModelbuffer containing the content of the .caffemodel file
Returns
Net object.
Net cv::dnn::readNetFromCaffe ( const char *  bufferProto,
size_t  lenProto,
const char *  bufferModel = NULL,
size_t  lenModel = 0 
)

Reads a network model stored in Caffe model in memory.

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

Parameters
bufferProtobuffer containing the content of the .prototxt file
lenProtolength of bufferProto
bufferModelbuffer containing the content of the .caffemodel file
lenModellength of bufferModel
Returns
Net object.
Net cv::dnn::readNetFromDarknet ( const String &  cfgFile,
const String &  darknetModel = String() 
)

Reads a network model stored in Darknet model files.

Parameters
cfgFilepath to the .cfg file with text description of the network architecture.
darknetModelpath to the .weights file with learned network.
Returns
Network object that ready to do forward, throw an exception in failure cases.
Net object.
Net cv::dnn::readNetFromDarknet ( const std::vector< uchar > &  bufferCfg,
const std::vector< uchar > &  bufferModel = std::vector< uchar >() 
)

Reads a network model stored in Darknet model files.

Parameters
bufferCfgA buffer contains a content of .cfg file with text description of the network architecture.
bufferModelA buffer contains a content of .weights file with learned network.
Returns
Net object.
Net cv::dnn::readNetFromDarknet ( const char *  bufferCfg,
size_t  lenCfg,
const char *  bufferModel = NULL,
size_t  lenModel = 0 
)

Reads a network model stored in Darknet model files.

Parameters
bufferCfgA buffer contains a content of .cfg file with text description of the network architecture.
lenCfgNumber of bytes to read from bufferCfg
bufferModelA buffer contains a content of .weights file with learned network.
lenModelNumber of bytes to read from bufferModel
Returns
Net object.
Net cv::dnn::readNetFromModelOptimizer ( const String &  xml,
const String &  bin 
)

Load a network from Intel's Model Optimizer intermediate representation.

Parameters
[in]xmlXML configuration file with network's topology.
[in]binBinary file with trained weights.
Returns
Net object. Networks imported from Intel's Model Optimizer are launched in Intel's Inference Engine backend.
Net cv::dnn::readNetFromONNX ( const String &  onnxFile)

Reads a network model ONNX.

Parameters
onnxFilepath to the .onnx file with text description of the network architecture.
Returns
Network object that ready to do forward, throw an exception in failure cases.
Net cv::dnn::readNetFromTensorflow ( const String &  model,
const String &  config = String() 
)

Reads a network model stored in TensorFlow framework's format.

Parameters
modelpath to the .pb file with binary protobuf description of the network architecture
configpath to the .pbtxt file that contains text graph definition in protobuf format. Resulting Net object is built by text graph using weights from a binary one that let us make it more flexible.
Returns
Net object.
Net cv::dnn::readNetFromTensorflow ( const std::vector< uchar > &  bufferModel,
const std::vector< uchar > &  bufferConfig = std::vector< uchar >() 
)

Reads a network model stored in TensorFlow framework's format.

Parameters
bufferModelbuffer containing the content of the pb file
bufferConfigbuffer containing the content of the pbtxt file
Returns
Net object.
Net cv::dnn::readNetFromTensorflow ( const char *  bufferModel,
size_t  lenModel,
const char *  bufferConfig = NULL,
size_t  lenConfig = 0 
)

Reads a network model stored in TensorFlow framework's format.

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

Parameters
bufferModelbuffer containing the content of the pb file
lenModellength of bufferModel
bufferConfigbuffer containing the content of the pbtxt file
lenConfiglength of bufferConfig
Net cv::dnn::readNetFromTorch ( const String &  model,
bool  isBinary = true,
bool  evaluate = true 
)

Reads a network model stored in Torch7 framework's format.

Parameters
modelpath to the file, dumped from Torch by using torch.save() function.
isBinaryspecifies whether the network was serialized in ascii mode or binary.
evaluatespecifies testing phase of network. If true, it's similar to evaluate() method in Torch.
Returns
Net object.
Note
Ascii mode of Torch serializer is more preferable, because binary mode extensively use long type of C language, which has various bit-length on different systems.

The loading file must contain serialized nn.Module object with importing network. Try to eliminate a custom objects from serialazing data to avoid importing errors.

List of supported layers (i.e. object instances derived from Torch nn.Module class):

  • nn.Sequential
  • nn.Parallel
  • nn.Concat
  • nn.Linear
  • nn.SpatialConvolution
  • nn.SpatialMaxPooling, nn.SpatialAveragePooling
  • nn.ReLU, nn.TanH, nn.Sigmoid
  • nn.Reshape
  • nn.SoftMax, nn.LogSoftMax

Also some equivalents of these classes from cunn, cudnn, and fbcunn may be successfully imported.

Mat cv::dnn::readTensorFromONNX ( const String &  path)

Creates blob from .pb file.

Parameters
pathto the .pb file with input tensor.
Returns
Mat.
Mat cv::dnn::readTorchBlob ( const String &  filename,
bool  isBinary = true 
)

Loads blob which was serialized as torch.Tensor object of Torch7 framework.

Warning
This function has the same limitations as readNetFromTorch().
void cv::dnn::shrinkCaffeModel ( const String &  src,
const String &  dst,
const std::vector< String > &  layersTypes = std::vector< String >() 
)

Convert all weights of Caffe network to half precision floating point.

Parameters
srcPath to origin model from Caffe framework contains single precision floating point weights (usually has .caffemodel extension).
dstPath to destination model with updated weights.
layersTypesSet of layers types which parameters will be converted. By default, converts only Convolutional and Fully-Connected layers' weights.
Note
Shrinked model has no origin float32 weights so it can't be used in origin Caffe framework anymore. However the structure of data is taken from NVidia's Caffe fork: https://github.com/NVIDIA/caffe. So the resulting model may be used there.
void cv::dnn::writeTextGraph ( const String &  model,
const String &  output 
)

Create a text representation for a binary network stored in protocol buffer format.

Parameters
[in]modelA path to binary network.
[in]outputA path to output text file to be created.
Note
To reduce output file size, trained weights are not included.