chainer.functions.depthwise_convolution_2d¶

chainer.functions.depthwise_convolution_2d(x, W, b=None, stride=1, pad=0)[source]¶

Two-dimensional depthwise convolution function.

This is an implementation of two-dimensional depthwise convolution. It takes two or three variables: the input image x, the filter weight W, and optionally, the bias vector b.

Notation: here is a notation for dimensionalities.

$n$ is the batch size.
$c_I$ is the number of the input.
$c_M$ is the channel multiplier.
$h$ and $w$ are the height and width of the input image, respectively.
$h_O$ and $w_O$ are the height and width of the output image, respectively.
$k_H$ and $k_W$ are the height and width of the filters, respectively.

Parameters

x (Variable or N-dimensional array) – Input variable of shape $(n, c_I, h, w)$ .
W (Variable or N-dimensional array) – Weight variable of shape $(c_M, c_I, k_H, k_W)$ .
b (Variable or N-dimensional array) – Bias variable of length $c_M * c_I$ (optional).
stride (int or pair of ints) – Stride of filter applications. stride=s and stride=(s, s) are equivalent.
pad (int or pair of ints) – Spatial padding width for input arrays. pad=p and pad=(p, p) are equivalent.

Returns

Output variable. Its shape is $(n, c_I * c_M, h_O, w_O)$ .

Return type

Variable

Like Convolution2D, DepthwiseConvolution2D function computes correlations between filters and patches of size $(k_H, k_W)$ in x. But unlike Convolution2D, DepthwiseConvolution2D does not add up input channels of filters but concatenates them. For that reason, the shape of outputs of depthwise convolution are $(n, c_I * c_M, h_O, w_O)$ , $c_M$ is called channel_multiplier.

$(h_O, w_O)$ is determined by the equivalent equation of Convolution2D.

If the bias vector is given, then it is added to all spatial locations of the output of convolution.

See: L. Sifre. Rigid-motion scattering for image classification