chainer.functions.dilated_convolution_2d¶
-
chainer.functions.
dilated_convolution_2d
(x, W, b=None, stride=1, pad=0, dilate=1, cover_all=False)[source]¶ Two-dimensional dilated convolution function.
This is an implementation of two-dimensional dilated convolution in ConvNets. It takes three variables: the input image
x
, the filter weightW
, and the bias vectorb
.Note
You can also perform dilated convolution by passing
dilate
argument tochainer.functions.convolution_2d
. The functionality is the same.Notation: here is a notation for dimensionalities.
\(n\) is the batch size.
\(c_I\) and \(c_O\) are the number of the input and output, respectively.
\(h\) and \(w\) are the height and width of the input image, respectively.
\(k_H\) and \(k_W\) are the height and width of the filters, respectively.
- Parameters
x (
Variable
or N-dimensional array) – Input variable of shape \((n, c_I, h, w)\).W (
Variable
or N-dimensional array) – Weight variable of shape \((c_O, c_I, k_H, k_W)\).b (
Variable
or N-dimensional array) – Bias variable of length \(c_O\) (optional).stride (int or pair of ints) – Stride of filter applications.
stride=s
andstride=(s, s)
are equivalent.pad (int or pair of ints) – Spatial padding width for input arrays.
pad=p
andpad=(p, p)
are equivalent.dilate (int or pair of ints) – Dilation factor of filter applications.
dilate=d
anddilate=(d, d)
are equivalent.cover_all (bool) – If
True
, all spatial locations are convoluted into some output pixels. It may make the output size larger.
- Returns
Output variable.
- Return type
The two-dimensional dilated convolution function is defined as follows. Then the
DilatedConvolution2D
function computes correlations between filters and patches of size \((k_H, k_W)\) inx
. Patches here are extracted at intervals of the dilation factor. Note that correlation here is equivalent to the inner product between expanded vectors. Patches are extracted at intervals of the dilation factor and at positions shifted by multiples ofstride
from the first position-pad
for each spatial axis. The right-most (or bottom-most) patches do not run over the padded spatial size.Let \((s_Y, s_X)\) be the stride of filter application, \((p_H, p_W)\) the spatial padding size, and \((d_Y, d_X)\) the dilation factor of filter application. Then, the output size \((h_O, w_O)\) is determined by the following equations:
\[\begin{split}h_O &= (h + 2p_H - k_H - (k_H - 1) * (d_Y - 1)) / s_Y + 1,\\ w_O &= (w + 2p_W - k_W - (k_W - 1) * (d_X - 1)) / s_X + 1.\end{split}\]If the bias vector is given, then it is added to all spatial locations of the output of convolution.