chainer.functions.convolution_nd¶
-
chainer.functions.convolution_nd(x, W, b=None, stride=1, pad=0, cover_all=False, dilate=1, groups=1)[source]¶ N-dimensional convolution function.
This is an implementation of N-dimensional convolution which is generalized two-dimensional convolution in ConvNets. It takes three variables: the input
x, the filter weightWand the bias vectorb.Notation: here is a notation for dimensionalities.
\(N\) is the number of spatial dimensions.
\(n\) is the batch size.
\(c_I\) and \(c_O\) are the number of the input and output channels, respectively.
\(d_1, d_2, ..., d_N\) are the size of each axis of the input’s spatial dimensions, respectively.
\(k_1, k_2, ..., k_N\) are the size of each axis of the filters, respectively.
\(l_1, l_2, ..., l_N\) are the size of each axis of the output’s spatial dimensions, respectively.
\(p_1, p_2, ..., p_N\) are the size of each axis of the spatial padding size, respectively.
Then the
convolution_ndfunction computes correlations between filters and patches of size \((k_1, k_2, ..., k_N)\) inx. Note that correlation here is equivalent to the inner product between expanded tensors. Patches are extracted at positions shifted by multiples ofstridefrom the first position(-p_1, -p_2, ..., -p_N)for each spatial axis.Let \((s_1, s_2, ..., s_N)\) be the stride of filter application. Then, the output size \((l_1, l_2, ..., l_N)\) is determined by the following equations:
\[l_n = (d_n + 2p_n - k_n) / s_n + 1 \ \ (n = 1, ..., N)\]If
cover_alloption isTrue, the filter will cover the all spatial locations. So, if the last stride of filter does not cover the end of spatial locations, an addtional stride will be applied to the end part of spatial locations. In this case, the output size is determined by the following equations:\[l_n = (d_n + 2p_n - k_n + s_n - 1) / s_n + 1 \ \ (n = 1, ..., N)\]- Parameters
x (
Variableor N-dimensional array) – Input variable of shape \((n, c_I, d_1, d_2, ..., d_N)\).W (
Variableor N-dimensional array) – Weight variable of shape \((c_O, c_I, k_1, k_2, ..., k_N)\).b (None or
Variableor N-dimensional array) – One-dimensional bias variable with length \(c_O\) (optional).stride (
intortupleofints) – Stride of filter applications \((s_1, s_2, ..., s_N)\).stride=sis equivalent to(s, s, ..., s).pad (
intortupleofints) – Spatial padding width for input arrays \((p_1, p_2, ..., p_N)\).pad=pis equivalent to(p, p, ..., p).cover_all (bool) – If
True, all spatial locations are convoluted into some output pixels. It may make the output size larger. cover_all needs to beFalseif you want to use cuDNN.dilate (
intortupleofints) – Dilation factor of filter applications.dilate=danddilate=(d, d, ..., d)are equivalent.groups (
int) – The number of groups to use grouped convolution. The default is one, where grouped convolution is not used.
- Returns
Output variable of shape \((n, c_O, l_1, l_2, ..., l_N)\).
- Return type
Note
This function uses cuDNN implementation for its forward and backward computation if ALL of the following conditions are satisfied:
cuda.cudnn_enabledisTruechainer.config.use_cudnnis'always'or'auto'The number of spatial dimensions is more than one.
cover_allisFalseThe input’s
dtypeis equal to the filter weight’s.The
dtypeis FP16, FP32 or FP64. (FP16 is only available when cuDNN version \(\geq\) v3.)
Convolution links can use a feature of cuDNN called autotuning, which selects the most efficient CNN algorithm for images of fixed-size, can provide a significant performance boost for fixed neural nets. To enable, set chainer.using_config(‘autotune’, True)
See also
Example
>>> n = 10 >>> c_i, c_o = 3, 1 >>> d1, d2, d3 = 30, 40, 50 >>> k1, k2, k3 = 10, 10, 10 >>> p1, p2, p3 = 5, 5, 5 >>> x = np.random.uniform(0, 1, (n, c_i, d1, d2, d3)).astype(np.float32) >>> x.shape (10, 3, 30, 40, 50) >>> W = np.random.uniform(0, 1, (c_o, c_i, k1, k2, k3)).astype(np.float32) >>> W.shape (1, 3, 10, 10, 10) >>> b = np.random.uniform(0, 1, (c_o)).astype(np.float32) >>> b.shape (1,) >>> s1, s2, s3 = 2, 4, 6 >>> y = F.convolution_nd(x, W, b, stride=(s1, s2, s3), pad=(p1, p2, p3)) >>> y.shape (10, 1, 16, 11, 9) >>> l1 = int((d1 + 2 * p1 - k1) / s1 + 1) >>> l2 = int((d2 + 2 * p2 - k2) / s2 + 1) >>> l3 = int((d3 + 2 * p3 - k3) / s3 + 1) >>> y.shape == (n, c_o, l1, l2, l3) True >>> y = F.convolution_nd(x, W, b, stride=(s1, s2, s3), pad=(p1, p2, p3), cover_all=True) >>> y.shape == (n, c_o, l1, l2, l3 + 1) True