chainer.functions.deconvolution_2d¶
-
chainer.functions.
deconvolution_2d
(x, W, b=None, stride=1, pad=0, outsize=None, *, dilate=1, groups=1)[source]¶ Two dimensional deconvolution function.
This is an implementation of two-dimensional deconvolution. In most of deep learning frameworks and papers, this function is called transposed convolution. But because of historical reasons (e.g. paper by Ziller Deconvolutional Networks) and backward compatibility, this function is called deconvolution in Chainer.
It takes three variables: input image
x
, the filter weightW
, and the bias vectorb
.Notation: here is a notation for dimensionalities.
\(n\) is the batch size.
\(c_I\) and \(c_O\) are the number of the input and output channels, respectively.
\(h_I\) and \(w_I\) are the height and width of the input image, respectively.
\(h_K\) and \(w_K\) are the height and width of the filters, respectively.
\(h_P\) and \(w_P\) are the height and width of the spatial padding size, respectively.
Let \((s_Y, s_X)\) be the stride of filter application. Then, the output size \((h_O, w_O)\) is estimated by the following equations:
\[\begin{split}h_O &= s_Y (h_I - 1) + h_K - 2h_P,\\ w_O &= s_X (w_I - 1) + w_K - 2w_P.\end{split}\]The output of this function can be non-deterministic when it uses cuDNN. If
chainer.configuration.config.deterministic
isTrue
and cuDNN version is >= v3, it forces cuDNN to use a deterministic algorithm.Deconvolution links can use a feature of cuDNN called autotuning, which selects the most efficient CNN algorithm for images of fixed-size, can provide a significant performance boost for fixed neural nets. To enable, set chainer.using_config(‘autotune’, True)
- Parameters
x (
Variable
or N-dimensional array) – Input variable of shape \((n, c_I, h_I, w_I)\).W (
Variable
or N-dimensional array) – Weight variable of shape \((c_I, c_O, h_K, w_K)\).b (None or
Variable
or N-dimensional array) – Bias variable of length \(c_O\) (optional).stride (
int
or pair ofint
s) – Stride of filter applications.stride=s
andstride=(s, s)
are equivalent.pad (
int
or pair ofint
s) – Spatial padding width for input arrays.pad=p
andpad=(p, p)
are equivalent.outsize (None or
tuple
ofint
s) – Expected output size of deconvolutional operation. It should be pair of height and width \((h_O, w_O)\). Default value isNone
and the outsize is estimated by input size, stride and pad.dilate (
int
or pair ofint
s) – Dilation factor of filter applications.dilate=d
anddilate=(d, d)
are equivalent.groups (
int
) – The number of groups to use grouped deconvolution. The default is one, where grouped deconvolution is not used.
- Returns
Output variable of shape \((n, c_O, h_O, w_O)\).
- Return type
Example
>>> n = 10 >>> c_i, c_o = 1, 3 >>> h_i, w_i = 5, 10 >>> h_k, w_k = 10, 10 >>> h_p, w_p = 5, 5 >>> x = np.random.uniform(0, 1, (n, c_i, h_i, w_i)).astype(np.float32) >>> x.shape (10, 1, 5, 10) >>> W = np.random.uniform(0, 1, (c_i, c_o, h_k, w_k)).astype(np.float32) >>> W.shape (1, 3, 10, 10) >>> b = np.random.uniform(0, 1, c_o).astype(np.float32) >>> b.shape (3,) >>> s_y, s_x = 5, 5 >>> y = F.deconvolution_2d(x, W, b, stride=(s_y, s_x), pad=(h_p, w_p)) >>> y.shape (10, 3, 20, 45) >>> h_o = s_y * (h_i - 1) + h_k - 2 * h_p >>> w_o = s_x * (w_i - 1) + w_k - 2 * w_p >>> y.shape == (n, c_o, h_o, w_o) True