Upgrade Guide

This is a list of changes introduced in each release that users should be aware of when migrating from older versions. Most changes are carefully designed not to break existing code; however changes that may possibly break them are highlighted with a box.

Chainer v5

ChainerMN Became Part of Chainer

ChainerMN, which enables multi-node distributed deep learning using Chainer, has been merged to Chainer v5.

Prior to Chainer v4, ChainerMN was provided as a separate chainermn package. In Chainer v5, ChainerMN now became a part of Chainer; ChainerMN will be installed just by installing chainer package. If you are using chainermn package, make sure to remove it by pip uninstall chainermn before upgrading to Chainer v5 or later.

For documentation of ChainerMN, see Distributed Deep Learning with ChainerMN.

FunctionNode Classes are Hidden from chainer.functions

Prior to Chainer v5, FunctionNode classes (e.g., chainer.functions.MaxPooling2D) are exposed under chainer.functions. In Chainer v5, these classes are hidden from chainer.functions. Use the equivalent wrapper functions listed in Functions (e.g., chainer.functions.max_pooling_2d()) instead.

Some wrapper functions now provide options to access internal states to avoid directly using FunctionNode classes.

For example, suppose your existing code needs to access MaxPooling2D.indexes to later perform upsampling:

p = F.MaxPooling2D(2, 2)
h = p.apply((x,))[0]
...
y = F.upsampling_2d(h, p.indexes, ksize=2)

The above code may raise this error in Chainer v5:

AttributeError: module 'chainer.functions' has no attribute 'MaxPooling2D'

You can rewrite the above code using return_indices option of chainer.functions.max_pooling_2d():

h, indices = F.max_pooling_2d(x, 2, 2, return_indices=True)
...
y = F.upsampling_2d(h, indices, ksize=2)

Updaters Automatically Call Optimizer.new_epoch

This change should affect only a minority of users (who call new_epoch() while using a trainer, or who implement their own Updater class).

Optimizers provide new_epoch() method, which can be used to change the behavior of optimizers depending on the current epoch number. Prior to Chainer v5, this method was expected to be called by users. In Chainer v5, updaters have been changed to call new_epoch() automatically. If you have been calling new_epoch() method manually while using a trainer (or an updater), you may need any of the following fixes:

  • Pass auto_new_epoch=False to the constructor of the updater (e.g., StandardUpdater) to stop new_epoch() from being called automatically by the updater.

  • Avoid calling new_epoch() method manually.

If you implement your own Updater class, you may need to update your code to automatically call new_epoch() (you can refer to the changes introduced in #4608 to understand how to fix your updater).

Extending the Backend Namespace

In addition to chainer.backends, we introduced chainer.backend. This subpackage contains utility functions that span several backends. For instance, it includes chainer.backend.get_array_module which used to be defined in chainer.backends.cuda.get_array_module. Both can be used but the latter will be deprecated.

get_device_from_array Returns Actual Device for Empty Arrays

Prior to Chainer v5, chainer.backends.cuda.get_device_from_array() returned chainer.backends.cuda.DummyDeviceType if the array is empty. In Chainer v5, it has been changed to return the actual cupy.cuda.Device object:

>>> x = cupy.array([])
>>> chainer.backends.cuda.get_device_from_array(x)
<CUDA Device 0>

Update of Docker Images

Chainer official Docker images (see Installation for details) are now updated to use CUDA 9.2 and cuDNN 7.

To use these images, you may need to upgrade the NVIDIA driver on your host. See Requirements of nvidia-docker for details.

CuPy v5

Chainer v5 requires CuPy v5 if you need GPU support. Please see the Upgrade Guide for CuPy v5 for details.

Chainer v4

Introduction of Backend Namespace

We introduced chainer.backends subpackage for future support of various backend libraries other than NumPy and CuPy. By this change, chainer.cuda module is now moved to chainer.backends.cuda.

This does not break the existing code; you can safely continue to use chainer.cuda (e.g., from chainer import cuda) but it is now encouraged to use from chainer.backends import cuda instead.

Namespace Changes for Updaters

chainer.training.StandardUpdater and chainer.training.ParallelUpdater are now moved to chainer.training.updaters.StandardUpdater and chainer.training.updaters.ParallelUpdater respectively, to align with the namespace convention of other subpackages. See the discussion in #2982 for more details.

This change does not break the existing code; you can safely continue to use updater classes directly under chainer.training but it is now encouraged to use chainer.training.updaters instead.

Namespace Changes for Optimizer Hooks

Optimizer hook functions are moved from chainer.optimizer.* to chainer.optimizer_hooks.*. For example, chainer.optimizer.WeightDecay is now located chainer.optimizer_hooks.WeightDecay.

If the existing code is using hooks directly under chainer.optimizer, DeprecationWarning will be shown. You are now encouraged to use chainer.optimizer_hooks instead.

Prohibition of Mixed Use of Arrays on Different Devices in Function Arguments

Argument validation of functions is now strictened to check device consistency of argument variables to provide better error messages to users. Suppose the following code:

v1 = chainer.Variable(np.arange(10, dtype=np.float32))      # CPU
v2 = chainer.Variable(cupy.arange(10, dtype=cupy.float32))  # GPU

# The line below raises an exception, because arguments are on different device.
F.maximum(v1, v2)

Prior to v4, the above code raises an exception like ValueError: object __array__ method not producing an array, which was difficult to understand. In v4, the error message would become TypeError: incompatible array types are mixed in the forward input (Maximum). This kind of error usually occurs by mistake (for example, not performing to_gpu for some variables).

Attention

As the argument validation is strictened, call of functions intentionally mixing NumPy/CuPy arrays in arguments will not work in Chainer v4. Please transfer all arrays to the same device before calling functions.

References to Function Nodes Not Retained in TimerHook and CupyMemoryProfilerHook

To reduce memory consumption, references to the function nodes will no longer be retained in the chainer.function_hooks.CupyMemoryProfileHook and chainer.function_hooks.TimerHook. See the discussion in #4300 for more details.

Attention

The existing code using function nodes retained in call_history attribute of these hooks will not work. The first element of call_history became the name of the function, instead of the function node instance itself. You can define your own function hook if you need to access the function node instances.

Update of Docker Images

Chainer official Docker images (see Installation for details) are now updated to use CUDA 8.0 and cuDNN 6.0. This change was introduced because CUDA 7.5 does not support NVIDIA Pascal GPUs.

To use these images, you may need to upgrade the NVIDIA driver on your host. See Requirements of nvidia-docker for details.

CuPy v4

Chainer v4 requires CuPy v4 if you need GPU support. Please see the Upgrade Guide for CuPy v4 for details.

Chainer v3

Introduction of New-style Functions

This release introduces new-style functions (classes inheriting from FunctionNode) that support double backward (gradient of gradient). See the Release Note for v3.0.0 for the usage of this feature.

Many of Functions are already migrated to new-style, although some of functions are still old-style (classes inheriting from Function). We are going to migrate more old-style functions to new-style in upcoming minor releases.

This does not break the existing code. Old-style functions (classes inheriting from Function) are still supported in v3 and future versions of Chainer.

If you are going to write new functions, it is encouraged to use FunctionNode to support double backward.

Attention

Users relying on undocumented function APIs (directly instantiating old-style classes) may experience an error like TypeError: 'SomeFunction' object is not callable after upgrading to v3. Please use the function APIs documented in Functions.

Changed Behavior of matmul Function

The behavior of chainer.functions.matmul() has been changed to behave like the corresponding NumPy function (numpy.matmul()). See the discussion in #2426 for more details.

Attention

The existing code using chainer.functions.matmul() may require modification to work with Chainer v3.

Also note that chainer.functions.batch_matmul() is now deprecated by this change. You can rewrite it using chainer.functions.matmul().

Removed use_cudnn Argument in spatial_transformer_grid and spatial_transformer_sampler Functions

use_cudnn argument has been removed from chainer.functions.spatial_transformer_grid() and chainer.functions.spatial_transformer_sampler(). See the discussion in #2955 for more details.

Attention

The existing code using use_cudnn argument of chainer.functions.spatial_transformer_grid() and chainer.functions.spatial_transformer_sampler() require modification to work with Chainer v3. Please use the configuration context (e.g., with chainer.using_config('use_cudnn', 'auto'):) to enable or disable use of cuDNN. See Configuring Chainer for details.

CuPy v2

Chainer v3 requires CuPy v2 if you need GPU support. Please see the Upgrade Guide for CuPy v2 for details.

Chainer v2

See Upgrade Guide from v1 to v2 for the changes introduced in Chainer v2.