chainer.links.ResNet50Layers¶
-
class
chainer.links.
ResNet50Layers
(pretrained_model='auto', downsample_fb=False)[source]¶ A pre-trained CNN model with 50 layers provided by MSRA.
When you specify the path of the pre-trained chainer model serialized as a
.npz
file in the constructor, this chain model automatically initializes all the parameters with it. This model would be useful when you want to extract a semantic feature vector per image, or fine-tune the model on a different dataset. Note that unlikeVGG16Layers
, it does not automatically download a pre-trained caffemodel. This caffemodel can be downloaded at GitHub.If you want to manually convert the pre-trained caffemodel to a chainer model that can be specified in the constructor, please use
convert_caffemodel_to_npz
classmethod instead.ResNet50 has 25,557,096 trainable parameters, and it’s 58% and 43% fewer than ResNet101 and ResNet152, respectively. On the other hand, the top-5 classification accuracy on ImageNet dataset drops only 0.7% and 1.1% from ResNet101 and ResNet152, respectively. Therefore, ResNet50 may have the best balance between the accuracy and the model size. It would be basically just enough for many cases, but some advanced models for object detection or semantic segmentation use deeper ones as their building blocks, so these deeper ResNets are here for making reproduction work easier.
See: K. He et. al., Deep Residual Learning for Image Recognition
- Parameters
pretrained_model (str) – the destination of the pre-trained chainer model serialized as a
.npz
file. If this argument is specified asauto
, it automatically loads and converts the caffemodel from$CHAINER_DATASET_ROOT/pfnet/chainer/models/ResNet-50-model.caffemodel
, where$CHAINER_DATASET_ROOT
is set as$HOME/.chainer/dataset
unless you specify another value by modifying the environment variable. Note that in this case the converted chainer model is stored on the same directory and automatically used from the next time. If this argument is specified asNone
, all the parameters are not initialized by the pre-trained model, but the default initializer used in the original paper, i.e.,chainer.initializers.HeNormal(scale=1.0)
.downsample_fb (bool) – If this argument is specified as
False
, it performs downsampling by placing stride 2 on the 1x1 convolutional layers (the original MSRA ResNet). If this argument is specified asTrue
, it performs downsampling by placing stride 2 on the 3x3 convolutional layers (Facebook ResNet).
- Variables
available_layers (list of str) – The list of available layer names used by
forward
andextract
methods.
Methods
-
add_param
(name, shape=None, dtype=<class 'numpy.float32'>, initializer=None)[source]¶ Registers a parameter to the link.
- Parameters
name (str) – Name of the parameter. This name is also used as the attribute name.
shape (int or tuple of ints) – Shape of the parameter array. If it is omitted, the parameter variable is left uninitialized.
dtype – Data type of the parameter array.
initializer – If it is not
None
, the data is initialized with the given initializer. If it is an array, the data is directly initialized by it. If it is callable, it is used as a weight initializer. Note that in these cases,dtype
argument is ignored.
-
add_persistent
(name, value)[source]¶ Registers a persistent value to the link.
The registered value is saved and loaded on serialization and deserialization. The value is set to an attribute of the link.
- Parameters
name (str) – Name of the persistent value. This name is also used for the attribute name.
value – Value to be registered.
-
addgrads
(link)[source]¶ Accumulates gradient values from given link.
This method adds each gradient array of the given link to corresponding gradient array of this link. The accumulation is even done across host and different devices.
- Parameters
link (Link) – Source link object.
-
children
()[source]¶ Returns a generator of all child links.
- Returns
A generator object that generates all child links.
-
cleargrads
()[source]¶ Clears all gradient arrays.
This method should be called before the backward computation at every iteration of the optimization.
-
classmethod
convert_caffemodel_to_npz
(path_caffemodel, path_npz, n_layers=50)[source]¶ Converts a pre-trained caffemodel to a chainer model.
-
copy
(mode='share')[source]¶ Copies the link hierarchy to new one.
The whole hierarchy rooted by this link is copied. There are three modes to perform copy. Please see the documentation for the argument
mode
below.The name of the link is reset on the copy, since the copied instance does not belong to the original parent chain (even if exists).
- Parameters
mode (str) – It should be either
init
,copy
, orshare
.init
means parameter variables under the returned link object is re-initialized by calling theirinitialize()
method, so that all the parameters may have different initial values from the original link.copy
means that the link object is deeply copied, so that its parameters are not re-initialized but are also deeply copied. Thus, all parameters have same initial values but can be changed independently.share
means that the link is shallowly copied, so that its parameters’ arrays are shared with the original one. Thus, their values are changed synchronously. The defaultmode
isshare
.- Returns
Copied link object.
- Return type
-
copyparams
(link, copy_persistent=True)[source]¶ Copies all parameters from given link.
This method copies data arrays of all parameters in the hierarchy. The copy is even done across the host and devices. Note that this method does not copy the gradient arrays.
From v5.0.0: this method also copies the persistent values (e.g. the moving statistics of
BatchNormalization
). If the persistent value is an ndarray, the elements are copied. Otherwise, it is copied usingcopy.deepcopy()
. The old behavior (not copying persistent values) can be reproduced withcopy_persistent=False
.
-
count_params
()[source]¶ Counts the total number of parameters.
This method counts the total number of scalar values included in all the
Parameter
s held by this link and its descendants.If the link containts uninitialized parameters, this method raises a warning.
- Returns
The total size of parameters (int)
-
delete_hook
(name)[source]¶ Unregisters the link hook.
- Parameters
name (str) – The name of the link hook to be unregistered.
-
disable_update
()[source]¶ Disables update rules of all parameters under the link hierarchy.
This method sets the
enabled
flag of the update rule of each parameter variable toFalse
.
-
enable_update
()[source]¶ Enables update rules of all parameters under the link hierarchy.
This method sets the
enabled
flag of the update rule of each parameter variable toTrue
.
-
extract
(self, images, layers=['pool5'], size=(224, 224))[source]¶ Extracts all the feature maps of given images.
The difference of directly executing
forward
is that it directly accepts images as an input and automatically transforms them to a proper variable. That is, it is also interpreted as a shortcut method that implicitly callsprepare
andforward
functions.Unlike
predict
method, this method does not overridechainer.config.train
andchainer.config.enable_backprop
configuration. If you want to extract features without updating model parameters, you need to manually set configuration when calling this method as follows:# model is an instance of ResNetLayers (50 or 101 or 152 layers) with chainer.using_config('train', False): with chainer.using_config('enable_backprop', False): feature = model.extract([image])
- Parameters
images (iterable of PIL.Image or numpy.ndarray) – Input images.
layers (list of str) – The list of layer names you want to extract.
size (pair of ints) – The resolution of resized images used as an input of CNN. All the given images are not resized if this argument is
None
, but the resolutions of all the images should be the same.
- Returns
A directory in which the key contains the layer name and the value contains the corresponding feature map variable.
- Return type
Dictionary of ~chainer.Variable
-
forward
(self, x, layers=['prob'])[source]¶ Computes all the feature maps specified by
layers
.- Parameters
x (Variable) – Input variable. It should be prepared by
prepare
function.layers (list of str) – The list of layer names you want to extract.
- Returns
A directory in which the key contains the layer name and the value contains the corresponding feature map variable.
- Return type
Dictionary of ~chainer.Variable
-
from_chainerx
()[source]¶ Converts parameter variables and persistent values from ChainerX to NumPy/CuPy devices without any copy.
-
init_scope
()[source]¶ Creates an initialization scope.
This method returns a context manager object that enables registration of parameters (and links for
Chain
) by an assignment. AParameter
object can be automatically registered by assigning it to an attribute under this context manager.Example
In most cases, the parameter registration is done in the initializer method. Using the
init_scope
method, we can simply assign aParameter
object to register it to the link.class MyLink(chainer.Link): def __init__(self): super().__init__() with self.init_scope(): self.W = chainer.Parameter(0, (10, 5)) self.b = chainer.Parameter(0, (5,))
-
links
(skipself=False)[source]¶ Returns a generator of all links under the hierarchy.
- Parameters
skipself (bool) – If
True
, then the generator skips this link and starts with the first child link.- Returns
A generator object that generates all links.
-
namedlinks
(skipself=False)[source]¶ Returns a generator of all (path, link) pairs under the hierarchy.
- Parameters
skipself (bool) – If
True
, then the generator skips this link and starts with the first child link.- Returns
A generator object that generates all (path, link) pairs.
-
namedparams
(include_uninit=True)[source]¶ Returns a generator of all (path, param) pairs under the hierarchy.
- Parameters
include_uninit (bool) – If
True
, it also generates uninitialized parameters.- Returns
A generator object that generates all (path, parameter) pairs. The paths are relative from this link.
-
params
(include_uninit=True)[source]¶ Returns a generator of all parameters under the link hierarchy.
- Parameters
include_uninit (bool) – If
True
, it also generates uninitialized parameters.- Returns
A generator object that generates all parameters.
-
predict
(images, oversample=True)[source]¶ Computes all the probabilities of given images.
- Parameters
images (iterable of PIL.Image or numpy.ndarray) – Input images. When you specify a color image as a
numpy.ndarray
, make sure that color order is RGB.oversample (bool) – If
True
, it averages results across center, corners, and mirrors. Otherwise, it uses only the center.
- Returns
Output that contains the class probabilities of given images.
- Return type
-
register_persistent
(name)[source]¶ Registers an attribute of a given name as a persistent value.
This is a convenient method to register an existing attribute as a persistent value. If
name
has been already registered as a parameter, this method removes it from the list of parameter names and re-registers it as a persistent value.- Parameters
name (str) – Name of the attribute to be registered.
-
repeat
(n_repeat, mode='init')[source]¶ Repeats this link multiple times to make a
Sequential
.This method returns a
Sequential
object which has the sameLink
multiple times repeatedly. Themode
argument means how to copy this link to repeat.Example
You can repeat the same link multiple times to create a longer
Sequential
block like this:class ConvBNReLU(chainer.Chain): def __init__(self): super(ConvBNReLU, self).__init__() with self.init_scope(): self.conv = L.Convolution2D( None, 64, 3, 1, 1, nobias=True) self.bn = L.BatchNormalization(64) def forward(self, x): return F.relu(self.bn(self.conv(x))) net = ConvBNReLU().repeat(16, mode='init')
The
net
object contains 16 blocks, each of which isConvBNReLU
. And themode
wasinit
, so each block is re-initialized with different parameters. If you givecopy
to this argument, each block has same values for its parameters but its object ID is different from others. If it isshare
, each block is same to others in terms of not only parameters but also the object IDs because they are shallow-copied, so that when the parameter of one block is changed, all the parameters in the others also change.- Parameters
n_repeat (int) – Number of times to repeat.
mode (str) – It should be either
init
,copy
, orshare
.init
means parameters of each repeated element in the returnedSequential
will be re-initialized, so that all elements have different initial parameters.copy
means that the parameters will not be re-initialized but object itself will be deep-copied, so that all elements have same initial parameters but can be changed independently.share
means all the elements which consist the resultingSequential
object are same object because they are shallow-copied, so that all parameters of elements are shared with each other.
-
serialize
(serializer)[source]¶ Serializes the link object.
- Parameters
serializer (AbstractSerializer) – Serializer object.
-
to_chainerx
()[source]¶ Converts parameter variables and persistent values to ChainerX without any copy.
This method does not handle non-registered attributes. If some of such attributes must be copied to ChainerX, the link implementation must override this method to do so.
Returns: self
-
to_cpu
()[source]¶ Copies parameter variables and persistent values to CPU.
This method does not handle non-registered attributes. If some of such attributes must be copied to CPU, the link implementation must override
Link.to_device()
to do so.Returns: self
-
to_device
(device)[source]¶ Copies parameter variables and persistent values to the specified device.
This method does not handle non-registered attributes. If some of such attributes must be copied to the device, the link implementation must override this method to do so.
- Parameters
device – Target device specifier. See
get_device()
for available values.
Returns: self
-
to_gpu
(device=None)[source]¶ Copies parameter variables and persistent values to GPU.
This method does not handle non-registered attributes. If some of such attributes must be copied to GPU, the link implementation must override
Link.to_device()
to do so.- Parameters
device – Target device specifier. If omitted, the current device is used.
Returns: self
-
zerograds
()[source]¶ Initializes all gradient arrays by zero.
Deprecated since version v1.15: Use the more efficient
cleargrads()
instead.
Attributes
-
available_layers
¶
-
device
¶
-
functions
¶
-
local_link_hooks
¶ Ordered dictionary of registered link hooks.
Contrary to
chainer.thread_local.link_hooks
, which registers its elements to all functions, link hooks in this property are specific to this link.
-
update_enabled
¶ True
if at least one parameter has an update rule enabled.
-
within_init_scope
¶ True if the current code is inside of an initialization scope.
See
init_scope()
for the details of the initialization scope.