chainer.functions.spatial_pyramid_pooling_2d

chainer.functions.spatial_pyramid_pooling_2d(x, pyramid_height, pooling=None)[source]

Spatial pyramid pooling function.

It outputs a fixed-length vector regardless of input feature map size.

It performs pooling operation to the input 4D-array x with different kernel sizes and padding sizes, and then flattens all dimensions except first dimension of all pooling results, and finally concatenates them along second dimension.

At i-th pyramid level, the kernel size (k(i)h,k(i)w) and padding size (p(i)h,p(i)w) of pooling operation are calculated as below:

k(i)h=bh/2i,k(i)w=bw/2i,p(i)h=(2ik(i)hbh)/2,p(i)w=(2ik(i)wbw)/2,

where denotes the ceiling function, and bh,bw are height and width of input variable x, respectively. Note that index of pyramid level i is zero-based.

See detail in paper: Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition.

Parameters
  • x (Variable) – Input variable. The shape of x should be (batchsize, # of channels, height, width).

  • pyramid_height (int) – Number of pyramid levels

  • pooling (str) – Currently, only max is supported, which performs a 2d max pooling operation.

Returns

Output variable. The shape of the output variable will be (batchsize,cH1h=022h,1,1), where c is the number of channels of input variable x and H is the number of pyramid levels.

Return type

Variable