roi_pool

paddle.fluid.layers.nn. roi_pool ( input, rois, pooled_height=1, pooled_width=1, spatial_scale=1.0, rois_num=None, name=None ) [source]

This operator implements the roi_pooling layer. Region of interest pooling (also known as RoI pooling) is to perform max pooling on inputs of nonuniform sizes to obtain fixed-size feature maps (e.g. 7*7).

The operator has three steps:

  1. Dividing each region proposal into equal-sized sections with the pooled_width and pooled_height;

  2. Finding the largest value in each section;

  3. Copying these max values to the output buffer.

For more information, please refer to https://stackoverflow.com/questions/43430056/what-is-roi-layer-in-fast-rcnn

Parameters
  • input (Variable) – Input feature, 4D-Tensor with the shape of [N,C,H,W], where N is the batch size, C is the input channel, H is Height, W is weight. The data type is float32 or float64.

  • rois (Variable) – ROIs (Regions of Interest) to pool over. 2D-LoDTensor with the shape of [num_rois,4], the lod level is 1. Given as [[x1, y1, x2, y2], …], (x1, y1) is the top left coordinates, and (x2, y2) is the bottom right coordinates.

  • pooled_height (int, optional) – The pooled output height, data type is int32. Default: 1

  • pooled_width (int, optional) – The pooled output height, data type is int32. Default: 1

  • spatial_scale (float, optional) – Multiplicative spatial scale factor to translate ROI coords from their input scale to the scale used when pooling. Default: 1.0

  • rois_num (Tensor) – The number of RoIs in each image. Default: None

  • name (str, optional) – For detailed information, please refer to Name. Usually name is no need to set and None by default.

Returns

The pooled feature, 4D-Tensor with the shape of [num_rois, C, pooled_height, pooled_width].

Return type

Variable

Examples:

import paddle.fluid as fluid
import numpy as np
import paddle
paddle.enable_static()

DATATYPE='float32'

place = fluid.CPUPlace()
#place = fluid.CUDAPlace(0)

input_data = np.array([i for i in range(1,17)]).reshape(1,1,4,4).astype(DATATYPE)
roi_data =fluid.create_lod_tensor(np.array([[1., 1., 2., 2.], [1.5, 1.5, 3., 3.]]).astype(DATATYPE),[[2]], place)
rois_num_data = np.array([2]).astype('int32')

x = fluid.data(name='input', shape=[None,1,4,4], dtype=DATATYPE)
rois = fluid.data(name='roi', shape=[None,4], dtype=DATATYPE)
rois_num = fluid.data(name='rois_num', shape=[None], dtype='int32')

pool_out = fluid.layers.roi_pool(
        input=x,
        rois=rois,
        pooled_height=1,
        pooled_width=1,
        spatial_scale=1.0,
        rois_num=rois_num)

exe = fluid.Executor(place)
out, = exe.run(feed={'input':input_data ,'roi':roi_data, 'rois_num': rois_num_data}, fetch_list=[pool_out.name])
print(out)   #array([[[[11.]]], [[[16.]]]], dtype=float32)
print(np.array(out).shape)  # (2, 1, 1, 1)