roi_pool¶
- paddle.fluid.layers.nn. roi_pool ( input, rois, pooled_height=1, pooled_width=1, spatial_scale=1.0, rois_num=None, name=None ) [source]
-
This operator implements the roi_pooling layer. Region of interest pooling (also known as RoI pooling) is to perform max pooling on inputs of nonuniform sizes to obtain fixed-size feature maps (e.g. 7*7).
The operator has three steps:
Dividing each region proposal into equal-sized sections with the pooled_width and pooled_height;
Finding the largest value in each section;
Copying these max values to the output buffer.
For more information, please refer to https://stackoverflow.com/questions/43430056/what-is-roi-layer-in-fast-rcnn
- Parameters
-
input (Variable) – Input feature, 4D-Tensor with the shape of [N,C,H,W], where N is the batch size, C is the input channel, H is Height, W is weight. The data type is float32 or float64.
rois (Variable) – ROIs (Regions of Interest) to pool over. 2D-LoDTensor with the shape of [num_rois,4], the lod level is 1. Given as [[x1, y1, x2, y2], …], (x1, y1) is the top left coordinates, and (x2, y2) is the bottom right coordinates.
pooled_height (int, optional) – The pooled output height, data type is int32. Default: 1
pooled_width (int, optional) – The pooled output height, data type is int32. Default: 1
spatial_scale (float, optional) – Multiplicative spatial scale factor to translate ROI coords from their input scale to the scale used when pooling. Default: 1.0
rois_num (Tensor) – The number of RoIs in each image. Default: None
name (str, optional) – For detailed information, please refer to Name. Usually name is no need to set and None by default.
- Returns
-
The pooled feature, 4D-Tensor with the shape of [num_rois, C, pooled_height, pooled_width].
- Return type
-
Variable
Examples:
import paddle.fluid as fluid import numpy as np import paddle paddle.enable_static() DATATYPE='float32' place = fluid.CPUPlace() #place = fluid.CUDAPlace(0) input_data = np.array([i for i in range(1,17)]).reshape(1,1,4,4).astype(DATATYPE) roi_data =fluid.create_lod_tensor(np.array([[1., 1., 2., 2.], [1.5, 1.5, 3., 3.]]).astype(DATATYPE),[[2]], place) rois_num_data = np.array([2]).astype('int32') x = fluid.data(name='input', shape=[None,1,4,4], dtype=DATATYPE) rois = fluid.data(name='roi', shape=[None,4], dtype=DATATYPE) rois_num = fluid.data(name='rois_num', shape=[None], dtype='int32') pool_out = fluid.layers.roi_pool( input=x, rois=rois, pooled_height=1, pooled_width=1, spatial_scale=1.0, rois_num=rois_num) exe = fluid.Executor(place) out, = exe.run(feed={'input':input_data ,'roi':roi_data, 'rois_num': rois_num_data}, fetch_list=[pool_out.name]) print(out) #array([[[[11.]]], [[[16.]]]], dtype=float32) print(np.array(out).shape) # (2, 1, 1, 1)