Loading TOC...

cntk.ROIPooling

cntk.ROIPooling(
   $operand as cntk.variable,
   $rois as cntk.variable,
   $pooling-type as String,
   $roi-output-shape as cntk.shape,
   $spatial-scale as Number,
   $name as String
) as cntk.function

Summary

The ROI-pooling operation computes a new matrix by selecting the maximum (max pooling) value in the pooling input for each region of interest (ROI). The regions of interest are given as the second input to the operator as the top left and bottom right corners of the regions in absolute pixels of the original image. The pooling input is computed per ROI by projecting the coordinates onto the input feature map (first input to the operator) and considering all overlapping positions. The projection uses the 'spatial scale' which is the size ratio of the input feature map over the input image size. The spatial scale can be computed by multiplying all strides that occur before the ROI-pooling and taking the inverse, e.g., a network that has four pooling layers with stride two would have a spatial scale of 1/16. The output shape's width and height are determined by the third argument, the output depth (number of filters) is the same as the input depth.

Parameters
$operand The operand of the operation. a convolutional feature map as the input volume
$rois ROI coordinates as absolute pixel coordinates (x_min, y_min, x_max, y_max).
$pooling-type
$roi-output-shape Dimensions (width, height) of the ROI output, as a BrainScript vector, e.g. (4:4).
$spatial-scale The scale of operand from the original image size. The default is 1/16, which matches for example AlexNet and VGG16 networks.
$name The name of the function instance in the network.

Stack Overflow iconStack Overflow: Get the most useful answers to questions from the MarkLogic community, or ask your own question.