Mask R-CNN Semantic Segmentation

The semantic segmentation approach described in this section is Mask R-CNN paper. Mask R-CNN is an extension of Faster R-CNN that adds a mask head to the detector. The mask head is a CNN that takes the feature map output by the RPN and the bounding box coordinates of the detected object and outputs a mask for each object. The mask is a binary image of the same size as the input image, where the pixels of the object are marked as 1 and the rest as 0. In the object detection section we saw R-CNN that simply cropped proposals, generated externally to the detector, from the input image and classifies those proposals. Since the proposals were typically overlapping, CNN computations that extracted features per proposal were wasted and the detector was very slow. Fast R-CNN improved this by passing the whole input image once via a CNN feature extractor and used a feature map internally to this CNN to elect proposals, therefore avoiding the feature extraction per proposal. Faster R-CNN removed the external dependency on proposal generation and introduced a Region Proposal Network (RPN) internally to the detector. For the RPN to generate proposals, prior (or anchor) boxes were defined uniformly across the input image and the RPN was trained to predict the class of each anchor and by how much the anchor needs to shift to match the ground truth bounding box. The code is this section together with the visualizations is useful to understand both Faster RCNN and the mask head extension that ‘colors’ the pixels of the detected objects.

Demo

Notebooks

TensorFlow

The four notebooks in this section use MaskRCNN and are from Matterport’s original implementation - as such they will not work in TF2. For newer versions see the TF Model Garden or the optimized for TPU repo.

TF1 Demo

Demos MaskRCNN inference on sample images.

TF1 Data Inspection

Visualizes the different pre-processing steps to prepare the training data.

TF1 Model Inspection

Goes in depth into the steps performed to detect and segment objects, with visualizations of every step of the pipeline.

TF1 Weight Inspection

Inspects the weights of a trained model and looks for anomalies and odd patterns.

PyTorch

There are two main implementations of MaskRCNN. The Detectron2 library, that is oriented towards research projects, offering more flexibility but a steeper learning curve and the model shipped as part of the torchvision library that is simpler to use at the expense of configurability.

Detectron2 Inference

Shows how an existing pretrained model can be used to do instance segmentation on new classes and how video can be processed via a relevant pipeline.

TorchVision Inference

Shows how an existing pretrained model can be used to do instance segmentation on new classes and how video can be processed via a relevant pipeline.

Key references: (Ren et al., 2015; He et al., 2017; Chen et al., 2018; Peng et al., 2017)

References

Chen, L., Zhu, Y., Papandreou, G., Schroff, F., Adam, H. (2018). Encoder-decoder with atrous separable convolution for semantic image segmentation.
He, K., Gkioxari, G., Dollár, P., Girshick, R. (2017). Mask R-CNN.
Peng, C., Xiao, T., Li, Z., Jiang, Y., Zhang, X., et al. (2017). MegDet: A Large Mini-Batch Object Detector.
Ren, S., He, K., Girshick, R., Sun, J. (2015). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.

Edit this page on GitHub or file an issue.

Perception

Sensor Models

CNNs

Scene Understanding

Faster RCNN Lab

YOLO Lab

UNet Lab

Mask RCNN Lab

State Estimation

Mapping

Mask R-CNN Semantic Segmentation

Demo

Notebooks

TensorFlow

TF1 Demo

TF1 Data Inspection

TF1 Model Inspection

TF1 Weight Inspection

PyTorch

Detectron2 Inference

TorchVision Inference

References

Perception

Sensor Models

CNNs

Scene Understanding

Faster RCNN Lab

YOLO Lab

UNet Lab

Mask RCNN Lab

State Estimation

Mapping

​Demo

​Notebooks

​TensorFlow

TF1 Demo

TF1 Data Inspection

TF1 Model Inspection

TF1 Weight Inspection

​PyTorch

Detectron2 Inference

TorchVision Inference

​References

Demo

Notebooks

TensorFlow

PyTorch

References