Skip to main content
Open In Colab
Detectron2 Logo
This tutorial covers:
  • Running inference with pre-trained detectron2 models
  • Training a detectron2 model on a custom dataset

Installation

!pip install detectron2@git+https://github.com/facebookresearch/detectron2

Setup

from detectron2.utils.logger import setup_logger
setup_logger()

from detectron2 import model_zoo
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog, DatasetCatalog

Running Pre-trained Models

cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")

predictor = DefaultPredictor(cfg)
outputs = predictor(im)

# Visualize predictions
v = Visualizer(im[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2)
out = v.draw_instance_predictions(outputs["instances"].to("cpu"))

Training on Custom Dataset

Register a custom dataset (balloon segmentation example):
from detectron2.structures import BoxMode

def get_balloon_dicts(img_dir):
    # Parse custom annotation format
    dataset_dicts = []
    for idx, v in enumerate(imgs_anns.values()):
        record = {
            "file_name": filename,
            "image_id": idx,
            "height": height,
            "width": width,
            "annotations": objs
        }
        dataset_dicts.append(record)
    return dataset_dicts

# Register dataset
DatasetCatalog.register("balloon_train", lambda: get_balloon_dicts("balloon/train"))
MetadataCatalog.get("balloon_train").set(thing_classes=["balloon"])

Training Configuration

from detectron2.engine import DefaultTrainer

cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))
cfg.DATASETS.TRAIN = ("balloon_train",)
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
cfg.SOLVER.IMS_PER_BATCH = 2
cfg.SOLVER.BASE_LR = 0.00025
cfg.SOLVER.MAX_ITER = 300
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1  # balloon class only

trainer = DefaultTrainer(cfg)
trainer.resume_or_load(resume=False)
trainer.train()

Evaluation

from detectron2.evaluation import COCOEvaluator, inference_on_dataset
from detectron2.data import build_detection_test_loader

evaluator = COCOEvaluator("balloon_val", output_dir="./output")
val_loader = build_detection_test_loader(cfg, "balloon_val")
print(inference_on_dataset(predictor.model, val_loader, evaluator))
Results on balloon dataset: ~70-75% AP for both bbox and segmentation.

Other Model Types

Detectron2 supports various tasks:
  • Keypoint Detection: COCO-Keypoints/keypoint_rcnn_R_50_FPN_3x.yaml
  • Panoptic Segmentation: COCO-PanopticSegmentation/panoptic_fpn_R_101_3x.yaml
Key references: (Szegedy et al., 2015; Redmon et al., 2015; Zhou et al., 2014)

References

  • Redmon, J., Divvala, S., Girshick, R., Farhadi, A. (2015). You only look once: Unified, real-time object detection.
  • Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z. (2015). Rethinking the Inception Architecture for Computer Vision.
  • Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A. (2014). Object detectors emerge in deep scene CNNs.