Sensor Models
Camera models, calibration techniques, and probabilistic sensor models including beam and likelihood field approaches.
CNNs
Convolutional neural networks for image classification, including layer types, architectures, and visualization techniques.
Object Detection
Detecting and localizing objects in images using two-stage and single-stage deep learning detectors.
Object Segmentation
Pixel-wise classification for semantic and instance segmentation using Mask RCNN and UNet.
State Estimation
Recursive Bayesian estimation, Kalman filters, and particle filters for robot localization.
Mapping
Occupancy grid mapping and simultaneous localization and mapping (SLAM).
Sensor models
Sensor models provide the mathematical foundation for understanding how robots perceive their environment through cameras, lidar, and other sensors.Camera models
Camera fundamentals and image processing for robotics.
Pinhole model
Mathematical representation of the pinhole camera model.
Camera calibration
Practical camera calibration using OpenCV.
Beam models
Probabilistic models for range sensors.
Convolutional neural networks
CNNs are the backbone of modern computer vision systems, enabling image classification, feature extraction, and visual understanding.CNN introduction
Introduction to convolutional neural networks.
CNN layers
Understanding CNN layer types and operations.
CNN architectures
Example architectures: LeNet, AlexNet, VGG, ResNet.
Feature extraction: ResNet
ResNet as a backbone for downstream vision tasks.
Object detection
Object detection covers scene understanding fundamentals, evaluation metrics, and the evolution from two-stage (RCNN family) to single-stage (YOLO family) detectors.Scene understanding
Detection vs classification, the detection pipeline, region proposals, FCNs, and the COCO dataset.
Detection metrics
Precision, recall, mAP, and IoU for evaluating detectors.
RCNN
Region-based CNN: selective search, CNN features, SVM classification.
Fast RCNN
Shared convolutional features and ROI pooling for end-to-end training.
Faster RCNN
Region Proposal Network enabling fully end-to-end two-stage detection.
Faster RCNN from scratch (PyTorch)
A six-notebook series building every Faster RCNN component from scratch in pure PyTorch, from COCO data loading through end-to-end training and inference.01 · COCO dataloader
Streaming COCO from Hugging Face, collation, and anchor target assignment.
02 · Backbone
ResNet50 feature pyramid network (FPN) with lateral connections.
03 · RPN
Region Proposal Network: anchor generation, objectness head, NMS.
04 · ROI head
ROI Align, two-layer MLP head, and sibling classification and regression predictors.
05 · Training
End-to-end training with AMP and gradient checkpointing on COCO streaming data.
06 · Inference
Checkpoint loading, COCO validation inference, proposal and detection visualization.
YOLO from scratch (PyTorch)
A five-notebook series building YOLOv8-style single-stage detection in pure PyTorch, from data loading through inference and evaluation.YOLO introduction
Single-stage detection design philosophy, anchor-free heads, and the YOLO architecture family.
01 · COCO dataloader
Streaming COCO, grid target assignment, and mosaic augmentation.
02 · Backbone
CSPDarknet backbone with C2f bottleneck blocks.
03 · Neck and head
PANet feature pyramid neck and decoupled detection head.
04 · Loss and training
Task-aligned assignment, distribution focal loss, and training loop.
05 · Inference and evaluation
NMS post-processing, COCO mAP evaluation, and latency benchmarks.
Object segmentation
Instance and semantic segmentation extend object detection to produce pixel-level masks, enabling fine-grained scene understanding.Mask RCNN
Extending Faster RCNN with a mask head for instance segmentation.
Mask RCNN · TF demo
Running inference with the TensorFlow Mask RCNN implementation.
Mask RCNN · inspect data
Visualizing COCO data loading, augmentation, and anchor generation.
Mask RCNN · inspect model
Layer-by-layer inspection of model activations and outputs.
Mask RCNN · inspect weights
Visualizing learned filter weights and statistics.
Mask RCNN · PyTorch (Detectron2)
Detectron2 Mask RCNN training and evaluation workflow.
Mask RCNN · torchvision inference
Running pretrained Mask RCNN inference with torchvision.
UNet
Encoder-decoder architecture for semantic segmentation.
State estimation
State estimation enables robots to determine their position using probabilistic models that fuse sensor observations over time.Recursive state estimation
Foundations of recursive Bayesian estimation.
Discrete Bayesian filter
Exact Bayesian filtering over discrete state spaces.
Kalman filters
Linear Gaussian state estimation.
HMM localization
Robot localization using hidden Markov models.
Mapping
Mapping algorithms build spatial representations of the environment that robots use for navigation and planning.Occupancy mapping
Building occupancy grid maps from sensor data.
SLAM
Simultaneous localization and mapping.

