UAV Drone Detection and Tracking

Drone object detection — 50
Kalman filter tracking — 50

Overview and learning objectives

Multi-Object Tracking (MOT) is a core visual ability that humans use to perform kinetic tasks and coordinate activities in dynamic environments. The AI community has recognized the importance of MOT through a series of competitions. In this assignment, the target object class is drone. You will detect drones in video footage and track them using Kalman filters. The assignment situates probabilistic reasoning in the physical security domain. By completing this assignment, you will:

Identify and use a drone-specific object detection dataset.
Fine-tune or configure a deep learning detector for the drone class.
Implement a Kalman filter to track detections across frames.
Visualize 2D trajectories superimposed on video.

Test videos

The following two videos are your primary test inputs. Download them locally before starting.

Use yt-dlp to download them:

Install ffmpeg and yt-dlp

brew install ffmpeg yt-dlp

Download a video

yt-dlp -f "bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]" \
  -o "drone_video_1.mp4" \
  "https://www.youtube.com/watch?v=DhmZ6W1UAv4"

Repeat for the second video.

Extract frames

mkdir -p frames
ffmpeg -i drone_video_1.mp4 -vf "fps=5" frames/frame_%04d.jpg

Sampling at 5 fps is a reasonable starting point. Adjust based on drone speed.

Task 1: Drone object detection (50 points)

Dataset

Find a dataset that contains labeled drone bounding boxes. Be careful to distinguish between:

Datasets that detect objects from drones (aerial imagery) — not what you want.
Datasets that detect the drone itself — what you want.

Resources to search:

Prefer datasets stored in Parquet or standard COCO/YOLO formats for ease of loading.

Detector

You must use a deep learning model. You may use a pretrained architecture — fine-tuning is encouraged but not required. Recommended starting points:

Ultralytics YOLOv8 — straightforward fine-tuning API
RT-DETR — transformer-based, strong accuracy

Deliverable

Split each video into frames and run your detector on every frame. Save all frames that contain at least one detection to a folder called detections/. Write your code so that it processes all .mp4 files in a given directory, not just the two test videos.

Task 2: Kalman filter tracking (50 points)

Use the filterpy library to implement a Kalman filter that tracks the drone across frames. Initialize the filter with detections from Task 1. Your state vector should represent at minimum the 2D pixel position of the drone bounding box center (and optionally its velocity). For each track:

Predict the next state using the Kalman filter motion model.
Update the state using the detector output for that frame.
Handle missing detections — the filter must continue predicting even when the detector misses the drone for a small number of consecutive frames.

Deliverable

Produce one output video per input video. Each output video must contain only the frames where the drone is present and must overlay:

The detector bounding box.
The 2D trajectory as a polyline connecting the tracker-estimated center positions across frames.

Use ffmpeg and OpenCV to compose the output.

Evaluation criteria

Criterion	Description
Detection quality	Are detections consistent and semantically correct (drone class, not background)?
Tracker correctness	Does the Kalman filter correctly predict and update across frames?
Trajectory visualization	Is the 2D trajectory clearly superimposed on the output video?
Code generality	Does the pipeline process any directory of `.mp4` files, not just the test videos?
Report clarity	Can you explain your detector choice, filter design, and failure cases?

Deliverables

A Hugging Face dataset containing the detections/ sample frames (Parquet format).
Output tracking videos (one per test input) uploaded to your personal YouTube channel and embedded in your README.
A README.md in your submission repository covering:
- Dataset choice and detector configuration.
- Kalman filter state design and noise parameters.
- Failure cases and how the tracker handles missed detections.

Edit this page on GitHub or file an issue.

Course

Study Guides

Assignments-Spring-2026

UAV Drone Detection and Tracking

Overview and learning objectives

Test videos

Task 1: Drone object detection (50 points)

Dataset

Detector

Deliverable

Task 2: Kalman filter tracking (50 points)

Deliverable

Evaluation criteria

Deliverables

Course

Study Guides

Assignments-Spring-2026

​Overview and learning objectives

​Test videos

​Task 1: Drone object detection (50 points)

​Dataset

​Detector

​Deliverable

​Task 2: Kalman filter tracking (50 points)

​Deliverable

​Evaluation criteria

​Deliverables

Overview and learning objectives

Test videos

Task 1: Drone object detection (50 points)

Dataset

Detector

Deliverable

Task 2: Kalman filter tracking (50 points)

Deliverable

Evaluation criteria

Deliverables