Computer Vision - aegean.ai

4 modules · 15 lessons · 5+ hours of content

Subscribe to our YouTube channel and explore the complete curriculum below.

1. Course Introduction

How We Understand Scenes

Human Perception and Imaging.

Mathematical Prerequisites

What you need to know before diving into the course material.

2. Convolutional Neural Networks

Convolution and Correlation

A linear operation for extracting spatial features.

CNN Architectures

Looking inside a CNN layer and understanding architectural patterns.

Image Classification

Image classification with data augmentation.

What CNNs Learn

Visualizing the features learned by CNNs.

ResNets

Residual Networks and skip connections.

3. Object Detection

Introduction to Object Detection

Object detection in a physical security application.

Computer Vision Datasets

What types of annotations are used in computer vision?

Region-based Object Detectors

R-CNN, Fast R-CNN, Faster R-CNN.

4. Vision-Language Models

Introduction to Transformers

The transformer architecture and the simple attention mechanism.

The Learnable Attention Mechanism

Implementing the scaled dot-product self attention mechanism.

Multi-Head Self Attention

Using multiple attention heads to capture different aspects of input sequences.

Course Information

CS681: Deep Learning for Computer Vision, NJIT (Spring 2026)

View Full Syllabus

See the complete course syllabus including assignments and schedule.

Edit this page on GitHub or file an issue.

How We Understand Scenes

Mathematical Prerequisites

Convolution and Correlation

CNN Architectures

Image Classification

What CNNs Learn

ResNets

Introduction to Object Detection

Computer Vision Datasets

Region-based Object Detectors

Introduction to Transformers

The Learnable Attention Mechanism

Multi-Head Self Attention

​Course Information

View Full Syllabus

Course Information