Computer Vision Course Syllabus

Books

TIF - Foundations of Computer Vision by Antonio Torralba, Phillip Isola and William T. Freeman. Free online. Covers the latest deep learning applications including diffusion models.
BISHOP - Deep Learning - Foundations and Concepts by C. Bishop and H. Bishop. Available to view online from the book’s website.
SZELINSKI - Computer Vision: Algorithms and Applications, 2nd Edition. Free to download for personal use. Alternative to TIF for some topics.

Lecture	Topic	Description
1	Introduction	Computer vision for agents with egomotion. Prerequisites review: Python, linear algebra, probability theory, camera fundamentals. Reading: TIF Chapter 1
2	Statistical Learning	End-to-end prediction, featurization, fully connected neural architectures, maximum likelihood optimization. Reading: TIF Chapters 9-10, BISHOP Chapters 4-5
3	Dense Neural Networks	Cross entropy loss, training and regularization of dense layers. Reading: TIF Chapters 12-13, BISHOP Chapter 6
4	CNNs	Spatial feature hierarchies, image classification, ResNets for real-time perception. Reading: TIF Chapter 24, BISHOP Chapter 10
5	Object Detection	YOLO and Faster R-CNN architectures for identifying and locating objects. Reading: TIF Chapter 50
6	Semantic Segmentation	Pixel-level labeling, panoptic segmentation for full scene understanding. Reading: SZELINSKI Chapter 6
7	Vision Transformers	Self-attention for global image dependencies, ViT vs CNN trade-offs. Reading: BISHOP Chapter 12, TIF Chapter 26
8	Object Tracking	Video stream processing, handling occlusion, motion blur, appearance changes. Reading: TIF Chapter 5

Lecture	Topic	Description
9	Contrastive Learning	Vision-language pretraining, CLIP for relating images and text. Reading: CLIP paper, TIF Chapter 51
10	From Retrieval to Generation	BLIP-2, LLaVA for image captioning and Visual Question Answering.
11	Prompted Vision Models	Meta’s SAM as a worker receiving multimodal prompts from VLM planners.

Lecture	Topic	Description
12	Neural Radiance Fields	NeRF for creating 3D scenes from 2D images, volume rendering concepts. Reading: TIF Chapter 45
13	Diffusion Models	Physics-inspired learning, conditional image generation, DALL-E and Stable Diffusion. Reading: TIF Chapters 32, 34