Introduction - aegean.ai

This book is a work in progress. Some chapters are incomplete or in draft form. Content is being actively developed and may change.

Welcome to Engineering AI Agents. This comprehensive resource covers the foundations and advanced topics needed to build intelligent autonomous systems.

What you’ll learn

This book bridges multiple disciplines to provide a unified understanding of AI agents:

Foundations

Statistical learning theory, regression, classification, and optimization fundamentals.

Neural Networks

Backpropagation, normalization, regularization, and training techniques.

Perception

CNNs, sensor models, object detection, and segmentation.

LLMs

NLP foundations, transformers, and large language models.

Logic

Logical reasoning and knowledge-based agents.

VLMs

Vision-language models including CLIP, LLaVA, and BLIP-2.

Planning

Task planning, global planning, and local planning for autonomous navigation.

MDPs & RL

Markov decision processes, Bellman equations, and reinforcement learning.

Robotics Systems

Kinematics, state estimation, SLAM, and systems integration.

Physical AI

Vision-Language-Action agents for embodied intelligence.

How to use this book

The content is organized into two tracks that share key chapters:

AI/ML track, Foundations → Neural Networks → Perception → LLMs → Reasoning → VLMs → Planning → MDPs → RL
Robotics track, Perception → Robotics Systems → Physical AI

Each section contains groups of related lectures that build upon each other. We recommend following the sequence within each track, though experienced readers may jump to specific topics. The book contains Python notebooks and code snippets for hands-on experience. The notebooks are available in the GitHub repository, and results from notebook runs are logged in the Weights & Biases workspace.

Part	Section	Topics
Foundations	Learning & Regression	Learning problem, linear regression, empirical risk, SGD
	Maximum Likelihood	Entropy, marginal MLE, Gaussian MLE, conditional MLE
	Classification	Classification intro, perceptron, logistic regression
	Dimensionality Reduction	PCA, PCA workshop, 3D PCA, low-rank Gaussians
Neural Networks	Backpropagation	DNN intro, backprop intro, backprop DNNs, exercises, Fashion MNIST
	Whitening	Whitening, correlation-covariance matrix
	Normalization	Batch normalization, layer normalization
	Regularization	Regularization techniques
	Hyperparameter Optimization	Bayesian optimization, HPO workshop
	Transfer Learning	Introduction, tutorial
Perception	Sensor Models	Camera models, pinhole model, calibration, beam models, likelihood field
	CNNs	CNN intro, layers, architectures, small datasets, visualization, ResNet features
	Scene Understanding	Introduction, detection metrics
	Faster RCNN Lab	RCNN → Fast RCNN → Faster RCNN, 6-notebook PyTorch series
	YOLO Lab	YOLO introduction, 5-notebook PyTorch series
	UNet Lab	UNet architecture, from-scratch notebook
	Mask RCNN Lab	Mask RCNN, TF demos, PyTorch Detectron2
LLMs	NLP Foundations	NLP pipelines, Word2Vec
	Recurrent Neural Networks	Introduction, simple RNN, LSTM
	Language Models	Language models, RNN language model
	Neural Machine Translation	NMT intro, RNN NMT with attention
	Transformers	Introduction, single-head attention, multi-head attention, MLP, inference
	Speech Agents	Text-to-speech and voice cloning
Reasoning	Logical Reasoning	Propositional logic, logical inference, logical agents, applications
	LLM Reasoning	LLM-based reasoning approaches
VLMs	Vision-Language Models	Overview, CLIP, LLaVA, BLIP-2
Planning	Task Planning	PDDL, BlocksWorld, logistics, manufacturing
	Global Planning	Search, forward search, A*
	Local Planning	Motion planning, behavioral planning, prediction
MDPs	Markov Decision Processes	MDP introduction
	Bellman Equations	Expectation backup, optimality backup, policy improvement, recycling robot
	Dynamic Programming	Policy iteration
RL	Reinforcement Learning	Introduction, model-based algorithms
	Prediction	Monte Carlo, temporal difference, TD vs MC
	Control	Generalized policy iteration, greedy MC, SARSA, gridworld
	Policy-Based	REINFORCE
Robotics Systems	Kinematics & Dynamics	Configuration space, homogeneous coordinates, motion representations, wheeled robots
	State Estimation	Recursive estimation, discrete Bayesian filter, Kalman filters, HMM localization
	SLAM	Occupancy mapping, simultaneous localization and mapping
	Systems Integration	Gazebo simulation, ROS applications, Sim2Real, imitation learning
Physical AI	VLA Models	Vision-Language-Action agents

Edit this page on GitHub or file an issue.

​What you’ll learn

Foundations

Neural Networks

Perception

LLMs

Logic

VLMs

Planning

MDPs & RL

Robotics Systems

Physical AI

​How to use this book

​Table of contents

What you’ll learn

How to use this book

Table of contents