> ## Documentation Index
> Fetch the complete documentation index at: https://aegean.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# AI for Computer Vision

> Course introduction.

<Frame>
  <img src="https://mintcdn.com/aegeanaiinc/ai8d4zdsUnwq_5J6/courses/cv/images/llama-vision.jpeg?fit=max&auto=format&n=ai8d4zdsUnwq_5J6&q=85&s=b2aa027a757ea26f45bae8426a5d192c" alt="AI for Computer Vision" style={{width: '100%', maxHeight: '400px', objectFit: 'cover', borderRadius: '8px'}} width="5824" height="3264" data-path="courses/cv/images/llama-vision.jpeg" />
</Frame>

## What this course is all about

This course projects the vast field of statistical learning using differential deep neural architectures onto the computer vision application space. Beginning with the fundamentals of computer vision, the course offers extensive coverage on essential topics such as object detection and semantic segmentation using Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs).

The course extends on such fundamentals and treats computer vision in multimodal and generative settings, enabling applications such as image captioning, visual question answering, and scene generation using state-of-the-art models like Neural Radiance Fields (NeRF) and diffusion models.

Students at the end of the course are well-equipped to design and deploy vision systems capable of complex tasks, from tracking and identifying objects in video streams to generating interactive responses based on visual prompts.

## Topics Covered

<CardGroup cols={2}>
  <Card title="CNNs" icon="layer-group" href="/aiml-common/lectures/cnn/cnn-intro/index">
    Convolutional Neural Networks, CNN layers, example architectures, and visual attention.
  </Card>

  <Card title="Sensor Models" icon="camera" href="/aiml-common/lectures/sensor-models/cameras/index">
    Camera models, beam models, and likelihood field models for perception.
  </Card>

  <Card title="Vision-Language Models" icon="comments" href="/aiml-common/lectures/VLM/index">
    CLIP, LLaVA, BLIP-2 and multimodal reasoning for vision and language tasks.
  </Card>

  <Card title="Expectation–Maximization" icon="wand-magic-sparkles" href="/aiml-common/lectures/mixture-of-gaussians/index">
    EM algorithm, VAE introduction, and VAE architectures for image generation.
  </Card>
</CardGroup>

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit this page on GitHub](https://github.com/aegean-ai/eaia/edit/main/src/courses/cv/index.mdx) or [file an issue](https://github.com/aegean-ai/eaia/issues/new/choose).
</Callout>
