Introduction to Camera Models
The camera models are mathematical representations of how a camera captures the 3D world and projects it onto a 2D image plane. There are several different camera models, each with its own assumptions and characteristics. Some of the most common camera models include:- Pinhole Camera Model: This is the simplest camera model, which assumes that light rays pass through a single point (the pinhole) and project onto the image plane. It is characterized by its focal length and the position of the pinhole.
- Perspective Camera Model: This model extends the pinhole camera model by incorporating lens distortion and other optical effects. It is commonly used in computer vision and graphics to simulate realistic camera behavior.
- Orthographic Camera Model: In this model, parallel lines in the 3D world remain parallel in the 2D image. This is useful for technical drawings and architectural visualizations, where accurate measurements are important.
- Spherical Camera Model: This model captures a 360-degree view of the scene by using a spherical image sensor. It is commonly used in virtual reality and panoramic photography.
- Omnidirectional Camera Model: Similar to the spherical camera model, this model captures a wide field of view (FOV) by using multiple lenses or a fisheye lens. It is often used in robotics and surveillance applications.

Camera Model Fundamentals
Pinhole Camera Model
The functions in this section use a so-called pinhole camera model. The view of a scene is obtained by projecting a scene’s 3D point into the image plane using a perspective transformation which forms the corresponding pixel . Both and are represented in homogeneous coordinates, i.e. as 3D and 2D homogeneous vector respectively. The distortion-free projective transformation given by a pinhole camera model is: where:- is a 3D point expressed with respect to the world coordinate system
- is a 2D pixel in the image plane
- is the camera intrinsic matrix
- and are the rotation and translation that describe the change of coordinates from world to camera coordinate systems
- is the projective transformation’s arbitrary scaling
Camera Intrinsic Matrix
The camera intrinsic matrix projects 3D points given in the camera coordinate system to 2D pixel coordinates: The camera intrinsic matrix is composed of the focal lengths and , which are expressed in pixel units, and the principal point , that is usually close to the image center: and thus:Coordinate Transformations
The joint rotation-translation matrix is the matrix product of a projective transformation and a homogeneous transformation. The 3-by-4 projective transformation maps 3D points represented in camera coordinates to 2D points in the image plane and represented in normalized camera coordinates and . The homogeneous transformation is encoded by the extrinsic parameters and and represents the change of basis from world coordinate system to the camera coordinate system : This gives us the complete transformation: If , this is equivalent to:Lens Distortion Model
Real lenses introduce distortions (radial and tangential).
- Radial coefficients: , , , , ,
- Tangential coefficients: ,
- Thin prism coefficients: , , ,
- Barrel distortion: monotonically decreasing
- Pincushion distortion: monotonically increasing
Coordinate Systems
Right-handed vs Left-handed
The right-handed and left-handed coordinate systems are two conventions for defining the orientation of axes in 3D space.ROS2 Coordinate System
The table below shows what ROS2 RViz2 displays - it’s a right-handed coordinate system. The right-hand rule is used to determine the direction of the axes.| Axis | Direction | Color |
|---|---|---|
| X | Forward | Red |
| Y | Left | Green |
| Z | Up | Blue |
Sensor Coordinate Systems
Each sensor has its own coordinate system, supported by the vendor documentation, which may be right-handed or left-handed. Take for example RealSense cameras - a right-handed sensor.RealSense Camera Coordinate Conventions:
- Point of View: Imagine standing behind the camera, looking forward
- ROS2 Coordinate System: (X: Forward, Y: Left, Z: Up)
- Camera Optical Coordinate System: (X: Right, Y: Down, Z: Forward)
- References: REP-0103, REP-0105

