7.2 - Camera Math

As we discussed in the previous lesson, in WebGL the camera is always located at the origin looking down the -Z axis. The programmer’s job is to create a transformation that moves a scene in front of this stationary camera. You will be able to perform more creative camera work if you understand how this is done. This lesson explains the mathematics behind a camera transformation. Let’s review how a camera is defined.

A Camera Definition

A camera is defined by a position and a local coordinate system. We typically call the position of the camera the “eye” position. The camera’s local coordinate system is defined by three orthogonal axes, u, v, and n. If a camera is located at the origin looking down the -Z axis, then u would align with the x axis, v would align with the y axis, and n would align with the z axis. This is summarized as:

u --> x
v --> y
n --> z

We can specify a camera using 12 values which define one global point and three vectors.

eye = (eye_x, eye_y, eye_z)  // the location of the camera
u = <ux, uy, uz>             // vector pointing to the right of the camera
v = <vx, vy, vz>             // vector pointing up from the camera
n = <nx, ny, nz>             // vector pointing backwards; <-n> is forward

The vectors u, v, and n define relative directions because they are pointing in a direction that is relative to the eye’s location and orientation.

Moving a Camera to its Default Location and Orientation

Given a camera definition, if we could develop a transformation that moves the camera to the global origin and aligns the camera’s axes with the global axes, then we could apply this transformation to every model in the scene. This would move the scene in front of the camera!

This task is easily accomplished using two separate transformations:

  • First, move the camera to the origin.
  • Second, rotate the camera to align the camera’s local coordinate system axes with the global axes.

In matrix format, we have the following, where the first operation is on the right side of the chained transforms:

rotateToAlign
*translateToOrigin
*x
y
z
w
=x'
y'
z'
w'
Eq1

The translateToOrigin transform is trivial to create because we know the eye location. The transform is:

translateToOrigin
=1
0
0
0
0
1
0
0
0
0
1
0
-eye_x
-eye_y
-eye_z
1
Eq2

The rotateToAlign transformation is equally simple. (We will develop this transform below.) The transformation is:

rotateToAlign
=ux
vx
nx
0
uy
vy
ny
0
uz
vz
nz
0
0
0
0
1
Eq3

Therefore, a transformation that will move a camera to the origin and align the axes is:

ux
vx
nx
0
uy
vy
ny
0
uz
vz
nz
0
0
0
0
1
*1
0
0
0
0
1
0
0
0
0
1
0
-eye_x
-eye_y
-eye_z
1
*x
y
z
w
=x'
y'
z'
w'
Eq4

Perform the matrix math by clicking on the multiplication signs! This is the standard camera transformation used for all 3D computer graphics! (Actually, for all right-handed coordinate system 3D computer graphics.)

Deriving the Rotation Transform

Let’s look closer at the rotation matrix that aligns a camera’s axes with the global axes. Remember that the u axis maps to the global x axis, the v axis maps to the global y axis, and the n axis maps to the global z axis. Also remember that a general rotation about an arbitrary axis requires a fractional value in the upper-left 3-by-3 positions of a transformation matrix. Therefore, the desired rotation matrix must satisfy the following three equations:

f1
f4
f7
0
f2
f5
f8
0
f3
f6
f9
0
0
0
0
1
*ux
uy
uz
0
=1
0
0
0
Eq4 - u --> x, or <ux, uy, uz> maps to <1, 0, 0>
f1
f4
f7
0
f2
f5
f8
0
f3
f6
f9
0
0
0
0
1
*vx
vy
vz
0
=0
1
0
0
Eq5 - v --> y, or <vx, vy, vz> maps to <0, 1, 0>
f1
f4
f7
0
f2
f5
f8
0
f3
f6
f9
0
0
0
0
1
*nx
ny
nz
0
=0
0
1
0
Eq6 - n --> z, or <nx, ny, nz> maps to <0, 0, 1>

We need one transform that makes all three equations true. Because of the way matrix multiplication works, it is OK to combine these three separate equations into a single equation like this:

f1
f4
f7
0
f2
f5
f8
0
f3
f6
f9
0
0
0
0
1
*ux
uy
uz
0
vx
vy
vz
0
nx
ny
nz
0
0
0
0
1
=1
0
0
0
0
1
0
0
0
0
1
0
0
0
0
1
Eq7

Notice that the vectors in the three separate equations became the columns of the single matrix. To solve for the rotation matrix, we need to multiply both sides of the equation by the known matrix’s inverse.

LetM-1
= inverse ofux
uy
uz
0
vx
vy
vz
0
nx
ny
nz
0
0
0
0
1
Eq8

Then,

f1
f4
f7
0
f2
f5
f8
0
f3
f6
f9
0
0
0
0
1
*ux
uy
uz
0
vx
vy
vz
0
nx
ny
nz
0
0
0
0
1
*M-1
=1
0
0
0
0
1
0
0
0
0
1
0
0
0
0
1
*M-1
Eq9

This reduces to:

f1
f4
f7
0
f2
f5
f8
0
f3
f6
f9
0
0
0
0
1
=M-1
Eq10

The rotation matrix we need to align a camera’s local coordinate system to the global coordinate system is:

the inverse ofux
uy
uz
0
vx
vy
vz
0
nx
ny
nz
0
0
0
0
1
Eq11

It is straightforward to show that if the columns of a matrix are vectors that are orthogonal to each other, the inverse of such a matrix is just its transpose. The columns of our matrix are orthogonal because they define a valid right-hand coordinate system where each axis is at a right angle to the other two axes. Therefore, the inverse is trivial to obtain – you interchange the rows and columns.

the inverse ofux
uy
uz
0
vx
vy
vz
0
nx
ny
nz
0
0
0
0
1
=ux
vx
nx
0
uy
vy
ny
0
uz
vz
nz
0
0
0
0
1
Eq12

lookat Implementation

Below is a JavaScript implementation of the lookat function. It simply implements the math we just discussed. Note that the variables V, center, eye, up, u, v, and n are class objects that were created once when the GlMatrix4z4 object was created. These objects are reused on each call to lookat.

self.lookAt = function (M, eye_x, eye_y, eye_z, center_x, center_y, center_z, up_dx, up_dy, up_dz) {

  // Local coordinate system for the camera:
  //   u maps to the x-axis
  //   v maps to the y-axis
  //   n maps to the z-axis

  V.set(center, center_x, center_y, center_z);
  V.set(eye, eye_x, eye_y, eye_z);
  V.set(up, up_dx, up_dy, up_dz);

  V.subtract(n, eye, center);  // n = eye - center
  V.normalize(n);

  V.crossProduct(u, up, n);
  V.normalize(u);

  V.crossProduct(v, n, u);
  V.normalize(v);

  let tx = - V.dotProduct(u,eye);
  let ty = - V.dotProduct(v,eye);
  let tz = - V.dotProduct(n,eye);

  // Set the camera matrix
  M[0] = u[0];  M[4] = u[1];  M[8]  = u[2];  M[12] = tx;
  M[1] = v[0];  M[5] = v[1];  M[9]  = v[2];  M[13] = ty;
  M[2] = n[0];  M[6] = n[1];  M[10] = n[2];  M[14] = tz;
  M[3] = 0;     M[7] = 0;     M[11] = 0;     M[15] = 1;
};

Literal Rendering of a Camera

To summarize, a camera transformation changes the position and orientation of a scene so that it is in front of a stationary camera that is at the origin looking down the -Z axis. This makes the succeeding stages of the graphics pipeline easier to perform. The WebGL program below demonstrates how a camera actually works. Please experiment with the program.

Manipulate the parameters of a lookAt() created virtual camera.
The left canvas shows the scene as seen by a stationary camera.
The right canvas shows the scene from the camera's vantage point.
Please use a browser that supports "canvas" Please use a browser that supports "canvas"
Manipulate the lookAt function's parameters:
lookAt(M, eye_x, eye_y, eye_z, center_x, center_y, center_z, up_dx, up_dy, up_dz)

eye (0.0, 0.0, 5.0) center (0.0, 0.0, 0.0) up <0.0, 1.0, 0.0>
X: -5.0 +5.0 X: -5.0 +5.0 X: -1.0 +1.0
Y: -5.0 +5.0 Y: -5.0 +5.0 Y: -1.0 +1.0
Z: -5.0 +5.0 Z: -5.0 +5.0 Z: -1.0 +1.0

Open this webgl demo program in a new tab or window

After your experimentation, hopefully you concur that manipulating this version of the demo program is much harder to visually understand – as compared to the WebGL program in the previous lesson. This illustrates an important concept.

Designing virtual cameras:

We conceptually design a 3D rendering by placing a virtual camera inside the scene at a specific location and orientation. The fact that the mathematical camera works differently than our conceptual model is fine. We think conceptually in 3D space and the computer does the hard math!

Glossary

orthogonal
Two vectors are orthogonal if the angle between them is 90 degrees.
maps to, -->
A mapping converts an element into another element.
transpose
An operation on a matrix that swaps rows with columns. Each M[i][j] element moves to the M[j][i] position.
orthogonal matrix
A matrix whose columns (or rows) form vectors that are orthogonal to each other. The inverse of an orthogonal matrix is just its transpose.

Self Assessment

    Q-157: A virtual camera transformation, as defined in this lesson, …
  • moves the scene in front on a stationary camera.
  • Correct. The camera is at the origin, looking down the -Z axis.
  • moves the camera to a desired view of the scene.
  • Incorrect. Conceptually, the camera is inside the scene at a specific location and orientation, but not mathematically.

    Q-158: Given a camera’s local coordinate system defined by vectors <u>, <v>, and <n>, what global axis does <u> map to?

  • global <x> axis
  • Correct.
  • global <y> axis
  • Incorrect. The <v> axis maps to the <y> axis
  • global <z> axis
  • Incorrect. The <n> axis maps to the <z> axis
    Q-159: Given a 4x4 transformation matrix, what must be true about the matrix for its inverse to be equal to its transpose?
  • The upper left 3x3 sub-matrix must define 3 vectors that are orthogonal to each other.
  • Correct.
  • Nothing. Given any 4x4 matrix, its transpose is always equal to its inverse.
  • Incorrect. Given any 4x4 matrix, its transpose is typically NOT equal to its inverse.
  • The matrix has to contain translation, scaling, and rotation.
  • Incorrect. In fact, the transformation can't contain translation and scaling.
  • The values along the diagonal have to all be 1.0.
  • Incorrect.
    Q-160: When conceptually designing a virtual camera, which is easier to do?
  • Conceptualize the camera inside the scene at a specific location looking towards a specific point.
  • Correct.
  • Conceptualize the scene as being moved in front of a stationary camera.
  • Incorrect. Experiment with the demo WebGL program above again!
Next Section - 7.3 - Camera Movement