Last UpdaApril 26, 2010Date -->

Once a 3D model has been completed, its co-ordinates need to be
converted to 2 dimensions in order to display the scene on a flat
computer monitor or to print it on paper. This process of converting from 3D to 2D is called
*projection*. The visual appearance of a 3D model depends on the position of the viewer (among other things), so
this must be taken into account when projecting a model. There are
two main types of projection available, parallel and perspective.

*Parallel projections*
are used by
drafters and engineers to create working drawings of an object
which preserves scale and shape. In parallel projection, image
points are found at the intersection of the view plane with a ray
drawn from the object point and having a fixed direction. The
direction of projection is the same for all rays (all rays are
parallel). A parallel projection is described by prescribing a
direction of projection vector $\vec{v}$
and a viewplane. The
object point $P$ is located at $(x,y,z)$ and we need to determine
the image point coordinates $P'(x',y',z')$ . If the projection
vector $ \vec{v}$
has the same direction as the viewplane normal,
then the projection is said to be *orthogonal*,
otherwise the projection is *oblique*.

*Perspective projections*
are
used to produce images which look natural. When we view scenes in
everyday life far away items appear small relative to nearer
items. This is called *perspective foreshortening*. A side
effect of perspective foreshortening is that parallel lines appear
to converge on a *vanishing point*. An important feature of
perspective projections is that it preserves straight lines, this
allows us to project only the end-points of 3D lines and then draw
a 2D line between the projected endpoints.

Perspective projection depends on the relative position of the
*eye* and the *viewplane*. In the usual arrangement the
eye lies on the z-axis and the viewplane is the $xy$ plane. To
determine the projection of a 3D point connect the point and the
eye by a straight line, where the line intersects the viewplane.
This intersection point is the projected point.

Perspective projections, while providing a realistic view of an
object, are rather restrictive. They require that the eye to lie
on a coordinate axis and that the viewplane must coincide with a
coordinate plane. If we wish to view an object from a different
point of view, we must rotate the model of an object. This causes
an awkward mix of modelling (describing the objects to be viewed)
and viewing (rendering a picture of the object). We will develop a
flexible method for viewing that is completely separate from
modling, this method is called the *synthetic camera*. A
synthetic camera is a way to describe a camera (or eye) positioned
and oriented in 3D space. The system has three principle
ingredients:

- A viewplane in which a window is defined.
- A coordinate system called viewing coordinate system(VCS) sometimes called the $UVN$ system.
- An eye defined in VCS.

The view plane is defined by a point on the plane called the
*View Reference Point*(VRP) and a normal to the viewplane
called the *View Plane Normal*(VPN). These are defined in the
world coordinate system. The viewing coordinate system is defined
as follows:

- The origin is the VRP.
- One axis of of the coordinate system is given by VPN, this is known as the $n$ axis.
- The 2nd axis is found from the View Up Vector(VUP), this is known as the $v$ axis.
- The third axis $u$ is calculated as $u=n\times v$ .

In order for a rendering application to achieve the required view, the user would need the specify the following parameters.

- The VRP $\big(\vec{r}=(r_{x},r_{y},r_{z})\big)$
- The VPN $\big(\vec{n}=(n_{x},n_{y},n_{z})\big)$
- The Upward Vector VUP $\vec{v}$

To Choose a VPN( $\vec{n}$ ), the user would simply select a point in the area of interest in the scene. The vector $\vec{n}$ is a unit vector, which can be calculated as follows;

The user should select some point in the scene which (s)he would like to appear as the center of the rendered view, call this point $\vec{scene}$ . The vector $ \vec{norm}$ , a vector lying along $\vec{n}$ can then be calculated:

\[\vec{norm}=\vec{scene}-\vec{VRP}\]

$\vec{n}$ must be a unit vector along $\stackrel{\longrightarrow}{\mathbf{norm}}$ ;

\[ \mbox{$\vec{n}$}=\frac{\mbox{$\stackrel{\longrightarrow}{\mathbf{norm}}$}}{|norm|} \]Finally the upward vector must be a unit vector perpendicular to $\vec{n}$ , let the user enter a vector $\stackrel{\longrightarrow}{\mathbf{up}}$ and allow the computer to calculate an appropriate vector $\vec{v}$ .

\[\begin{eqnarray*} \mbox{$\stackrel{\longrightarrow}{\mathbf{up'}}$}&=&\mbox{$\vec{up}$}-k\mbox{$\vec{n}$}\\ \mbox{$\vec{up'}$}.\mbox{$\vec{n}$}&=&0\\ (\mbox{$\vec{up}$}-k\mbox{$\vec{n}$}).\mbox{$\vec{n}$}&=&0\\ \mbox{$\vec{up}$}.\mbox{$\vec{n}$}&=&k\mbox{$\vec{n}$}\mbox{$\vec{n}$}=0\\ \mbox{$\vec{up}$}.\mbox{$\vec{n}$}&=&k|\mbox{$\vec{n}$}|^{2}\\ k&=&\frac{\mbox{$\vec{up}$}.\mbox{$\vec{n}$}}{|\mbox{$\vec{n}$}|^{2}}\\ k&=&\mbox{$\vec{up}$}.\mbox{$\vec{n}$}\\ \mbox{$\vec{up'}$}&=&\mbox{$\vec{up}$}-(\mbox{$\vec{up}$}.\mbox{$\vec{n}$})\mbox{$\vec{n}$}\\ \rm {finally;} \quad \mbox{$\vec{v}$}&=&\frac{\mbox{$\vec{up'}$}}{|\mbox{$\vec{up'}$}|}\\ \end{eqnarray*}\]The vector $\vec{u}$ can now be calulated $\vec{u}=\vec{n}\times \vec{v}$ . With the viewing cordinate system set up, a window in the viewplane can be defined by giving minimum and maximum $u$ and $v$ values. The centre of the window (CW) does not have to be the VRP. The eye can be given any position ($\vec{e}=(e_{u},e_{v},e_{n})$) in the viewing coordinate system. It is usually positioned at some negative value on the n-axis, $\vec{e}=(0,0,-e_{n})$.

The components of the synthetic camera can be changed to provide different views and animation effects;

- Moving the VRP along a given path will provide a sequence of views which will show a 'walk through' or a 'fly-by' of the object, with the viewer always looking at the same location.
- Changing the direction of the $\vec{VPN}$ is equivalent to swiveling your head or 'panning'.
- Changing the direction of the $\vec{v}$ vector, allows the user to see the object rotate in the viewplane, as if the user was tilting his/her head to the left or right.

Below is a Java Applet which demonstrates the effects of the synthetic camera. Modify the camera in the right panel. The left panel shows the view from the eye.

- Click and drag the $\stackrel{\longrightarrow}{\mathbf{VRP}}$ .
- Click and drag the $\stackrel{\longrightarrow}{\mathbf{eye}}$ .
- Click and drag the $\stackrel{\longrightarrow}{\mathbf{scene}}$ .

We have developed a method for specifying the location and orientation of the synthetic camera. In order to draw projections of models in this system we need to be able to represent our real-world coordinates in terms of $ \vec{u}\vec{v}\vec{n}$ .

Converting from one coordinate system to another:

\[\begin{eqnarray*} (x,y)&=&\vec{r}+a\vec{u}+b\vec{v}\\ &=&\vec{r}+\left(\begin{array}{ccccc}a&b \end{array} \right)\left(\begin{array}{ccccc}\vec{u}\\\vec{v}\end{array}\right)\\ \textrm{in three dimensions;}\\ (x,y,z)&=&\vec{r}+\left(\begin{array}{ccccc}a&b&c \end{array} \right) \left(\begin{array}{ccccc}\vec{u}\\\vec{v}\\\vec{n}\end{array} \right)\\ \textrm{let} \quad \mathbf{M} &=& \left(\begin{array}{ccccc}\mbox{$\vec{u}$}\\\mbox{$\vec{v}$}\\\mbox{$\vec{n}$}\end{array} \right)= \left(\begin{array}{ccccc} u_{x}&u_{y}&u_{z}\\ v_{x}&v_{y}&v_{z}\\ n_{x}&n_{y}&n_{z} \end{array} \right)\\ \textrm{and subtract $\vec{r}$ from both sides:}\\ (x,y,z)-\vec{r}&=&\left(\begin{array}{ccccc}a&b&c \end{array} \right)\mathbf{M}\\ \textrm{multiply across by $\mathbf{M}^{-1}$}\\ \textrm{and write $(x,y,z)$ as a vector, $\vec{p}$}\\ \left(\begin{array}{ccccc}a&b&c \end{array} \right)&=&(\vec{p}-\vec{r})\mathbf{M}^{-1}\\ \textrm{$\mathbf{M}$ is made up of orthogonal unit vectors; $\mathbf{M}^{-1}=\mathbf{M}^{T}$}\\ \left(\begin{array}{ccccc}a&b&c \end{array} \right)&=&\vec{p}\mathbf{M}^T-\vec{r}\mathbf{M}^T\\ \textrm{expanding $-\vec{r}\mathbf{M}^T$, we get;}\\ -\vec{r}\mathbf{M}^{T}&=&\left(\begin{array}{ccccc}-r_{x}&-r_{y}&-r_{z} \end{array} \right)\left(\begin{array}{ccccc} u_{x}&v_{x}&n_{x}\\ u_{y}&v_{y}&n_{y}\\ u_{z}&v_{z}&n_{z} \end{array} \right)\\ \end{eqnarray*}\]writing the above as a dot product of vectors;

\[ -\mbox{$\vec{r}$}\mbox{${\mathbf{M}}$}^{T}= \left(\begin{array}{ccccc}-\mbox{$\vec{r}$}.\mbox{$\vec{u}$}&-\mbox{$\vec{r}$}.\mbox{$\vec{v}$}&-\mbox{$\vec{r}$}.\mbox{$\vec{n}$} \end{array} \right) \]Combining the the above displacement with the matrix multiplication into a homogeneous matrix, we get;

\[\begin{eqnarray*} \left(\begin{array}{ccccc}a&b&c \end{array} \right)=\mbox{$\vec{p}$}\mbox{${\mathbf{M}}$}^{T}-\mbox{$\vec{r}$}\mbox{${\mathbf{M}}$}^{T}=\mbox{$\vec{p}$}\left(\begin{array}{ccccc} u_{x}&v_{x}&n_{x}&0\\ u_{y}&v_{y}&n_{y}&0\\ u_{z}&v_{z}&n_{z}&0\\ -\mbox{$\vec{r}$}.\mbox{$\vec{u}$}&-\mbox{$\vec{r}$}.\mbox{$\vec{v}$}&-\mbox{$\vec{r}$}.\mbox{$\vec{n}$}&1\end{array} \right) \end{eqnarray*}\]We will refer to the above matrix as$ \mathbf{\hat{A}}_{wv}$ (world to viewing coordinate transformation). We can now write our coordinate transform as:

\[\begin{eqnarray*} \left(\begin{array}{ccccc}a&b&c \end{array} \right)=\vec{p}\mathbf{\hat{A}}_{wv}\\ \end{eqnarray*}\]The above transformation can be reduced to three simpler relations for computation;

\[\begin{eqnarray*} a&=&(\mbox{$\vec{p}$}-\mbox{$\vec{r}$}).\mbox{$\vec{u}$}\\ b&=&(\mbox{$\vec{p}$}-\mbox{$\vec{r}$}).\mbox{$\vec{v}$}\\ c&=&(\mbox{$\vec{p}$}-\mbox{$\vec{r}$}).\mbox{$\vec{n}$} \end{eqnarray*}\]We now have a method for converting world coordinates to viewing coordinates of the synthetic camera. We need to transform all objects from world coordinates to viewing coordinates, this will simplify the later operations of clipping, projection etc. We should have a separate data structure to hold the viewing coordinates of an object. The model itself remains uncorrupted and we can have may different views of the model.

This equation is valid for values of t between 0 and 1. We wish to find the coordinates of the ray as it pierces the viewplane, this occurs when $r_{n}(t)=0$ , the best way to do this is to find what 'time' $t$ the ray strikes the viewplane so;

\[\begin{eqnarray*} r_{n}(t)=e_{n}(1-t')+p_{n}t'=0 \\ e_{n}-e_{n}t'+p_{n}t'=0\\ e_{n}-t'(e_{n}-p_{n})=0\\ e_{n}=t'(e_{n}-p_{n})\\ t'=\frac{e_{n}}{(e_{n}-p_{n})}\\ \textrm{substituting $t'$\ into $r_{u}(t)$\ and $r_{v}(t)$\ we get;}\\ r_{u}(t')=u'=e_{u}(1-t')+p_{u}t'\\ u'=e_{u}\left(1-\frac{e_{n}}{(e_{n}-p_{n})}\right)+p_{u} \frac{e_{n}}{(e_{n}-p_{n})}\\ \textrm{rearranging gives;}\\ u'=\frac{p_{u}e_{n}-e_{u}p_{n}}{(e_{n}-p_{n})}\\ \textrm{similarly for $v'$;}\\ v'=\frac{p_{v}e_{n}-e_{v}p_{n}}{(e_{n}-p_{n})} \end{eqnarray*}\]This gives us coordinates of the point $(u,v,n)$ when projected on to the view plane. If the eye is on the n-axis, which is the usual case, then both $e_{u}$ & $e_{v}$ are zero, thus $u'$ and $v'$ simplify to;

\[ u'=\frac{p_{u}e_{n}}{(e_{n}-p_{n})} \qquad v'=\frac{p_{v}e_{n}}{(e_{n}-p_{n})} \]Note that $u'$ and $v'$ do not depend on t, this means that every point on the ray projects to the same point on the viewplane. Even points behind the eye ($t<0$ ) are projected to the same point on the viewplane. These points will be eliminated later.

When manipulating 3D entities it is useful to have an additional
quantity which retains a measure of depth of a point. As our
analysis stands we have lost information about the depth of the
points because all points are projected onto the viewplane with a
depth of zero. We would like to have something which preserves the
depth ordering of points, this quantity will be called
*pseudodepth*, and to simplify later calculation we will
define it as;

An increase in actual depth $p_{n}$ causes an increase in $n$ as required. The simplified equations for $u'$ ,$v'$ and $n'$ can be re- written as follows:

\[\begin{eqnarray*} u'=\frac{p_{u}e_{n}}{(e_{n}-p_{n})}=\frac{p_{u} e_{n}}{e_{n}(1-\frac{p_{n}}{e_{n}})}&=&\frac{p_{u}}{(1-\frac{p_{n}}{e_{n}})}\\ v'=\frac{p_{v}e_{n}}{(e_{n}-p_{n})}=\frac{p_{v} e_{n}}{e_{n}(1-\frac{p_{n}}{e_{n}})}&=&\frac{p_{v}}{(1-\frac{p_{n}}{e_{n}})}\\ n'=\frac{p_{n}e_{n}}{(e_{n}-p_{n})}=\frac{p_{n} e_{n}}{e_{n}(1-\frac{p_{n}}{e_{n}})}&=&\frac{p_{n}}{(1-\frac{p_{n}}{e_{n}})} \end{eqnarray*}\]
We can now write a matrix to implement the above transformation,
this is called the *Perspective Transformation*:

The projection P' of a point P can now be written as:

\[ \mbox{$\vec{p}$}'=(p_{u}p_{v}p_{n}1)\mbox{${\mathbf{\hat{M}}}$}_{p} \]

At this stage we have a method for transforming a point from world-coordinates to viewing coordinates and then projecting that point onto the view plane, i.e.

\[ \vec{P}'(p_{u}',p_{v}',p_{n}',1)=\vec{P}_{xyz}\mbox{${\mathbf{\hat{A}}}$}_{wv} \mbox{${\mathbf{\hat{M}}}$}_{p} \]It is now possible to combine the coordinate transformation and projection transformation into one matrix.

\[\begin{eqnarray*} \mathbf{\hat{M}}_{wv}={\mathbf{\hat{A}}}_{wv}{\mathbf{\hat{M}}}_{p}=\left(\begin{array}{ccccc} u_{x} & v_{x} & n_{x}&-\frac{n_{x}}{e_{n}} \\ u_{y} & v_{y} & n_{y}&-\frac{n_{y}}{e_{n}} \\ u_{z} & v_{z} & n_{z}&-\frac{n_{z}}{e_{n}} \\ -\vec{r}.\vec{u} &-\vec{r}.\vec{v} &-\vec{r}.\vec{n} &1-\frac{\vec{r}.\vec{n}}{e_{n}} \end{array} \right) \end{eqnarray*}\]The human brain perceives depth in a scene because we have two eyes separated in space, so each eye "sees" a slightly different view, and the brain uses these differences to estimate relative distance. These two views can be artificially generated by setting up a synthetic camera with two "eyes", each offset slightly from the n-axis. Each eye will result in a different projection. If each projection is displayed to the user in a different colour, and the user has appropriately filtered glasses, the 2D display will appear to have depth. Other 3D viewing system include Virtual reality headsets which have two in built displays, one for each eye or LCD goggles, the goggles block the right eye when the left eye image is being displayed on a large screen and visa- versa, this cycle must occur 50 times a second, if the animation is to be smooth.

We must define precisely the region in space that is to be projected and drawn. This region is called the view-volume. In the general case, only a small fraction of the model falls within the field of view of the camera, The part of the model that falls outside of the cameras views must be identified and discarded as soon as possible to avoid unnecessary computation.

The view volume is defined in viewing coordinates. The eye and the
window defined on the view plane, together define a double sided
pyramid extending forever in both directions. To limit the view
volume to a finite size, we can define a front plane $n=F$ and a
back plane $n=B$ these are sometimes known as the *hither* and
*yon* planes. Now the view volume becomes a *frustum*
(truncated pyramid).

We will later develop a clipping algorithm which will clip any part of
the world which lies outside of the view volume. The effect of
clipping to the front plane is to remove objects that lie behind
the eye or too close to it. The effect of clipping to the back
plane is to remove objects that are too far away, and would appear
as indistinguishable spots. We can move the font and back plane
close to each other to produce "cutaway" drawings of complex
objects. Clipping against a volume like a frustum would be a
complex process, but if we apply the perspective transformation
too all our points, the clipping process will become trivial. The
view volume is defined after the matrix $\mathbf{\hat{M}}_{wv}$
has been
applied to each point in world coordinates. The effect of applying
the perspective transformation is called *pre-warping*. If we
apply pre-warping to the view volume, it gets distorted into a more managable shape.

We will first examine pre-warping effects on key points in the view volume. First we need to calculate the v-coordinate $v_{2}$ of $P_{2}(u_{2},v_{2},n_{2})$ is an arbitrary point lying on the line from the eye through $P_{1}(0,w_{t},0)$ , where$ w_{t}$ represents the top of the window defined on the view plane:

\[\begin{eqnarray*} \textrm{From the equation of a line; } v&=&m(n-n_{1})+v_{1} \textrm{ and};\\ m&=&\frac{w_{t}-e_{v}}{0-e_{n}}\\ \textrm{we get}\\ v_{2}&=&\left(\frac{w_{t}-e_{v}}{-e_{n}}\right)(n_{2}-0)+w_{t}=\left(\frac{w_{t}-e_{v}}{-e_{n}}\right)n_{2}+w_{t}\\ \textrm{so} P_{2}&=&\left(0,\left(\frac{w_{t}-e_{v}}{-e_{n}}\right)n_{2}+w_{t},n_{2}\right)\\ \textrm{if we now apply pre-warping to the $v$\ coordinate of $P_{2}$;}\\ v_{2}'&=&\frac{p_{v}e_{n}-e_{v}p_{n}}{e_{n}-p_{n}}\\ v_{2}'&=&\frac{\left(\left(\frac{w_{t}-e_{v}}{-e_{n}}\right)n_{2}+w_{t}\right)e_{n} -e_{v}n_{2}}{e_{n}- n_{2}}\\ v_{2}'&=&\frac{-w_{t}n_{2}+e_{v}n_{2} +w_{t}e_{n}-e_{v}n_{2}}{e_{n}- n_{2}}\\ v_{2}'&=&\frac{w_{t}(e_{n}-n_{2})}{e_{n}- n_{2}}\\ v_{2}'&=&w_{t}\\ \end{eqnarray*}\]So prewarping $P_{2}$ gives us the point which lies on the plane $v=w_{t}$ .

Therefore the effect of pre-warping is to transform all points on the plane representing the top of the view volume to a point on the plane $v=w_{t}$ . this plane is parallel with the $un$ plane.

It can be similarly shown that the other three side of the view volume are transformed to planes parallel to the coordinate planes.

If we take a point on the back plane $P_{3}(u_{3},v_{3},B)$ and apply prewarping to the n-coordinate;

\[ n_{3}'=\frac{p_{n}e_{n}}{(e_{n}-p_{n})}=\frac{Be_{n}}{(e_{n}-B)}=\frac{B}{1-\frac{B}{e_{n}}} \]so we can see that the back plane has been moved to the plane $n=\frac{B}{1-\frac{B}{e_{n}}}$ . This plane is parallel to the original plane.

Similarly the front plane will have been moved to $n=\frac{F}{1-\frac{F}{e_{n}}}$ .

Applying prewarping to the eye gives $n'=\frac{e_{n}e_{n}}{(e_{n}-e_{n})}=\infty$ . This means that the eye has been moved to infinity.

In summary, pre-warping has moved the walls of the frustum shaped view volume to the following planes;

\[\begin{eqnarray*} u&=&w_{l}\\ u&=&w_{r}\\ v&=&w_{t}\\ v&=&w_{b}\\ n&=&\frac{F}{1-\frac{F}{e_{n}}}\\ n&=&\frac{B}{1-\frac{B}{e_{n}}}\\ \end{eqnarray*}\]Note that each of these planes are parallel to the coordinate axis.The frustum shaped view volume has become a parallelpiped.>

The final stage of the transformation process is to map the projected points to their final position on the viewport on screen. We will combine this viewvolume-viewport mapping with the pre-warping matrix, this will allow us to perform all the necessary calculation to transform a point in world coordinates to pixel coordinates on screen in one matrix multiplication.

The u and v coordinates will be converted to x and y screen coordinates and to simplify calculation later we will scale the n coordinates (pseudo-depth) to a range between 0 and 1 (scale the front plane to 0 and the back plane to 1).

First we need to translate the view-volume to the origin, this can be done by applying the following translation matrix;

\[ \mbox{${\mathbf{\hat{T}}}$}_{1}=\left(\begin{array}{ccccc}1&0&0&0 \\0&1&0&0 \\0&0&1&0 \\ -v_{l} &-v_{b} & -\frac{e_{n}F}{F-e_{n}} & 1 \end{array} \right) \]Next the view-volume needs to be scaled to the width and height of the window. At this stage we will normalize the pseudo-depth to a range of 0 to 1. To scale the n-coordinate ,we need to scale by;

\[ \frac{1-0}{\frac{e_{n}B}{e_{n}-B}-\frac{e_{n} F}{e_{n}-F}}=\frac{1}{\frac{e_{n}B(e_{n}-F)-e_{n} F(e_{n}-B)}{(e_{n}-B)(e_{n}-F)}}= \frac{(e_{n}-B)(e_{n}-F)}{e_{n}^{2}(B-F)} \]Therefore the scaling matrix required is;

\[ \mbox{${\mathbf{\hat{S}}}$}=\left(\begin{array}{ccccc}\frac{w_{r}-w_{l}}{v_{r}-v_{l}}&0&0&0 \\0&\frac{w_{t}-w_{b}}{v_{t}-v_{b}}&0&0 \\ 0&0&\frac{(e_{n}-B)(e_{n}-F)}{e_{n}^{2}(B-F)}&0 \\ 0&0&0& 1 \end{array} \right) \]Finally we need to translate the scaled view volume to the position of the viewport;

\[ \mbox{${\mathbf{\hat{T}}}$}_{2}=\left(\begin{array}{ccccc} 1&0&0&0\\0&1&0&0\\0&0&1&0\\w_{l}&w_{b}&0&1 \end{array} \right) \]Combining the above three transfomations gives us the Normalization Matrix;

\[ \mbox{${\mathbf{\hat{N}}}$}=\mbox{${\mathbf{\hat{T}}}$}_{1}\mbox{${\mathbf{\hat{S}}}$}\mbox{${\mathbf{\hat{T}}}$}_{2} \]We can now combine all our transformations into one overall
matrix, which will convert a point from world coordinates to
*Normalized Device Coordinates*(NDC) while retaining a
representation of the relative depth of the points.

This will give us the view-volume in its final configuration,
called the *Canonical View Volume*.

© Ken Power 1996-2016