Raytracing provides a powerful approach, which solves most of these problems. Although expensive, raytracing gives some of the most realistic results of any rendering technique. Instead of building a scene polygon by polygon, raytracing passes a ray from the eye through each pixel into the scene. The path of the ray is traced to see what object is strikes first and at what point (This automatically solves the hidden surface problem). From this point further rays are traced to all the light sources in the scene in order to determine how this point is illuminated. This determines if a point is in shadow. If the surface that has been struck is reflective and/or transparent, then further rays are cast in the direction of reflection or transmission. Then the contribution to surface shading of each of these rays is calculated. Each of these rays contributes to the eventual colour of the pixel.
Although raytracing can be applied to scenes modelled using polygon meshes, the most effective way to model scenes for raytracing purposes is Constructive Solid Geometry (CSG). CSG is the process of building solid objects from other solids. The three CSG operators are Union, Intersection, and Difference. Each operator acts upon two objects and produces a single object result. By combining multiple levels of CSG operators, complex objects can be produced from simple primitives. The union of two objects results in an object that encloses the space occupied by the two given objects. Intersection results in an object that encloses the space where the two given objects overlap. Difference is an order dependent operator; it results in the first given object minus the space where the second intersected the first.
A CSG system includes a number of primitive or generic objects from which complex scenes can be built. These objects usually include a unit sphere (centered on the origin, radius one), a unit cube, unit cone, unit cylinder, a plane etc. Transformations can be applied to these objects to rotate, scale (stretch) or twist (shear) by applying a transformation matrix. The transformations can be different in different axis, stretching a sphere along the x-axis produces an ellipsoid for example. When the primitive has been changed into the desired shape, it can be translated to a different position by adding a displacement vector. The resultant shapes can be combined using the CSG operators into compound objects. These compound objects can also be transformed and displaced and further combined. This process can created surprisingly complex and realistic models from a few simple building blocks. Each object or compound object can be given a number of properties, depending on the complexity of the raytracing engine. These include pigmentation, reflectivity, transparency and texture.
In the scene the position and properties (colour, intensity, shape etc) of light sources are specified. The objects in the scene are specified (usually) as a CSG model, the color, texture and other surface properties are also defined for the scene objects.
For each square (pixel) on the grid, a ray is built which extends from the eye through the centre of the square and into the scene. This ray is tested against all objects in the scene to determine which objects are intersected by the ray. This gives a list of points of intersection. The list is sorted and the intersection nearest the eye is determined. This point will be used for further processing. The colour returning from this point to the eye is calculated using an illumination model which takes into account the lights in the scene, the orientation of the surface and basic surface properties. This will be the final colour of the pixel. This is a simplified view of the process and these steps will produce images of the same quality as traditional mesh rendering.
To enhance realism a raytracer will cast secondary rays, which will add important details such as shadow, transparency and reflection to the scene.
A ray can be given as given as;
A transformation matrix, M, has been applied to our generic object and the object has been translated by d. So every point on our primitive has been transformed to ;
Instead of intersecting (t) with the transformed object , we will apply the reverse transformation to the ray by subtracting and applying ;
A useful theorem from linear algebra tells us that if a ray, (t), passes through a transformed point , at time , then the transformed ray (t) passes through the point at the same time, (This assumes that the transformation is the reverse of ). So, in order to find the point of intersection of a ray with an object, we reverse the transformation that created the object from a primitive and apply that reversed transformation to the ray. Then find the points of intersection of the transformed ray and the original primitive.
this can be written as the familiar quadratic the solution of which is;
The solutions to the quadratic give the time(s) of intersection. If this gives two real solutions, then the ray has intersected the sphere twice. If only one solution is found, then the ray is a tangent to the sphere. If no real solutions are found, then the ray does not intersect the sphere (if , then there are no real solutions). Once the intersection times (hit times) are known, they can be inserted into the original equation of the ray to calculate the co-ordinates of intersection.
If is zero then the ray is parallel with the plane and there is no intersection.
A generic square is a piece of the generic plane. First determine the co-ordinates of the ray intersecting the plane, and then test if the points' and co-ordinates lie outside the region of the square (usually ).
To test for intersection with this cylinder, a test must first be made against the infinite cylinder lying along the z-axis. This is determined by substituting the ray into the equation for the cylinder (the and are replaced by the and components of the ray respectively). This gives a quadratic of the form , which can be solved in the usuall method. If the solution(s) give hit time(s), with z-components between and , then the ray hits the walls of the unit cylinder.
The ray may also strike the unit cylinder by piercing the caps. A test must be made to determine points of intersection with the planes and , if the hit point lies within the radius of the cap , then the ray strikes the cylinder at that point.
The equation for an infinite double cone is , again hit points can be found by substituting , and by the , , and components of the ray and solving the resulting quadratic. Like the cylinder, the co-ord of the hit points must occur between and . Also a separate test must be made to determine if the base of the cone has been pierced.
As has been seen, computing hit points for 3D objects can be expensive. If we have a scene with many objects, for each ray from the eye we need to check for intersections with all objects. However each ray will usually strike a small percentage of the objects in a scene. It has been estimated that 95% of the time of a naive raytracer is spent determining intersections. There would be a significant gain in efficiency if there was a way to quickly eliminate objects which are not struck by a given ray without having to perform the expensive tests seen above. This can be done by using extents. An extent defines a simple area around each object which can be quickly tested for intersections. For primary rays (rays from the eye), each object is projected onto the viewplane and the minimum bounding rectangle (MBR) for each object is calculated before raytracing begins). The MBR is determined by finding the maximum and minimum and co-ordinates for the projected image. Each ray is then tested to see if it intersects the MBR, (this is done by checking if this current pixel is inside the MBR), if it does not, the object cannot possibly be intersected by the ray and is ignored. If the MBR is intersected, we can then apply a more precise intersection test to determine if and where the ray intersects the object in question.
This process can be improved by using union extents for clusters of small objects. This involves finding a MBR which encloses a number of objects. Using union extents, large numbers of objects can be eliminated with one quick test.
To determine a if a point is in shadow with respect to a particular light source, a secondary ray is generated. This ray starts at the hit point on the surface in question () and emanates to the light source (). This ray is then tested for intersections. If any intersections occur between t=0 and t=1, then the surface point is in shadow with respect to the objects intersected. This test is carried out for each of the light sources in the scene. Even is the scene contains only one object, a test for shadows must be carried out as an object can shadow itself. This process is a similar to testing the primary ray for intersections, except;
For a point on a surfaces which is reflective and/or transparent,
the light returning to the eye will have 5 components:
is the reflected light component. This is light arriving at the point along the direction of reflection.
The equation for this vector is;
Similarly is the refracted component, which arises from light transmitted through the material along the direction . When light arrives at a material boundary it is "bent" continuing it's travel along vector . Just as is a sum of various light contributions, and each arise from their own 5 components. These are computed by casting a secondary ray in the directions of reflection and refraction and then seeing what surface is hit first, the light components at that point are then computed, which in turn generates additional secondaries. In the figure 9.6, is the light arriving at the eye. It represents the sum of 3 components, reflected (), transmitted () and the local component (not shown).
The local component is made up of ambient, diffuse and specular reflections. Each reflected component is in turn made up of three components, e.g. is composed of and as well as the local components at that point. Each contribution is the sum of three others, this suggests a recursive approach to computing the light intensity. In principle, secondaries may be spawned ad-infinitum. However, in practise, raytracing usually spawns secondaries two levels deep, this limit may be locally increased for particularly reflective or transparent surfaces.
However, some objects, are not suitable for sphere bounding, also, calculating the minimum bounding sphere can be difficult. A more useful general purpose bounding volume is the 6-sided box. The box is constructed with it's sides aligned with the co-ordinate axis, which allows intersections to be calculated efficiently. This box is built like the projection extent. The minimum and maximum , and co-ordinate for all the vertices in the object are determined and these values from the sides of the box.
The ray in the diagram must pass through 4 intersections before
first striking part of the surface of the bowl . It is not
possible to simply use the first intersection as in simple
objects. In this case is the first intersection, but it does
not form part of the visible surface. An intersection list must be
build for each object;
Real surfaces do not reflect perfectly, reflections are slightly blurred because of small imperfections in the surface. For the same reason real materials do not transmit light perfectly. Shadows are never sharp because finite (real) light sources cause a penumbra shadow (figure 9.12).
One approach to model these effects is distributed raytracing. This method typically an number of secondary rays (usually 16) for each primary ray. When each ray strikes a surface, it is not reflected or transmitted exactly in the predicted direction, instead the ray is cast in a random direction close to the predicted ray. How close the cast ray corresponds to the predicted ray depends on the surface properties. Similarly, when casting rays to determine shadow, the rays are not cast exactly in the direction of the light source, but to random points close to the light source. These 16 rays are averaged to determine the colour of that pixel.
Traditional ray traced images typically have perfect focus throughout the scene. The human eye and cameras can only focus on a limited range in any given situation, this range is called the depth of field (DOF), figure 9.13. The depth of field of an image depends on the aperture (width) of the lens. Wide lenses allow much light through, which is necessary for low light situations, but wide lenses have very small depth of field. The centre of the DOF is controlled by the focal length of the lens. The human eye can change both the aperture and focal length of its lens by widening the pupil or stretching the lens. SLR cameras also have these features.
To achieve depth of field effects in raytracing, we must specify a focal plane, which is a plane parallel to the viewplane, this plane will be the centre of the required depth of field. All points on this plane will be in perfect focus. The focal plane is usually given as a distance from the viewplane (focal length). For each pixel, a primary ray is cast, the point at which this ray strikes the focal plane is called the focal point for this ray. An aperture value is given. This specifies the size of the region around the pixel from which further rays are cast, usually 4-16 additional rays. These rays are built so that they will intersect the focal point. Each of these rays are tested for intersections as normal and a colour is determined. The colour of the distributed rays is averaged and this colour is used for the pixel (figure 9.14).
Notice that the distributed rays converge on the focal point. Before and after the focal point, the rays are spread over an area called the circle of confusion. This circle increases in size as the rays move away from the focal point. The size of the circle represents the area of the scene that is involved in producing the colour for a particular pixel. The larger this area is, then the more blurred the image becomes, because more components of the scene contribute to the colour at that pixel. There is region in which the circle of confusion is smaller than the resolution (pixel area), this is the depth of field. Parts of the scene in this area are in focus. Making the aperture smaller increases the depth of field (figure 9.15).