CS424 Notes, 7 March 2012

Projection in Context
- Remember that projection maps eye coordinates to clip coordinates. Projection is the last transformation that you apply to a vertex. The projection map just has to pick out a box -- rectangular or pyramidal -- whose contents will be visible when the scene is rendered. Before projection is applied, the vertex is in a coordinate system in which the eye is at the origin, looking down the z-axis. In eye coordinates, the box is given in terms of limits on x, y, and z. In particular, the near and far distances specify z-values. In terms of world coordinates, however, the box can be anywhere. The near and far distances are not z-coordinates but they still specify distances from the viewer. Similarly, the left and right limits are no longer x-coordinates, but they still specify distances to the left or right of the viewer.
- To get gl_Position in the vertex shader, a vertex is first multiplied by the modelview transformation then by the projection transformation. This might look like:
```
    gl_Position = projection * modelview * vec4(vertexCoords,1.0);
```
  However, since this multiplication will be done for every vertex, it is probably more efficient to multiply the projection by the modelview matrix in JavaScript and pass the combined matrix as a single uniform variable in the vertex shader. Then gl_Position might be computed as
```
    gl_Postion = projmodelview * vec4(vertexCoords,1.0);
```
  However, as mentioned previously, the modelview matrix might still be necessary in the vertex or fragment shader, since it is used in lighting calculations.
- The function mat4.multiply(projection,modelview,projmodelview) can do the multiplication in JavaScript, leaving projection and modelview unmodified and putting the result in projmodelview.
mat4.frustum
- mat4.frustum(left,right,bottom,top,near,far) creates a projection matrix for a perspective projection in which the view volume is a truncated pyramid. The pyramid extends along the negative z-axis (in eye coordinates) from -near to -far. left and right give the limits on the x-coordinate at the near distance, that is at the top of the pyramid. Similarly, bottom and top give the limits on the y-coordinate at the near distance.
- near and far should be chosen so that everything that you want to see in the scene is included in that range of distances from the eye. However, you should not just make near tiny and far huge. The reason is that the entire range of values from near to far is mapped onto the depth buffer, which generally only has 16 bits per pixel. That means that the depth buffer can only distinguish 65,536 different depths. Suppose that far - near is 65,536 and that your scene actually occupies only one unit in the z-direction. In that case, every point in your scene is at the same depth, as far as the depth buffer is concerned, and the depth buffer algorithm becomes completely useless! The moral is to use a near/far range that is reasonably well adapted to the actual dimensions of your scene.
- The fact that left, right, bottom, and top represent distances at the near distance is unfortunate. It would be nicer if they represented distances at the center of view, that is, at the distance from eye to center in mat4.lookAt. However, viewing and projection are completely separate operations for mat4, and there is no connection between them. Fortunately, there is a simple conversion that you can use: Suppose that d is the distance from the eye to the center of view and that you want the visible ranges of x and y values at that distance to be xmin to xmax and ymin to ymax. Then the values of left, right, bottom, and top for mat4.frustum can be computed as
```
    left   = (near/d)*xmin;
    right  = (near/d)*xmax;
    bottom = (near/d)*ymin;
    top    = (near/d)*ymax;
```
- Note that the x and y ranges are usually symmetric about about 0. That is left = -right and bottom = -top. However, there is no requirement that this be true. If it's not, then the view volume is a tilted pyramid in which the axis is not perpendicular to the base. (I find it easier not to think about this possibility.)
mat4.perspective
- mat4.perspective(fovy,aspect,near,far) is an alternative function for creating a perspective projection matrix. In this method, near and far have the same meanings as in mat4.frustum. The first parameter, fovy, is the "field of view in the y direction". It represents an angle, measured in degrees between the top plane of the view pyramid and the bottom plane. The value must be between 0 and 180; a typical value is 45. The aspect parameter is the ratio between the horizontal size of the pyramid, from left to right, and the vertical size, measured from bottom to top. The aspect is almost always set to match the aspect ratio of the "viewport" where the image is drawn; for WebGL, this generally means canvas.width/canvas.height.
- Note that mat4perspective always produces a view in which the x and y ranges are symmetric about 0.
- There is no similar function in standard OpenGL, but one is provided in the common OpenGL utility library GLU. the name of the function in that library is gluPerspective.
- The definition of mat4.perspective in gl-matrix.js is very simple:
```
    mat4.perspective = function (fovy, aspect, near, far, dest) {
       var top = near * Math.tan(fovy * Math.PI / 360.0),
           right = top * aspect;
       return mat4.frustum(-right, right, -top, top, near, far, dest);
    };
```
  (The mathematicians in the audience can figure this one out, if they want.)
mat4.ortho
- mat4.ortho(left,right,top,bottom,near,far) sets up an orthographic projection in which the view volume is a rectangular solid. The first four parameters give the limits on x and y (in eye coordinates); since this is an orthographic projection, the distances don't depend on the distance from the eye. The last two parameters give the limits on z, relative to the eye. That is, a point is in the view volume if its z-coordinate, in eye coordinates, is between -near and -far.
- The odd thing about this is that there really is no natural concept of eye position for an orthographic projection. There is no point at which all the lines of view converge (or, if there is one, it is a point at infinite distance). The "eye" is really just the origin in the eye coordinate system.
- In fact, there is no requirement that near and far be positive numbers. For example, in mat4.ortho(-5,5,-5,-5,5), the view volume extends from -5 to 5 in the x, y, and z directions (in eye coordinates). The surprise here is that the first z-limit, -5, actually refers to the z-coordinate 5 on the positive z-axis, while the second z-limit, 5, actually refers to the z-coordinate -5 on the negative z-axis. Remember that near and far are given in terms of distance from the eye, not in terms of z-coordinates. For symmetric view volumes, the distinction doesn't matter, but for mat4.ortho(-5,5,-5,5,5,10), you need to remember that the z-limits, 5 to 10, correspond to eye-coordinate z-values from -5 to -10. It's easiest to think of them as meaning that things between 5 and 10 units in front of the eye are visible in the scene.
- In the example from last time, the view and projection are set up with
```
         modelview = mat4.lookAt( [5,5,10], [0,0,0], [0,1,0] );
    and
         projection = mat4.ortho(-3,3,-3,3,9,15);
```
  The distance from the eye, [5,5,10], to the view center, [0,0,0], is about 12. (More accurately, about 12.247.) The near and far distances are chosen to be about three units on either side of this distance, so that the view center is about in the center of the view volume and the range of z values is the same size as the range of x and y. (What would happen if you changed "ortho" to "frustum" for the projection matrix?)
Homogeneous Coordinates
- It's time to start talking about the fourth coordinate!
- To represent affine transforms in 3D, we use 4-by-4 matrices and vectors with four coordinates, (x,y,z,w). To transform (x,y,z), add a 1 in the w-position to get (x,y,z,1). Multiply by the affine transform matrix, and then discard the 1 in the resulting vector.
- However, perspective projection is not an affine transform! When you multiply a vector (x,y,z,1) by a perspective projection matrix, the resulting vector will have a w-coordinate that is not equal to 1. So, how can we interpret the result as a point in 3D?
- The answer is provided by homogeneous coordinates. The short answer is that a vector (x,y,z,w), where w is any non-zero number, represents the point (x/w,y/w,z/w) in three-dimensional space. In homogeneous coordinates, the point (1,2,3) could be represented by (1,2,3,1), (2,4,6,2), (-1,-2,-3,-1), (0.5,1,1.5,0.5), or even (π,π*2,π*3,π). The division by w to give (x/w,y/w,z/w) is called perspective division when it is done in connection with perspective transformations.
- Graphics hardware is perfectly capable of dealing with vectors given in homogeneous coordinates. In fact, homogeneous coordinates are its native language. When you compute gl_Position using a perspective projection matrix, the result will very likely have a w-coordinate that is not equal to one. You do not have to worry about this; the graphics hardware will do the perspective division and get the correct point.
- Vectors of the form (x,y,z,0), in which the w-coordinate is 0, do not represent points in ordinary three-dimensional space. However, this vector can be interpreted as representing a "point at infinity in the direction of (x,y,z)." In fact, this interpretation is used in standard OpenGL when specifying the position of a light. A light whose position is a "point at infinity" is a directional light, in which all the rays of light come from the same direction and are parallel to each other.
If we have any extra time, we'll discuss some examples of modeling, viewing, and projecting in 3D