Section 2.2
Into the Third Dimension
As we move our drawing into the third dimension, we have to deal with the addition of a third coordinate, a z-axis in addition to an x-axis and a y-axis. There's quite a bit more to it than that, however, since our goal is not just to define shapes in 3D. The goal is to make 2D images of 3D scenes that look like pictures of three-dimensional objects. Take a look, for example, at this picture of a set of three coordinate axes:
It's not too hard to see this as a picture of three arrows pointing in different directions in space. If we just drew three straight lines to represent the axes, it would look like nothing more than three lines on a 2D surface. In the above picture, on the other hand, each axis is actually built from a long, thin cylinder and a cone. Furthermore, the shapes are shaded in a way that imitates the way that light would reflect from curved shapes. Without this simulated lighting, the shapes would just look like flat patches of color instead of curved 3D shapes. (This picture was drawn with a short OpenGL program. Take a look at the sample program Axes3D.java if you are interested.)
In this section and in the rest of the chapter, we will be looking at some of the fundamental ideas of working with OpenGL in 3D. Some things will be simplified and maybe even oversimplified as we try to get the big picture. Later chapters will fill in more details.
2.2.1 Coordinate Systems
Our first task is to understand 3D coordinate systems. In 3D, the x-axis and the y-axis lie in a plane called the xy-plane. The z-axis is perpendicular to the xy-plane. But you see that we already have a problem. The origin, (0,0,0), divides the z-axis into two parts. One of these parts is the positive direction of the z-axis, and we have to decide which one. In fact, either choice will work.
In OpenGL, the default coordinates system identifies the xy-plane with the computer screen, with the positive direction of the x-axis pointing to the right and the positive direction of the y-axis pointing upwards. The z-axis is perpendicular to the screen. The positive direction of the z-axis points out of the screen, towards the viewer, and the negative z-axis points into the screen. This is a right-handed coordinate system: If you curl the fingers of your right hand in the direction from the positive x-axis to the positive y-axis, then your thumb points in the direction of the positive z-axis. (In a left-handed coordinate system, you would use your left hand in the same way to select the positive z-axis.)
This is only the default coordinate system. Just as in 2D, you can set up a different coordinate system for use in drawing. However, OpenGL will transform everything into the default coordinate system before drawing it. In fact, there are several different coordinate systems that you should be aware of. These coordinates systems are connected by a series of transforms from one coordinate system to another.
The coordinates that you actually use for drawing an object are called object coordinates. The object coordinate system is chosen to be convenient for the object that is being drawn. A modeling transformation can then be applied to set the size, orientation, and position of the object in the overall scene (or, in the case of hierarchical modeling, in the object coordinate system of a larger, more complex object).
The coordinates in which you build the complete scene are called world coordinates. These are the coordinates for the overall scene, the imaginary 3D world that you are creating. Once we have this world, we want to produce an image of it.
In the real world, what you see depends on where you are standing and the direction in which you are looking. That is, you can't make a picture of the scene until you know the position of the "viewer" and where the viewer is looking (and, if you think about it, how the viewer's head is tilted). For the purposes of OpenGL, we imagine that the viewer is attached to their own individual coordinate system, which is known as eye coordinates. In this coordinate system, the viewer is at the origin, (0,0,0), looking in the direction of the negative z-axis (and the positive direction of the y-axis is pointing straight up). This is a viewer-centric coordinate system, and it's important because it determines what exactly is seen in the image. In other words, eye coordinates are (almost) the coordinates that you actually want to use for drawing on the screen. The transform from world coordinates to eye coordinates is called the viewing transform.
If this is confusing, think of it this way: We are free to use any coordinate system that we want on the world. Eye coordinates are the natural coordinate system for making a picture of the world as seen by a viewer. If we used a different coordinate system (world coordinates) when building the world, then we have to transform those coordinates to eye coordinates to find out what the viewer actually sees. That transformation is the viewing transform.
Note, by the way, that OpenGL doesn't keep track of separate modeling and viewing transforms. They are combined into a single modelview transform. In fact, although the distinction between modeling transforms and viewing transforms is important conceptually, the distinction is for convenience only. The same transform can be thought of as a modeling transform, placing objects into the world, or a viewing transform, placing the viewer into the world. In fact, OpenGL doesn't even use world coordinates internally -- it goes directly from object coordinates to eye coordinates by applying the modelview transformation.
We are not done. The viewer can't see the entire 3D world, only the part that fits into the viewport, the rectangular region of the screen or other display device where the image will be drawn. We say that the scene is clipped by the edges of the viewport. Furthermore, in OpenGL, the viewer can see only a limited range of z-values. Objects with larger or smaller z-values are also clipped away and are not rendered into the image. (This is not, of course, the way that viewing works in the real world, but it's required by the way that OpenGL works internally.) The volume of space that is actually rendered into the image is called the view volume. Things inside the view volume make it into the image; things that are not in the view volume are clipped and cannot be seen. For purposes of drawing, OpenGL applies a coordinate transform that maps the view volume onto a cube. The cube is centered at the origin and extends from −1 to 1 in the x-direction, in the y-direction, and in the z-direction. The coordinate system on this cube is referred to as normalized device coordinates. The transformation from eye coordinates to normalized device coordinates is called the projection transformation. At this point, we haven't quite projected the 3D scene onto a 2D surface, but we can now do so simply by discarding the z-coordinate.
We still aren't done. In the end, when things are actually drawn, there are device coordinates, the 2D coordinate system in which the actual drawing takes place on a physical display device such as the computer screen. Ordinarily, in device coordinates, the pixel is the unit of measure. The drawing region is a rectangle of pixels. This is the rectangle that we have called the viewport. The viewport transformation takes x and y from the normalized device coordinates and scales them to fit the viewport.
Let's go through the sequence of transformations one more time. Think of a primitive, such as a line or polygon, that is part of the world and that might appear in the image that we want to make of the world. The primitive goes through the following sequence of operations:
- The points that define the primitive are specified in object coordinates, using methods such as glVertex3f.
- The points are first subjected to the modelview transformation, which is a combination of the modeling transform that places the primitive into the world and the viewing transform that maps the primitive into eye coordinates.
- The projection transformation is then applied to map the view volume that is visible to the viewer onto the normalized device coordinate cube. If the transformed primitive lies outside that cube, it will not be part of the image, and the processing stops. If part of the primitive lies inside and part outside, the part that lies outside is clipped away and discarded, and only the part that remains is processed further.
- Finally, the viewport transform is applied to produce the device coordinates that will actually be used to draw the primitive on the display device. After that, it's just a matter of deciding how to color individual pixels to draw the primitive on the device.
All this still leaves open the question of how you actually work with this complicated series of transformations. Remember that you just have to set up the viewport, modelview, and projection transforms. OpenGL will do all the calculations that are required to implement the transformations.
With Jogl, the viewport is automatically set to what is usually the correct value, that is, to use the entire available drawing area as the viewport. This is done just before calling reshape method of the GLEventListener. It is possible, however, to set up a different viewport by calling gl.glViewport(x,y,width,height), where (x,y) is the lower left corner of the rectangle that you want to use for drawing, width is the width of the rectangle, and height is the height. These values are given in device (pixel) coordinates. Note that in OpenGL device coordinates, the minimal y value is at the bottom, and y increases as you move up; this is the opposite of the convention in Java Graphics2D. You might use this, for example, to draw two or more views of the same scene in different parts of the drawing area, as follows: In your displayi> method, you would use glViewport to set up a viewport in part of the drawing area, and draw the first view of the scene. You would then use glViewport again to set up a different viewport in another part of the drawing area, and draw the scene again, from a different point of view.
The modelview transform is a combination of modeling and viewing transforms. Modeling is done by applying the basic transform methods glScalef, glRotatef, and glTranslatef (or their double precision equivalents). These methods can also be used for viewing. However, it can be clumsy to get the exact view that you want using these methods. In the next chapter, we'll look at more convenient ways to set up the view. For now, we will just use the default view, in which the viewer is on the z-axis, looking in the direction of the negative z-axis.
Finally, to work with the projection transform, you need to know a little more about how OpenGL handles transforms. Internally, transforms are represented as matrices (two-dimensional arrays of numbers), and OpenGL uses the terms projection matrix and modelview matrix instead of projection transform and modelview transform. OpenGL keeps track of these two matrices separately, but it only lets you work on one or the other of these matrices at a time. You select the matrix that you want to work on by setting the value of an OpenGL state variable. To select the projection matrix, use
gl.glMatrixMode(GL.GL_PROJECTION);
To select the modelview matrix, use
gl.glMatrixMode(GL.GL_MODELVIEW);
All operations that affect the transform, such as glLoadIdentity, glScalef, and glPushMatrix, affect only the currently selected matrix. (OpenGL keeps separate stacks of matrices for use with glPushMatrix and glPopMatrix, one for use with the projection matrix and one for use with the modelview matrix.)
The projection matrix is used to establish the view volume, the part of the world that is rendered onto the display. The view volume is expressed in eye coordinates, that is, from the point of view of the viewer. (Remember that the projection transform is the transform from eye coordinates onto the standard cube that is used for normalized device coordinates.) OpenGL has two methods for setting the view volume, glOrtho and glFrustum. These two methods represent two different kinds of projection, orthographic projection and perspective projection. Although glFrustum gives more realistic results, it's harder to understand, and we will put it aside until the next chapter. For now, we consider a simple version of glOrtho. We assume that we want to view objects that lie in a cube centered at the origin. We can use glOrtho to establish this cube as the view volume. If the cube stretches from −s to s in the x, y, and z directions, then the projection can be set up by calling
gl.glOrtho(-s,s,-s,s,-s,s);
In general, though, we want the view volume to have the same aspect ratio as the viewport, so we need to expand the view volume in either the x or the y direction to match the aspect ratio of the viewport. This can be done most easily in the reshape method, where we know the aspect ratio of the viewport. Remember that we must call glMartixMode to switch to the projection matrix. Then, after setting up the projection, we call glMatrixMode again to switch back to the modelview matrix, so that all further transform operations will affect the modelview transform. Putting this all together, we get the following reshape method:
public void reshape(GLAutoDrawable drawable, int x, int y, int width, int height) { GL gl = drawable.getGL(); double s = 1.5; // limits of cube that we want to view go from -s to s. gl.glMatrixMode(GL.GL_PROJECTION); gl.glLoadIdentity(); // Start with the identity transform. if (width > height) { // Expand x limits to match viewport aspect ratio. double ratio = (double)width/height; gl.glOrtho(-s*ratio, s*ratio, -s, s, -s, s); } else { // Expand y limits to match viewport aspect ratio. double ratio = (double)height/width; gl.glOrtho(-s, s, -s*ratio, s*ratio, -s, s); } gl.glMatrixMode(GL.GL_MODELVIEW); }
This method is used in the sample program Axes3D.java. Still, it would be nice to have a better way to set up the view, and so I've written a helper class that you can use without really understanding how it works. The class is Camera, and you can find it in the source package glutil along with several other utility classes that I've written. A Camera object takes responsibility for setting up both the projection transform and the view transform. By default, it uses a perspective projection, and the region with x, y, and z limits from −5 to 5 is in view. If camera is a Camera, you can change the limits on the view volume by calling camera.setScale(s). This will change the x, y, and z limits to range from −s to s. To use the camera, you should call camera.apply(gl) at the beginning of the display method to set up the projection and view. In this case, there is no need to define the reshape method. Cameras are used in the remaining sample programs in this chapter.
Note that even when drawing in 2D, you need to set up a projection transform. Otherwise, you can only use coordinates between −1 and 1. When setting up the projection matrix for 2D drawing, you can use glOrtho to specify the range of x and y coordinates that you want to use. The range of z coordinates is not important, as long as it includes 0. For example:
gl.glMatrixMode(GL.GL_PROJECTION); gl.glLoadIdentity(); gl.glOrtho(xmin, xmax, ymin, ymax, -1, 1); gl.glMatrixMode(GL.GL_MODELVIEW);
2.2.2 Essential Settings
Making a realistic picture in 3D requires a lot of computation. Certain features of this computation are turned off by default, since they are not always needed and certainly not for 2D drawing. When working in 3D, there are a few essential features that you will almost always want to enable. Often, you will do this in the init method.
Perhaps the most essential feature for 3D drawing is the depth test. When one object lies behind another, only one of the objects can be seen, and which is seen depends on which one is closer to the viewer, and not on the order in which the objects are drawn. Now, OpenGL always draws objects in the order in which they are generated in the code. However, for each pixel that it draws in an object, it firsts tests whether there is already another object at that pixel that is closer to the viewer than the object that is being drawn. In that case, it leaves the pixel unchanged, since the object that is being drawn is hidden from sight by the object that is already there. This is the depth test. "Depth" here really means distance from the user, and it is essentially the z-coordinate of the object, expressed in eye coordinates. In order to implement the depth test, OpenGL uses a depth buffer. This buffer stores one value for each pixel, which represents the eye-coordinate z-value of the object that is drawn at that pixel, if any. (There is a particular value that represents "no object here yet.") By default, the depth test is disabled. If you don't enable it, then things might appear in your picture that should really be hidden behind other objects. The depth test is enabled by calling
gl.glEnable(GL.GL_DEPTH_TEST);
This is usually done in the init method. You can disable the depth test by calling
gl.glDisable(GL.GL_DEPTH_TEST);
and you might even want to do so in some cases. (Note that any feature that can be enabled can also be disabled. In the future, I won't mention this explicitly.)
To use the depth test correctly, it's not enough to enable it. Before you draw anything, the depth buffer must be set up to record the fact that no objects have been drawn yet. This is called clearing the depth buffer. You can do this by calling glClear with a flag that indicates that it's the depth buffer that you want to clear:
gl.glClear(GL.GL_DEPTH_BUFFER_BIT);
This should be done at the beginning of the display method, before drawing anything, at the same time that you clear the color buffer. In fact, the two opererations can be combined into one method call, by "or-ing" together the flags for the two buffers:
gl.glClear(GL.GL_COLOR_BUFFER_BIT | GL.GL_DEPTH_BUFFER_BIT);
This might be faster than clearing the two buffers separately, depending on how the clear operations are implemented by the graphics hardware.
There is one issue with the depth test that you should be aware of. What happens when two objects are actually at the same distance from the user? Suppose, for example, that you draw one square inside another, lying in the same plane. Does the second square appear or not? You might expect that the second square that is drawn might appear, as it would if the depth test were disabled. The truth is stranger. Because of the inevitable inexactness of real-number computations, the computed z-values for the two squares at a given pixel might be different -- and which one is greater might be different for different pixels. Here is a real example in which a black square is drawn inside a white square which is inside a gray square. The whole picture is rotated a bit to force OpenGL to do some real-number computations. The picture on the left is drawn with the depth test enabled, while it is disabled for the picture on the right:
When the depth test is applied, there is no telling which square will end up on top at a given pixel. A possible solution would be to move the white square a little bit forward and the black square forward a little bit more -- not so much that the change will be visible, but enough to clear up the ambiguity about which square is in front. This will enable the depth test to produce the correct result for each pixel, in spite of small computational errors.
The depth test ensures that the right object is visible at each pixel, but it's not enough to make a 3D scene look realistic. For that, you usually need to simulate lighting of the scene. Lighting is disabled by default. It can be enabled by calling
gl.glEnable(GL.GL_LIGHTING);
This is possibly the single most significant command in OpenGL. Turning on lighting changes the rendering algorithm in fundamental ways. For one thing, if you turn on lighting and do nothing else, you won't see much of anything! This is because, by default, no lights are turned on, so there is no light to illuminate the objects in the scene. You need to turn on at least one light:
gl.glEnable(GL.GL_LIGHT0);
This command turns on light number zero, which by default is a white light that shines on the scene from the direction of the viewer. It is sufficient to give decent, basic illumination for many scenes. It's possible to add other lights and to set properties of lights such as color and position. However, we will leave that for Chapter 4.
Turning on lighting has another major effect: If lighting is on, then the current drawing color, as set for example with glColor3f, is not used in the rendering process. Instead, the current material is used. Material is more complicated than color, and there are special commands for setting material properties. The default material is a rather ugly light gray. We will consider material properties in Chapter 4.
To get the best effect from lighting, you will also want to use the following command:
gl.glShadeModel(GL.GL_SMOOTH);
This has to do with the way that interiors of polygons are filled in. When drawing a polygon, OpenGL does many calculations, including lighting calculations, only at the vertices of the polygons. The results of these calculations are then interpolated to the pixels inside the polygon. This can be much faster than doing the full calculation for each pixel. OpenGL computes a color for each vertex, taking lighting into account if lighting is enabled. It then has to decide how to use the color information from the vertices to color the pixels in the polygon. (OpenGL does the same thing for lines, interpolating values calculated for the endpoints to the rest of the line.) The default, which is called flat shading, is to simply copy the color from the first vertex of the polygon to every other pixel. This results in a polygon that is a uniform color, with no shading at all. A better solution, called smooth shading smoothly varies the colors from the vertices across the face of the polygon. Setting the shade model to GL_SMOOTH tells OpenGL to use smooth shading. You can return to flat shading by calling
gl.glShadeModel(GL.GL_FLAT);
Putting all the essential settings that we have talked about here, we get the following init method for 3D drawing with basic lighting:
public void init(GLAutoDrawable drawable) { GL gl = drawable.getGL(); gl.glClearColor(0,0,0,1); // Set background color. gl.glEnable(GL.GL_LIGHTING); // Turn on lighting. gl.glEnable(GL.GL_LIGHT0); // Turn on light number 0. gl.glEnable(GL.GL_DEPTH_TEST); // Turn on the depth test. gl.glShadeModel(GL.GL_SMOOTH); // Use smooth shading. }
Let's look at a picture that shows some of the effects of these commands. OpenGL has no sphere-drawing command, but we can approximate a sphere with polygons. In this case, a rather small number of polygons is used, giving only a rough approximation of a sphere. Four spheres are shown, rendered with different settings:
In the sphere on the left, only the outlines of the polygons are drawn. This is called a wireframe model. Lighting doesn't work well for lines, so the wireframe model is drawn with lighting turned off. The color of the lines is the current drawing color, which has been set to white using glColor3f.
The second sphere from the left is drawn using filled polygons, but with lighting turned off. Since no lighting or shading calculations are done, the polygons are simply filled with the current drawing color, white. Nothing here looks like a 3D sphere; we just see a flat patch of color.
The third and fourth spheres are drawn with lighting turned on, using the default light gray material color. The difference between the two spheres is that flat shading is used for the third sphere, while smooth shading is used for the fourth. You can see that drawing the sphere using lighting and smooth shading gives the most realistic appearance. The realism could be increased even more by using a larger number of polygons to draw the sphere.
There are several other common, but not quite so essential, settings that you might use for 3D drawing. As noted above, when lighting is turned on, the color of an object is determined by its material properties. However, if you want to avoid the complications of materials and still be able to use different colors, you can turn a feature that causes OpenGL to use the current drawing color for the material:
gl.glEnbable(GL.GL_MATERIAL_COLOR);
This causes the basic material color to be taken from the color set by glColor3f or similar commands. (Materials have other aspects besides this basic color, but setting the basic color is often sufficient.) Examples in this chapter and the next will use this feature.
Another useful feature is two-sided lighting. OpenGL distinguishes the two sides of a polygon. One side is the front side, and one is the back side. Which is the front side and which is the back is determined by the order in which the vertices of the polygon are specified when it is drawn. (By default, the order is counterclockwise if you are looking at the front face and is clockwise if you are looking at the back face.) In the default lighting model, the back faces are not properly lit. This is because in many 3D scenes, the back faces of polygons face the insides of objects and will not be visible in the scene; the calculation that is required to light them would be wasted. However, if the back sides of some polygons might be visible in your scene, then you can turn on two-sided lighting, at least when you are drawing those polygons. This forces the usual lighting calculations to be done for the back sides of polygons. To turn on two-sided lighting, use the command:
gl.glLightModeli(GL.GL_LIGHT_MODEL_TWO_SIDE, GL.GL_TRUE);
You can turn it off using the same command with parameter GL.GL_FALSE in place of GL.GL_TRUE.
Finally, I will mention the GL_NORMALIZE option, which can be enabled with
gl.glEnable(GL.GL_NORMALIZE);
This has to do with the "normal vectors" that will be discussed in Section 2.4. (For now, just think of a vector as an arrow that has a length and a direction.) This option should be enabled in two circumstances: if you supply normal vectors that do not have length equal to 1, or if you apply scaling transforms. (Rotation and translation are OK without it.) Correct lighting calculations require normal vectors of length 1, but nothing forces you to supply vectors of proper length. Furthermore, scaling transforms are applied to normal vectors as well as to geometry, and they can increase or decrease the lengths of the vectors. GL_NORMALIZE forces OpenGL to adjust the length of all normal vectors to length 1 before using them in lighting calculations. This adds some significant computational overhead to the rendering process, so this feature is turned off by default. However, if you fail to turn it on when it is needed, the lighting calculations for your image will be incorrect.