CPSC 441, Networking, Fall 2004
Lab 10: Running MPI


THIS LAB introduces you to working with MPI on the computers in the Math/CS lab. You will be running various programs and testing their performance. You will also compile a pre-written MPI program.

This lab will be due, along with further work on MPI, on Monday, December 6 (the beginning of the last week of classes).


Setup and Initial Testing

Before using MPI, you must set up your account as described in the first section of the About MPI handout. Make sure that this one-time setup is done before you proceed.

Before running MPI programs, you must start the MPD daemons. To do this, cd to the directory /classes/f04/cs441/MPI and run the script start_mpd.sh. This will start a daemon on each of the 12 machines in the lab, so that you will have a 12-processor virtual MPI machine to run your programs on.

You can let your virtual machine run during the entire lab. But before you log out, use the function mpdallexit to properly shut down the virtual machine.

To test that your virtual machine is running, use the command

         mpirun -np 12 hello_mpi

(while still in the /classes/f04/cs441/MPI directory). The console should receive and display a message from each of the 12 processes.

If you run the command mpirun -np 12 hello_mpi for a second time, you will notice that it runs much faster. Apparently, the virtual machine retains some state between runs, and if you run it for a second time, the startup will be much quicker.

By the way, the messages from the processes in this program are not necessarily received in numerical order. You might need to run the program a few times to observe this.


Speedup in a Simple Program

Change to the subdirectory named primes of /classes/f04/cs441/MPI. This directory contains several programs for counting prime numbers between 2 and some upper limit. The program primes_uniprocessor is a regular non-parallel-processing program. If you run it with no command-line argument, it will count primes between 2 and 1,000,000. You can specify a different upper limit as a command-line argument.

The Linux "time" command can be applied to any command to find out how much time it uses. If you say

         time primes_uniprocessor

it will run the primes_uniprocessor program, then tell you the "real", "user", and "sys" times used by the program. The "real" time is simply the elapsed time. The sum of the "user" and "sys" times is the actual execution time of the program. The "sys" time is the time spent in kernel routines and will be very small for the types of program that we are running; generally, it can be ignored. The elapsed time can be significantly greater than the compute time if the computer that you are using is working on multiple tasks at the same time.

Try running "time primes_uniprocessor 10000000" to find out how long it takes to count primes between 2 and 10,000,000.

The MPI program named primes1 does the same thing as primes_uniprocessor, but it distributes parts of the task among the processes in an MPI virtual machine. Try running

         time mpirun -np 12 primes1 10000000

Try to coordinate with other people in the class so that your job will not be competing for processing time with other jobs.

Question 1:  How much speedup do you see when you run the multiprocessing version of the program? (Look at the "real" time in the output from the time command! To avoid counting too much startup time, run the command twice in a row, and use the data from the second run.) How far is the actual speedup from the maximum possible speedup (12 times as fast when 12 processes are used)? When you time the mpirun command, what does the "user" time in the time report mean? Why?


Speedup in a Load-Balanced Program

Change into the mandelbrot subdirectory of the /classes/f04/cs441/MPI directory. This directory contains an MPI program named mandelbrot_mpi and some input files that can be used by this program. This is a GUI program, and since MPI runs it in some funny way, you will have to give the command:

         xhost localhost

before you can run it. Once you have done this, you can run the program with

         mpirun -np 12 mandelbrot_mpi

The mandelbrot program draws pictures of the so-called Mandelbrot set. This is a famous subset of the plane that has infinite detail. It was discovered by Bernoit Mandelbrot, who was inspired by it to invent the theory of fractals. Use the "Start Drawing" command in the Control menu to see a picture. The Mandelbrot set is the black region of the picture. Points outside the set are colored to indicate a kind of "computational distance" from the set. We won't worry about the details here. Suffice it to say that it can take a fair amount of computation to determine how to color each point. If you click the mouse on any point, that point will be moved to the center of the picture and you will move in that point by a factor of two. The input files for the program contain close-ups of several regions in the set. To see them, use the "Load Parameters" command in the File menu and select one of the files params1.mdb, params2.mdb, ..., params8.mdb.

The mandelbrot program uses a master/slave model of computation. Process 0 is the master. It sends out tasks to the other processes and receives the results. It also handles the GUI part of the program. In this case, each task consists of computing the colors of the pixels in one vertical column of pixels. When process 0 gets the data for one column from one of the slaves, it draws the corresponding column of pixels on the screen.

Note that when you run mandelbrot on a virtual machine with k processes, there are only k-minus-one processes doing computations. Process 0 does not do any of the computations itself. (So, you need a virtual machine with at least two processes to run the program at all.)

Question 2:  Choose one of the param files and time (with your watch) how long it takes to draw the entire image on a virtual machine with 12 processes. Better, draw it a few times and take the average. Remember that only 11 of the processes are doing computations. Exit from the mandelbrot program and run it on a virtual machine with 2 processes, using mpirun -np 2 mandelbrot_mpi. Draw the same param file and time how long it takes. Remember that this time there is only one process doing computation. Also, try it with a few intermediate numbers of processes. Report on the times and what you observe about the speedup. (Note: Try to coordinate with other people in the class so that your job will not be competing for processing time with other jobs.)

Question 3:  Just for fun, play with the mandelbrot program to make a nice picture. Use the "Save Parameters" command in the file menu to make a param file for your picture. Email your param file to me. (It's just a very simple text file.)


Compiling an MPI Program

Getting back to the prime-counting task, it would be nice to know how much time each prime-counting task takes to do its job. The program primes3.cc in the directory /classes/f04/cs441/MPI/primes can be used to find out. This program is similar to primes1, but each process keeps track of the elapsed time for its task and reports its time to process 0. Process 0 displays the times on the console.

The primes3.cc program is provided in source code form only. You will have to compile it before you can run it. To compile it, copy it into your own account and then use the command

         mpiCC primes3.cc

This will produce an MPI executable program named a.out (which you could rename to primes3 if you like). To use it to count the primes between 2 and 10,000,000, use the command

         mpirun -np 12 a.out 10000000

Question 4:  Comment on the times taken by the various processes in primes3. Explain why the times show that this program does not do a very good job of load balancing. (Can you suggest a mathematical explanation for the pattern of times?)


David Eck