CPSC 441, Fall 2002
Lab 9: Introduction to PVM


FOR THE LAST FEW LABS OF THE TERM, we will be working with PVM (Parallel Virtual Machine). PVM is a software system for distributed computing. That is, it allows a number of hosts on a network to communicate and to work together on a project. In this lab, you will set up your Linux account so that you will be able to use PVM on the cslab machines, and you will get some initial experience with using PVM. The complete user guide for PVM is available on-line at http://www.netlib.org/pvm3/book/pvm-book.html. I will hand out a photocopy of part of this guide.

Exercise to be turned in at the end of lab. Before you leave lab, show me that you have primes_master running under xpvm. Write up and turn in your reactions to xpvm and the parallel algorithm used in the primes program. What happens in xpvm when you run it? Does it work well? Could it be improved? Turn in your write-up at the end of lab. This will count for 6 homework points, graded mainly on the basis of your being in lab.


Setup

Using PVM requires some configuration of your Linux account. The directory /home/cs441/pvm3 contains some files that you will need.

1. PVM uses the command rsh to start processes on other computers. To make this possible, your account must contain an .rhost file that contains a list of computers that are allowed to do remote access. This is considered to be a slight security risk, so you should probably delete this file at the end of the term. To create your .rhost file, you can just give this command in your home directory:

               cp  /home/cs441/pvm3/for-rhosts  .rhosts

2. You will be using a program called xpvm to control and monitor PVM. This program needs a configuration file named .xpvm_hosts. This file lists the computers that xpvm can use, and it specifies certain options (in particular, the directories in which PVM will look for programs). To get an appropriate file:

               cp  /home/cs441/pvm3/for-xpvm_hosts  .xpvm_hosts

3. Certain other definitions should go into your .bashrc file, so that they will be in effect every time you log in. Add the contents of the file /home/cs441/pvm3/for-bashrc to end of the .bashrc file in your home directory. After that, you can log out and log back in to make the change take effect. You can modify .bashrc either by editing the the .bashrc file or with the command:

               cat  /home/cs441/pvm3/for-bashrc  >>  .bashrc

4. One of the directories that PVM will search when it needs to find a program is $HOME/pvm3. That is, it will look in a directory named pvm3 in your home directory. Make this directory with the command mkdir pvm3 in your home directory. You should do all your PVM work in this directory. Change into this directory.

5. Finally, copy the following two sample files, for use in this lab, into your pvm3 directory: /home/cs441/pvm3/primes_master.cc and /home/cs441/pvm3/primes_slave.cc


Running PVM from the Command Line

A PVM virtual machine can be started from the command line with the command pvmh. (Actually, pvm is the usual command. One of the things that you added to your .bashrc file defines pvmh as an alias for "pvm /home/cs441/pvm3/hostfile". The "hostfile" specified in this command contains information about the cslab computers that will be part of the virtual machine. Specifically, it tells them where to look for PVM programs. If you just use pvm, PVM will run, but it won't be able to find the programs that you write.

So, enter the command pvmh on the command line. This will start the PVM console, where you can give commands to PVM. A virtual machine is made of several host computers. When you first start the VM, it contains only the computer on which you are working. Use the add command in the PVM console to add other cslab machines to the VM. For example:

               add cslab2

Add a few hosts to the VM. Choose them at random, so we don't have everyone in lab using the same machines. The conf command will lists the hosts in your virtual machine. To test the virtual machine, try the command in the PVM console:

               spawn  -10  ->  first

"first" is the name of a program that can be found in /home/cs441/pvm3. This is one of the places where PMV is configured to look for programs. The "spawn -10" command creates 10 processes running on the various hosts in the virtual machine. Each of these processes will run the program named first. This program just outputs a message such as "boo from tid 262148" which you will see on the console along with a lot of cruft. A "tid" is the task identifier which identifies one process, or task, running on the virtual machine. If you look at the source code, /home/cs441/pvm3/first.cc, you will see that the program simply writes the message to cout. When a task is started with spawn, any output that it sends to cout will appear on the console, even if the task is running on another computer.

The "spawn" command can be used to run programs from any directory known to PVM. But you can also run programs in your own account without using the spawn command. In fact, you can run a PVM program in the usual way, as long as virtual machine has been started. Leave the PVM console with the command

               quit

This quits the PVM console, but it leaves the virtual machine running. Make sure you are in your pvm3 directory. You should have programs named primes_master.cc and primes_slave.cc in that directory. These are PVM programs, and you have to add some options to g++ to compile them. Your .bashrc file now defines the command pvmcomp to make it easier to compile PVM programs. To compile primes_master and primes_slave, just say:

               pvmcomp  primes_master.cc
               pvmcomp  primes_slave.cc

These programs are really two parts of a single program. When you run primes_master, it will spawn several copies of primes_slave. These tasks will work together to count all the primes between 1 and 1000000. You can specify a different upper limit on the command line. Try running primes_master with command lines such as

               primes_master
               primes_master 10000000
               primes_master 1000

Note that this will only work if: PMV is running, you used pvmh to start PVM, and primes_slave is in the pvm3 directory inside your home directory.

You should not leave the PVM virtual machine running. To shut it down, go back into the PVM console with the pvmh command. Use the command halt in the PVM console to shut down the virtual machine. Do this before you go on to the next part of the lab.


Running PVM with xpvm

The xpvm program provides a graphical user interface to PVM. Since it carries a lot of overhead, it's not something that you use when you want maximal performance, but it can help in debugging and in understanding how PVM works.

Run xpvm with the command xpvm. If PVM was already running, it will just join the existing PVM. Otherwise, the virtual machine will be started, using the hostfile .xpvm_hosts in your home directory. The file that you copied from /home/cs441/pvm3 defines the correct search directories for programs, and it specifies that the cslab machines should be added to xpvm's "Hosts" menu. They are not added immediately to the virtual machine, but you can add them easily using the "Hosts" menu. Add a few computers to your VM now. You will see icons for the hosts in the top half of the xpvm window.

After adding some hosts to the virtual machine, you should use xpvm to run the primes_master program. Do this with the "Spawn" command in the "Tasks" menu. To provide the VM with a reasonable amount of work to do, it's best to run the program for ten million values: Just select "Spawn" from the "Tasks" menu and enter primes_master 10000000 in the command box that appears.

As the command runs, bar graphs in the lower part of the window show what each task is doing. Green indicates computation. White indicates an idle task (possibly not a good thing because you want to keep all the hosts occupied -- a lot of white might indicate that the workload could be distributed better). A red line between bars indicates a message sent from one task to another. Click-and-hold on a red line for more information.

In the default setting, xpvm does not display output from the program. To see the output in a separate window, choose the "Task Output" option from the "Views" menu.

Try running primes_master a few times, on different sets of hosts. Take a look at the source code, /home/cs441/pvm3/primes_master.cc and /home/cs441/pvm3/primes_slave.cc, and try to understand what these programs are doing. Can you see how the programs correspond to what you see when you run primes_master?

The "File" menu in xpvm contains two commands: "Quit PVM" and "Halt PVM." If you use "Quit PVM", you will exit from xpmv, but the virtual machine will still be running. (You can re-attach to it by running xpvm or pvmh again.) If you use "Halt PVM," you will exit from PVM and PVM will also be halted. Do not leave PVM running after the lab.


Problems Adding Hosts?

If PVM gets an error when it tries to add some host, the most probably cause is that a previous PVM did not shut down properly. When this happens, it leaves a file named /tmp/pvmd.uid in the /tmp directory on the machine where the problem occurred. Here, uid will be the user ID number that identifies you to the system. The actual file name will be something like /tmp/pvmd.501. (You can use the id command to see your uid.) If a host can't be added to PVM, deleting the /tmp/pvmd.uid file on the host might help.


David Eck, November 2002