CPSC 441, Fall 2018
Lab 7: ARP / Mandelbrot

This lab has two unrelated parts. In the first part you will (mostly) investigate the ARP protocol. The second part introduces the idea of distributed processing, using computation of the famous Mandelbrot Set as an example. You should spend some class time on each part.

You can work with a partner on this lab. There is no extra work if you do.

The lab is due next Monday.

Part 1: ARP

Since you will be using Wireshark for today's lab, you should log into your "netXX" account.

Your computer has an "ARP cache" in which it stores information about Ethernet addresses associated with different IP addresses. To view the contents of the ARP cache, use the command

arp -n

The "-n" stops the command from trying to do DNS lookups to find host names for IP addresses.

Exercise 1: Find the MAC address of the Ethernet card in the computer csfac1 (IP address 172.21.7.91). This machine will not be in your ARP cache unless you have tried to access it (for example by pinging it). Explain how you found the MAC address.

Exercise 2: What happens to the ARP cache when you try to access a non-existant IP address on the local network, such as 172.21.7.112? How about if you try to access an address outside the local network, such as 123.45.67.89? Why the difference?

Exercise 3: The arp command can be used to make permanent entries into the ARP cache (although only if you are running as root). Read the man page for the arp command and look up the meaning of the "-s" option. Why might you want to use this option? What would happen if you used this option to add an incorrect Ethernet address to the ARP cache?


For the next two exercises, you need Wireshark. Start up Wireshark and start a packet capture on eno1. Set the filter to arp, and click "Apply".

Exercise 4: While the packet capture is running, ping csfac2 (IP 172.21.7.92), which will generate an ARP request and reply, as long as csfac2 was not already in the ARP cache. (If it was, you can try one of the computers csfac3 through csfac7 that is not in the ARP cache.) This will generate an ARP request and reply. To see all ARP packets, you can set the display filter to arp. To see only ARP replies in Wireshark, you can set the display filter to arp.opcode == 2. Find the reply that was generated by the ping to csfac2. How did you recognize it? How can you find the Ethernet address for csfac2 just from the information in the Wireshark window. If you set the display filter to arp, you should see a large number of packets. Discuss the number of ARP packets that you see. Why are there so many? Where do they all come from? Why are there so many more ARP packets than there are ARP replies.?

Exercise 5: There is such a thing as a "gratuitous" ARP request or ARP reply. Set the display filer to arp.isgratuitous. Do you see any gratuitous ARP packets in the packet capture? Do a web search for "gratuitous ARP", and do some reading. You will probaby find the Wireshark Wiki page. What are gratuitious ARP requests? Why are they used? Do you have any interesting observations about the gratuitous ARP requests in your packet capture?

Exercise 6: This exercise is not about ARP. Remove the Wireshark display filter so that you see every packet. Look in the Protocol column for protocols that we have not studied. Do a web search for one or more of them, until you find a protocol that you find interesting. Write a paragraph about that protocol and what purpose it serves. If you can figure out what it's doing on our network, you can say something about that as well.

Part 2: A Distributed Processing Example

In distributed processing, a computation is broken up among a number of computers that communicate over a network. It is a type of parallel processing, but the relatively slow speed of network communication makes it different from parallel processing using threads inside a single computer. In this part of the lab, you will work with a fairly simple example of distributed processing. The computation in this case produces an image of the Mandelbrot set.

This example uses a supervisor/worker model of distributed computation. The supervisor program breaks up the computation of a complete image into a number of smaller jobs and distributes those jobs to worker processes running on other computers. The supervisor receives the results of the jobs and combines them to create the complete image. (In fact, the program also uses some worker processes running as threads on the same computer as the supervisor, so it can work even when there are no networked workers.)

The supervisor program is /classes/cs441/xMandelbrot.jar. It is written in Java. You should start up the program and try it out. You can zoom in by dragging a box around part of the image. The interesting parts are along the border of the black area. As you zoom in, you might need to increase the value in the "MaxIterations" menu — increasing the value might fill in some black areas with color. If you zoom in too far, you will exceed the precision of the Java double type. Try it with the last example in the "Examples" menu, which is just on the verge of doing so; zoom in to see what happens. However, the program is capable of using "arbitrary precision arithmetic" to do computations to any number of decimal places. Just check the option "Enable High Precision" in the "Control" menu. Try it on the last example, and note that the high precision computation takes much longer than computing with the built-in double type. By distributing this time-consuming computation, you can get the image a lot faster.

Load the last example from the "Examples" menu, then check "Enable High Precision" in the "Control" menu. You should work with the same example for the rest of the lab.

To use a distributed computation, you need to start worker programs on several computers. The worker program is /classes/cs441/MBNetServe.jar. By default, the program communicates on port 17071, but we might have several people trying to run the program on the same computers. So, you should select a different port number. I will assume that you are using port 12321 in the following discussion, but you should pick some random port number. Start the worker on several computers using commands of the form

ssh -f hostname java -jar /classes/cs441/MBNetServe.jar -quiet -port 12321

For the hostname, use one of cslab0, cslab1, ..., cslab11, or csfac0, csfac1, ..., csfac7. Select at random! Substitute your selected port number for 12321, or leave out the port number if you are using the default port.

Next, you have to tell the supervisor about the workers. In the xMandelbrot program, select the command "Configure Multiprocessing" from the "Control" menu. This will show the following dialog box, configured here with 5 worker computers:

In the dialog box, click "Enable Networking". Then click "Add Network Host", enter the name and port for one of your worker computers, and click OK. The worker will be added to the list of computers in the Multiprocessing Dialog. Repeat for each worker. Finally, click "Apply Config Now". At that point, the supervisor will try to establish connections with all the workers. You should see their status change from INACTIVE to CONNECTED. (Remember that you always have to click "Apply Config Now" to make a change in the configuration effective.) You should leave the dialog box open while working with the program in the rest of the lab.


Finally, you are ready to try some high precision computations with distributed processing.

Exercise 7: You should be using the last example from the "Examples" menu, with high-precision processing enabled. Run a computation using distributed processing, and time how long it takes. (Don't have a clock that shows seconds? Use the date command in a Terminal to show the current time.) Then uncheck "Enable Networking" in the Multiprocessing dialog, and click "Apply Config Now", so that the worker status changes back to INACTIVE. (Note that you must click "Apply Config Now" to actually turn distributed processing off.) Run the same computation, without the workers, and time it again. How much speedup did you see? Was it what you expected? Ideally, you should try this with different numbers of worker computers to see how speedup depends on the number of workers. (Note: The program doesn't really make it possible to run exactly the same computation. To run one that is almost the same, draw a zoom box around most of the image or select the "Custom" command in the "MaxIterations" menu and change the number of iterations by one.)

Exercise 8:Start up Wireshark again. Shut down the networking entirely in xMandelbrot, then start a packet capture in Wireshark. With the packet capture running, re-enable xMandelbrot networking and do several distributed computations with the program. The goal of this exercise is to figure out as much as you can, just from looking at the packet trace, about the protocol that is used by the program. Set the Wireshark filter to tcp.port == 12321, replacing 12321 with the port that you selected for use being used by the Mandelbrot program. The Wireshark window doesn't show the packet data in a convenient text form, but you can see the full content by selecting one of the packets and using the "Follow TCP Stream" command from Wireshark's "Analyze" menu. Outgoing packets are shown in red and incoming packets in blue. (It might also be interesting to look at the "Time Sequence Graph (Stevens)" command from the "TCP Stream Graph" sub-menu of the "Statistics" menu.) In addition, watching how the picture is composed while the computation is in progress might give you some hints about how the computation is broken up into subproblems. Write a few paragraphs about your observations, conclusions, and speculations.