Biodiversity and Information Theory


The term diversity is usually synonymous with "variety" and is simply an indication of the number of different thing present. For example, we often speak of a "diversity of opinions." While simply counting the number of different types of opinions on a subject can give a rough idea of the "diversity of opinions," the numbers of people holding the various opinions must be taken into account to get a true sense of the diversity. For example, a situation in which there are 99 people with one opinion and 1 person holding a different opinion is quite different from a situation in which 50 people have one opinion and the other 50 people have another, even though the number of opinions is 2 in both cases. Clearly the proportions of different categories of objects (e.g. opinions or species) play an important role in our intuitive notion of diversity as illustrated by the "diversity of opinions" example above.

To see how the diversity measure changes as the proportions of each category in the sample change, try entering the following data into the diversity applet below. Enter the data for pond 1 first and then clear the table and enter the data for pond 2.

 Pond 1
Pond 2
 species A
 species B
 species C
 species D

Diversity Applet

To use this applet, click in the white box in the table at the left. For each species in the sample, enter the number of individuals observed in the cell in the table. This number should be an integer larger than 0 because you can not have -2 or 0.5 species. Press the return or enter key on your keyboard to go to the next cell in the table. When all of the dat has been added to the table, click the "Compute" button and the Shannon Index values will appear on the right. Click the "Clear Data" button to clear all the entries from the table.

Note: Some versions of Netscape do not support Java 1.1. If you are using one of these browsers, you will not see the applet above.

Notice what happened to the values of H1 and H1 max. When all the organisms are in the same proportion in the sample, the H1 = H1 max. In the second pond sample, where the proportions of species A is much higher than that for any other species, the H1 value is much less than the H1 max for that sample. In short, what this indicates is that the diversity of pond 1 is much higher than pond 2 even though both have the same number of species and individuals. To see why this is so, read the sections below on how these diversity measures are calculated.

Diversity Measures

Biologists use the mathematics of information theory to make precise calculations about entities that we will call first-order diversity, H1, and divergence from equiprobability, D1.

Definition 1 Assume that there are n possible categories in a data set and that their proportions are pi,.....,pn. Then the measure of diversity, for this system is defined to be

This is the same as saying:

The units of measurement are called bits. (Since log20 is not defined, if pi = 0 we adopt the convention that the expression pilog 2pi = 0).

Definition 2 In a data set with n categories, H1max(n) is the maximum possible value of H1.

Definition 3 The divergence from eqiprobability is defined to be:

A low D1 value means H1 is close to H1max, that is, the system is nearly in a state of equiprobability; there is a high degree of diversity present. Conversely, a high D1 value means that H1 is small relative to H1max, that is, the system has diverged substantially from equiprobability and is not very diverse. To take an example, if you had an H1 of 1.5 and an H1max of 2.0, the D1 value would be 0.5. In this case 0.5 is a substantial divergence, since is represents 25% of H1max.

This page was created as part of the Mathbeans Project. The java applets were created by David Eck and modified by Jim Ryan. The Mathbeans Project is funded by a grant from the National Science Foundation DUE-9950473.