The Species-Area Relation

Introduction

One of the most fundamental ecological relationships is that as the area of a region increases, so does the number of different species encountered. While this makes sense and may even seem obvious, this observation seems to have first occurred late in the eighteenth century and slowly taken hold in the nineteenth century. During this period, naturalists such as Alfred Wallace and Charles Darwin accompanied sailing expeditions to islands around the globe. In the process of recording and collecting what were new and exotic species to Europeans, certain ecological patterns and trends slowly became apparent. Johann Rheinhold Forster, the naturalist on Captain Cook's second voyage to the South Pacific (1772), seems to be the first to have noticed this particular point [Quamen 1996].

Islands only produce a greater or less number of species as their circumference is more or less extensive.

Simply put, the number of species increases with area. A less obvious insight would occur later to others making careful collections of data: the increase in species occurs at a decreasing rate. This species-area relation may be the oldest ecological pattern to be recognized; H. C. Watson described a species-area curve for plant species in Britain in 1859 and deCandolle produced a similar study in 1855 [Williams 1964 and Rosenzweig 1995]. Beginning with a small plot in county Surrey, Watson identified the plant species present in ever increasing areas of Great Britain. The general pattern is one of increasing species diversity with increasing area sampled. The first mathematical description of the species-area relationship was proposed by Arrhenius in 1920 and modified by Gleason in 1922.

The association of increased area with an increasing number of species at a declining rate has been tested numerous times. It persists over areas both small and large and with animals as well as plants.

Example 1 Load the data file Darlington.dat into the ScatterPlot applet to the right and click the "Plot the Data" button. Here and in subsequent examples, A is the area of the region and S is the number of species present in the corresponding region.

Darlington [1957] found this pattern with amphibian and reptile species in the Greater and Lesser Antilles The areas, A, of these islands range from about 1 mi² to nearly 40000 mi². The number of species, S, present on the islands ranged from 3 on the smallest island Redonda to 84 on Hispaniola. Note that Hispaniola is the second largest island in this group with Cuba being the largest. Consequently, the number of species does not always increase with island size, though that is the general trend.

Example 2 The same pattern persists when small areas are surveyed. Arrhenius [1921] observed this effect when counting plant species in a variety of ecological communities. His plots ranged from 1 to 100 square decimeters. You can load Arrhenius.dat into the scatterplot at the right and see the same pattern.

The simple species-area pattern elucidated by Watson and deCandolle has, of course, been shown to depend on a number of other variables besides area. For example, elevation and latitude may change the shape of the species-area curve. So does isolation (mainland versus island). This may be one reason for the large number of amphibian and reptile species that were found on Hispaniola. Likewise, habitat heterogeneity contributes to the rate at which new species are added as area increases. For example, whether you sampled a square meter or a square kilometer at the North Pole, you'd probably find few if any species. However, a similar study in an area with several different types of habitats would yield many new species with each increase in plot area. In fact, some studies suggest that the best explanation for the species-area relationship is increasing habitat diversity [Johnson and Simberloff 1974].

Ecologists have produced hundreds of examples of increasing species diversity with increasing area (see [Rosenzweig 1995]). Yet despite numerous empirical examples, debate continues to exist over the cause(s) of the species-area relation [McGuinness 1984]. The debate has become even more contentious in recent years as species-area curves have been used to address important questions such as:

What is the minimum protected area that will sustain a particular endangered species?
Should we protect species or ecosystems?
What management practices will result in species extinction and at what rate?

Darlington's Observation

Reconsider the data on amphibians and reptiles in the Antilles. Darlington [1957] noticed that there was a pattern in the data when he excluded a few of the islands from consideration.

Table 1. Island size versus number of amphibian and reptile species in the Antilles. Based on [Darlington 1957, Table 17; also see Quamen 1996, 388].

Approx.Area	Species	Species	Index No.
mi²	(Approximate)	(Actual)	k
4	5	5	0
40	10	9	1
(400)	(20)	--	(2)
4000	40	39--40	3
40000	80	76--84	4

These data may be loaded into the ScatterPlot by using the file Pattern.dat.

Though there was no 400 mi² island in the Antilles, Darlington still included that size in his table and filled in an expected number of 20 amphibian and reptile species for such an island. He did so because he noticed that for each tenfold increase in area, the number of species doubled. Mathematically this means that both the area and the number of species are geometric series. If A denotes the area and S the number of species, then the area of the islands can be written as A = 4*10^k, where k is the index number of the island in Table 1---starting with k = 0 for the smallest island. Similarly, the corresponding number of species is S = 5*2^k.

The goal is to "predict" the number of species based on the area as Darlington did. In other words, S should be expressed as a function of A, not as a function of some arbitrary index k. To do this, solve for k in terms of A by using logarithms and then substitute this into the expression for S.

A = 4*10^k if and only if A/4 = 10^k iff and only if log(A/4) = k.
If we rewrite S using this expression for k and the identity 2 = 10^{log 2}, we obtain
S = 5*2^k = 5(10^{log 2})^k
= 5(10^{log 2})^{log (A/4)}
= 5(10^{log (A/4)})^{log 2}.
Again, using basic logarithm properties, this simplifies to
S = 5(A/4)^{log 2} = 5*4^{-log 2}*A^{log 2}
Since 5*4^{-log 2} is approximately 3.29, then
S = 3.29A^{log 2}.

Notice how the exponent log 2 of A ensures that a tenfold increase in area produces a doubling of the number of species. If we start with an area of size A and then evaluate S for an area of size 10A we see that the function predicts there to be twice as many species as in an area of size A:

S(10A) = 3.29(10A)^{log 2} = 10^{log 2}*3.29A^{log 2} = 2*S(A).

Of course, we can evaluate the log of 2:

log 2 = 0.301.
So, in this particular case, we are able to express S as a power function of A using
3.29A^{log 2} = 3.29A^0.301
because we spotted a pattern in (an approximation to) the data. Such patterns are seldom so obvious. Nonetheless, the expectation is that species and area are related in the general way that we have found in this example, S is a power function of A of the form
S = cA^z,
where c and z are constants that depend on the particular location and taxonomic group. In particular, whenever a tenfold increase in area produces a doubling of the number of species, then z=log 2. For the general situation, we outline below a simple method of finding z and c.

Exercises

Suppose Darlington had found that a tenfold increase in area produced a tripling of species. What would the exponent z be in the equation S=cA^z?
Suppose that a ninefold increase in area produced a doubling of species, what would the exponent z be?
For those familiar with natural logarithms, recall that log 2=ln 2/ ln 10. Re-express your answers to parts (a) and (b) using natural logs.
More generally, if an n-fold increase in area produces an m-fold increase in the number of species, express the corresponding exponent z using natural logs.

The Power Function Model

The power curve description of the species-area relation, S = cA^z, was proposed by Arrhenius in 1920 and modified by Gleason [1922]. The exponent z is generally small, in the range of 0.2 or 0.3 as we found above. The constants c and z are determined from the survey data itself. One of the difficulties with this model is giving an appropriate biological interpretation to these constants. We will consider this question later. Note, however, that both c and z may vary from region to region and from one type of organism to another and both constants can dramatically change the shape of the curve.

Because c and z are fitted to the data, some have criticized power curve models because they don't explain anything about the system, they merely describe it. Biologically, why should species-area curves be described by power curves at all? Taking up this criticism, many others (see [McGuinness 1984] for an account) have modified, adapted or even scrapped this basic model in an attempt to build a model with explanatory power. Nonetheless, McGuinness [1984] notes that the basic relation is often viewed as "one of community ecology's few genuine laws." For beginning students, the model provides an introduction to the process of transforming data in order to determine the relationship that would seem to exist between two variables that can be measured relatively easily in the field.

Fitting a Power Curve to the Data

Reconsider Darlington's original data set for Antillean amphibians and reptiles. How does one find the power curve S = cA^z shown below that captures the overall trend of the data. How does one find the equation for this curve using the actual data? Which values of c and z that best fit these data?

Again, the key is logarithms. Assuming a power model S = cA^z applies, then the constants c and z can be estimated from the data by taking the log of both sides of the equality.
log S = log(cA^z )
so log S = log c + log A^z
and finally log S = zlog A + log c.
This linearizes the original relation, that is, this new equation has the form of a line: the constant z is its slope, the constant log c is its intercept, log A is the independent variable, and log S is the dependent variable.

So if a power model describes the data, then when we graph log A on the horizontal axis and log S on the vertical the points should lie nearly along a line. This line has slope z and intercept log c.

Try it Now!

Reload Darlington.dat into the ScatterPlot Applet. Carry out a transformation of Darlington's data. Take the logs of both variables and plot the result. Do this by
- typing log(A) into the box labelled "Plot"
- typing log(S) into the box labelled "versus:"
- and then clicking the "Plot the Data" button.
Are the data reasonably linear? Remember that the point representing Hispaniola will probably not be very close to the regression line. Why?
What are the slope and intercept for the regression line? (Read these above the scatterplot.)
What is the correlation coefficient (also called r)? It measures the strength of the linear relationshipd between the two variables and ranges from -1 to +1. When the value is close to -1 or +1, there is a strong correlation between the two variables and when the value is close to 0 there is a little or no correlation between the two variables. Negative correlation coefficients arise when the regression line has a negative slope.

The equation of the least squares regression line is
y = mx + b = 0.324x + 0.493,
or, using the formulation of the power model,
log S = zlog A + log c = 0.324log A + 0.493.
This means that z = 0.324 and log c = 0.493 or, equivalently,
c = 10^0.493 = 3.113.
Thus, the power model for these data would be
S=cA^z=3.113A^0.324,
where A is measured in square miles.

Exercises

Many values of z fall within a narrow range between 0.15 and 0.39 [Preston 1962; MacArthur and Wilson 1967].
- Suppose that z=0.15. Using the general power-curve model S = cA^z, what is the effect on the number of species S if the area is increased from A to 10A?
- Suppose that z=0.39. What is the effect on the number of species S if the area is increased from A to 10A?
- Darlington's observation that for every tenfold increase in area there is a doubling of the number of species often is given as a rule of thumb. Do your two calculations support this claim? Explain.
The nonvolant (flightless) mammal fauna for the Channel Islands was surveyed and displayed the general trend of increased area with an increased number of species at a declining rate. Load the ChannelData into the ScatterPlot Applet and find the power curve that best fits these data.
Table 2: Total numbers of nonvolant mammal species versus area for the islands of the British Channel [Wright 1981; adapted from Table A2].

Island Area (km²) Species

Jersey 116.3 9

Guernsey 63.5 5

Alderney 7.9 3

Sark 5.2 2

Herm 1.3 2

The data below give the number of endemic vascular plant species in mainland coastal areas (mi²) of California at or above 33 degrees latitude.

Table 3: Data from [Johnson, Mason, and Raven 1968].

Location	Area	Species
Tiburon Peninsula	5.9	370
San Francisco	45	640
Santa Barbara area	110	680
Santa Monica Mountains	320	640
Marin County	529	1060
Santa Cruz Mountains	1386	1200
Monterey County	3324	1400
San Diego County	4260	1450
California Coast	24520	2525

Find the power curve function that best fits the endemic (native) plant species data for coastal areas of California at or above 33 degrees latitude given above.
Now predict the number of species present in a 24210 mi² region.
Johnson, Mason, and Raven [1968] were interested in the effects of latitude and elevation as well as area on the number of species present. The Baja region of California is considerably south (28 degrees latitude) of the regions in Table 3. Its area is 24210² and the number of endemic plant species is 1450. How does this compare to your prediction? Does latitude seem to be an important factor?

Islands occur not just in oceans. There are also "virtual" islands such as mountain tops where the surrounding lowland region represents a physical barrier to montane species. Other islands can be lakes or ponds or even wooded areas surrounded by open tracts of land. In this view, a nature preserve or wildlife refuge acts like an "island" to many of the species which inhabit it.
- Power curve models can be used to estimate the effect on the number of species present when natural areas are encroached upon by human activity (e.g., clear cutting of rain forest). Suppose that 50% of an existing area is deforested. Does the power curve model predict that 50% of the species will be lost? Use a value of z of 0.25 to make your estimate.
- Suppose that 90% of an existing area is deforested. What proportion of species does the power curve model predict will remain?
- How many Amazonian plant and animal species ultimately can be preserved if only 1% of the Amazonian rain forest is maintained in a ``natural" state? (Diamond and May [1981] comment, `"Such relations are admittedly crude and neglectful of detail, but they provide an informed first guess at the relation between the area of a reserve and the number of species which are eventually likely to be preserved in it.")

Scale Dependence of c

Suppose two researchers (perhaps two students in class that should have talked to each other before starting the project!) in the same location are counting plant species in relatively small areaa. One measures area in square meters and the other in square feet. How will their species-area curves differ? Consider their two species-area equations for the same location with area measured using two different scales, A₁ in meters and A₂ in feet. Let k be the conversion factor to change A₁-values to A₂-values. There are approximately 3.28 feet in a meter, so each square meter is approximately 3.28² = 10.50 square feet. The conversion factor would be k = 10.50. In other words, A₂ = kA₁. In the exercise below, you will see haw the change in scale will affect the species-area equation.

Exercise

Load the BrownLake.dat data file into the ScatterPlot. These are plant species data collected by six different groups of Hobart and William Smith students in quadrats of increasing area on 17 October 1996 at Brown Lake, North Stradbroke Island, Queensland, Australia. The areas were measured in square meters.

Use a log A versus log S transformation to find the species-area curve. How good is the fit? (Use the coefficient of determination.)
What are the slope and intercept of the regression line?
What is the equation of the species-area curve with area measured in meters?
Now let's graph the data as if area had been measured in square feet instead. We will need to use the scaling factor on the areas only: Plot 10.50*A verus S. The graph should be curved again, like the original plot.
Now comes the tricky part: We want to straighten out this new curve. Do so by taking logs of both variables. Plot log(10.50*A) versus log S. What is the slope of the regression line? What is its intercept?
What is the equation of the species-area curve with area measured in feet?
How were the two species-area curves similar? How were they different?

The message is clear: To compare species-area curves from two or more studies, make sure that the same units of area are used. For example, one cannot conclude that an area with a higher c-value has more species than an area with a lower c-value, even if the values of z are the same for both areas, unless one knows that the same measurement scale was used in both cases. If the studies employed different units, use the relation c₁ = c₂k^z to rescale the c-value for the first model to units of the second model. The powers remain the same.

Caution: Competing Models and Methods

It is important to note that there are competing models of the species-area relation being used by biologists and ecologists. One whose history is nearly as long as the power function model is the so-called ``exponential model" proposed by Gleason [1922]. In this model the number of species is a linear function of the logarithm of the area: S = k + zlog A, where k and z are constants. This model has been shown to fit many data sets, for example, Diamond and Mayr [1976] used it when modeling the species-area relation for birds in the Solomon Islands.

Even when the power function model S=cA^z is used, there are different methods for finding the constants c and z. We have described an indirect method which involves transforming the variables using logarithms and then fitting a least-squares line to the resulting data. The advantages of this method are that it is relatively easy to carry out and historically it has been the method that many researchers used when reporting their results. A more direct though more complicated method is to use nonlinear least-squares regression to fit a power function model to the data. If species-area data were perfectly modeled by a power function, both methods would give the same result. But real world data seldom fit any model exactly, witness our analysis of Darlington's data above.

Resources

Mitchell, Kevin and James Ryan. The species-area relation. UMAP Journal Vol. 19, No. 2, 1998, 139-170. Also published as UMAP Module 768 and UMAP Modules: Tools for Teaching 1998. 23-54.

Quamen, David. 1996. Song of the Dodo. Touchstone, New York.

Rosenzweig, M. 1995. Species Diversity in Space and Time. Cambridge University Press.

Williams, C. 1964. Patterns in the Balance of Nature. Academic Press, London.

Kevin Mitchell mitchell@hws.edu
Hobart and William Smith Colleges 
Copyright © 1997-2001
Last updated: 17 July 2001

Island	Area (km²)	Species
Jersey	116.3	9
Guernsey	63.5	5
Alderney	7.9	3
Sark	5.2	2
Herm	1.3	2