The Species-Area Relation
Introduction
One of the most fundamental ecological relationships is
that as the area of a region increases, so does the number of different species
encountered. While this makes sense and may even seem obvious, this observation
seems to have first occurred late in the eighteenth century and slowly taken hold
in the nineteenth century. During this period, naturalists such as Alfred Wallace
and Charles Darwin accompanied sailing expeditions to islands around the globe.
In the process of recording and collecting what were new and exotic species
to Europeans, certain ecological patterns and trends slowly became apparent.
Johann Rheinhold Forster, the naturalist on Captain Cook's second voyage to the
South Pacific (1772), seems to be the first to have noticed this
particular point [Quamen 1996].
Islands only produce a greater or less number of species as their
circumference is more or less extensive.
Simply put, the number of species increases with area. A less obvious
insight would occur later to others making careful collections of data:
the increase in species occurs at a decreasing rate.
This species-area relation may be the oldest ecological
pattern to be recognized; H. C. Watson described a species-area curve
for plant species in Britain in 1859 and deCandolle produced a
similar study in 1855 [Williams 1964 and Rosenzweig 1995].
Beginning with a small plot in county Surrey, Watson identified the
plant species present in ever increasing areas of Great Britain. The
general pattern is one of increasing species diversity with increasing
area sampled. The first mathematical description of the species-area
relationship was proposed by Arrhenius in 1920 and modified by
Gleason in 1922.
The association of increased area with an increasing number of species
at a declining rate has been tested numerous times. It persists over areas
both small and large and with animals as well as plants.
Example 1 Load the data file Darlington.dat into the
ScatterPlot applet to the right and click the "Plot the Data" button.
Here and in subsequent examples, A is the
area of the region and S is the number of species present in the
corresponding region.
Darlington [1957] found this pattern with
amphibian and reptile species in the Greater and Lesser Antilles
The areas, A, of these islands range from
about 1 mi2 to nearly 40000 mi2. The number of
species, S, present on the islands ranged from 3 on the smallest island
Redonda to 84 on Hispaniola. Note that Hispaniola is the second largest island
in this group with Cuba being the largest. Consequently, the number of species
does not always increase with island size, though that is the general
trend.
Example 2 The same pattern persists when small areas are surveyed.
Arrhenius [1921] observed this effect when counting plant
species in a variety of ecological communities.
His plots ranged from 1 to 100 square decimeters. You can load Arrhenius.dat
into the scatterplot at the right and see the same pattern.
The simple species-area pattern elucidated by Watson and
deCandolle has, of course, been shown to depend on a number of
other variables besides area. For example, elevation and latitude
may change the shape of the species-area curve. So does isolation (mainland
versus island). This may be one reason for the large number of amphibian and reptile
species that were found on Hispaniola. Likewise, habitat
heterogeneity contributes to the rate at which new species are added
as area increases. For example, whether you sampled a square meter
or a square kilometer at the North Pole, you'd probably find few if
any species. However, a similar study in an area with several
different types of habitats would yield many new species with each
increase in plot area. In fact, some
studies suggest that the best explanation for the species-area
relationship is increasing habitat diversity [Johnson and Simberloff
1974].
Ecologists have produced hundreds of examples of increasing
species diversity with increasing area (see [Rosenzweig
1995]). Yet despite numerous empirical examples, debate continues
to exist over the cause(s) of the species-area relation [McGuinness
1984]. The debate has become even more contentious in recent
years as species-area curves have been used to address important
questions such as:
- What is the minimum protected area that will sustain a
particular endangered species?
- Should we protect species or ecosystems?
- What management practices will result in species extinction and
at what rate?
Darlington's Observation
Reconsider the
data on amphibians and reptiles in the Antilles.
Darlington [1957] noticed that there was
a pattern in the data when he excluded a few of the islands from consideration.
Table 1. Island size versus number of amphibian and
reptile species in the Antilles.
Based on [Darlington 1957, Table 17; also see Quamen 1996, 388].
Approx.Area | Species | Species | Index No. |
mi2 | (Approximate) | (Actual) | k |
4 | 5 | 5 | 0 |
40 | 10 | 9 | 1 |
(400) | (20) | -- | (2) |
4000 | 40 | 39--40 | 3 |
40000 | 80 | 76--84 | 4 |
These data may be loaded into the ScatterPlot by using the file Pattern.dat.
Though there was no 400 mi2 island in the Antilles,
Darlington still included that size in his table and filled in an
expected number of 20 amphibian and
reptile species for such an island.
He did so because he noticed that for each tenfold increase in area,
the number of species doubled.
Mathematically this means that both
the area and the number of species are geometric series.
If A denotes
the area and S the number of species, then
the area of the islands can be written as
A = 4*10k,
where k is the index
number of the island in Table 1---starting with k = 0 for the smallest island.
Similarly, the corresponding number of species is S = 5*2k.
The goal is to "predict" the number of species based on the area as Darlington
did. In other words,
S should be expressed as a function of A, not as a function of some
arbitrary index k. To do this, solve for k in terms of A by using logarithms
and then substitute this into the expression for S.
A = 4*10k if and only if A/4 = 10k iff and only if
log(A/4) = k.
If we rewrite S using this expression for k
and the identity
2 = 10log 2,
we obtain
S = 5*2k = 5(10log 2)k
= 5(10log 2)log (A/4)
= 5(10log (A/4))log 2.
Again, using basic logarithm properties, this
simplifies to
S = 5(A/4)log 2 = 5*4-log 2*Alog 2
Since 5*4-log 2 is approximately 3.29,
then
S = 3.29Alog 2.
Notice how the exponent log 2 of A ensures that a tenfold
increase in area produces a doubling of the number of species. If we start with an area of
size A and then evaluate S for an area of size 10A we
see that the function predicts there to be twice as many species as in an area of size
A:
S(10A) = 3.29(10A)log 2
= 10log 2*3.29Alog 2 = 2*S(A).
Of course, we can evaluate the log of 2:
log 2 = 0.301.
So, in this particular case, we are able to express S as a power
function of A using
3.29Alog 2 = 3.29A0.301
because
we spotted a pattern in (an approximation to) the data. Such patterns
are seldom so obvious. Nonetheless, the expectation is that species
and area are related in the general way that we have found in this
example, S is a power function of A of the form
S = cAz,
where c and z are constants that depend on the particular
location and taxonomic group.
In particular, whenever a tenfold
increase in area produces a doubling of the number of species,
then z=log 2. For the general situation, we outline below
a simple method of finding z and c.
Exercises
- Suppose Darlington had found
that a tenfold increase in area produced a tripling
of species. What would the exponent z be in the equation S=cAz?
- Suppose that a ninefold increase in area produced a doubling
of species, what would the exponent z be?
- For those familiar with natural logarithms, recall that
log 2=ln 2/ ln 10.
Re-express your answers to parts (a) and (b) using
natural logs.
- More generally, if an n-fold increase in area produces an m-fold increase
in the number of species, express the corresponding exponent z using
natural logs.
The Power Function Model
The power curve description of the species-area relation,
S = cAz,
was proposed by Arrhenius in 1920 and modified by
Gleason [1922]. The exponent z is generally small, in the range of
0.2 or 0.3 as we found above. The constants c and z are determined from the survey
data itself. One of
the difficulties with this model is giving an appropriate biological
interpretation to these constants. We will consider this question later.
Note, however, that both c and z may vary from
region to region and from one type of organism to another and both constants
can dramatically change the shape of the curve.
Because c and z are fitted to the data, some have criticized power curve
models because they don't explain anything about the system, they merely describe
it. Biologically, why should species-area curves be described by power curves
at all? Taking up this criticism, many others (see [McGuinness 1984] for an account)
have modified, adapted or even scrapped this basic
model in an attempt to build a model with explanatory power. Nonetheless,
McGuinness [1984] notes that the
basic relation is often viewed as "one of community ecology's few genuine laws."
For beginning students, the model provides an introduction to the
process of transforming data in order to determine the relationship that
would seem to exist between two variables that can be measured relatively easily
in the field.
Fitting a Power Curve to the Data
Reconsider Darlington's original data set for Antillean amphibians and reptiles.
How does one find
the power curve S = cAz shown below that
captures the overall trend of the data. How does one find
the equation for this curve using the actual data? Which
values of c and z that best fit these data?
Again, the key is logarithms.
Assuming a power model S = cAz applies, then
the constants c and z can be estimated from the data by
taking the log of both sides of the equality.
log S = log(cAz )
so
log S = log c + log Az
and finally
log S = zlog A + log c.
This linearizes the original relation, that is,
this new equation has the form of a line:
the constant z is its
slope, the constant log c is its intercept,
log A is the independent variable, and log S is the dependent variable.
So if a power model describes the data, then when we
graph log A on the horizontal axis and log S on the vertical
the points should lie nearly along a line. This line has slope z and
intercept log c.
Try it Now!
- Reload Darlington.dat into the ScatterPlot Applet.
Carry out a transformation of Darlington's data. Take the
logs of both variables and plot the result. Do this by
- typing log(A) into the box labelled "Plot"
- typing log(S) into the box labelled "versus:"
- and then clicking the "Plot the Data" button.
- Are the data reasonably linear? Remember that
the point representing Hispaniola will probably not be very close to the regression line. Why?
- What are the slope and intercept for the regression line? (Read these above the scatterplot.)
- What is the correlation coefficient (also called r)? It measures the strength of
the linear relationshipd between the two variables and ranges from -1 to +1. When the value
is close to -1 or +1, there is a strong correlation between the two variables and
when the value
is close to 0 there is a little or no correlation between the two variables. Negative correlation
coefficients arise when the regression line has a negative slope.
The equation of the least squares regression line
is
y = mx + b = 0.324x + 0.493,
or, using the formulation of the power model,
log S = zlog A + log c = 0.324log A + 0.493.
This means that z = 0.324 and log c = 0.493 or, equivalently,
c = 100.493 = 3.113.
Thus, the power model for these data would be
S=cAz=3.113A0.324,
where A is measured in square miles.
Exercises
- Many values of z
fall within a narrow range between 0.15 and 0.39 [Preston 1962; MacArthur and
Wilson 1967].
- Suppose that z=0.15. Using the general power-curve model S = cAz,
what is the effect on the number of species S
if the area is increased from A to 10A?
- Suppose that z=0.39. What is the effect on the number of species S
if the area is increased from A to 10A?
- Darlington's observation that for every tenfold increase in
area there is a doubling of the number of species often is given as
a rule of thumb.
Do your two calculations support this claim? Explain.
- The nonvolant (flightless) mammal fauna for the
Channel Islands was surveyed and displayed the general
trend of increased area with an increased number of species
at a declining rate. Load the ChannelData into the ScatterPlot Applet and
find the power curve that best fits these data.
Table 2: Total numbers of nonvolant mammal species versus area
for the islands of the British Channel
[Wright 1981; adapted from Table A2].
Island | Area (km2) | Species |
Jersey | 116.3 | 9 |
Guernsey | 63.5 | 5 |
Alderney | 7.9 | 3 |
Sark | 5.2 | 2 |
Herm | 1.3 | 2 |
- The data below give the number of endemic vascular
plant species in mainland coastal areas
(mi2) of California at or above 33 degrees latitude.
Table 3: Data from [Johnson, Mason, and Raven 1968].
Location | Area | Species |
Tiburon Peninsula | 5.9 | 370 |
San Francisco | 45 | 640 |
Santa Barbara area | 110 | 680 |
Santa Monica Mountains | 320 | 640 |
Marin County | 529 | 1060 |
Santa Cruz Mountains | 1386 | 1200 |
Monterey County | 3324 | 1400 |
San Diego County | 4260 | 1450 |
California Coast | 24520 | 2525 |
- Find the power curve function that
best fits the endemic (native) plant species data for coastal areas of California
at or above 33 degrees latitude given above.
- Now predict the number of species present in a
24210 mi2 region.
- Johnson, Mason, and Raven [1968] were interested in the effects of
latitude and elevation as well as area on the number of species present.
The Baja region of California is considerably south (28 degrees latitude)
of the regions in Table 3. Its area is 242102 and the number
of endemic plant species is 1450. How does this compare to your prediction?
Does latitude seem to be an important factor?
- Islands occur not just in
oceans. There are also "virtual" islands such as mountain tops where
the surrounding lowland region represents a physical barrier to montane species.
Other islands can be lakes or ponds or even wooded areas surrounded by open tracts of
land. In this view, a nature preserve or wildlife refuge acts like an
"island" to many of the species which inhabit it.
- Power curve models can be used to estimate the effect on the
number of species present when natural areas are encroached upon
by human activity (e.g., clear cutting of rain forest). Suppose that
50% of an existing area is deforested. Does the power curve model predict that
50% of the species will be lost? Use a value of z of 0.25 to make your estimate.
- Suppose that
90% of an existing area is deforested. What proportion of species does the power
curve model predict will remain?
- How many Amazonian plant and animal species ultimately can
be preserved if only 1% of the Amazonian rain forest is maintained in
a ``natural" state?
(Diamond and May [1981] comment, `"Such relations are admittedly
crude and neglectful of detail, but they provide an informed first guess
at the relation between the area of a reserve and the number of species
which are eventually likely to be preserved in it.")
Scale Dependence of c
Suppose two researchers (perhaps two students in class that should have talked to each other
before starting the project!) in the same location
are counting plant species in relatively small areaa. One
measures area in square meters and the other in square feet. How will their species-area
curves differ?
Consider their two
species-area equations for the same location with area measured using
two different scales, A1 in meters and A2 in feet.
Let k be the conversion
factor to change A1-values to A2-values. There
are approximately 3.28 feet in a meter, so each square meter is approximately
3.282 = 10.50 square feet. The conversion factor would be k = 10.50. In
other words, A2 = kA1. In the exercise below,
you will see haw the change in scale will affect the species-area equation.
Exercise
Load the BrownLake.dat data file into the ScatterPlot.
These are plant species data collected by six different groups of
Hobart and William Smith students
in quadrats of increasing area on 17 October 1996 at Brown Lake,
North Stradbroke Island, Queensland, Australia. The areas were measured in
square meters.
- Use a log A versus log S transformation to find the species-area curve.
How good is the fit? (Use the coefficient of determination.)
- What are the slope and intercept of the regression line?
- What is the equation of the species-area curve with area measured in meters?
- Now let's graph the data as if area had been measured in square feet instead.
We will need to use the scaling factor on the areas only: Plot 10.50*A verus
S. The graph should be curved again, like the original plot.
- Now comes the tricky part: We want to straighten out this new curve. Do so by taking
logs of both variables. Plot log(10.50*A) versus log S.
What is the slope of the regression line? What is its intercept?
- What is the equation of the species-area curve with area measured in feet?
- How were the two species-area curves similar? How were they different?
The message is clear: To compare species-area curves from two or more
studies, make sure that the same units of area are used. For example,
one cannot conclude
that an area with a higher c-value has more species than an area with a
lower c-value, even if the values of z are the same for both areas,
unless one knows that the same measurement scale was used in both cases.
If the studies
employed different units, use the relation c1 = c2kz
to rescale the
c-value for the first model to units of the second model. The powers
remain the same.
Caution: Competing Models and Methods
It is important to note that there are competing models of the species-area
relation being used by biologists and ecologists. One whose history is nearly as
long as the power function model is
the so-called ``exponential model" proposed by Gleason [1922]. In this model the
number of species is a linear function of the logarithm of the area: S = k + zlog A,
where k and z are constants.
This model has been shown to fit many data sets, for example, Diamond and Mayr [1976]
used it when modeling the species-area relation for birds in the Solomon Islands.
Even when the power function model S=cAz is used, there are different methods for finding
the constants c and z. We have described an indirect method which involves transforming
the variables using logarithms and then fitting a least-squares line to the resulting data.
The advantages of this method are that it is relatively easy to carry out and historically it
has been the method that many researchers used when reporting their results.
A more direct though more complicated method is to use
nonlinear least-squares regression to fit a power function model to the data.
If species-area data were perfectly modeled by a power function, both methods would give the
same result. But real world data seldom fit any model exactly, witness our analysis of
Darlington's data above.
Resources
Mitchell, Kevin and James Ryan. The species-area relation.
UMAP Journal Vol. 19, No. 2, 1998, 139-170.
Also published as UMAP Module 768 and
UMAP Modules: Tools for Teaching 1998. 23-54.
Quamen, David. 1996. Song of the Dodo. Touchstone, New York.
Rosenzweig, M. 1995. Species Diversity in Space and Time. Cambridge University Press.
Williams, C. 1964. Patterns in the Balance of Nature. Academic Press, London.
Kevin Mitchell mitchell@hws.edu
Hobart and William Smith Colleges
Copyright © 1997-2001
Last updated: 17 July 2001