An evaluation of Michael’s analysis of Morton and Gould
By David DeGusta (firstname.lastname@example.org) and Jason Lewis (email@example.com)
Michael’s analysis [1,2] of Morton and Gould was the first and, to our knowledge,
only direct examination of the validity of Gould’s claims other than our own . As such,
we provide here a detailed examination of Michael’s work, particularly his published
paper . We also examined Michael’s unpublished undergraduate thesis , but the
thesis generally does not add significantly to what is presented in Michael’s publication
(though, interestingly, the thesis generally accepts Gould’s critique of Morton). Our
evaluation is intended to accompany our own analysis of Morton and Gould ,
particularly our Supplemental Text S1: Additional Historical Background.
Michael  focuses on Morton’s Catalogue of Skulls – he only summarizes Crania
Americana  and does not mention Crania Aegyptiaca . Michael’s article  consists of
three main components: his remeasurement of 201 crania from the Morton Collection,
his recalculation of cranial capacity averages by population using Morton’s 1849 data,
and his response to several of Gould’s criticisms  of Morton’s work.
Michael’s Remeasurements of Morton Collection Crania
For the 201 Morton crania that Michael remeasured, Michael reports only group
averages using Morton’s ethnographic divisions . Michael states that, “Means
calculated for groups and subgroups from remeasured values were also consistent with
but lower than means calculated from Morton’s data (table 3),”. However, Michael’s
“table 3” contains seven errors in the listed mean cranial capacities. Both the Morton
“measurement mean” and the Michael “remeasurement mean” are obviously wrong for
both the Mexican and Peruvian groups (since the grand mean for the American group is
smaller than the means of all the subgroups in both cases, which is impossible).
Comparing Michael’s publication  with his thesis  reveals that the maximum values
were erroneously used for these groups. The correct values for the Mexican sample are 82
and 79 for Morton and Michael, respectively, and for the Peruvian sample 75 and 74 .
A similar mistake was made for the Polynesian group, erroneously listed in Michael’s
“table 3”  as having Morton and Michael means of 84 and 82 but with the correct
values actually being 83 and 79 . For the “Caucasian” group, the grand mean for the
Michael measurements in his “table 3” is closer to 82, not 84, and the former value is
indeed that given in his thesis . Regardless, the significance of the mean values in table
3 is unclear, as it is difficult to see, and Michael [1, 2] does not specify, how sample
averages test the question of whether individual skulls were mismeasured.
Michael [1, 2] does not provide any comparisons of his individual capacity
measurements with those of Morton, other than to state that, “Over 95% of Morton’s
measurements were within 4 in
of the [re-]measurements; fewer than 7% were smaller,”
 and that, “his [Morton’s] specific cranial measurements contain few errors,” .
Michael’s measurements were consistently about 2 in
smaller than Morton’s, because
Michael used molded acrylic balls while Morton used shot . As such, the fact that
Michael’s measurements were still greater than Morton’s for about a dozen crania
indicates a number of erroneous (low) measurements by Morton, assuming Michael’s
measurements were accurate.
However, the hypothesis at issue is not simply whether Morton made errors in his
cranial capacity measurements, but whether those errors were patterned by race (i.e.,
whether they conformed to Morton’s presumed bias). The key data are therefore the
population affinities of the crania where Michael’s measurements differ notably from
Morton’s. This information is not available in either Michael’s publication  or thesis
, nor are his measurements of individual crania given (he provides only sample
averages). Attempts to locate Michael failed, and his thesis advisor is deceased, so these
data are presumably lost. Thus while we have Morton’s measurements of individual
crania made in 1849 (or earlier) we do not have Michael’s corresponding individual
measurements from 1986.
Regardless, Gould [7,8] does not claim that Morton’s shot-based measurements
 -- the ones used for comparison by Michael  -- were biased or in error. In fact,
Gould argues the opposite: “I will assume, as Morton contends, that measurements with
shot were objective and invariably repeatable to within 1 in
,” . As such, while the
general correspondence between Morton’s original shot measurements and Michael’s
remeasurements has sometimes been taken as a refutation of Gould [9-11], such a
correspondence is, in fact, exactly what Gould predicted. Gould’s suggestion of a
“plausible scenario” involving Morton’s mismeasuring  was strictly and explicitly
confined to Morton’s seed-based measurements reported in Crania Americana ,
measurements not dealt with by Michael . Gould’s support for this scenario came from
the changes in means between the seed-based and shot-based measurements reported by
Morton, as described in our main text, an argument not dealt with by Michael .
Michael’s Recalculations of Morton’s Means
In addition to remeasuring crania, Michael  recalculated the group means
listed in Morton’s 1849 summary table  using Morton’s own data. However, Michael’s
recalculated group means are unlikely to match Morton’s exactly, “because the [Morton]
1849 table does not represent all the measured crania Morton listed, it is impossible to
determine exactly which crania he used in his calculations,” . Even so, Michael found
that, “75% of the ‘family’ or ‘race’ means reported by Morton in 1849 were found to be
within 2 in
of the subgroup means recalculated from the same data,” . However,
Michael  does not discuss the 25% of the means that differ by more than 2 in
again, Gould’s claim is in regard the patterning and direction of Morton’s errors by
population , not simply their absolute frequency. Relative to Michael’s recalculations
, the means of Morton  which err by more than 2 in
are as follows: Anglo-
American mean 5 in
too high, German mean 3 in
too high, Semitic mean 4 in
high, Mexican mean 3 in
too low, and “Mongolian” mean 3 in
too low. The patterning
here is obvious: the three overestimated means in Morton  are all part of his Caucasian
group, while his two underestimated means are for non-Caucasian groups (Michael noted
this in his thesis , but did not consider its implications for Morton’s bias). So to the
extent that there is a pattern, it conforms exactly with Gould’s prediction. Again, though,
we emphasize that these comparisons are problematic and largely uninformative because
of the differences in the samples used by Morton and Michael, as Michael  stipulates.
Michael states that, “Morton incorrectly determined the means for ‘groups’ and
‘families’ by averaging the means of their constituent ‘races,’” . In fact, Gould’s
criticism was just the opposite, that Morton inappropriately took straight, ungrouped
means: “As a primary reason for rejecting Morton’s ungrouped means ...,” . Oddly,
Michael elsewhere states that, “The only error of Morton’s 1849 table that may indicate
bias is his unfair comparison of samples. His samples are unequal in size and sexual
distribution, and Gould (1978:505-506) has convincingly argued that Morton had some
knowledge that sample size could affect means,” . In fact, Morton’s 1849 table
represents grouped means, as noted by Michael himself, thus largely avoiding the issues of
sample sizes raised by Gould . Gould’s criticism of unequal sample sizes, with generally
small Peruvian crania numerically dominating Morton’s “American Group” sample ,
was primarily aimed at Crania Americana . In addition, Gould argued for taking grouped
means at a lower level of “taxonomic” resolution (i.e., among specific tribes of Native
Americans, rather than just grouped means of “Barbarous Nations” and “Toltecs”) than
was done by Morton.
Michael states that, “It is possible that Morton unconsciously adjusted his results
by limiting the size of some samples, but because his other errors do not indicate bias it
appears equally possible that the inequality of sample sizes was a result of his ignorance of
statistics,” . First, Morton’s other errors, judging from Michael’s own work, are biased
in regards populations (as discussed above). Second, it is difficult to imagine that limiting
sample sizes to influence overall means could be done unconsciously given the relatively
large samples involved. Third, it requires no knowledge of statistics to understand that
arbitrarily excluding some specimens from an analysis has the potential to impact the
resulting mean, and Morton was surely aware of this given his experience in calculating
means for expanding samples (i.e., Morton 1839 versus Morton 1849). Finally, Michael
 neglects to mention a more plausible reason: Morton’s 1849 summary table  was
clearly intended as a “progress report” of sorts, given that Morton was then engaged in a
more interpretive work based on his 1849 catalog. Morton’s 1849 table summarizes
means based on 623 specimens, while only circa 25 more cranial capacities are reported
in his catalog that arguably merit inclusion. These exclusions are typically of one or two
specimens per population group, rarely altering the overall mean to any noticeable
degree. As such, most crania measured by Morton 1849  were included in the table,
and the omissions appear largely inadvertent and/or inconsequential.
Michael’s Response to Gould’s Criticisms of Morton
Michael  considers Morton’s  erroneous reporting of his “American Group”
(Native American) mean as 82 in
rather than 80 in
, which Gould  interprets as
evidence that Morton was attempting to place Native Americans above Africans. Michael
states, “If there had been a similar error reducing the ‘American’ mean, however, it could
also have been interpreted to indicate bias toward ‘Caucasians.’ This hypothetical bias is
subtly different from that spelled out by Gould but just as reasonable,” . First,
Morton’s error overstated, not understated, the Native American mean – strictly
hypothetical errors are not informative regarding Morton’s actual bias. Second, the bias
Gould  hypothesizes for Morton is in regards the chain of being: whites above Native
Americans above blacks. A mistake in favor of whites, then, would have been as
consistent with Gould’s hypothesis as a mistake that elevates Native Americans above
Michael claims that, “Gould’s statistical analysis would support his suspicion of
systematic mismeasurement only if Morton had the bias he attributes to him. Since I have
found no indication of that bias, and given the accuracy of Morton’s shot data, it seems
unlikely that Morton systematically mismeasured crania in 1839,” . First, Gould’s 
analysis was designed to detect Morton’s bias, rather than depending on the assumption
of such a bias. Second, as detailed above, Michael’s results  actually do contain
indications of Morton’s bias (though again, the differences between his samples and
Morton’s render the results highly questionable). Third, as noted above, Gould 
explicitly assumes that Morton’s shot data are accurate (though it actually contains some
mistakes, as detailed in Table 1 of our main text).
Michael  claims to have found a Morton error that is not in line with his
assumed racial bias: “the 1849 [Morton] table and my recalculations give the ‘Malay
group’ and Malayan group mean as 84 in
even though the recalculated sample is larger
than Morton’s; simply averaging the recalculated means for the Malayan and Polynesian
subgroups produces a value 1in
lower,” . In fact, Morton  derived his overall
“Malay Group” mean by doing just that: averaging the means of the two subgroups,
“Malayan” and “Polynesian” at 86 and 83 in
respectively, to yield the reported 85 in
(actually, 84.5 in
). That Michael  gets an average of 84.0 in
(so a mere 0.5 in
different than Morton’s) when averaging his recalculated means of those two subgroups is
due to Michael’s inclusion of four crania not included in Morton’s means. This seems
unlikely to qualify as a notable error opposite the inferred direction of Morton’s bias.
Michael  then faults Gould  for rearranging the samples in Gould’s
reanalysis of Morton : “His [Gould’s] ‘African peoples,’ for example, do not include
Morton’s ‘Australians’ or ‘Hottentots,’” . In fact, Morton’s reported mean for his
“African Group” also clearly excludes Australians and “Hottentots” .
Michael’s final argument is that, “Although Gould is mistaken in many of his
assumptions about Morton and his work, he is correct in asserting that these tables are
scientifically unsound. He fails, however, to mention the overriding reason for rejecting
them, namely, Morton’s acceptance of the existence of race .... If race does not really
exist, then Morton’s samples are meaningless, and this criticism overshadows Gould’s
criticisms ...” . Michael  is certainly correct to question the biological validity of
human races, and Morton’s broader groupings (e.g., “American Group,” “Negro
Group”) are certainly of little to no biological significance . However, Morton’s
samples are potentially meaningful regardless of the invalidity of the race concept as
applied to humans: they document global variation and, ironically, are just the type of
data needed to demonstrate that discrete racial categories are inaccurate descriptions of
the pattern of modern human variation. Furthermore, Morton was increasingly careful to
record the specific population affinities of most crania in his collection [4,5,6]. Morton
did not merely label a cranium as “Indian,” for example, but typically provided a specific
tribal and geographic context, along with whatever information was known about the
group and/or individual . Morton’s data was, in a loose sense, the 19th century version
of Howell’s craniometric data . As for Gould , his aim was twofold: first, to show
that, even using Morton’s categories (races), there was little if any difference in average
cranial capacity between those categories (other than that due to sex and stature); and
second, to show that Morton manipulated his data and analysis to support an a priori
notion of racial rankings in intelligence. For both aims, Morton’s use of racial groupings
are actually necessary, rather than precluding Gould’s analysis.
In a similar vein, Michael charges that, “His [Morton’s] failure to define ‘race’
makes his work statistically meaningless,” . Morton did, in fact, define what he meant
by race, and did so in multiple publications [4,5], in quite clear terms: “It is necessary to
explain here what is meant by the word race ...,” .
Michael states that, “Contrary to Gould’s interpretation, I conclude that Morton’s
research was conducted with integrity,” and that, “there is no clear evidence that he
doctored these tables for any reason,” . While we come to largely similar conclusions
(see main text), Michael’s analysis does not support his conclusions. He only deals with
one of Morton’s tables . His remeasurements are compared to Morton’s shot data,
which were explicitly assumed by Gould  to be accurate. Michael’s remeasurements
are reported erroneously, lack specifics on individual comparisons, and are missing the
key data on the population affinity of potentially mis-measured specimens. Michael’s
recalculations  of the Morton means  are of questionable value given the
incongruent samples but, overlooking that, reveal the racial pattern of errors expected by
Gould , contra Michael . Michael’s defense of Morton against Gould’s claims
overlooks the most relevant charges made by Gould. While Michael is correct in pointing
out the biological invalidity of the race concept as applied to humans, this does not render
Morton’s data invalid, or trump Gould’s criticisms of Morton. Overall, Michael’s main
contribution seems to have been his identification of Morton’s handwritten correction of
the erroneous “American Group” mean in a copy of Crania Americana .
Brace  and Cook  have faulted Gould for not citing Michael’s work  in
his revised edition of The Mismeasure of Man . Given the flaws in Michael, it is just as
problematic to take Michael’s critique at face value as it is to take Gould’s analysis at face
1. Michael JS (1988) A new look at Morton's craniological research. Current
Anthropology 29: 349-354.
2. Michael JS (1986) An Analysis of Samuel G. Morton’s Catalog of the Skulls of Man
and the Inferior Animals, Third Ed., Based on a Remeasurement of a Random Sample of
the Morton Collection of Human Crania. Unpublished Undergraduate Honor’s Thesis,
Department of Geology, Macalester College, St. Paul, Minnesota.
3. Lewis JE, DeGusta D, Meyer MR, Monge JM, Mann AW, Holloway RL. 2011. The
mismeasure of science: Stephen Jay Gould versus Samuel George Morton on skulls and
bias. PLoS Biology 9: (6): e1001071. doi:10.1371/.
4. Morton SG (1849) Catalogue of Skulls of Man and the Inferior Animals, Third
Edition. Philadelphia: Merrihew and Thomson Printers.
5. Morton SG (1839) Crania Americana; or, A Comparative View of the Skulls of
Various Aboriginal Nations of North and South America: to Which is Prefixed an Essay
on the Varieties of the Human Species. Philadelphia: J. Dobson.
6. Morton SG (1844) Crania Aegyptiaca; or, Observations on Egyptian Ethnography,
Derived from Anatomy, History, and the Monuments. Philadelphia: John Penington.
7. Gould SJ (1978) Morton's ranking of races by cranial capacity: Unconscious
manipulation of data may be a scientific norm. Science 200: 503-509.
8. Gould SJ (1981) The Mismeasure of Man. New York: W. W. Norton and Company.
9. Brace CL (2005) "Race" Is a Four-Letter Word: The Genesis of the Concept. New
York: Oxford University Press.
10. Cook DC (2006) The old physical anthropology and the New World: A look at the
accomplishments of an antiquated paradigm. In: Buikstra JE, Beck LA, editors.
Bioarchaeology: The Contextual Analysis of Human Remains. Amsterdam: Elsevier. pp.
11. Buikstra JE (2009) Introduction to the 2009 Reprint Edition of Crania Americana.
Davenport, Iowa: Gustav’s Library. pp. i-xxxvi.
12. Howells WW (1995) Who’s who in skulls: Ethnic identification of crania from
measurements. Papers of the Peabody Museum of Archaeology and Ethnology, Harvard
University 82: 1-108.
13. Gould SJ (1996) The Mismeasure of Man, Revised Edition. New York: W.W. Norton