by Jean Philippe Rushton (1997)
The Altruistic Personality
The most important and largest study of the problem of generality versus specificity in behavior concerned altruism. This is the classic “Character Education Enquiry” carried out by Hartshorne and May in the 1920s and published in three books (Hartshorne & May, 1928; Hartshorne, May, & Mailer, 1929; Hartshorne, May, & Shuttleworth, 1930). These investigators gave 11,000 elementary and high school students some 33 different behavioral tests of altruism (referred to as the “service” tests), self-control, and honesty in home, classroom, church, play, and athletic contexts. Concurrently, ratings of the children’s reputations with teachers and classmates were obtained. Altogether, more than 170,000 observations were collected. Scores on the various tests were correlated to discover whether behavior is specific to situations or consistent across them.
This study is still regarded as a landmark that has not been surpassed by later work. It will be discussed in some detail because it is the largest examination of the question ever undertaken, it raises most of the major points of interest, and it has been seriously misinterpreted by many investigators. The various tests administered to the children are summarized in Table 2.1.
First, the results based on the measures of altruism showed that any one behavioral test of altruism correlated, on the average, only 0.20 with any other test. But when the five behavioral measures were aggregated into a battery, they correlated a much higher 0.61 with the measures of the child’s altruistic reputation among his or her teachers and classmates. Furthermore, the teachers’ and peers’ perceptions of the students’ altruism were in close agreement (r = 0.80). These latter results indicate a considerable degree of consistency in altruistic behavior. In this regard, Hartshorne et al. (1929:107) wrote:
The correlation between the total service score and the total reputation scores is .61 … Although this seems low, it should be borne in mind that the correlations between test scores and ratings for intelligence seldom run higher than .50.
Similar results were obtained for the measures of honesty and self-control. Any one behavioral test correlated, on average, only 0.20 with any other test. If, however, the measures were aggregated into batteries, then much higher relationships were found either with other combined behavioral measures, with teachers’ ratings of the children, or with the children’s moral knowledge scores. Often, these correlations were on the order of 0.50 to 0.60. For example, the battery of tests measuring cheating by copying correlated 0.52 with another battery of tests measuring other types of classroom cheating. Thus, depending on whether the focus is on the relationship between individual measures or on the relationship between averaged groups of behaviors, the notions of situational specificity and situational consistency are both supported. Which of these two conclusions is more accurate?
Hartshorne and colleagues focused on the small correlations of 0.20 and 0.30. Consequently, they argued (1928: 411) for a doctrine of specificity:
Neither deceit nor its opposite, “honesty” are unified character traits, but rather specific functions of life situations. Most children will deceive in certain situations and not in others. Lying, cheating, and stealing as measured by the test situations used in these studies are only very loosely related.
Their conclusions and data have often been cited in the subsequent literature as supporting situational specificity. For example, Mischel’s (1968) influential review argued for specificity on the ground that contexts are important and that people have different methods of dealing with different situations.
Unfortunately Hartshorne and May (1928-30), P. E. Vernon (1964), Mischel (1968), and many others, including me (Rushton, 1976), had seriously overinterpreted the results as implying that there was not enough cross-situational consistency to make the concept of traits very useful. This, however, turned out to be wrong. By focusing on correlations of 0.20 and 0.30 between any two measures, a misleading impression is created. A more accurate picture is obtained by examining the predictability achieved from a number of measures. This is because the randomness in any one measure (error and specificity variance) is averaged out over several measures, leaving a clearer view of what a person’s true behavior is like. Correlations of 0.50 and 0.60 based on aggregated measures support the view that there is cross-situational consistency in altruistic and honest behavior.
Further evidence for this conclusion is found in Hartshorne and May’s data. Examination of the relationships between the battery of altruism tests and batteries concerned with honesty, self-control, persistence, and moral knowledge suggested a factor of general moral character (see, e.g., Hartshorne et al., 1930: 230, Table 32). Mailer (1934) was one of the first to note this. Using Spearman’s tetrad difference technique, Mailer isolated a common factor in the intercorrelations of the character tests of honesty, altruism, self-control, and persistence. Subsequently, Burton (1963) reanalyzed the Hartshorne and May data and found a general factor that accounted for 35-40 percent of common variance.
As Eysenck (1970), among others, has repeatedly pointed out, failures to take account of the necessity to average across a number of exemplars in order to see consistency led to the widespread and erroneous view that moral behavior is almost completely situation specific. This, in turn, led students of moral development to neglect research aimed at discovering the origins of general moral “traits”. The fact that, judging from the aggregated correlational data, moral traits do exist, and, moreover, appear to develop early in life, poses a considerable challenge to developmental research.
One traditionally important source of data has been the judgments and ratings of people made by their teachers and peers. In recent years, judges’ ratings have been much maligned on the ground that they are little more than “erroneous constructions of the perceiver.” This pervasive view had led to a disenchantment with the use of ratings. The main empirical reason that is cited for rejecting rating methods is that judges’ ratings only correlate, on the average, 0.20 to 0.30. However, it is questionable that correlations between two judges’ ratings are stable and representative. The validity of judgments increases as the number of judges becomes larger.
Galton (1908) provided an early demonstration from a cattle exhibition where 800 visitors judged the weight of an ox. He found that the individual estimates were distributed in such a way that 50 percent fell between plus or minus three percent of the middlemost value that was itself within one percent of the real value. Gallon likened the results to the votes given in a democracy where, with the reservation that the voters be versed in the issues, the vox populi was correct. Shortly thereafter, K. Gordon (1924) had subjects order a series of objects by weight. When the number of subjects making the judgment increased from 1 to 5 to 50, the corresponding validities increased from 0.41 to 0.68 to 0.94.
The question of cross-situational consistency becomes a question about longitudinal consistency when the time dimension is introduced. To what extent, over both time and situation, do a person’s behaviors stem from enduring traits of character? When studies measure individual differences by aggregating over many different assessments, longitudinal stability is usually found. But when single measurements or other less reliable techniques are used, longitudinal stability is less marked.
Intelligence is the trait with the strongest stability over time. The ordering of an individual relative to his or her age cohort over the teenage and adult years shows typical correlations of 0.62 to 0.94 over 7 to 40 years (Brody, 1992). The trend is for the correlations to decline as the period of time between administrations of the test increases. But the correlations can be increased by further aggregation. For example, the combined score from tests administered at ages 10, 11, and 12 correlate 0.96 with a combined score from tests administered at ages 17 and 18 (Pinneau, 1961). This latter finding suggests that there was initially no change at all in an individual’s score relative to his or her cohorts over the high school years.
Intelligence in infancy, however, is either slightly less stable or somewhat less easy to measure. The correlations between a composite of tests taken from 12 to 24 months predicts the composite at ages 17 and 18 around 0.50 (Pinneau, 1961). Newer techniques based on infant habituation and recognition memory (the infant’s response to a novel or familiar stimulus) made in the first year of life predict later IQ assessed between 1 and 8 years of age with a weighted (for sample size) average of between 0.36 and 0.45 (McCall & Carriger, 1993).
The stability of personality has been demonstrated over several 30-year longitudinal studies. To summarize these, Costa and McCrae (1994:21) quote William James (1890/1981), saying that once adulthood has been reached, personality is “set like plaster.” At younger ages, personality stability was demonstrated by Jack Block (1971, 1981) in work where the principle of aggregation was strictly adhered to. For about 170 individuals data were first obtained in the 1930s when the subjects were in their early teens. Further data were gathered when the subjects were in their late teens, in their mid-30s, and in their mid-40s. The archival data so generated were enormously wide-ranging and often not in a form permitting of direct quantification. Block systematized the data by employing clinical psychologists to study individual dossiers and to rate the subject’s personality using the Q-sort procedure — a set of descriptive statements such as “is anxious,” which can be sorted into piles that indicate how representative the statement is of the subject. To ensure independence, the materials for each subject were carefully segregated by age level, and no psychologist rated the materials for the same subject at more than one time period. The assessments by the different raters (usually three for each dossier) were found to agree with one another to a significant degree, and they were averaged to form an overall description of the subject at that age.
Block (1971, 1981) found personality stability across the ages tested. Even the simple correlations between Q-sort items over the 30 years between adolescence and the mid-40s provided evidence for stability. Correlations indicating stability were, for example, for the male sample: “genuinely values intellectual and cognitive matters,” 0.58; “is self-defeating,” 0.46; and “has fluctuating moods,” 0.40; for the female sample, “is an interesting, arresting person,” 0.44; “aesthetically reactive,” 0.41; and “is cheerful,” 0.36. When the whole range of variables for each individual was correlated over 30 years, the mean correlation was 0.31. When typologies were created, the relationships became even more substantial.
Using self-reports instead of judgments made by others, Conley (1984) analyzed test-retest data from 10 to 40 years for major dimensions of personality such as extraversion, neuroticism, and impulsivity. The correlations in different studies ranged from 0.26 to 0.84 for periods extending from 10 to 40 years, with an average of about 0.45 for the 40-year period. Overall the personality traits were only slightly less consistent over time than were measures of intelligence (0.67, in this study).
Longitudinal stability has been cross-validated using different procedures. Thus, one method is used to assess personality at Time 1 (e.g., ratings made by others) and a quite different method at Time 2 (e.g., behavioral observations). Olweus (1979), for example, reported correlations of 0.81 over a 1-year time period between teacher ratings of the aggressive behavior of children and frequency count observations of the actual aggressive behavior. Conley (1985) reported correlations of about 0.35 between ratings made by a person’s acquaintances as they were about to get married and self-reports made some 20 years later.
In a 22-year study of the development of aggression, Eron (1987) found that children rated as aggressive by their peers when they were 8 years old were rated as aggressive by a different set of peers 10 years later and were 3 times more likely to have been entered on police record by the time they were 19 than those not so rated. By age 30, these children were more likely to have engaged in a syndrome of antisocial behavior including criminal convictions, traffic violations, child and spouse abuse, and physical aggressiveness outside the family. Moreover, the stability of aggression was found to exist across three generations, from grandparents to children to grandchildren. The 22-year stability of aggressive behavior is 0.50 for men and 0.35 for women.
Also in the 22-year data, early ratings of prosocial behavior were positively related to later prosocial behavior and negatively related to later antisocial behavior. Children rated as concerned about interpersonal relations at age 8 had higher occupational and educational attainment as well as low aggression, social success, and good mental health, whereas aggression at age 8 predicted social failure, psychopathology, aggression, and low educational and occupational success. In all of these analyses, social class was held constant. Eron’s (1987) data suggested that aggression and prosocial behavior are at two ends of a continuum (see Figure 2.3).
The general conclusion is that once people reach the age of 30 there is little change in the major dimensions of personality. McCrae and Costa (1990; Costa & McCrae, 1992) reviewed six longitudinal studies published between 1978 and 1992, including two of their own. The six had quite different samples and rationales but came to the same conclusions. Basic tendencies typically stabilized somewhere between 21 and 30. Retest measures for both self-reports and ratings made by others are typically about 0.70. Moreover, anything these dimensions affect stabilizes as well, such as self-concept, skills, interests, and coping strategies.
Although a great deal of effort has gone into refining paper and pencil and other techniques for measuring attitudes, personality, and intelligence, relatively little attention has been given to the adequacy of measurements on the behavioral end of the relationship. Whereas the person end of the person-behavior relationship has often been measured by multi-item scales, the behavior to be predicted has often comprised a single act.
Fishbein and Ajzen (1974) proposed that multiple-act criteria be used on the behavioral side. Using a variety of attitude scales to measure religious attitudes and a multiple-item religious behavior scale, they found that attitudes were related to multiple-act criteria but had no consistent relationship to single-act criteria. Whereas the various attitude scales had a mean correlation with single behaviors ranging from 0.14 to 0.19, their correlations with aggregated behavioral measures ranged from 0.70 to 0.90.
In a similar paper to Fishbein and Ajzen’s, Jaccard (1974) carried out an investigation to determine whether the dominance scales of the California Psychological Inventory and the Personality Research Form would predict self-reported dominance behaviors better in the aggregate than they would at the single-item level. The results were in accord with the aggregation expectations. Whereas both personality scales had a mean correlation of 0.20 with individual behaviors, the aggregated correlations were 0.58 and 0.64.
Comparable observations were made by Eaton (1983) who assessed activity level in three- and four-year olds using single versus multiple actometers attached to the children’s wrists as the criterion and teachers’ and parents’ ratings of the children’s activity level as the predictors. The ratings predicted activity scores from single actometers relatively weakly (0.33) while predicting those aggregated across multiple actometers comparatively well (0.69).
One Problem with Experimental Studies
Failures to aggregate dependent variables in experimental situations may produce conclusions about the relative modifiability of behavior that may be incorrect. For example, with respect to social development, it is considered well established that observational learning from demonstrators has powerful effects on social behavior (Bandura, 1969, 1986). These findings have prompted governmental concern about possible inadvertent learning from television. Concerning intellectual development, it is equally well known that intervention programs designed to boost children’s intelligence, some of them employing observational learning, have achieved only modest success (Brody, 1992; Locurto, 1991).
The apparent difference in the relative malleability of social and intellectual development has been explained in various ways. One leading interpretation is that intellectual development is controlled by variables that are “structural” and, therefore, minimally susceptible to learning, whereas social development is controlled by variables that are “motivational” and, therefore, more susceptible to learning. An analysis of the dependent variables used in the two types of studies, however, suggests an interpretation based on the aggregation principle.
In observational learning studies, a single dependent variable is typically used to measure the behavior; for example, the number of punches delivered to a Bo-Bo doll in the case of aggression (Bandura, 1969) or the number of tokens donated to a charity in the case of altruism (Rushton, 1980). In intellectual training studies, however, multiple-item dependent variables such as standardized intelligence tests are typically used. Throughout this discussion it has been stressed that the low reliability of nonaggregated measures can mask strong underlying relationships between variables. In the case of learning studies, it can have essentially the opposite effect. It is always easier to produce a change in some trait as a consequence of learning when a single, less stable measure of the trait is taken than when more stable, multiple measures are taken. This fact may explain why social learning studies of altruism have generally been more successful than training studies of intellectual development.
The degree to which various tests are correlated with g, or are “g-loaded,” can be determined by factor analysis, a statistical procedure for grouping items. […] Most conventional tests of mental ability are highly g-loaded although they usually measure some admixture of other factors in addition to g, such as verbal, spatial, and memory abilities, as well as acquired information of a scholastic nature (Brody, 1992). Test scores with the g factor statistically removed have virtually no predictive power for scholastic performance. Hence, it is the g factor that is the “active ingredient.” The predictive validity of g applies also to performance in nearly all types of jobs. Occupations differ in their complexity and g demands as much as do mental tests, so as the complexity of a job increases, the better cognitive ability predicts performance on it (e.g., managers and professions 0.42 to 0.87, sales clerks and vehicle operators 0.27 to 0.37; see Hunter, 1986, Table 1; Hunter & Hunter, 1984).
Chapter 3 : Behavioral Genetics
The Heritability of Behavior
Anthropometric and Physiological Traits
Height, weight, and other physical attributes provide a point of comparison to behavioral data. Not surprisingly, they are usually highly heritable accounting for 50 to 90 percent of the variance. These results are found from studies of both twins and adoptees (e.g., Table 3.1). The genes also account for large portions of the variance in physiological processes such as rate of breathing, blood pressure, perspiration, pulse rate, and EEG-measured brain activity.
Obesity was studied in a sample of 540 42-year-old Danish adoptees selected so that the age and sex distribution was the same in each of four weight categories: thin, medium, overweight, and obese (Stunkard et al., 1986). Biological and adoptive parents were contacted and their current weight assessed. The weight of the adoptees was predicted from that of their biologic parents but not at all from that of the adoptive parents with whom they had been raised. The relation between biologic parents and adoptees was present across the whole range of body fatness — from very thin to very fat. Thus, genetic influences play an important role in determining human fatness, whereas the family environment alone has no apparent effect. This latter result, of course, is one that varies from popular views. Subsequent evidence shows significant genetic transmission of obesity in black as well as in white families (Ness, Laskarzewski, & Price, 1991).
Testosterone is a hormone mediating many bio-behavioral variables in both men and women. Its heritability was examined in 75 pairs of MZ twins and 88 pairs of DZ twins by Meikle, Bishop, Stringham, & West (1987). They found that genes regulated 25 to 76 percent of plasma content for testosterone, estradiol, estrone, 3 alpha-audiostanediol glucuronide, free testosterone, lutinizing hormone, follicle stimulating hormone, and other factors affecting testosterone metabolism.
Several investigators have found activity level to be heritable from infancy onward (Matheny, 1983). In one study, activity in 54 identical and 39 fraternal twins aged 3 to 12 years was assessed with behaviors like “gets up and down” while “watching television” and “during meals” (Willerman, 1973). The correlation for identical twins was 0.88 and for fraternal twins was 0.59, yielding a heritability of 58 percent. An investigation of 181 identical and 84 fraternal twins from 1 to 5 years of age using parent ratings found correlations for a factor of zestfulness of 0.78 for identical and 0.54 for fraternal twins, yielding a heritability of 48 percent (Cohen, Dibble, & Grawe, 1977). Data from a Swedish sample aged 59 years and including 424 twins reared together and 315 twins reared apart showed the heritability for activity level in this older sample to be 25 percent (Plomin, Pedersen, McClearn, Nesselroade, & Bergeman, 1988).
Altruism and Aggression
Several twin studies have been conducted on altruism and aggression. Loehlin and Nichols (1976) carried out cluster analyses of self-ratings made by 850 adolescent pairs on various traits. Clusters labeled kind, argumentative, and family quarrel showed the monozygotic twins to be about twice as much alike as the dizygotic twins, with heritabilities from 20 to 42 percent. Matthews, Batson, Horn, and Rosenman (1981) analyzed adult twin responses to a self-report measure of empathy and estimated a heritability of 72 percent. In the Minnesota adoption study of twins raised apart, summarized in Table 3.1, the correlations for 44 pairs of identical twins reared apart are 0.46 for aggression and 0.53 for traditionalism, a measure of following rules and authority (Tellegen et al., 1988).
In a study of 573 pairs of identical and fraternal adult twin pairs reared together, all of the twins completed separate questionnaires measuring altruistic and aggressive tendencies. The questionnaires included a 20-item self-report altruism scale, a 33-item empathy scale, a 16-item nurturance scale, and many items measuring aggressive dispositions. As shown in Table 3.2, 50 percent of the variance on each scale was associated with genetic effects, virtually 0 percent with the twin’s common environment, and the remaining 50 percent with each twin’s specific environment. When the estimates were corrected for unreliability of measurement, the genetic contribution increased to 60 percent (Rushton, Fulker, Neale, Nias, & Eysenck, 1986).
At 14 months of age, empathy was assessed in 200 pairs of twins by the child’s response to feigned injury by experimenter and mother (Emde et al., 1992). Ratings were based on the strength of concern expressed in the child’s face, the level of emotional arousal expressed in the child’s body as well as prosocial intervention by the child (e.g., comforting by patting the victim or bringing the victim a toy). About 36 percent of the variance was estimated to be genetic.
Although social, political and religious attitudes are often thought to be environmentally determined, a twin study by Eaves and Eysenck (1974) found that radicalism-conservatism had a heritability of 54 percent, tough-mindedness had a heritability of 54 percent, and the tendency to voice extreme views had a heritability of 37 percent. In a review of this and two other British studies of conservatism, Eaves and Young (1981) found for 894 pairs of identical twins an average correlation of 0.67 and for 523 fraternal twins an average correlation of 0.52, yielding an average heritability of 30 percent.
In a cross-national study, 3,810 Australian twin pairs reared together reported their response to 50 items of conservatism such as death penalty, divorce, and jazz (Martin et al., 1986). The heritabilities ranged from 8 percent to 51 percent (see Table 4.4, next chapter). Overall correlations of 0.63 and 0.46 were found for identical and fraternal twins, respectively, yielding a heritability of 34 percent. Correcting for the high assortative mating that occurs on political attitudes raised the overall heritability to about 50 percent. Martin et al. (1986) also replicated the analyses by Eaves and Eysenck (1974) on the heritability of radicalism and tough-mindedness.
Religious attitudes also show genetic influence. Although Loehlin and Nichols (1976) found no genetic influences on belief in God or involvement in organized religious activities in their study of 850 high school twins, when religiosity items were aggregated with other items, such as present religious preference, then a genetic contribution of about 20 percent became observable (Loehlin & Nichols, 1976, Table 4-3, Cluster 15). Using a more complete assessment battery, including five well-established scales of religious attitudes, interests and values, and estimates of heritability from twins reared apart as well as together, the Minnesota study estimated the genetic contribution to the variance in their instruments to be about 50 percent (Table 3.1; also Waller, Kojetin, Bouchard, Lykken, & Tellegen, 1990).
The earliest twin study of criminality was published in 1929 in Germany by Johannes Lange. Translated into English in 1931, Crime as Destiny reported on the careers of a number of criminal twins, some of them identical, others fraternal, shortly after the distinction between the two kinds had become generally accepted. Lange compared the concordance rates for 13 monozygotic and 17 dizygotic pairs of twins in which at least 1 had been convicted of a criminal offense. Ten of the 13 monozygotic pairs (77 percent) were concordant, whereas only 2 of the 17 dizygotic pairs (12 percent) were concordant. A summary of Lange’s (1931) study and of the literature up to the 1960s was provided by Eysenck and Gudjonsson (1989). For 135 monozygotic twins the concordance rate was 67 percent and for 135 dizygotic twins, 30 percent.
Among subsequent studies is an investigation of the total population of 3,586 male twin pairs born on the Danish Islands from 1881 to 1910, recording serious offenses only. For this nonselected sample, identical and fraternal twin concordances are 42 percent versus 21 percent for crimes against persons and 40 percent versus 16 percent for crimes against property (Christiansen, 1977). Three small studies carried out in Japan showed similar concordance rates to those in the West (see Eysenck & Gudjonsson, 1989: 97-99).
Replicating the concordance ratios based on official statistics are those from studies based on self-reports. Sending questionnaires by mail to 265 adolescent twin pairs, Rowe (1986) sampled the eighth through twelfth grades in almost all the school districts of Ohio. The results showed that identical twins were roughly twice as much alike in their criminal behavior as fraternal twins, the heritability being about 50 percent.
Converging with the twin work are the results from several American, Danish, and Swedish adoption studies. Children who were adopted in infancy were at greater risk for criminal convictions if their biological parents had been so convicted than if their adoptive parents had been. For example, in the Danish study, based on 14,427 adoptees, for 2,492 adopted sons who had neither adoptive nor biological criminal parents, 14 percent had at least one criminal conviction. For 204 adopted sons whose adoptive (but not biological) parents were criminals, 15 percent had at least one conviction. If biological (but not adoptive) parents were criminal, 20 percent (of 1,226) adopted sons had criminal records; if both biological and adoptive parents were criminal, 25 percent (of 143) adopted sons were criminals. In addition, it was found that siblings raised apart showed 20 percent concordance and that half-siblings showed 13 percent concordance while pairs of unrelated children reared together in the same adoptive families showed 9 percent concordance (Mednick, Gabrielli, & Hutchings, 1984).
The largest heritability study of emotional reactivity, or the speed of arousal to fear and anger, was carried out by Floderus-Myrhed, Pedersen, and Rasmuson (1980). They administered the Eysenck Personality Inventory to 12,898 adolescent twin pairs of the Swedish Twin Registry. The heritability for neuroticism was 50 percent for men and 58 percent for women. Another large twin study, carried out in Australia, involving 2,903 twin pairs, found identical and fraternal twin correlations of 0.50 and 0.23 for neuroticism (Martin & Jardine, 1986). The opposite side of the neuroticism continuum, emotional stability, as measured by the California Psychological Inventory’s Sense of Weil-Being scale, is also found to have a significant heritability, both in adolescence and 12 years later (Dworkin et al., 1976).
The studies of twins raised apart substantiate the genetic contribution to a neuroticism “superfactor.” In the Minnesota study (Table 3.1), the correlation for the 44 MZA twins is 0.61 for the trait of stress reaction, 0.48 for alienation, and 0.49 for harm avoidance (Tellegen et al., 1988). In a Swedish study of 59-year-olds the correlation for emotionality in 90 pairs of identical twins reared apart is 0.30 (Plomin et al., 1988). Other adoption studies also confirm that the familial resemblance for neuroticism is genetically based. In a review of three adoption studies, the average correlation for nonadoptive relatives was about 0.15 and the average correlation for adoptive relatives was nearly zero, suggesting a heritability estimate of about 0.30 (Henderson, 1982).
It is the g factor that is the most heritable component of intelligence tests. In Bouchard et al.’s study (Table 3.1) the g factor, the first principal component extracted from several mental ability tests, had the highest heritability (78 percent). Similarly in Pedersen et al.’s (1992) study, the first principal component had a heritability of 80 percent whereas the specific abilities averaged around 50 percent.
Remarkably, the strength of the heritability varies directly as a result of a test’s g loading. Jensen (1983) found a correlation of 0.81 between the g loadings of the 11 subtests of the Wechsler Intelligence Scale for Children and heritability strength assessed by genetic dominance based on inbreeding depression scores from cousin marriages in Japan. Inbreeding depression is defined as a lowered mean of the trait relative to the mean in a non-inbred population and is especially interesting because it indicates genetic dominance, which arises when a trait confers evolutionary fitness.
Jensen took the figures on inbreeding depression from a study by Schull and Neel (1965) who calculated them from 1,854 7- to 10-year-old Japanese children. Since about 50 percent of the sample involved cousin marriages, it was possible to assess the inbreeding depression on each subtest, expressed as the percentage decrement in the score per 10 percent increase in degree of inbreeding. These were calculated after statistically controlling for child’s age, birth rank, month of examination, and eight different parental variables, mostly pertaining to SES. The complement of inbreeding depression was found by Nagoshi and Johnson (1986) who observed “hybrid vigor” in offspring of Caucasoid-Mongoloid matings in Hawaii.
Subsequently, Jensen (1987a) reported rank order correlations of 0.55 and 0.62 between estimates of genetic influence from two twin studies and the g loadings of the Wechsler Adult Intelligence Scale subtests, and P. A. Vernon (1989) found a correlation of 0.60 between the heritabilities of a variety of speed of decision time tasks and their relationship with the g loadings from a psychometric test of general intelligence. More detailed analyses showed that the relationship among the speed and IQ measures are mediated entirely by hereditary factors. Thus, there are common biological mechanisms underlying the association between reaction time and information-processing speed and mental ability (Baker, Vernon, & Ho, 1991).
Heritabilities for mental ability have been examined within black and Oriental populations. A study by Scarr-Salapatek (1971) suggested the heritability might be lower for black children than for white children. Subsequently, Osborne (1978, 1980) reported heritabilities of greater than 50 percent both for 123 black and for 304 white adolescent twin pairs. Japanese data for 543 monozygotic and 134 dizygotic twins tested for intelligence at the age of 12 gave correlations of 0.78 and 0.49 respectively, indicating a heritability of 58 percent (R. Lynn & Hattori, 1990).
Related to intelligence at greater than 0.50 are years of education, occupational status, and other indices of socioeconomic status (Jensen, 1980a). All of these have also been shown to be heritable. For example, a study of 1,900 pairs of 50-year-old male twins yielded MZ and DZ twin correlations of 0.42 and 0.21, respectively for occupational status, and 0.54 and 0.30 for income (Fulker & Eysenck, 1979; Taubman, 1976). An adoption study of occupational status yielded a correlation of 0.20 between biological fathers and their adult adopted-away sons (2,467 pairs; Teasdale, 1979). A study of 99 pairs of adopted-apart siblings yielded a correlation of 0.22 (Teasdale & Owen, 1981). All of these are consistent with a heritability of about 40 percent for occupational status. Years of schooling also shows substantial genetic influence; for example, MZ and DZ twin correlations are typically about 0.75 and 0.50 respectively, suggesting that heritability is about 50 percent (e.g., Taubman, 1976).
Longevity and Health
Work on the genetics of longevity and senescence was pioneered by Kallman and Sander (1948, 1949). These authors carried out a survey in New York of over 1,000 pairs of twins aged 60 years or older and found that intra-pair differences for longevity, disease, and general adjustment to the aging process were consistently smaller for identical twins than for fraternal twins. For example, the average intra-pair difference in life span was 37 months for identical twins and 78 months for fraternal twins. In an adoption study of all 1,003 nonfamilial adoptions formally granted in Denmark between 1924 and 1947, age of death in the adult adoptees was predicted better by knowledge of the age of death in the biological parent than by knowledge of the age of death in the adopting parent (Sorensen, Nielsen, Andersen, & Teasdale, 1988).
Many individual difference variables associated with health are heritable. Genetic influences have been found for blood pressure, obesity, resting metabolic rate, behavior patterns such as smoking, alcohol use, and physical exercise, as well as susceptibility to infectious diseases. There is also a genetic component of from 30 to 50 percent for hospitalized illnesses in the pediatric age group including pediatric deaths (Scriver, 1984).
Numerous studies have shown substantial genetic influences on reading disabilities, mental retardation, schizophrenia, affective disorders, alcoholism, and anxiety disorders. In a now classic early study, adopted-away offspring of hospitalized chronic schizophrenic women were interviewed at the average age of 36 and compared to matched adoptees whose birth parents had no known psychopathology (Heston, 1966). Of 47 adoptees whose biological parents were schizophrenic, 5 had been hospitalized for schizophrenia. None of the adoptees in the control group was schizophrenic. Studies in Denmark confirmed this finding and also found evidence for genetic influence when researchers started with schizophrenic adoptees and then searched for their adoptive and biological relatives (Rosenthal, 1972; Kety, Rosenthal, Wender, & Schulsinger, 1976). A major review of the genetics of schizophrenia has been presented by Gottesman (1991).
Alcoholism also runs in families such that about 25 percent of the male relatives of alcoholics are themselves alcoholics, as compared with less than 5 percent of the males in the general population. In a Swedish study of middle-aged twins who had been reared apart, twin correlations for total alcohol consumed per month were 0.71 for 120 pairs of identical twins reared apart and 0.31 for 290 pairs of fraternal twins reared apart (Pedersen, Friberg, Floderus-Myrhed, McClearn, & Plomin, 1984). A Swedish adoption study of males found that 22 percent of the adopted-away sons of biological fathers who abused alcohol were alcoholic (Cloninger, Bohman, & Sigvardsson, 1981).
A questionnaire study of twins found genetic influence on strength of sex drive in turn predictive of age of first sexual intercourse, frequency of intercourse, number of sexual partners, and type of position preferred (Eysenck, 1976; Martin, Eaves, & Eysenck, 1977). Divorce, or the factors leading to it at least, is also heritable. Based on a survey of more than 1,500 twin pairs, their parents, and their spouses’ parents, McGue and Lykken (1992) calculated a 52 percent heritability. They suggested the propensity was mediated through other heritable traits relating to sexual behavior, personality, and personal values.
Perhaps the most frequently cited study of the genetics of sexual orientation is that of Kallman (1952), in which he reported a concordance rate of 100 percent among homosexual MZ twins. Bailey and Pillard (1991) estimated the genetic component to male homosexuality to be about 50 percent. They recruited subjects through ads in gay publications and received usable questionnaire responses from 170 twin or adoptive brothers. Fifty-two percent of the identical twins, 22 percent of the fraternal twins, and 11 percent of the adoptive brothers were found to be homosexual. The distribution of sexual orientation among identical co-twins of homosexuals was bimodal, implying that homosexuality is taxonomically distinct from heterosexuality.
Subsequently, Bailey, Pillard, Neale, and Agyei (1993) carried out a twin study of lesbians and found that here, too, genes accounted for about half the variance in sexual preferences. Of the relatives whose sexual orientation could be confidently rated, 34 (48 percent) of 71 monozygotic co-twins, 6 (16 percent) of 37 dizygotic cotwins, and 2 (6 percent) of 35 adoptive sisters were homosexual.
In one large study, Floderus-Myrhed et al. (1980) gave the Eysenck Personality Inventory to 12,898 adolescent twin pairs of the Swedish Twin Registry. The heritability for extraversion, highly related to sociability, was 54 percent for men and 66 percent for women. Another large study of extraversion involving 2,903 Australian twin pairs, found identical and fraternal twin correlations of 0.52 and 0.17 with a resultant heritability of 70 percent (Martin & Jardine, 1986). In a Swedish adoption study of middle-aged people, the correlation for sociability in 90 pairs of identical twins reared apart was 0.20 (Plomin et al., 1988).
Sociability and the related construct of shyness show up at an early age. In a study of 200 pairs of twins, Emde et al. (1992) found both sociability and shyness to be heritable at 14 months. Ratings of videotapes made of reactions to arrival at the home and the laboratory and other novel situations, such as being offered a toy, along with ratings made by both parents showed heritabilities ranging from 27 to 56 percent.
Values and Vocational Interests
Loehlin and Nichols’s (1976) study of 850 twin pairs raised together provided evidence for the heritability of both values and vocational interests. Values such as the desire to be well-adjusted, popular, and kind, or having scientific, artistic, and leadership goals were found to be genetically influenced. So were a range of career preferences including those for sales, blue-collar management, teaching, banking, literature, military, social service, and sports.
As shown in Table 3.1, Bouchard et al. (1990) reported that, on measures of vocational interest, the correlations for their 40 identical twins raised apart are about 0.40. Additional analyses from the Minnesota Study of Twins Reared Apart suggest the genetic contribution to work values is pervasive. One comparison of reared-apart twins found a 40 percent heritability for preference for job outcomes such as achievement, comfort, status, safety, and autonomy (Keller, Bouchard, Arvey, Segal, & Dawis, 1992). Another study of MZAs indicated a 30 percent heritability for job satisfaction (Arvey, Bouchard, Segal, & Abraham, 1989).
Genetic Similarity Theory
The Paradox of Altruism
The resolution of the paradox of altruism is one of the triumphs that led to the new synthesis called sociobiology. By a process known as kin selection, individuals can maximize their inclusive fitness rather than only their individual fitness by increasing the production of successful offspring by both themselves and their genetic relatives (Hamilton, 1964). According to this view, the unit of analysis for evolutionary selection is not the individual organism but its genes. Genes are what survive and are passed on, and some of the same genes will be found not only in direct offspring but in siblings, cousins, nephews/nieces, and grandchildren. If an animal sacrifices its life for its siblings’ offspring, it ensures the survival of common genes because, by common descent, it shares 50 percent of its genes with each sibling and 25 percent with each sibling’s offspring.
Thus, the percentage of shared genes helps determine the amount of altruism displayed. Social ants are particularly altruistic because of a special feature of their reproductive system that gives them 75 percent of their genes in common with their sisters. Ground squirrels emit more warning calls when placed near relatives than when placed near nonrelatives; “helpers” at the nest tend to be related to one member of the breeding pair; and when social groups of monkeys split, close relatives remain together. When the sting of the honey bee is torn from its body, the individual dies, but the bee’s genes, shared in the colony of relatives, survive.
Thus, from an evolutionary perspective, altruism is a means of helping genes to propagate. By being most altruistic to those with whom we share genes we help copies of our own genes to replicate. This makes “altruism” ultimately “selfish” in purpose. Promulgated in the context of animal behavior this idea became known as “kin-selection” and provided a conceptual breakthrough by redefining the unit of analysis away from the individual organism to his or her genes, for it is these that survive and are passed on.
Kin Recognition in Humans
Building on the work of Hamilton (1964), Dawkins (1976), Thiessen and Gregg (1980), and others, the kin-selection theory of altruism was extended to the human case. Rushton et al. (1984) proposed that, if a gene can better ensure its own survival by acting so as to bring about the reproduction of family members with whom it shares copies, then it can also do so by benefiting any organism in which copies of itself are to be found. This would be an alternative way for genes to propagate themselves. Rather than merely protecting kin at the expense of strangers, if organisms could identify genetically similar organisms, they could exhibit altruism toward these “strangers” as well as toward kin. Kin recognition would be just one form of genetic similarity detection.
The implication of genetic similarity theory is that the more genes are shared by organisms, the more readily reciprocal altruism and cooperation should develop because this eliminates the need for strict reciprocity. In order to pursue a strategy of directing altruism toward similar genes, the organism must be able to detect genetic similarity in others. As described in the previous section, four such mechanisms by which this could occur have been considered in the literature.
Humans are capable of learning to distinguish kin from non-kin at an early age. Infants can distinguish their mothers from other women by voice alone at 24 hours of age, know the smell of their mother’s breast before they are six days of age, and recognize a photograph of their mother when they are 2 weeks old. Mothers are also able to identify their infants by smell alone after a single exposure at 6 hours of age, and to recognize their infant’s cry within 48 hours of birth (see Wells, 1987, for review).
Human kin preferences also follow lines of genetic similarity. For example, among the Ye’Kwana Indians of South America, the words “brother” and “sister” cover four different categories ranging from individuals who share 50 percent of their genes (identical by descent) to individuals who share only 12.5 percent of their genes. Hames (1979) has shown that the amount of time the Ye’Kwana spend interacting with their biological relatives increases with their degree of relatedness, even though their kinship terminology does not reflect this correspondence.
Anthropological data also show that in societies where certainty of paternity is relatively low, males direct material resources to their sisters’ offspring (to whom their relatedness is certain) rather than to their wives’ offspring (Kurland, 1979). An analysis of the contents of 1,000 probated wills reveals that after husbands and wives, kin received about 55 percent of the total amount bequeathed whereas non-kin received only about 7 percent; offspring received more than nephews and nieces (Smith, Kish, & Crawford, 1987).
Paternity uncertainty also exerts predictable influence. Grandparents spend 35 to 42 percent more time with their daughters’ children than with their sons’ children (Smith, 1981). Following a bereavement they grieve more for their daughters’ children than for their sons’ children (Littlefield & Rushton, 1986). Family members feel only 87 percent as close to the fathers’ side of the family as they do to the mothers’ side (Russell & Wells, 1987). Finally, mothers of newborn children and her relatives spend more time commenting on resemblances between the baby and the putative father than they do about the resemblance between the baby and the mother (Daly & Wilson, 1982).
When the level of genetic similarity within a family is low, the consequences can be serious. Children who are unrelated to a parent are at risk; a disproportionate number of battered babies are stepchildren (Lightcap, Kurland, & Burgess, 1982). Children of preschool age are 40 times more likely to be assaulted if they are stepchildren than if they are biological children (Daly & Wilson, 1988). Also, unrelated people living together are more likely to kill each other than are related people living together. Converging evidence shows that adoptions are more likely to be successful when the parents perceive the child as similar to them (Jaffee & Fanshel, 1970).
Heritability Predicts Spousal Similarity
If people choose each other on the basis of shared genes, it should be possible to demonstrate that interpersonal relationships are influenced more by genetic similarity than by similarity attributable to a similar environment. A strong test of the theory is to observe that positive assortative mating is greater on the more heritable of a set of homogeneous items. This prediction follows because more heritable items better reflect the underlying genotype.
Stronger estimates of genetic influence have been found to predict the degree of matching that marriage partners have engaged in on anthropometric, attitudinal, cognitive, and personality variables. Thus, Rushton and Nicholson (1988) examined studies using 15 subtests from the Hawaii Family Study of Cognition and 11 subtests from the Wechsler Adult Intelligence Scale. With the Hawaii battery, genetic estimates from Koreans in Korea correlated positively with those from Americans of Japanese and European ancestry (mean r = 0.54, p < 0.01). With the Wechsler scale, estimates of genetic influence correlated across three samples with a mean r = 0.82.
Consider the data in Table 4.2 showing heritabilities predicting the similarity of marriage partners. Note, though, that many of the estimates of genetic influence in this table are based on calculations of midparent-offspring regressions using data from intact families, thereby combining genetic and shared-family environmental effects. The latter source of variance, however, is surprisingly small (Plomin & Daniels, 1987) and has not been found to add systematic bias. Nonetheless, it should be borne in mind that many of the estimates of genetic influence shown in Table 4.2 were calculated in this way.
Reported in Table 4.2 is a study by Russell, Wells, and Rushton (1985) who used a within-subjects design to examine data from three studies reporting independent estimates of genetic influence and assortative mating. Positive correlations were found between the two sets of measures (r = 0.36, p < 0.05, for 36 anthropometric variables; r = 0.13, p < 0.10, for 5 perceptual judgment variables; and r = 0.44, p < 0.01, for 11 personality variables). In the case of the personality measures, test-retest reliabilities over a three-year period were available and were not found to influence the results.
Another test of the hypothesis reported in Table 4.2 was made by Rushton and Russell (1985) using two separate estimates of the heritabilities for 54 personality traits. Independently and when combined into an aggregate, they predicted similarity between spouses (rs = 0.44 and 0.55, ps < 0.001). Rushton and Russell (1985) reviewed other reports of similar correlations, including Kamin’s (1978) calculation of r = 0.79 (p < 0.001) for 15 cognitive tests and DeFries et al.’s (1978) calculation of r = 0.62 (p < 0.001) for 13 anthropometric variables. Cattell (1982) too had noted that between-spouse correlations tended to be lower for the less heritable, more specific cognitive abilities (tests of vocabulary and arithmetic) than for the more heritable general abilities (g, from Progressive Matrices).
Also shown in Table 4.2 are analyses carried out using a between-subjects design. Rushton and Nicholson (1988) analyzed data from studies using 15 subtests from the Hawaii Family Study of Cognition (HFSC) and 11 subtests from the Wechsler Adult Intelligence Scale (WAIS); positive correlations were calculated within and between samples. For example, in the HFSC, parent-offspring regressions (corrected for reliability) using data from Americans of European ancestry in Hawaii, Americans of Japanese ancestry in Hawaii, and Koreans in Korea correlated positively with between-spouse similarity scores taken from the same samples and with those taken from two other samples: Americans of mixed ancestry in California and a group in Colorado. The overall mean r was 0.38 for the 15 tests. Aggregating across the numerous estimates to form the most reliable composite gave a substantially better prediction of mate similarity from the estimate of genetic influence (r = 0.74, p < 0.001). Similar results were found with the WAIS. Three estimates of genetic influence correlated positively with similarities between spouses based on different samples, and in the aggregate they predicted the composite of spouse similarity scores with r = 0.52 (p < 0.05).
Parenthetically, it is worth noting that statistically controlling for the effects of g in both the HFSC and the WAIS analyses led to substantially lower correlations between estimates of genetic influence and assortative mating, thus offering support for the view that marital assortment in intelligence occurs primarily with the g factor. The g factor tends to be the most heritable component of cognitive performance measures (chap. 3).
Race, Brain Size, and Intelligence
Intelligence Test Scores
Research on the academic accomplishments of Mongoloids in the United States continues to grow. Caplan, Choy, and Whitmore (1992) gathered survey and test score data on 536 school-age children of Indochinese refugees in five urban areas around the United States. Unlike some of the previously studied populations of “boat people,” these refugees had had limited exposure to Western culture, knew virtually no English when they arrived, and often had a history of physical and emotional trauma. Often they came with nothing more than the clothes they wore. All the children attended schools in low-income metropolitan areas. The results showed that whether measured by school grades or nationally normed standardized tests, the children were above average overall, “spectacularly” so in mathematics. […]
An early study of the intelligence of “pure” African Negroids was carried out in South Africa by Fick (1929). He administered the American Army Beta Test, a nonverbal test designed for those who could not speak English, to 10- to 14-year-old white, black African, and mixed-race (mainly Negroid-Caucasoid hybrid) schoolchildren. In relation to the white mean of 100, based on more than 10,000 children, largely urban black African children obtained a mean IQ of 65, while urban mixed-race children obtained a mean IQ of 84. Thus South African mixed races obtained a mean IQ virtually identical to that of African-Americans. […]
Since R. Lynn’s review, Owen (1992) has published another South African study. He gave Raven’s Standard Progressive Matrices to four groups of high school students. The results showed clear racial mean differences with 1,065 whites = 45.27 (SD = 6.34); 1,063 East Indians = 41.99 (SD = 8.24); 778 mixed races = 36.69 (SD = 8.89); and 1,093 pure Negroids = 27.65 (SD = 10.72). Thus, Negroids are from 1.5 to 2.7 standard deviations below the two Caucasoid populations and about 1 standard deviation lower than the mixed races. The four groups showed little difference in test reliabilities, the rank order of item difficulties, item discrimination values, and the loadings of items on the first principal component. Owen (1992: 149) concluded: “Consequently, from a psychometric point of view, the [test] is not culturally biased.”
R. Lynn also summarized the results of studies of the intelligence of Amerindians. The mean IQs have invariably been found to be somewhat below that of Caucasoids. The largest study is that of Coleman et al. (1966), which obtained a mean of 94, but a number of studies have reported means in the 70 to 90 range. The median of the 15 studies listed is 89, which Lynn took as a reasonable approximation, indicating that the Amerindian mean IQ falls somewhere between that of Caucasoids and Negroid-Caucasoid hybrids. The same intermediate position is occupied by Amerindians in performance on the Scholastic Aptitude Test (Wainer, 1988).
In addition, all the studies of Amerindians have found that they have higher visuospatial than verbal IQs. The studies listed are those where the Amerindians speak English as their first language, so this pattern of results is unlikely to be solely due to the difficulty of taking the verbal tests in an unfamiliar language. The verbal-visuospatial disparity is also picked up in the Scholastic Aptitude Test, where Amerindians invariably score higher on the mathematical test than on the verbal (Wainer, 1988).
Thus, Jensen (1985) examined 11 large-scale studies, each comprising anywhere from 6 to 13 diverse tests administered to large black and white samples aged 6 to 16½, with a total sample size of 40,000, and showed that a significant and substantial correlation was found in each between the test’s g loadings and the mean black-white difference on the same tests. In a follow up, Jensen (1987b; Naglieri & Jensen, 1987) matched 86 black and 86 white 10- to 11-year-olds for age, school, sex, and socioeconomic status and tested them with the Wechsler Intelligence Scale for Children-Revised and the Kaufman Assessment Battery for Children for a total of 24 subtests. The results showed that the black-white differences on the various tests correlated r = 0.78 with the test’s g loading.
To further examine the racial difference in reaction times and their relationship to g, P. A. Vernon and Jensen (1984) gave a battery of eight tasks to 50 black and 50 white college students who were also tested on the Armed Services Vocational Aptitude Battery (ASVAB). Despite markedly different content, the reaction time measures correlated significantly at about 0.50 with the ASVAB in both the black and the white samples. Blacks had significantly slower reaction time scores than whites, as well as lower scores on the ASVAB. The greater the complexity of the reaction time task, measured in milliseconds, the stronger its relationship to the g factor extracted from the ASVAB, and the greater the magnitude of the black-white difference. […]
Meanwhile, Jensen (1993; Jensen & Whang, 1993) used similar decision time tasks as R. Lynn to extend his test of Spearman’s hypothesis. Thus, Jensen (1993) gave 585 white and 235 black 9- to 11-year-old children from middle-class suburban schools in California a battery of 12 reaction time tasks based on the simple, choice, and oddman procedures. The response time loadings on psychometric g were estimated by their correlations with scores on Raven’s Progressive Matrices. In another procedure, the chronometric tasks assessed speed of retrieval of easy number facts such as addition, subtraction, or multiplication of single digit numbers. These have typically been learned before the children are 9 years old, and all children in the study were able to perform them correctly.
In both studies, Spearman’s hypothesis was borne out as strongly as in the previous studies using conventional psychometric tests. Blacks scored lower than whites on the Raven’s Matrices and were slower than whites in decision time. In addition, the size of the black-white difference on the decision time variables was directly related to the variables’ loadings on psychometric g. Moreover, when the response time was separated into a cognitive decision component and a physical movement component, blacks were found to be slower than whites on the cognitive part and faster than whites on the physical part.
Using the same procedures as in the study just described, Jensen and Whang (1993), also in California, compared 167 9- to 11-year-old Chinese American children with the 585 white children. On Raven’s Matrices there was a 0.32 standard deviation advantage to the Oriental children (about 5 IQ points), although they were lower in socioeconomic status. Also, compared to the white American children, the Chinese American children were faster in the cognitive aspects of information processing (decision time) but slower in the motor aspects of response execution (movement time).
Chapter 7 : Speed of Maturation, Personality, and Social Organization
Speed of Maturation
Table 7.1 summarizes the racial differences on several measures of life span development. In the United States, black babies have a shorter gestation period than white babies. By week 39, 51 percent of black children have been bom while the figure for white children is 33 percent; by week 40, the figures are 70 and 55 percent respectively (Niswander & Gordon, 1972). Similar results have been obtained in Paris. Collating data over several years, Papiernik, Cohen, Richard, de Oca, and Feingold (1986) found that French women of European ancestry had longer pregnancies than those of mixed black-white ancestry from the French Antilles, or black African women with no European admixture. These differences persisted after adjustments for socioeconomic status.
Other observations, made within equivalent gestational age groups established by ultrasonography, find that black babies are physiologically more mature than white babies as measured by pulmonary function, amniotic fluid, fetal birth weight between 24 and 36 weeks of gestation, and weight-specific neonatal mortality (reviewed in Papiernik et al., 1986). I am unaware of data on gestation time for Mongoloids.
Black precocity continues throughout life. Revised forms of Bayley’s Scales of Mental and Motor Development administered in 12 metropolitan areas of the United States to 1,409 representative infants aged 1-15 months showed black babies scored consistently above whites on the Motor Scale (Bayley, 1965). This difference was not limited to any one class of behavior, but included: coordination (arm and hand); muscular strength and tonus (holds head steady, balances head when carried, sits alone steadily, and stands alone); and locomotion (turns from side to back, raises self to sitting, makes stepping movements, walks with help, and walks alone).
Similar results have been found for children up to about age 3 elsewhere in the United States, in Jamaica, and in sub-Saharan Africa (Curti, Marshall, Steggerda, & Henderson, 1935; Knobloch & Pasamanik, 1953; Williams & Scott, 1953; Walters, 1967). In a review critical of the literature Warren (1972) nonetheless reported evidence for African motor precocity in 10 out of 12 studies. For example, Geber (1958:186) had examined 308 children in Uganda and reported an “all-round advance of development over European standards which was greater the younger the child.” Freedman (1974, 1979) found similar results in studies of newboms in Nigeria using the Cambridge Neonatal Scales (Brazelton & Freedman, 1971).
Mongoloid children are motorically delayed relative to Caucasoids. In a series of studies carried out on second- through fifth-generation Chinese-Americans in San Francisco, on third- and fourth-generation Japanese-Americans in Hawaii, and on Navajo Amerindians in New Mexico and Arizona, consistent differences were found between these groups and second- to fourth-generation European-Americans using the Cambridge Neonatal Scales (Freedman, 1974, 1979; Freedman & Freedman, 1969). One measure involved pressing the baby’s nose with a cloth, forcing it to breathe with its mouth. Whereas the average Chinese baby fails to exhibit a coordinated “defense reaction,” most Caucasian babies turn away or swipe at the cloth with the hands, a response reported in Western pediatric textbooks as the normal one.
On other measures including “automatic walk,” “head turning,” and “walking alone,” Mongoloid children are more delayed than Caucasoid children. Mongoloid samples, including the Navajo Amerindians, typically do not walk until 13 months, compared to the Caucasian 12 months and Negro 11 months (Freedman, 1979). In a standardization of the Denver Developmental Screening Test in Japan, Ueda (1978) found slower rates of motoric maturation in Japanese as compared with Caucasoid norms derived from the United States, with tests made from birth to 2 months in coordination and head lifting, from 3 to 5 months in muscular strength and rolling over, at 6 to 13 months in locomotion, and at 15 to 20 months in removing garments.
Eveleth and Tanner (1990) discuss race differences in terms of skeletal maturity, dental maturity, and pubertal maturity. Problems include poorly standardized methods, inadequate sampling, and many age/race/method interactions. Nonetheless, when many null and idiosyncratic findings are averaged out the data suggest that African-descended people have a faster tempo than others.
With skeletal maturity, the clearest evidence comes from the genetically timed age at which bone centers first become visible. Africans and African-Americans, even those with low incomes, mature faster up to 7 years. Mongoloids are reported to be more delayed at early ages than Caucasoids but later catch up, although there is some contradictory data. Subsequent skeletal growth varies widely and is best predicted by nutrition and socioeconomic status.
With dental development, the clearest pattern comes from examining the first phase of permanent tooth eruption. For beginning the first phase, a composite of first molar and first and second incisors in both upper and lower jaws showed an average for 8 sex-combined African series of 5.8 years compared to 6.1 years each for 20 European and 8 east Asian series (Eveleth & Tanner, 1990, Appendix 80, after excluding east Indians and Amerindian samples from the category “Asiatics”). For completion of the first phase, Africans averaged 7.6, Europeans 7.7, and east Asians 7.8 years. (The significance of this pattern will be discussed in chapter 10, where the predictive value of age of first molar for traits like brain size has been shown in other primate species.) No clear racial pattern emerged with the onset of deciduous teeth nor with the second phase of permanent tooth eruption.
In speed of sexual maturation, the older literature and ethnographic record suggested that Africans were the fastest to mature and Orientals slowest with Caucasian people intermediate (e.g., French Army Surgeon, 1898/1972). Despite some complexities this remains the general finding. For example, in the United States, blacks are more precocious than whites as indexed by age at menarche, first sexual experience, and first pregnancy (Malina, 1979). A national probability sample of American youth found that by age 12, 19 percent of black girls had reached the highest stages of breast and pubic hair development, compared to 5 percent of white girls (Harlan, Harlan, & Grillo, 1980). The same survey, however, found white and black boys to be similar (Harlan, Grillo, Coroni-Huntley, & Leaverton, 1979).
Subsequently, Westney, Jenkins, Butts, and Williams (1984) found that 60 percent of 11-year-old black boys had reached the stage of accelerated penis growth in contrast to the white norm of 50 percent of 12.5-year-olds. This genital stage significantly predicted onset of sexual interest, with over 2 percent of black boys experiencing intercourse by age 11. While some surveys find that Oriental girls enter puberty as early as whites (Eveleth & Tanner, 1990), others suggest that in both physical development and onset of interest in sex, the Japanese, on the average, lag one to two years behind their American counterparts (Asayama, 1975).
Black babies in the United States show a greater mortality rate than white babies. In 1950, a black infant was 1.6 times as likely to die as a white infant. By 1988, the relative risk had increased to 2.1. Controlling for some maternal risk factors associated with infant mortality or premature birth, such as age, parity, marital status, and education, does not eliminate the gap between blacks and whites within those risk groups. For instance, in the general population, black infants with nonnal birth weights have almost twice the mortality of their white counterparts.
One recent study examined infants whose parents were both college graduates in a belief that such a study would eliminate obvious inequalities in access to medical care. The researchers compared 865,128 white and 42,230 black children but they found that the mortality rate among black infants was 10.2 per 1,000 live births as against 5.4 per 1,000 among white infants (Schoendorf, Carol, Hogue, Kleinman, & Rowley, 1992).
The reason for the disparity appears to be that the black women give birth to greater numbers of low birth weight babies. When statistics are adjusted to compensate for the birth weight of the babies, the death rates for the two groups become virtually identical. Newborns who are not underweight, born to black and white college-educated parents, had an equal chance of surviving the first year. Thus, in contrast to black infants in the general population, black infants born to college-educated parents have higher mortality rates than similar white infants only because of their higher rates of low birth weight.
Indices of social breakdown are also to be gained from figures of those confined to mental institutions or who are otherwise behaviorally unstable. Most of the data to be reviewed come from the United States. In 1970, 240 blacks per 100,000 population were confined to mental institutions, compared with 162 whites per 100,000 population (Staples, 1985). Blacks also use community mental health centers at a rate almost twice their proportion in the general population. The rate of drug and alcohol abuse is much greater among the black population, based on their overrepresentation among patients receiving treatment services. Moreover, it is estimated that over one-third of young black males in the inner city have serious drug problems (Jaynes & Williams, 1989).
Kessler and Neighbors (1986) have demonstrated, using cross-validation on eight different surveys encompassing more than 20,000 respondents, that the effect of race on psychological disorders is independent of class. They observed an interaction between race and class such that the true effect of race was suppressed and the true effect of social class was magnified in models that failed to take the interaction into consideration. Again, in contrast, Orientals are underrepresented in the incidence of mental health problems (P. E. Vernon, 1982).
Blacks have deeper voices than whites. In one study, Hudson and Holbrook (1982) gave a reading task to 100 black men and 100 black women volunteers ranging in age from 18 to 29 years. The fundamental vocal frequencies were measured and compared to white norms. The frequency for black men was 110 Hz, lower than the 117 Hz for white men, and the frequency for black women was 193 Hz, lower than the frequency of 217 Hz for white women.
Differences in bone density between blacks and whites have been noted at a variety of ages and skeletal sites and remain even after adjusting for body mass (Pollitzer & Anderson, 1989). Racial differences in bone begin even before birth. Divergence in the length and weight of the bones of the black and white fetus is followed by greater weight of the skeleton of black infants compared with white infants. Blacks have not only greater skeletal calcium content, but also greater total body potassium and muscle mass. These findings are important for osteoporosis and fractures, especially in elderly people.
Body structure differences likely account for the differential success of blacks at sporting events. Blacks are disproportionately successful in sports involving running and jumping but not at all successful at sports such as swimming. For example in the 1992 Olympic Games in Barcelona, blacks won every men’s running race. On the other hand, no black swimmer has ever qualified for the U.S. Olympic swim team. The bone density differences mentioned above may be a handicap for swimming.
The physique and physiology of blacks may give them a genetic advantage in running and jumping, as discussed in Runner’s World by long time editor Amby Burfoot (1992). For example, blacks have less body fat, narrower hips, thicker thighs, longer legs, and lighter calves. From a biomechanical perspective, this is a useful package. Narrow hips allow for efficient, straight-ahead running. Strong quadricep muscles provide horsepower, and light calves reduce resistance.
With respect to physiology, West Africans are found to have significantly more fast-twitch fibers and anaerobic enzymes than whites. Fast-twitch muscle fibers are thought to confer an advantage in explosive, short duration power events such as sprinting. East and South African blacks, by contrast, have muscles that provide great endurance by producing little lactic acid and other products of muscle fatigue.
A number of direct performance studies have shown a distinct black superiority in simple physical tasks such as running and jumping. Often, the subjects in these studies were very young children who had no special training. Blacks also have a significantly faster patellar tendon reflex time (the familiar knee-jerk response) than white students. Reflex time is obviously an important variable for sports that require lightning reflexes. It would be interesting to know if the measures on which blacks performed best were the ones on which Orientals performed poorest, and vice versa. Do reflex times and percentage of fast-twitch muscle show a racial gradient, and is it one opposite to that of cognitive decision time? Is this ultimately a physiological tradeoff?
Sexual Potency, Hormones, and AIDS
The average woman produces one egg every 28 days in the middle of the menstrual cycle. Some women, however, have shorter cycles than others and some produce two eggs in a cycle. Both events translate into greater fecundity because of the greater opportunities they provide for a conception. Occasionally double ovulation results in the birth of dizygotic (two-egg) twins.
The races differ in the rate at which they double ovulate. Among Mongoloids, the frequency of dizygotic twins per 1,000 births is less than 4, among Caucasoids the rate is 8 per 1,000, and among Negroids the figure is greater than 16 per 1,000, with some African populations having twin frequencies of more than 57 per 1,000 (Buhner, 1970). Recent reviews of twinning rates in the United States (Allen, 1988) and Japan (Imaizumi, 1992) confirm the racial differences. Note that the frequency of monozygotic twinning is nearly constant at about 4 per 1,000 in all groups. Monozygotic twinning is the result of a single fertilized egg splitting into two identical parts.
The frequency of three-egg triplets and four-egg quadruplets shows a comparable racial ordering. For triplets, the rate per million births among Mongoloids is 10, among Caucasoids 100, and among Negroids 1,700; and for quadruplets, per million births, among Mongoloids 0, among Caucasoids 1, and among Negroids, 60 (Allen, 1987; Nylander, 1975). Data from racially mixed matings show that multiple births are largely determined by the race of the mother, independently of the race of the father, as found in Mongoloid-Caucasoid crosses in Hawaii, and Caucasoid-Negroid crosses in Brazil (Buhner, 1970).
Intercourse Frequency and Attitudes
Racial differences exist in frequency of sexual intercourse. Examining Hofmann’s (1984) review of the extent of premarital coitus among young people around the world, Rushton and Bogaert (1987) categorized the 27 countries by primary racial composition and averaged the figures. The results showed that African adolescents are more sexually active than Europeans, who are more sexually active than Asians (see Table 8.3). While some variation occurs from country to country, consistency is found within groups. As is typical of such surveys, young men report a greater degree of sexual experiences than young women (Symons, 1979). It is clear from Table 8.3, however, that the population differences are replicable across sex, with the men of the more restrained group having less experience than the women of the less restrained.
A confirmatory study was carried out in Los Angeles which held the setting constant and fully sampled the ethnic mix. Of 594 adolescent and young adults, 20 percent were classified as Oriental, 33 percent as white, 21 percent as Hispanic, and 19 percent as black. The average age at first intercourse was 16.4 for Orientals and 14.4 for blacks, with whites and Hispanics intermediate, and the percentage sexually active was 32 percent for Orientals and 81 percent for blacks, with whites and Hispanics intermediate (Moore & Erickson, 1985).
A Youth Risk Behavior Survey with a reading level for 12-year-olds was developed by the Centers for Disease Control in the United States to examine health-risk behaviors including sexual behaviors. In 1990, a representative sample of 11,631 students in grades 9-12 (ages 14 to 17) from across the United States anonymously completed the questionnaire during a 40-minute class period. Students were asked whether they had ever had sexual intercourse, with how many people they had had sexual intercourse, and with how many people they had had sexual intercourse during the past 3 months. They were also asked about their use of condoms and other methods of preventing pregnancy (Centers for Disease Control, 1992a).
Of all students in grades 9-12, 54 percent reported ever having had sexual intercourse and 39 percent reported having had sexual intercourse during the 3 months preceding the survey. Male students were significantly more likely than female students to ever have had sexual intercourse (61 percent and 48 percent, respectively) and to have had sexual intercourse during the 3 months preceding the survey (43 percent and 36 percent, respectively). Black students were significantly more likely than white students to ever have had sexual intercourse (72 percent and 52 percent, respectively), to have had sexual intercourse during the 3 months preceding the survey (54 percent and 38 percent, respectively), and to have had four or more sex partners in their lifetime (38 percent and 16 percent, respectively). Four percent of all students reported having had a sexually transmitted disease. Black students (8 percent vs. 3 percent) were significantly more likely to report having had a sexually transmitted disease than white students (Centers for Disease Control, 1992a, 1992b).
The rate of premarital intercourse is matched by that following marriage. Rushton and Bogaert (1987) inspected a section on cross-cultural intercourse frequency in a review by Ford and Beach (1951) and categorized the tribal peoples listed into three main groups. The Oceanic and Amerindian people tended to have a lower rate of sexual intercourse per week average (1-4) than U.S. whites (2-4) and Africans (3-10). Subsequent surveys tend to support the same conclusion. For married couples in their twenties, the average frequency of intercourse per week for the Japanese approximates 2 (Asayama, 1975), for American whites 4, and for American blacks 5 (Fisher, 1980).
Sex surveys are also beginning in the People’s Republic of China where, according to Time Magazine (May 14, 1990), a new era of relative permissiveness is beginning. According to the Time account, in one survey, 500 volunteer social workers have interviewed 23,000 people in 15 provinces using a 240-question survey. Results from a smaller survey, of about 2,000 men and women from urban centers throughout China have now been published (Bo & Wenxiu, 1992). The results show much restraint relative to that in the West. For example, over 50 percent of the men and women reported never having discussed sex with others and over 20 percent of spouses had never talked about sex with each other. This compared to less than 5 percent of respondents in England (Eysenck, 1976).
Over 50 percent thought that masturbation (and even loss of semen) was debilitating. Only 19 percent of males who admitted masturbating had engaged in the practice before the age of 17 years, and no female masturbators reported that they had done it before that age, while over 90 percent of the women stated that they had commenced after the age of 20. One reason for the older average age of masturbation is a later puberty. Of the males, about 50 percent reported that they had experienced their first seminal emission above the age of 17.
The frequency of reported intercourse may also be slightly lower in urban China than in the urban West. For married couples aged 20-30 the average is about 12 times a month or 3 times a week (Bo & Wenxiu, 1992, Table 7). Only 5 percent of the males and 3 percent of the females reported frequencies of one or more sexual outlets a day. The incidence of reported extramarital intercourse is also lower in China. About 29 percent of the males and 23 percent of the females admitted that they had been or were engaged in it. In the United States, one set of figures indicate 45 percent and 34 percent, respectively (Playboy Magazine, 1983).
Not all surveys find racial differences in the predicted direction. Tanfer and Cubbins (1992) found that 20- to 29-year-old single black women cohabiting with a sexual partner reported only 4.3 occasions of intercourse in the previous four weeks as compared with 6.9 among cohabiting white women (p < .05). The authors suggested that these black women’s partners had other sexual partners as well and were less available than the white women’s partners. Another possible reason was that more of the black sample were pregnant (Tanfer & Cubbins, 1992, Table 3).
Concomitant racial differences are found in sexual attitudes. In Ford and Beach’s (1951) survey, the Asian groups were the most likely to endorse beliefs concerning the weakening effects of intercourse. A review by P. E. Vernon (1982) led him to conclude that both the Chinese and the Japanese were not only less experienced in premarital sex, but were also less permissive, and less concerned with sexual display than Caucasians. Thus, Connor (1975, 1976) had found that three generations of Japanese Americans, as well as Japanese students in Japan, reported less interest in sex than Caucasian samples. Abramson and Imari-Marquez (1982) observed that each of three generations of Japanese Americans showed more sex guilt than matched Caucasian Americans. In studies carried out in Britain and Japan using a sex fantasy questionnaire, Iwawaki and Wilson (1983) found that British men reported twice as many fantasies as Japanese men, and British women admitted to four times as much sex fantasy as Japanese women.
In contrast, African-descended people are more permissive than Caucasians. Reiss (1967) observed this with several hundred black and white university students in the United States on scales measuring premarital sexual attitudes (e.g., approving of or feeling guilt about petting and intercourse in casual and romantic relationships); results replicated with other samples and measuring instruments (Heltsley & Broderick, 1969; Sutker & Gilliard, 1970). Johnson (1978) also compared black and white premarital sexual attitudes and behavior and included a Swedish sample who were expected to be (and were) more permissive than American whites. The black sample (particularly males) was found to have had intercourse earlier and with a greater number of casual partners, and with less feelings of distaste, than either white sample.
Chapter 9 : Genes Plus Environment
Genetic Weights Predict Racial Differences
To my knowledge, Jensen (1973, chapter 4) was the first to apply the idea of differential heritability to race differences. Jensen deduced diametrically opposite predictions from genetic and environmental perspectives. He reasoned that if racial differences in cognitive performance are genetically based, then black-white differences should be greatest on those tests with higher heritabilities. But if racial differences are caused by the environment, then black-white differences should be greatest on those tests more environmentally influenced, and so would have lower heritability.
Jensen (1973) tested these predictions by calculating “environmentability” for various tests through the degree to which sibling correlations departed from the pure genetic expectation of 0.50. These showed an inverse relation with the magnitude of the black-white differences. That is, the most environmentally influenced tests were the ones that least differentiated between blacks and whites. Then, Jensen (1973) cited an unpublished study by Nichols (1972) who estimated the heritability of 13 tests from 543 siblings and found that the correlation between these heritabilities and the black-white difference scores on the same tests was 0.67. In other words, the more heritable the test, the more it discriminated between the races.
The genetic hypothesis is indirectly supported by studies using a test’s g loading rather than its heritability. As described in chapters 2, 3, and 6, the higher a test’s g loading, the more predictive of intelligent behavior it tends to be, the more heritable it is, and the more it differentiates between the races. Thus, Jensen (1985, 1987b) examined 12 large-scale studies, each comprising anywhere from 6 to 13 tests administered to over 40,000 elementary and high school students and found that the test’s g loading consistently predicted the magnitude of the black-white difference.
Prompted by Jensen’s approaches I showed a direct genetic effect on the black-white differences using inbreeding depression scores, a measure of genetic dominance (Rushton, 1989e). As described by Jensen (1983), inbreeding depression is an effect for which there is no really satisfactory explanation other than a genetic one. It depends on the presence of dominant genes that enhance fitness in the Darwinian sense.
Inbreeding depression scores had been calculated by Schull and Neel (1965) in a study of 1,854 7 to 10-year-old Japanese cousins tested in 1958 and 1960 and shown to be related to the g factor scores for 11 subtests of the Wechsler Intelligence Scale for Children by Jensen (1983). […]
Set out in Table 9.1 is a summary of the data used in the studies by Jensen (1985, 1987b) and Rushton (1989e). As mentioned, the g factor loadings are indirect estimates of genetic penetrance and the inbreeding depression scores direct estimates. I have calculated a weighted average for the five sets of black-white differences (in σ units, based on raw scores from a total N = 4,848) as well as a weighted average for the 10 sets of g loadings. Also in Table 9.1 are the reliabilities of the tests.
Figure 9.1 shows the regression of black-white differences on the g factor loadings and on the inbreeding depression scores. Clearly, as the g loading and inbreeding depression scores increase, so do the magnitudes of the black-white differences. The racial differences are significantly predicted by the genetic penetrance of each of the subtests. The genetic contribution to racial differences in mental ability is robust across populations, languages, time periods, and measurement specifics.
Between versus Within Family Effects
Other adoption and twin designs show that the environmental variables influencing behavior are primarily those that occur within families rather than between them (Plomin & Daniels, 1987). This is one of the more important discoveries made using behavior genetic procedures; it appears to hold even for variables such as altruism, obesity, and law abidingness, which parents are thought to strongly socialize. One implication of this finding is that because the variables usually proposed to explain racial differences, such as social class, religious beliefs, cultural practices, father absence, and parenting styles account for so little variance within a race, they are unlikely to account for the differences among races.
Using similar reasoning, Jensen (1980b) described how data from siblings could be used to determine whether relationships between variables are caused by factors “extrinsic” to the family, such as social class. Such factors serve to make family members similar to one another and different from people in other families. Strong social class effects can be presumed operative, therefore, if the covariance structures that emerge from between-family data disappear when using “intrinsic” within-family data. If the covariance structures remain constant regardless of whether they are calculated on the basis of within-family or between-family data, then social class effects must be less operative, and genetic and within-family effects more operative. Research shows that the general factor of intelligence, g, is constant across all three major racial groups from both within-family and between-family analyses (Jensen, 1987a; Nagoshi, Phillips, & Johnson, 1987). The implication is that the differences in g found between races are primarily due to within-family effects, such as genetics, rather than to between-family effects such as socioeconomic background.
Additional evidence for the within-family, intrinsic nature of g comes from data on the head size-IQ correlation (chap. 2). Jensen and Johnson (in press) showed a significant positive correlation between head size and IQ in both black and white, male and female samples of 4- and 7-year-olds. In all cases, the sibling with the larger head perimeter tends to be the more intelligent sibling.
Race versus Social Class
One challenge for purely environmental theories is to explain upward and downward within-family mobility. For example, Weinrich (1977) reviewed data showing that those adolescents moving from one SES level to another showed the sexual patterns of their to be acquired class, not the class they were raised in by their parents. More recent research confirms the importance of within-family variation with some siblings more often adopting the syndrome of early sexuality, delinquency, and low educational attainment than others (Rowe, Rodgers, Meseck-Bushey, & St. John, 1989).
Within-family social mobility has been known for some time in the IQ literature. In one study Waller (1971) obtained the IQ scores of 130 fathers and their 172 adult sons, all of whom had been routinely tested during their high school year in Minnesota. The IQs ranged from below 80 to above 130 and were related to social class. Children with lower IQs than their fathers went down in social class as adults, and those with higher IQs went up (r = 0.37 between difference in father-son social class and difference in father-son IQ). Such intergenerational social mobility has subsequently been confirmed (Mascie-Taylor & Gibson, 1978).
Socioeconomic effects often appear to confound those of race because, as will be discussed in chapter 13, lower socioeconomic groups more often engage in r-strategies than do higher socioeconomic groups. Dizygotic twinning (the r-strategy) is greater among lower than upper socioeconomic women in both European and African samples, as are differences in family size, intelligence, law abidingness, health, longevity, and sexuality. The question then arises as to whether social class or race is more predictive of behavior. […]
In the study just referred to on regression effects, Jensen (1974) found that black children from high socioeconomic status homes scored lower on IQ tests than white children from low socioeconomic homes. The study examined virtually all the white (N = 1,489) and black (n = 1,123) children enrolled in regular classes of the fourth, fifth, and sixth grades of the Berkeley elementary school district in California. The black children’s parents were high-level administrators, supervisors, college teachers, and professionals; the white children’s parents were manual and unskilled workers. The racial differences showed up on both the verbal and the nonverbal parts of the nationally standardized Thorndike-Lorge Intelligence Test.
In a similar study of the Scholastic Aptitude Test, the results from 1984 showed that the median scores of black college applicants from families earning over $50,000 were lower than those of whites from families earning less than $6,000. The scores were monotonically related to income within both races (R. A. Gordon, 1987a). Race was more powerful than income in determining test scores.
Although it is well known that test scores are correlated with socioeconomic status within racial groups, this does not, in fact, explain black-white ability differences. The pattern of black-white differences is different in factorial composition from the pattern of social class differences within the black and the white groups (Jensen & Reynolds, 1982). For example, the SES differences tend to be largest on tests of verbal ability rather than on tests of spatial visualization. This is just the opposite of the pattern of black-white differences on verbal and spatial tests.
To examine race versus social class differences in sexual behavior, Rushton and Bogaert (1988) contrasted noncollege-educated whites with college-educated blacks. Table 8.4 shows the results. Noncollege-educated whites were more restrained than college-educated blacks on such measures as speed of occurrence of premarital, marital, and extramarital experiences, number of partners, frequency of intercourse, speed and incidence of pregnancy, and length of the menstrual cycle, although they were not as restrained as the college-educated whites. The black sample, consisting of university students from 1938 to 1963 was atypical in the direction of being religiously devout and of high socioeconomic status.
The race/social class findings of Rushton and Bogaert (1988) depicted in Table 8.4 were independently replicated with additional samples by M. S. Weinberg and Williams (1988). These authors reanalyzed evidence from three independent sources: the original Kinsey data, which formed the basis of Rushton and Bogaert’s studies; a 1970 National Opinion Research Center poll of sexual attitudes; and a study carried out in San Francisco. All three reanalyses showed the predicted racial effects on sexuality while holding education and social class constant. Moreover, with dizygotic twinning, while both race and social class are predictive, race is the source of the larger portion of variance (Rushton, 1987b).
In other domains, too, race has been found to have strong effects independent of class. With psychological illness, Kessler and Neighbors (1986) used cross-validation on eight different surveys encompassing more than 20,000 respondents to demonstrate an interaction between race and class such that the true effect of race was suppressed and the true effect of social class was magnified in models that failed to take the interaction into consideration.
Chapter 11 : Out of Africa
During the 1.5 million years that spanned the emergence of H. erectus and H. sapiens, the stone implements were crude. Hand axes, choppers, and cleavers were not sufficiently differentiated in shape to imply distinctive function. Wear marks on the tools show they were variously used to cut meat, bone, hides, wood, and nonwoody parts of plants. Moreover, there is no evidence that the tools were ever mounted on other materials for increased leverage and there were no tools made of bone, no ropes to make nets, and no fishhooks. The stone tools remained unchanged for thousands of years. In fact, minimalists have held that there is no good evidence of hunting skills until around 100,000 years ago and even then humans would have been relatively ineffective hunters. Calvin (1990), however, has suggested that some of the stone axes used by erectus populations may have been effective throwing instruments aimed at animal herds as they watered.
It is only in the northern Eurasian land mass, particularly the arctic, where little plant food was available, that big-game hunting clearly became the dominant food source. And humans didn’t reach the Arctic until around 30,000 years ago. Neanderthal tools found in Europe were similar to earlier human tools found in Africa, being simple hand-held axes not mounted on separate parts such as handles. There were no standardized bone tools and no bows and arrows. Shelters were apparently crude; all that remains of them are postholes and simple stone piles. There is no evidence of art, sewing, boating, or trade, and no variation of tools over time and space suggesting little in the way of innovation.
One hundred thousand years ago, in Africa, at the time when modern-looking Africans had evolved, the stone tool implements suddenly became more specialized. Carefully prepared stone cores enabled numerous thin blades about two inches (five cm) long to be struck off and turned into knives, spear barbs, scrapers, borers, and cutters. This blade technology allowed many more flakes to be struck off than previously and the stone workers relied more heavily on nonlocal rocks, choosing to bring in fine-grained rocks of many types from miles away.
Although the anatomically modern Africans had somewhat superior tools to their predecessors, they are still characterizable as Middle Stone Age in culture. They continued to lack standardized bone tools, bows and arrows, art, and cultural variation. These Africans can barely be considered big-game hunters because their weapons were still spears for thrusting rather than bows and arrows.
Evidence for a more abrupt change doesn’t occur until the last Ice Age in Europe (France and Spain) around 35,000 years ago. Anatomically modern people, known as Cro-Magnons, appeared on the scene with dramatically more specialized tools. Standardized bone and antler tools appear for the first time, including needles used for sewing, as do compound tools of several parts tied or glued together, such as spear points set in shafts or ax heads hafted to handles. Rope, used in nets or snares, accounts for the frequent bones of foxes, weasels, and rabbits at Cro-Magnon sites.
Sophisticated weapons for killing dangerous animals at a distance now appear also — weapons such as barbed harpoons, darts, spear throwers, and bows and arrows. European caves are full of bones of bison, elk, reindeer, horse, and ibex. By this time South African caves also yield bones of buffalo and pigs.
Several types of evidence testify to the effectiveness of late Cro-Magnon people as big-game hunters. Their sites are more numerous than those of Neanderthals or Middle Stone Age Africans, implying more success at obtaining food. Moreover, numerous species of big animals that had survived many previous ice ages became extinct toward the end of the last ice age, suggesting that they were exterminated by the human hunters’ new skills. Likely victims include Europe’s woolly rhino and giant deer, and southern Africa’s giant buffalo and giant Cape horse. With watercraft capable of crossing the 60 miles from eastern Indonesia to Australia, and tailored clothing enabling the crossing of the Bering straits, the giant kangaroos of Australia and the mammoths of North America were exterminated.
The occupation of Northeast Asia about 30,000 years ago depended on many advances: tailored clothing, as evidenced by eyed needles, cave paintings of parkas, and grave ornaments marking outlines of shirts and trousers; warm furs, indicated by fox and wolf skeletons minus the paws (removed in skinning and found in a separate pile); elaborate houses (partially dug into the ground for insulation and marked by postholes, pavements, and walls of mammoth bones) with intricate fireplaces; and stone lamps to hold animal fat and light the long Arctic nights.
Whereas Neanderthals obtained their raw materials within a few miles of home, Cro-Magnons and their contemporaries throughout Eurasia practiced long-distance trade, not only for raw material for tools but also for ornaments. Tools of obsidian, jasper, and flint have been found hundreds of miles from where those stones were quarried. Baltic amber reached southeast Europe, while Mediterranean shells and the teeth from sharks were carried to inland parts of France, Spain, and the Ukraine. Burial displays reflect great variation, with skeletons wearing necklaces, bracelets, and head bands of shell beads and bear and lion teeth.
The artwork of anatomically modern humans also shows a clear discontinuity with what went before. Well known are the rock paintings, with polychrome depictions of now extinct animals and the relief carvings and clay sculptures deep within caves in France and Spain that hinted at shamanistic rituals. On the Eurasian plains are “Venus” figurines of women with enormous breasts and buttocks, made from a mixture of clay and bone powder. Ivory carvings of eagles, mammoths, and arctic waterbirds, as well as female figurines have been found in Siberia and dated to 35,000 years ago.
Analysis of amino acids in ostrich eggshells, once used as food and as containers, also bolsters the case that the first modern humans originated in Africa. Change in amino acids take place in eggshells at a steady rate and, once anchored to radiocarbon dating, enable dating back to 200,000 years ago, and up to 1 million years in colder climates (Gibbons, 1992). The eggshells turn up at camp sites in South Africa between 105,000 and 125,000 years ago, before the earliest dates on other continents. Not long afterward, ostrich shells appear in the Middle East along with anatomically modern human remains.
[…] Why would Mongoloids have ended up the most K-selected? I agree with those who have proposed that colonizing temperate and cold environments leads to increased cognitive demands to solve the problems of gathering food and gaining shelter and general survival in cold winters (e.g., Calvin, 1990; R. Lynn, 1987, 1991a).
From time to time populations move into new niches which entails increased cognitive demands for survival. When this occurs populations respond by evolving larger brains in relation to body size. Larger brains have the capacity for greater intelligence and enable the populations to deal with the cognitive demands of the new niche. The Caucasoid and Mongoloid peoples who evolved in Eurasia were subjected to pressures for improved intelligence to deal with the problems of survival in the cold northern latitudes. Most of the last 80,000 years has been colder than today. During the main Wurm glaciation of approximately 24-10,000 B.P. winter temperatures in Europe and northeast Asia fell by 5- 15°C. The terrain became cold grasslands and tundra with only a few trees in sheltered river valleys; the environment was broadly similar to that of present-day Alaska.
Obtaining food and keeping warm in these conditions posed a problem. Unlike the tropics and subtropics, plant foods were seasonal and not available for many months during the winter and spring. People therefore became wholly reliant on hunting large herbivores such as mammoth, horse, and reindeer to secure their food supply. Even among near-contemporary hunter-gatherers, the proportions of foods obtained by hunting and by gathering varies according to latitude. Peoples in tropical and subtropical latitudes were largely gatherers, while peoples in temperate environments relied more on hunting. Peoples in arctic and subarctic environments relied almost exclusively on hunting, together with fishing, and did so of necessity because plant foods were unavailable for much of the time.
Hunting in the open grasslands of northern Eurasia was also more difficult than hunting in the woodlands of the tropics and subtropics where there is plenty of cover for hunters to hide in. The only way of hunting animals in open grasslands is to make use of natural traps into which the animals could be driven. One of the most common traps was the narrow ravine where some of the beasts would stumble and could be speared by members of the group waiting in ambush. Li addition, the herbivores could be surrounded and driven over cliffs, into bogs or into the loops of rivers.
For effective hunting of large herbivores people would have needed to manufacture a variety of tools from stone, wood, and bone for making spearheads and for cutting. When these peoples had killed a large herbivore they would have to skin and butcher it into pieces of a size that could be carried back to the base camp. For this it was necessary to manufacture a variety of sophisticated cutting and skinning tools.
Another set of problems in the northern latitudes would have centered on keeping warm. People had to solve the problems of making fires, clothes, and shelters. It would have been much harder to make fires in Eurasia than in Africa, where spontaneous bush fires would have been frequent. In Eurasia during the glaciations there would have been no spontaneous bush fires. People would have had to make fires by friction or percussion in a terrain where there was little wood. Probably dry grass had to be stored in caves for use as tinder and the main fuel would have been dung, animal fat, and bones. In addition, clothing and shelters were unnecessary in sub-Saharan Africa but were made in Europe during the main Wurm glaciation. Needles were manufactured from bone for sewing together animal skins, and shelters were constructed from large bones and skins. Torrence (1983) has demonstrated an association between latitude and the number and complexity of tools used by contemporary hunter-gatherers.
Thus, the cognitive demands of manufacturing sophisticated tools and making fires, clothing, and shelters (as well as regulating the storage of food; Miller, 1991) would have selected for higher average intelligence levels than in the less cognitively demanding environment in sub-Saharan Africa. Those individuals who could not solve these problems of survival would have died out, leaving those with alleles for higher intelligence as the survivors.
In the data set out in chapter 6, general, verbal, and visuospatial abilities are all higher in Caucasoids compared with Negroids. The magnitude of the Caucasoid advantage was about the same for all three abilities, namely, about 30 IQ points for the comparison with Africans and about 15 IQ points for the comparison with African Americans and African Caribbeans. It is likely that all three abilities came under selection pressure for enhancement in Eurasia to about the same extent.
The intelligence of the Mongoloids are held to have evolved somewhat differently. While the Mongoloid peoples have only slightly higher general intelligence than the Caucasoids, they have markedly higher visuospatial abilities and, indeed, somewhat weaker verbal abilities. R. Lynn (1987, 1991a) attributed the evolution of this pattern of abilities to the even colder winters that Mongoloids experienced relative to Caucasoids. Evolving in Siberia where, in the main Wurm glaciation, the temperatures were some 5-15°C colder than today, the people of northeast Asia would have found themselves between the encroaching ice from the Himalayas in the south and from the Arctic region in the north. In response to this extreme cold, Mongoloids evolved distinctive adaptations to reduce heat loss, including the flattened face and shortened limbs and epicanthic fold and narrow eyes that afford protection against the cold and the glare of the sunlight on the snow. Under these adverse conditions natural selection increased general intelligence and a trade-off in favor of visuospatial abilities over verbal because of the crucial role of strong visuospatial abilities for making sophisticated tools and weapons, and for the planning and execution of group hunting strategies.
R. Lynn (1991a) also provided a scenario for the evolution of intelligence in southeast Asians and Amerindians. Although southeast Asians had some exposure to cold winters before they migrated southward, and so were selected for some enhanced intelligence, this would have been less than that experienced by the northern Caucasoids and Mongoloids. Hence, their intelligence levels were raised above those of Negroids but not to as high a level as the Caucasoids and Mongoloids. With respect to Amerindians, they are descendants of an archaic Mongoloid people that entered the Americas prior to the main Wurm glaciation of approximately 24-10,000 years ago that produced the “classical” Mongoloid features with their highly elevated cognitive abilities. Thus, the first Wurm glaciation at 40,000 B.P. set in place the archaic Mongoloid cognitive profile of relatively strong visuospatial and weak verbal abilities, and then some subsequent selection pressure, such as the main Wurm glaciation, raised the whole profile in the Mongoloids, leaving that of the Amerindians at a lower level.
Once proto-Mongoloids had crossed the Bering Strait and made their way down into the Americas they would have found life easier than their ancestors had been accustomed to in northeast Asia. They would have found a number of herbivorous mammals such as mammoth, horse, antelope, and bison who were quite unused to being hunted by man. With no experience of predation by man they would have been easy game for the skilled hunters who had evolved for many thousands of years in the more difficult environment of northeast Asia. As they moved southward proto-Mongoloids would have found that plant foods were readily available. Thus, survival would have been easier and selection for further increase in cognitive abilities would have relaxed.
Challenges and Rejoinders
Aggregation and Brain Size
Many nineteenth-century scientists including Broca, Darwin, Galton, Lombroso, and Morton concluded that there were racial differences in brain size (chap. 5). With some exceptions, for example, American anthropologists Boas and Mead, this view was probably dominant until World War II (Pearl, 1934). As discussed in chapter 6, following the war, the literature on brain size and race underwent vigorous critiques. Thus, Tobias (1970) cited 14 potentially confounding variables that he argued made the data on black-white differences in brain weight measured at autopsy highly problematic; and Gould (1978) alleged that many of the data on racial differences in endocranial volume were due to “unconscious … finagling” and “juggling” of figures. Together, these authors claimed to have dismantled the “myth” of racial group differences in brain size.
As discussed in chapter 6, however, when the autopsy data debunked by Tobias (1970) were aggregated, racial group differences were found, with Mongoloids and Caucasoids having heavier brains than Negroids (1,368 g, 1,378 g vs. 1,316 g, respectively). When Tobias’s number of “excess neurons” were averaged, Negroids had 8.55 billion, Caucasoids had 8.65 billion, and Mongoloids had 8.99 billion. Similarly, re-aggregating Gould’s (1978, 1981) “corrected” analyses of nineteenth-century endocranial data showed that about 1 in3(16 cm3) of cranial capacity differentiated the races such that Mongoloids > Caucasoids > Negroids.
These re-assemblages did not convince all the critics. Cain and Vanderwolf (1990) countered that the averaging method I had used for Tobias’s data was inappropriate because, for example, the midpoint of a range of means had been used. This procedure, they suggested, could yield misleading results unless the distribution was symmetrical. They did not say why it was reasonable to assume that the distributions were skewed.
Cain and Vanderwolf (1990) and M. Lynn (1989b) also objected to the inclusion of the data from the ancient Caucasians in the category “Caucasoid” in my aggregation of Gould’s data because of their small bodies and dried skulls. But if one accepted this position and excluded the ancient Caucasians from analysis, a 4 in3 difference in internally measured cranial capacity would be left between Mongoloids and Caucasoids on the one hand and Negroids on the other (see Table 6.1). Even if this magnitude is somewhat overestimated, the residual cannot be ignored. Moreover, if body size is controlled, the rank ordering is indeed Mongoloids > Caucasoids > Negroids because Mongoloids are often smaller in body size than Caucasoids.
Critics also brought “new” data to bear on the debate from a monograph by Herskovits (1930) who had collected external head measurements of American blacks and other populations. From this table, Zuckerman and Brody (1988: 1027) separated out a sample of 46,975 Swedes with a smaller cranial capacity than the American blacks and argued that if this kind of overlap was possible, then it was meaningless to make comparisons across races. This position was subsequently cited by other critics (e.g., Cain & Vanderwolf, 1990; Weizmann et al., 1990).
As discussed in chapter 6, Herskovits (1930) actually collated head size data for 36 male populations made by several investigators (Table 6.2). By choosing among the samples, any racial ranking can be artificially created. It is more appropriate to use the principle of aggregation and combine samples. When Herskovits’s (1930) data were aggregated, as we have seen, statistically significant differences in brain size were found, with Mongoloids (in this case North American Indians) and Caucasoids averaging larger than Negroids.
Other tabulations provided by critics to support the null hypothesis turn out on closer examination to support the racial hypothesis. Thus, Cain and Vanderwolf (1990: 782) set out 20 data points including a 1923 series of Caucasoid crania and a 1986 Negroid series (Table 12.1). Their purpose was “to illustrate that by drawing from other studies one can arrive at different conclusions than Rushton did” and to show that Negroid crania are “sometimes” greater than Caucasoid crania. They concluded: “Depending on the studies one chooses to cite, one can arrive at a variety of orderings of brain size or cranial capacity.”
Yet, Cain and Vanderwolf’s data show that Negroid adults average the smallest brains. I converted to grams the data in cubic centimeters in Table 12.1 using equation (5) from chapter 6 and aggregated the data across the sexes and measures (Rushton 1990c). Mongoloids averaged 1,297 g, Caucasoids averaged 1,304 g, and Negroids averaged 1,199 g, a difference of 100 g between Negroids and the other two populations. In a reply, Vanderwolf and Cain (1991) acknowledged that “some” of the data are “trustworthy” and in the direction claimed.
A similar sort of table was created by Groves (1991) who entered 21 Mongoloid populations (16 male, 5 females), 18 Caucasoid populations (13 male, 5 female), and 12 Negroid populations (9 male, 3 female). Among those having the largest cranial capacity were the Mokapu, a Mongoloid tribe in Hawaii, and the Xhosa, a tribe from Africa. Groves focused discussion on these outliers and ignored the remainder of his own table. I have plotted the data from his table in Figure 12.1, which, in aggregation, clearly shows the racial pattern. For males, the Mongoloids, Caucasoids, and Negroids average 1,487 cm3, 1,458 cm3, and 1,408 cm3, respectively, and for females they average 1,325 cm3, 1,312 cm3, and 1,254 cm3 respectively. An unweighted sex-combined average of these figures results in 1,406 cm3, 1,385 cm3, and 1,331 cm3, respectively.
Aggregation and Crime
Some claim that crime statistics only reflect police prejudice and biases in the criminal justice system. Some have gone so far as to claim that when the self-reports of adolescents are used, no racial differences in crime exist. Others sidestep crime statistics and focus on those surveys failing to show racial differences in antisocial personality disorder, psychopathy, and psychotic tendency (Zuckerman & Brody, 1988: 1030).
It is true that self-report measures typically show less racial disproportionality than arrest data. This is because they emphasize lesser, even trivial, offenses, that almost all males have engaged in at least once (e.g., “Have you ever been in a fight?”) or because they include items of marginal relation to crime (e.g., “Would being in debt worry you?”). It is also because few of the questionnaires assess the frequency of activities.
Self-report measures do, however, show the same general pattern of group differences (age, sex, socioeconomic, and race) as do official statistics. J. Q. Wilson and Herrnstein (1985) review the literature. One study, involving a U.S. national sample of 1,726 youth aged 11 to 17, found clear evidence that African Americans engage in more crime than European-American, and particularly in predatory crimes, with the difference most apparent among highrate offenders (Elliott & Ageton, 1980). Other studies showed that black males scored higher (that is, were less “normal”) than white males on personality tests such as the Minnesota Multiphasic Personality Inventory, particularly on the Psychopathic Deviate (Pd) scale, which is predictive of criminal behavior in both racial groups.
Because crime figures vary enormously over type, region, generation, and subpopulation, Roberts and Gabor (1990) proposed they could only be explained using “situational” and “interaction” factors. Thus Roberts and Gabor (1990) pointed out that whereas arrest data from the U.S. Federal Bureau of Investigation indicated that 47 percent of violent crimes were committed by blacks, another set based on the reports of victims, by the U.S. Department of Justice, observed that only 24 percent of such crimes were committed by blacks. The figures were also shown to change with time and geographic area. Thus, the involvement of blacks in crime had increased over the last thirty years and for one year in the state of Delaware, the homicide rate for blacks was 16.7 per 100,000 whereas in Missouri the rate was 65 per 100,000.
Roberts and Gabor (1990) also pointed out that American blacks had a higher homicide rate than their more racially pure counterparts in Africa. They cited rates of 0.01 per 100,000 in Mali and 8 per 100,000 in Tanzania. Moreover, homicide rates in the Far East varied considerably, from 39 per 100,000 residents in the Philippines to 1.3 per 100,000 in Hong Kong.
As discussed in chapter 7, however, I tested the generalizability of racial differences in crime by aggregating the international crime statistics reported to INTERPOL for 1983-1984 and 1985-1986, which provided data on nearly 100 countries in 14 crime categories. For both 1984 and 1986 African and Caribbean countries reported twice the amount of violent crime (rape, murder, and serious assault) as did European countries, and about three times as much as did countries from the Pacific Rim (Table 7.3).
In reply, Gabor and Roberts (1990: 338) argued that the international statistics are “an unstandardized data base which is highly sensitive to the differential legal definitions, reporting, and recording practices of the countries around the world.” In many countries, they pointed out, politically motivated killings are included in the homicide data. Rape, they go on to say, is notoriously underreported and highly sensitive to public attitudes, victim services available, the status of women, and the practices of the police and courts prevailing in a given society. Gabor and Roberts (1990) did not say why, despite all the sources of error they had enumerated, so clear a racial pattern could be calculated.
Many critics of the work on crime and race point out that African Americans are typically the victims of crime. For example, Roberts and Gabor (1990) showed that in the United States, black men are 20 times more likely than white men to be shot, cut, or stabbed, and black women are 18 times more likely to be raped than white women. Black people are also more likely than white people to be the victims of burglary, motor vehicle theft, assaults, robbery, and many other offenses.
To this argument two points may be made. First, as J. Q. Wilson and Herrnstein (1985: 463) cogently remarked:
To believe that blacks do not commit such offenses at greater rates than whites, one would have to believe that the higher rates of victimization are caused by whites entering black neighborhoods in order to break into homes and hold up citizens. While that is possible, it seems unlikely.
Second, there is an asymmetry to interracial crime. The problem of interracial violence is overwhelmingly one of black assaults on whites. While more than 97 percent of white criminals victimize white people, up to 67 percent of black criminals also victimize white people. According to U.S. Department of Justice statistics for 1987, 200 million whites committed 87,029 violent assaults on blacks while nearly 30 million blacks committed 786,660 violent attacks on whites. This averages out to 1 out of every 38 blacks violently assaulting a white in one year, and only 1 out of every 2,298 whites assaulting a black. The black criminal’s preference for white victims is at least 60 times that of the white criminal’s preference for black victims. Levin (1992) has discussed some of the social implications of racial discrepancies in crime production.
Is the Genetic Evidence Flawed?
… A 100 percent denial of genetic influence continues to be promoted, most forcefully by Lewontin (1991; Lewontin et al., 1984). One argument is that because development is so complicated and genetic x environment interactions are so ubiquitous it is impossible to disentangle causality and apportion variance separately to genes and environment (Hirsch, 1991; Wahlsten, 1990). […]
In general response to the complexity discussion Bouchard (1984: 182) made a forceful point: If context and interaction effects are so ubiquitous and genetic effects so complicated, how can it be that monozygotic twins reared apart grow to be so similar in so many ways? Siblings raised away from each other grow to be significantly similar to each other, with their degree of similarity being predicted by the number of genes they share. This implies the presence of genetically based stabilizing systems driving development into a common channel (see Table 3.1 and Figures 3.3 to 3.5).
Is r-K Theory Correct?
Several writers have claimed that my theoretical account ignored ecological processes and assumptions that are central to the r- and K-selection concept (Anderson, 1991; Lerner, 1992; Miller, 1993; Weizmann et al., 1990, 1991). One reason for widespread confusion even among ecologists has centered on the climatic conditions most likely to produce r-selection. For example, Barash (1982: 306) wrote, in his textbook Sociobiology and Behavior:
Although the distinction between r and K-selection was first made explicit by Mac Arthur and Wilson (1967), it was actually suggested nearly 20 years previously by the great evolutionary geneticist Theodosius Dobzhansky (1950). He noted that, in general, inhabitants of the temperate and arctic zones suffered mortality that was largely independent of their population density, occurring because of large-scale environmental fluctuations, such as drought, storms, sudden influx of large numbers of predators. In such conditions, mortality was relatively independent of individual characteristics, so parents ensured their reproductive success by generating a large number of offspring (that is, r-selection). By contrast, Dobzhansky emphasized that tropical species competed most intensely with one another rather than with the environment. The relatively benign habitat was virtually filled with organisms, so the difference between success and failure was by producing not a large number of offspring but rather a smaller number of well-endowed descendents (that is, K-selection).
Barash, however, is incorrect. Predictability is the ecological necessity for K-selection. This can occur in either a stable environment or a predictably variable one. What has apparently been misunderstood is that subtropical savannahs, where humans evolved, because of sudden droughts and devastating viral, bacterial, and parasitic diseases, are less predictable for long-lived species than are temperate and especially Arctic environments. Although the Arctic climate varies greatly over one year, it is highly predictable, but harsh, over many years (Rushton & Ankney, 1993).
Many critics have made the classic mistake (many ecologists do also) of confusing variable and unpredictable. Weizmann et al. (1990: 2) claimed that, because of their longer ancestry in stable tropical climates, blacks should be more K-selected than other human groups. Miller (1993) also suggested that the converse might be true, that arctic animals with variable winter cycles, would be r-selected. But, of course, they are not. Long-lived arctic mammals like polar bears, caribou, muskox, seals, and walruses are highly K-selected, as are Arctic people. The reason is that the Arctic environment is not only highly variable, but more importantly, is highly predictable as well. (More generally, data show that plants, lizards, and mammals become more K-selected with increasing elevation and latitude [Zammuto & Millar, 1985].)
Annual food shortage in the arctic is predictable, that is, people knew that it would be difficult to find food for 4 to 6 months every year. Thus, this selected for K-traits. If an individual had the traits necessary to plan ahead well, the individual’s genes survived. Contrast this with tropical savannahs where disease epidemics and prolonged droughts were (and are) unpredictable. Under such conditions an individual that produced many descendants during favorable conditions would be most likely to have some that survived (unpredictable) catastrophes. Alternatively, if an arctic-dwelling person put maximal effort into mating/reproduction he or she likely wouldn’t survive for one year; their offspring certainly would not.
Additional criticisms have been made of my (Rushton, 1985a, 1988b) version of r-K theory (originally termed “differential K theory” to emphasize that all human beings are K-selected relative to other animals). Some have insisted that r-K theory is applicable only at the level of the species or, at best, to well-defined local populations, but is not applicable to variation within species (Anderson, 1991; Lerner, 1992; Weizmann et al. 1990, 1991). This criticism ignores both the origins of the theory (MacArthur & Wilson, 1967) and the within-species studies of plants, insects, fish, and nonhuman mammals (chap. 10). Other complaints that predictions about altruism, law abidingness, and sexuality are arbitrary and do not derive from r-K theory, rest on an incomplete understanding of what the original codifiers of the theory have written (see chap. 10 for references and page numbers).
Are Environmental Explanations Sufficient?
Environmental r-K Theory
Draper and Harpending (1982, 1988) proposed that father absence was a critical determinant of later reproductive strategy. Due to learned perceptions about the predictability of the environment, low-income and father-absent families were said to adopt an opportunistically oriented r-strategy of high “mating effort” whereas high-income and father-present families adopted a future-oriented K-strategy of “parenting effort.” The more predictable an environment is learned to be, the more a K-strategy would be adopted. Draper and Harpending reviewed the correlates of the “mating effort” strategy and its culmination in the father-absent child: poor school performance, anti-authoritarianism, aggressiveness, sexual precocity, and criminality. They concluded that “father-present societies are those where most males act like dads and father-absent societies are those where most males act like cads” (1988: 349).
Building on the earlier work by Draper and Harpending, environmental theories of the development of reproductive strategies have been proposed by Belsky, Steinberg, and Draper (1991) and Chisholm (1993). Thus, two diverging pathways (Figure 12.2) were succinctly described by Belsky et al. (1991: 647):
One is characterized, in childhood, by a stressful rearing environment and the development of insecure attachments to parents and subsequent behavior problems; in adolescence by early pubertal development and precocious sexuality; and in adulthood, by unstable pair bonds and limited investment in child rearing, whereas the other is characterized by the opposite.
These predictions have been confimed in several longitudinal studies. In one, over 900 16-year-old New Zealand girls were assessed with a diverse battery of psychological, medical, and sociological measures every 2 years from age 3 to age 15 (Moffit, Caspi, Belsky, & Silva, 1992). Family conflict and father absence in childhood predicted an earlier age of menarche, independent of body weight. In longitudinal studies in the United States, Jessor, Donovan, and Costa (1991) predicted onset of sexual intercourse among adolescents from knowledge of whether they had low scores in academic achievement and religiosity, and high scores on measures of deviance and “problem behavior.” Multiple correlations reached predictive levels greater than 0.50 accounting for approximately 30 percent of the variance over a 9-year interval.
Additional aspects of sexuality are predictive. Age of menarche is related to adult orgastic capacity and sexual activity in both women (Raboch & Bartak, 1981) and men (Raboch & Mellan, 1979). In a review of the literature on early menarche, Surbey (1990) found a significant positive correlation between mothers and daughters’ menarcheal ages and for early menarche to be associated with a cluster of social and sexual behaviors related to a woman losing her mate or never living with the father of her child. Promiscuity, high school drop out, and other problem behaviors were also more likely.
In a longitudinal Swedish study of 1,400 individuals, Magnusson (1992) found that early-maturing girls had cheated, been truants, got drunk, and tried marijuana more often than late-maturing girls. Conflicts with parents and teachers were more common and the early-maturing girls were less interested in school and future education. The early-maturing girls entered pair relationships earlier, married earlier, had children earlier, and entered the labor market earlier. […]
It is always easy to hypothesize poorly defined causal factors to explain race differences for which there is actually no scientific evidence. Jensen (1973) labeled these “X-factors,” that is, factors that can account for anything, but cannot be proved or disproved. Most analyses of racial differences are superficial and diffuse. If understanding in this area is to advance, it is essential that hypotheses be made with greater clarity and with the capacity for generating differential predictions.
The mechanism of paternal provisioning has been postulated by E. M. Miller (1993, 1994) to be a more exact specification of the evolutionary process by which the races became differentiated. A continuum of male paternal effort is focused on, ranging from none to maximum. Miller proposed that, in warm climates, females can typically gather enough food to support themselves and their children. In cold climates, however, hunting is required, and females typically do not hunt. Thus, males in cold climates were selected to devote more of their effort to provisioning and less to seeking matings. Thus, for males, during the hunter-gatherer period of human evolution, the optimal combination of mating effort and paternal investment varied with the severity of the winters. In Africa, a strong sex drive, aggression, dominance seeking, impulsivity, low anxiety, sociability, extra version, and a morphology and muscle enzyme suitable for fighting lead to mating success, whereas in northeast Asia, altruism, empathy, behavioral restraint, and a long life assisted success in provisioning.
Chapter 13 : Conclusions and Discussion
Generalizing the r-K Formulation
If one generalizes the information from the macro-scale characteristics outlined in Table 10.1 and Figure 10.3 to the within-race human variation, several falsifiable predictions can be derived. A summary of the variables expected to intercorrelate is shown in Table 13.1, along with positive and negative evidence and identification of those variables not yet examined.
From Table 13.1 it can be seen that while many variables remain to be investigated, of those that have been, most are in the expected direction. There are some anomalies. Although it is predicted that the higher a person’s socioeconomic status, the later he or she would enter puberty, the opposite appears to be true (Malina, 1979). Another contradictory finding occurs with body size. Because large body size is indicative of a K-strategy, Mongoloids should be larger than Caucasoids or Negroids, and yet the opposite is true. Large body size should dispose to law abidingness, and yet the evidence here too is in the opposite direction. Perhaps the most striking aspect of Table 13.1, however, is the infrequency of such lapses. Additional relationships among the variables can be considered. While some of the ideas are speculative, they may be worthy of further investigation.
Double ovulation and the production of two-egg twins has been related to several r-K traits. Mothers of dizygotic twins can be considered to represent the r-strategy. Their characteristics have been contrasted with mothers of singletons representing the K-strategy (Rushton, 1987b). Predictably, the mothers of dizygotic twins are found to have, on average, a lower age of menarche, a shorter menstrual cycle, a higher number of marriages, a higher rate of coitus, more illegitimate children, a closer spacing of births, a greater fecundity, more wasted pregnancies, a larger family, an earlier menopause, and an earlier mortality. Further, twins typically have a shorter gestation, a lower birth weight, a greater incidence of infant mortality, and a lowered IQ.
Other family structure variables such as marital breakup and single parenting are related to r-characteristics such as child abuse, lower intelligence, educational dropout, sexual precocity, and juvenile delinquency (Draper & Harpending, 1988; J. Q. Wilson & Herrnstein, 1985). To quote again the distinction made by Draper and Harpending (1988: 349): “Father-present societies are those where most males act like dads and father absent societies are those where most males act like cads.”
The developmental transition in the lives of most young people from virginity to nonvirginity takes place within a network of individual, social, and behavioral factors that go beyond mere covariation. In two longitudinal studies, Jessor et al. (1991) found that early onset of sexual intercourse was predicted from knowledge of whether adolescents had low scores in academic achievement and religiosity, and high scores on measures of deviance and “problem behavior.” Multiple correlations reached predictive levels greater than 0.50 accounting for approximately 30 percent of the variance over a 9-year interval.
Personality and sexuality have been related. Eysenck (1976) found that, compared to introverts, extroverts typically have intercourse earlier, more frequently, and with more different partners. These findings were replicated by Barnes, Malamuth, and Check (1984). More historically, in Civilization and Its Discontents, Freud (1930) explained the existence of a positive correlation between restrained sexuality and the production of culture through the psychodynamics of repression and sublimation. The perspective being outlined here explains it in terms of genetically correlated traits. Energy can be allocated to reproductive effort either directly through sexual behavior or indirectly through the ability to produce complex social institutions and thereby compete when resources are scarce.
Sexual behavior varies by social class. Weinrich (1977) examined over 20 studies from the world literature and concluded that the lower the socioeconomic status, the earlier the age of first coitus, the greater the likelihood of premarital coitus and coitus with prostitutes, the shorter the time before engaging in extramarital affairs, and the less stable the marriage bond. Weinrich (1977) also found that the higher the socioeconomic status, the more likely the individual was to engage in sexual activities beyond those directly leading to conception, including fellatio, cunnilingus, petting, and affection, and coitus during menstruation. Moreover, although lower socioeconomic status adolescents apparently knew as much about birth control devices as upper socioeconomic status adolescents, they used them less frequently.
Of interest are social class differences in the production of two-egg twins. Monozygotic twinning is nearly constant at about 3 1/2 per 1,000 in all groups. Dizygotic twinning, however, is greater among lower than among upper social-class women in both European and African samples (Golding, 1986; Nylander, 1981).
Altruism and Law Abidingness
Because they are lower in altruism and disrupt rather than maintain social organization, criminals are considered to represent the r-strategy. Ellis (1987) found that criminals have the following r-strategy traits: large numbers of siblings (or half-siblings); families in which parents no longer live together; shorter gestation periods (more premature births); more rapid sexual maturation; greater copulatory frequency outside of bonded relationships (or at least a preference for such); less stable bonding; lower parental investment in offspring (as evidenced by higher rates of child abandonment, neglect, and abuse); and a shorter life expectancy.
Antisocial and other problem behaviors like alcohol and drug abuse are linked to early onset of sexual intercourse (Jessor et al., 1991). Among adolescents, 36 to 49 percent of the variance in nonsexual forms of deviance in siblings of either sex could be explained by the amount of sexual behavior engaged in by the other (Rowe et al., 1989).
Conscientiousness at work, as well as more obvious criminal behavior, has also been linked to temperament and intelligence (Elander, West, & French, 1993). Some evidence has suggested that introverts are more punctual, absent less often, and stay longer at a job, whereas extroverts spend more time talking to their workmates, drinking coffee, and generally seeking diversion from routine. Accidents by bus drivers have also been found to be predicted by intelligence and extraversion (Shaw & Sichel, 1970).
Health and Longevity
As shown in Figure 10.3, humans are the only primates with a post-reproductive phase. One explanation for menopause is that since the human body becomes weaker with age, women eventually reach a point where continued childbearing would endanger their lives. While there is no equivalent pressure on men, sperm production declines with age. By 45 years of age, a man is producing only 50 percent of the sperm he was producing at 18 years of age, and most older men have difficulty attracting fertile females. Thus, in the evolutionary past, older people better aided copies of their genes by caring for grandchildren and the extended family than by producing additional offspring themselves. With increasing K, grandparents will have to remain healthier and live longer to be able to do this effectively since both their own offspring and their children’s offspring will be delaying reproduction to later ages. In both developed and developing countries early maternal death is associated with short spacing of births and total number of children.
Lower socioeconomic classes have higher death rates than upper socioeconomic classes and these differences have increased in the past several decades. The Black Report and other studies record a growing disparity in death rates between occupational classes in England and Wales (Black et al., 1982; Whitehead, 1988; Marmot et al., 1991). For example, in 1930 people in the lowest social class had a 23 percent higher chance of dying at every age than people in the highest social class. By 1970, this excess risk had grown to 61 percent. A decade later it had jumped to 150 percent (Black et al., 1982). This increasing disparity presents a paradox especially when a national health service system has long existed in Britain to minimize inequalities in healthrelated services.
Similar gaps have been noted in France and Hungary during the past two decades (Black et al., 1982). The inverse relation between mortality and socioeconomic status has also been increasing in the United States. One large study showed that over the 26-year period between 1960 and 1986, health inequalities according to educational level increased for whites and blacks by over 20 percent with respect to women and by over 100 percent with respect to men (Pappas et al., 1993).
The increasing correlation of health and social class is explainable from an r-K perspective when it is appreciated that removing environmental barriers to health increases the variance accounted for by genetic factors (Scriver, 1984). In a parallel way, increasing equality of educational opportunity leads to an increase in the heritability of educational attainment (Heath et al., 1985). Generally, removing environmental impediments makes individual-difference variance more dependent on innate characteristics. This implies that, in the 1990s at least, and on average, more genes coding for good health and longevity exist in persons of the upper classes than in persons of the lower classes (Rushton, 1987a).
Extroverts may be less K than introverts for they are described as “active,” “impulsive,” and “changeable” while introverts are “careful,” “thoughtful,” and “reliable” (Eysenck & Eysenck, 1975). With respect to academic success, some evidence suggests that while extraverted children may perform better in school up until puberty, after this introverts gain a progressive advantage (Anthony, 1977; Eysenck & Cookson, 1969). Jensen (1980a) reported a tendency for introverts to perform faster on reaction-time measures of intelligence than extroverts. Finally, there is the evidence that extroverts are less conditionable and more criminal than introverts (Eysenck & Gudjonsson, 1989). An underlying dimension of “behavioral restraint” may be involved (Gray, 1987).
Masters (1989) suggested an r-K integration of Cloninger’s (1986) three-dimensional system of personality based on neurotransmitter functions. According to Cloninger, harm avoidant vs. risk-taking is associated with serotonergic transmission, novelty-seeking vs. stereotyping rests on dopaminergic transmission, and reward dependence vs. social independence rests on noradrenergic transmission. Masters hypothesized that r-strategists are those with risk-taking, novelty-seeking, and reward-dependent personalities while K-strategists are those with harm-avoidant, conventional, socially-independent personalities.
Masters went on to connect r-K strategies with preferences in assortative mating (chapter 4). K-strategists were said to prefer others who were genetically similar, in part, because they are not risk takers, whereas r-strategists do not necessarily prefer similarity, in part, because they seek novelty. Thus, spousal similarity will be less in r-strategists. Masters used r-K theory to explain why interethnic dating is more frequent among poorer r-groups (e.g., in Hawaii) than in wealthier K-groups.
Brain Size and Cognitive Ability
The published research on this topic has now been summarized by Rushton and Ankney (1996). The well-established relationship has been most clearly shown using Magnetic Resonance Imaging (MRI), which creates, in vivo, a three-dimensional image of the brain. An overall correlation of 0.44 was found between MRI-measured brain size and IQ in 8 separate studies with a total sample size of 381 nonclinical adults, which is roughly equivalent to the strength of the relationship between socioeconomic status of origin and IQ. From 7 MRI studies of clinical adults (N = 312) the overall correlation was 0.24; from 15 studies using external head measurements with adults (N = 6,437) the overall correlation was 0.15, and from 17 studies using external head measurements with children and adolescents (N = 45,056) the overall correlation was 0.21. The relation between brain size and IQ appears early in life, for head perimeter at birth correlates with IQ scores at age 7, as determined in the National Collaborative Perinatal Project study of thousands of white and black children (described on pp. 37-41).
A functional relation between brain size and cognitive ability is implied in two studies by Jensen showing the head size/IQ relation within as well as among families. A tendency for a sibling with a larger head to have a higher IQ than a sibling with a smaller head is of special interest because it controls for many of the sources of variance that distinguish families such as cultural background and socioeconomic status. Jensen (1994) examined 82 pairs of monozygotic and 61 pairs of dizygotic adolescent twins and extracted the general factor, or psychometric g, from their IQ tests and found it correlated with head size across individuals (r = 0.30), within twin pairs (r = 0.25), and between twin pairs (r = 0.32). Jensen and Johnson (1994) examined the head size/IQ relation in several hundred pairs of siblings from the National Collaborative Perinatal Project and found that at 7 years of age (although not at 4 years) a significant correlation existed within families (r = 0.11) as well as between families (r = 0.20).
It is understandable that correlations between IQ and overall brain size will be modest. First, much of the brain is not involved in producing what we call intelligence; thus, variation in size/mass of that tissue will lower the magnitude of the correlation. Second, IQ, of course, is not a perfect measure of intelligence and, thus, variance in IQ scores is an imperfect measure of variation in intelligence. Although brain size accounts for only a small percentage of variation in cognitive ability, it is important to note, following Hunter and Schmidt (1990), that small correlations can have large effects. For example, although the MRI-established brain size/IQ correlation is only about 0.40, when squared it shows that 16 percent of the variance is explained, and it shows that, from regression predictions, for every 1 standard deviation increase in brain size, IQ will increase, on average, by 0.40 standard deviations.
The relevant question thus becomes: “Are heritabilities for blacks lower than those for whites?” Evidence for his proposition comes from the already discussed study of cranial capacity in black and white twins (Rushton & Osborne, 1995). A somewhat higher range of heritabilities (depending on corrections for age and body size) was found for whites than for blacks (47 to 56 percent vs. 12 to 31 percent), and a somewhat lower range of nvironmentalities was found for whites than for blacks (44 to 53 percent vs. 69 to 88 percent). Environmental factors could be having a more detrimental effect on brain development among blacks than among whites. These differences, however, did not achieve normal levels of statistical significance but they do point the way to how twin studies may inform about differentially harmful environments.
Support for generalizing within-group heritabilities to between-group differences comes from new work showing that the origins and structure of intellectual and social variables are virtually identical for blacks, whites, and Asians within the United States. Rowe, Vazsonyi, and Flannery (1994) used diverse but representative data sources to compare the similarity of correlation matrices for developmental and outcome variables for the various groups. The matrices were as similar to each other as were matrices computed from random halves within the same ethnic group. Ree and Carretta (1995) found a near identity of structure of intellect for ethnic groupings with the normative sample of the Armed Services Vocational Aptitude Battery (ASVAB) used to select applicants for all military enlistments in order to assign them to first jobs. Carretta and Ree (1995) found .the same result with the more specialized Air Force Officer Qualifying Test (AFOQT), a multiple-aptitude battery given to applicants. Thus, average group differences result from different levels of the same processes, be they genetic or environmental.