New England Antiquities Research Association


Human Lymphocyte Antigens: Apparent Afro-Asiatic, Southern Asian, & European HLAs in Indigenous American Populations
by James L. Guthrie [1]




Section 1


In our pursuit of cultural and physical human diffusion around the globe, NEARA encourages research exploring "hard" scientific evidence.  Over the years, Jim Guthrie has published numerous articles in the NEARA Journal on many subjects.  Now, his long interest in micro-biology has culminated in a comprehensive article on human lymphocyte antigens and their dispersal into indigenous American populations, published in Pre-Columbiana, Volume 2, Number 2 & 3, December 2000 & June 2001.  Pre-Columbiana, like the NEARA Journal has a limited circulation and the NEARA editorial team felt that this work is so important, that it must reach as wide an audience as possible, scientist and layman alike.  In collaboration with Pre-Columbiana editor Stephen Jett and with permission, we are pleased to make this ground-breaking research available on through the internet.

The NEARA editors



Studies have shown that the number of human lymphocyte antigen (HLA) alleles characteristic of indigenous American populations is relatively small, and that some isolated South American tribes possess only a few types that are common throughout the Americas.  But other groups, especially those near sites of former Mesoamerican and Andean urban societies, exhibit HLA alleles that are rare in America but common in certain Afro-Asiatic, South Asian, and European populations.  These unexpected genes account, on the average, for 6-7% of the American HLA total, but range as high as 24%.


The atypical genes are postulated to have been acquired by assimilation of foreign populations at various times after initial colonization of the hemisphere but prior to the sixteenth-century influx of Europeans and Africans, because they suggest gene-flow from places some scholars claim to have been in ancient contact with the Americas, such as North Africa and Southeast Asia.  The occurrence of parallel anomalies in blood groups such as Rhesus, Kell, and Duffy, as well as in serum proteins such as transferrin and immunoglobin, supports this interpretation to some degree, but the small number and poor distribution of samples in all systems, including HLA, preclude conclusive results.  Other explanations are considered possible but less likely.  Key Words:  Genetics, Americas, Migrations, Pre-Columbian.




Human lymphocyte antigens (HLAs) are part of the histocompatibility system, whose main function is to produce antibodies.  They are proteins on white blood cells that play a role in tissue and organ transplantation similar to that of the more familiar blood groups in transfusion.  HLA distributions differ among world populations to such a degree that careful typing and matching must be done at transplant centers in order to minimize adverse reactions.  This diversity gives population geneticists a powerful tool for tracing ancient migrations, and, at present, HLA distributions are more informative in this regard than are any other genetic system except DNA.

The purpose of this paper is to point out that certain indigenous American populations have HLA alleles that are rare in America but common in parts of the world not usually associated with American Indian origins, and many of the unexpected HLAs are characteristic of populations sometimes claimed, on the basis of other kinds of evidence, to have had ancient contacts with Americans.  In other words, there seems to be genetic support for the idea of ancient interhemispheric mobility.  I propose that the “non-Indian” HLAs were introduced from the outside at various times between the initial colonizations of the hemisphere and the late fifteenth century A.D.  I also consider other possible explanations, but they seem less likely to me.  The percentage of apparently foreign HLAs averages only 7% in the populations tested so far, and this observation does not contradict the supposition that the founding American populations were overwhelmingly Asian.

Many competent scholars have published evidence for intermittent contact between Native Americans and others over several millennia.  Among the proposals are: contacts between Indonesia as well as Japan with the Pacific coast of America, circumpolar migrations, trans-Atlantic incursions from Africa and the Near East, and numerous interactions with China, India, and Pacific Oceania.[2]  These publications are little known among academics, being largely ignored for reasons recently analyzed by Alice Kehoe (1998:190-207). 

The prevailing view of anthropologists is that Native Americans were almost entirely cut off from developments elsewhere after the submersion of Beringia.  Persistence of this belief is due in part to the fact that contrary evidence from diverse areas of scholarship has not been integrated and argued in a sober and convincing manner, so that few professional academicians are aware of more than a fraction of the pertinent literature.  They find it easy to dismiss as isolated curiosities the parts that they know about.  Full synthesis of the evidence for early transoceanic contacts would be an enormous job, one that seems unlikely to be done in the near future. 

Another problem has been the adversarial nature of discussions, tempting participants to take extreme positions.  Advocates of early contact sometimes make claims that go far beyond the evidence, while critics tend to base their rebuttals on preconceptions and outdated information.  It is rare to find informed evaluation of new, unexpected data about possible early voyaging.

The HLA material presented here is based on the compilations of L. Luca Cavalli-Sforza, Paolo Menozzi, and Alberto Piazza in The History and Geography of Human Genes (1994).  I think that the distributions of HLA types, combined with supporting data from other genetic systems, provide strong evidence that some American populations have assimilated significant numbers of foreigners in places where early contacts have been claimed on the basis of other considerations.  I am not aware that anyone has discussed the potential of HLA data in evaluating these proposals except Cavalli-Sforza et al., in passing.

Cavalli-Sforza and his colleagues published The History and Geography of Human Genes to summarize decades of work on distributions of genetic polymorphisms, including those of HLAs.  Their Appendix 2 lists worldwide frequencies of the 29 HLA families for which there are enough data to construct useful distribution maps, but there are many more types that are less well mapped at the present time.[3]  This discussion is limited to those at the A and B loci tabulated by Cavalli-Sforza et al. (hereafter, “CS”).  For my purposes, I will use the term “HLA type” to mean the family designation—e.g., A*1.

The objective of the CS study was to collect and compare, using mathematical models, gene-frequency distributions of world populations as a way of illuminating their population structures.  Although data were gleaned from nearly a thousand publications, the samples are not evenly distributed and some regions are poorly represented, causing conclusions to be less reliable than would have been possible with a more complete database.  The set of genes widely enough documented for their aims comprised 128 alleles from 491 individual populations and 116 aggregates (pp. 393-468); however, there were sufficient HLA data from only 132 individual populations and 78 aggregates, including 30 indigenous American groups.  Only twelve type-A and 17 type-B HLAs were sufficiently well sampled for useful worldwide comparisons. The problems and caveats associated with the use of whatever data are available from the literature are thoroughly discussed by CS and should be kept in mind by anyone who uses their tables and maps for further analysis.               

The set of HLA data has large gaps.  For example, there are no data from Egypt and few from Northeast Asia.  In North America, interpretation is severely limited by the absence of HLA samples from several important linguistic groups.  Cherokee is the only eastern Amerind population in the set, and Navajo is the only representative of the set of populations commonly known as Na-Dene.  Na-Dene is a linguistic term, and some find the category to be without much linguistic basis (a phantom), objecting to its use in classifying populations.  I use it in this report as it is used in much of the cited literature to designate a genetically distinct set of populations of northwestern North America (northern Na-Dene) plus Navajo and Apache (southern Na-Dene).

In South America, there are no samples from Arawakan groups and there is only one sample each from Panoans and Tupians.  In addition to the HLA data presented by CS, I have used results published by Belich et al. (1992) for the Guaraní (Tupian) and Caingáng (Ge) people.  Specific linguistic classifications used in this report are those of Greenberg (1987) but as far as I am aware do not involve any disputed terminology.  Designations are intended mainly for identification.  As Greenberg, Cavalli-Sforza, and others have established, genetic and linguistic distributions are strongly associated, and the reasons for this have been explained in detail by Cavalli-Sforza, Piazza, Menozzi, and Mountain (1989) and by Greenberg (1995), who cited the statistical analysis by Penny, Watson, and Steel (1993).

In general, an individual with several HL antigens will be better able to resist disease than one with fewer.  Many populations have all, or nearly all, of the 29 HLA types listed by CS; however, this is not true of indigenous Americans, who typically lived at population densities too low to maintain HLA diversity.  Genes are apt to be lost whenever a small group splits from a parent population (the founder effect), so two long-separated tribes in the same geographic region may have different sets of a few HLA types at exaggerated frequencies.  CS point out (p. 335) that in the total absence of cross-migration, drift eventually might lead to the survival of only one HLA allele in each population.  Gene loss is minimized in large, stable populations, especially if missing or low-level genes are reintroduced from the outside.  CS think there was gene loss during migrations to America but that later gene flow reintroduced some missing genes (p. 130).

American populations listed in CS have from seven to 26 HLA alleles.  Those near the ancient urban societies of Mesoamerica and the Andes have the most, and some marginal tribes of South America have the fewest.  The degree of HLA diversity in a population may be a measure of its former size and cosmopolitan nature.  According to Parham and Ohta (1996:71), the large number of alleles in modern urban populations is “predominantly the result of admixture bringing together alleles that evolved under natural selection in previously separated populations.”

Some American HLA types remain undetected for technical reasons, and some with restricted world distributions were omitted from the CS tabulations (A*24, A*68, B*39, B*48, B*51, B*52).  For these reasons, HLA totals given average 91% rather than 100%. Of those listed, four types account for 94% of the American A HLA-A total and six types account for 93% of the HLA-B total.  I will refer to these ten as the “American” alleles. Some South American tribes apparently have only these alleles, whereas those near former urban centers tend to have significant percentages of HLAs that now are most common in the Near East, India, Africa, Northwest Europe, or Southeast Asia (including Pacific Oceania).  I will refer to these as the “non-Indian” or atypical alleles.

Table 1 ranks the American samples according to the number of alleles reported.  Except for Greenland Eskimos, who seem to have acquired some European genes, those with 16 to 26 alleles (first column) are from present or former urban populations.  Those with ten or fewer types (third column) are exclusively South American samples from comparatively isolated places. Locations of the South American populations are shown in Map 1. Tribal/group designations used throughout this report are normally those of the original authors, even though a few names may no longer be in use. 



Table 1.  Number of HLA families tabulated by CS for American indigenes.a








Quechua 24 Ticuna 13 Trio 10
Cherokee 22 Navajo 13 Guaraní 10

Greenland Eskimo

21 C Amerind 12 Parakana 10
Eastern Maya 21 Warao 12 Mataco 9
Mapuche 19 NW Eskimo 12 Cayapó 9
Araucano 18 Aymara 12 Oyampí 9
Papago 17 Atacama 11 Yanomama 9
Pima 16 Yupik 11 Yupa 9
Zuni 16 E Eskimo 11 Bari  8
        Makiritaré 8


aCaingáng and Guaraní from Belich et al. (1992). 



Cherokee origins are controversial. Cherokees are genetically diverse and may, at least in part, have migrated to the U.S. Southeast from Mesoamerica.  According to CS's genetic-distance table (p. 327), they are closest to, and not statistically different from, the Nahua, and Spuhler (1979) linked them genetically with Indians of the American Southwest.[4]  Overall, they are more like Amerindians of the southwest than like neighboring Algonquians (CS 327).  The HLA data suggest to me that they came from Mesoamerica or perhaps even Venezuela.  But without HLA data from other eastern Amerindians, it is impossible to tell whether Cherokees are more diverse than the others.



American and Other HLAs


Of the 29 HLA types that CS listed, nine occur widely in the Americas and all but one of these has its highest frequencies there.  They are likely to have been present in the earliest populations.  A tenth (B*27) has its highest frequencies in the Navajo and Eskimo samples.  These ten “American” alleles have interesting distributions outside of America that are clues to American origins and migration paths, but that topic will not be treated here except for brief comments in the appendix.  The focus of this report is on the possible significance of the 18 “non-Indian” HLAs.  A 29th allele, B*41, so far seems to be absent in America.


Table 2 lists the American samples in decreasing order of their “non-Indian” content.  The numbers show percentages of these alleles in the total tabulated for that sample.  For example, 24% of the Nahua total comes from 16 HLA alleles that seem “out of place” in America.  The Nahua are at one extreme of the spectrum, while six South American tribes are at the other, showing none of the atypical alleles.  Except for the northwest Canadian Eskimos and the Cherokees, populations with 10% or more are clustered either in the Andean region or in southwestern North America.



Table 2.  “Non-Indian” HLA content of American samples, by percent.a


Nahua 24.0
Cherokee 17.4
Mapuche 17
Atacama 15.8
NW Canadian Eskimo 14.9
Araucano 13.4
Papago 12.4
Eastern Maya 11.5
Quechua 11.5
C Amerind composite 10.9
Pima 9.8
Caingáng 9.4
E Canadian Eskimo 7.6
Navajo 7.3
Greenland Eskimo 4.4
Oyampí 4.0
Aymara 3.7
Zuni 2.4
Warao 2.2
Guaraní 2.1
Inupik 2.1
Ticuna 1.7
Yupik 1.1
Trio 0.7
Parakana 0.5
Yupa 0.1


aThose with exclusively “American” HLAs are: Bari, Cayapó, Emerillon, Mataco, Makiritaré, and Yanomama.



The 18 atypical alleles account for only 6% of the American HLA-A total and 7% of the HLA-B total.  Seven of these are termed “Eurasian” by CS (p. 130) and four are said to be characteristic of Southeast Asia or Pacific Oceania (p. 369). CS make virtually no comment on these in their chapter on the Americas, because they are concerned with main effects rather than with anomalies.  However, both the Eurasian types (A*1, A*3, A*32, B*7, B*8, B*12, B*14) and the Oceanic types (A*10, A*11, B*13, B*22) have distributions in America that seem to reflect pre-Columbian contacts of the kinds that have been advocated on the basis of other kinds of evidence.  There are other possible explanations that must not be ignored, however.  One is that the first Native Americans possessed all of the HLA alleles and that some were lost, leaving a random pattern with no useful information about prehistory.  Another is that reintroduction of missing genes took place recently through unrecorded post-Columbian contacts.[5]  This is a serious problem and requires careful case-by-case study before conclusions are possible.


Until about 30 years ago some largely “unmixed” American populations could be studied, but CS used a few samples thought to have as much as 10% of recent admixture.  They say (p. 341) that “In general, we have tried to avoid using populations in which admixture of some magnitude was suspected, but it was impossible to avoid mixed populations entirely without introducing an unwarranted bias.”  Some anomalies may be explainable as recent admixtures, but I think that most are not.  The apparently foreign HLA alleles are usually less characteristic of Spain, Portugal, or West Africa than of places alleged to have had earlier contact, such as Pacific Oceania, North Africa, or Southwest Asia, and in many instances other “marker” genes of modern European and West African populations are absent.  Also, veteran linguistic scholars such as Key (1999), Foster (1999), and Stubbs (1998) are advancing reasons to think that elements common to Afro-Asiatic and Austronesian languages were present in certain Mexican and South American Indian languages long before post-1492 contact.  Nevertheless, the possibility of modern admixture and other explanations must be kept in mind.

The CS volume is masterful.  Except for Mourant, Kopec, and Domawewska-Sobczac's pioneering work of 1976 (hereafter, “Mourant”), there has been nothing like it.  However, its very scope required selectivity and minimal discussion of several interesting distributions and relationships, and the authors expressed their hope that others will use the database for further investigations.  In the discussion of HLA data, for example, there is no interpretation of the world distributions of B*21 or A*33, which (combined) contribute a fifth of the “non-Indian” data in America.

The apparently Arabian or North African B*21 reaches frequencies of about 10% in three samples of Uto-Aztecan speakers, yet CS say only that it peaks in the Caucasian-occupied portions of Africa (p. 187), that it averages 1.5% in America with a “maximum above 10% in the extreme southwest of the United States” (p. 334), and that it is absent from South America (p. 369)—the last despite their data showing traces in the Mapuche, Araucano, and Yupa samples.

A*33 seems to trace movement of a Near-Eastern population to Southeast Asia and South America,[6] but CS do not mention this.  They say only that A*33 peaks in the Middle East and in Southeast Asia (pp. 288, 247) and that it is one of 13 alleles occurring in America at frequencies “significantly different from zero” (p. 334).  Yet, the key to the synthetic maps summarizing the statistical treatment (p. 338) shows A*33 contributing 70-80% to the second principal component, with its strongest effect in eastern North America and Panama.  Interpretation is left to others.  Like B*21, A*33 is said to be absent from South America, despite a 5% level in the Quechua sample and a 3% level in the Aymara sample.  These erroneous statements are probably based on the computer-smoothed distribution maps from which the actual data have disappeared.  As recognized by CS, it will be profitable for others to examine parts of the data in more detail than they were able to do.

The authors seem to accept conventional ideas about the peopling of the Americas, avoiding much mention of possible post-initial-settlement gene flow from the outside.  Negroid and Caucasoid genes are assumed to be recent acquisitions.  This is understandable in discussions of eastern Venezuela, the Guianas, and eastern North America, where centuries of admixture are well documented, but readers who know the evidence for more ancient contacts may find the commentary inadequate.  For example, one synthetic map (p. 339) shows apparent African admixture near the northeast coast of South America and in the southwestern United States.  The South American admixture is considered “likely to have taken place,” but there is no comment on the Southwestern phenomenon, for which the conventional model provides no explanation.  Acknowledgement that several early explorers reported seeing “Negroes” near South American coasts and in Central America would have been helpful.[7]  The only departure from orthodox views appears on page 340, where Polynesian influence is invoked to explain an anomalous area on a color map of the Americas “in the middle of the Andes near Bolivia and Peru” caused by atypical data in several genetic systems for the Atacama, Mapuche, Araucano, and Aymara samples south of Lake Titicaca.[8]  The use of genetic data to reconstruct population movements is at an early stage.  Events that shaped present distributions are complex and controversial, but it is my view that reconstructions that ignore possible pre-Columbian gene flow between Native Americans and others from overseas will prove defective and ultimately will be revised.



Classification of Atypical HLAs


To organize this presentation, I have grouped the “foreign” HLAs into three somewhat arbitrary categories: those now concentrated in North Africa or Southwest Asia (Afro-Asiatic), those now prevalent in Northwest Europe (“European”), and those with their highest frequencies in South and Southeast Asia and the Pacific islands (“southern Asian”).  These assignments are not completely clear-cut, because of overlapping distributions.  For example, CS term A*1 “European” but it appears to me to have been dispersed from North Africa or Southwest Asia, so I have assigned it to the “Afro-Asiatic” set.  The same may be true of A*33, about which CS say little.  I attribute its present distribution to Asian intrusion into both Africa and America and have grouped it with the southern Asian HLAs.  These problems are purely organizational and do not affect the central demonstration that unexpected genes are present in America.


Table 3 lists all 18 “non-Indian” alleles in decreasing order of their contribution to the American total.  The most important is Afro-Asiatic B*21, which contributes 10.4% of the atypical HLAs found in American samples.  Six alleles account for more than half of the total.  The nine Afro-Asiatic types together contribute 47%; the five southern Asian types, 28%; and the four European types, 25%.  These percentages are only approximate as they stand and would doubtless change with more complete sampling or with changes in classification.


Table 3.  "Non-Indian" HLA alleles, in order of importance.

Allele % of Contribution Designation
B*21           10.4 Afro-Asiatic


9.6 Southern Asian
B*7              9.1 European
A*30              8.0 Afro-Asiatic
A*32       7.9 Afro-Asiatic
B*14   7.0 Afro-Asiatic
B*12               6.9 European
A*1                   6.5 Afro-Asiatic
B*22                    6.4 Southern Asian
A*11                5.4 Southern Asian
A*3                 4.7 European
B*8                4.5 European
A*10                   4.0 Southern Asian
B*17                  2.6 Afro-Asiatic
B*13                   2.4 Southern Asian
A*29               2.0 Afro-Asiatic
B*18         1.9 Afro-Asiatic
B*37                0.6 Afro-Asiatic



The atypical HLAs of the eastern Canadian and Greenland Eskimos are predominantly European, and those of the Nahua and Eastern Mayans are heavily southern Asian.  Those with mainly Afro-Asiatic types are the Mapuche, Atacama, Papago  (Tohono O’odham), Pima, Oyampí, Warao, and a composite sample of Central Americans (“Central Amerinds”).[9]  Atypical HLAs of the Cherokee and Eastern Mayans are mainly of the Afro-Asiatic and southern Asian types, while those of the Araucano and northwest Canadian Eskimos are largely European and southern Asian.  The Quechua display the greatest diversity, with significant levels from all three sets (see Table 11, in the appendix).

In the following discussion, I review each category and provide comments on individual alleles as well as two tables.  One table shows the major American occurrences, with minor ones indicated in footnotes.  The other table shows the highest worldwide frequencies CS listed, including American data.  The number of world entries has been arbitrarily limited to about ten Old World populations for each allele, which is sufficient in most cases to give a good idea of its distribution.  The reader should consult the CS volume for more detail. Supplementary Tables 11 and 12 are in the appendix.  Table 11 is an expansion of Table 3, subdivided by regional designation, and Table 12 ranks American samples in each category.

CS represent some important populations by multiple samples—e.g., Bantus and Lapps.  For these I have used the average values that CS tabulated unless a single high frequency seemed especially noteworthy.  In a few cases, extreme genetic drift seems to have caused exaggerated frequencies of some alleles in small populations, such of those of Sardinia and of Easter Island.  Also, space has been saved in Tables 5 and 9 by using average frequencies of A*1, B*7, B*8, and B*12 for three European composite samples in which values are tightly bunched.  The composites are designated as “British” (Ireland, Scotland, Wales, England), “Germanic” (Germany, Netherlands, Austria, Switzerland), and “Scandinavian” (Sweden, Norway, Denmark).

Many references cited in the endnotes discuss evidence for early interhemispheric travel, but they represent only a small sample.   Some topics have been discussed in far too many publications to be listed here, but the better ones may be found easily through the annotations of Sorenson and Raish (1996), which contain much more information than is included here.  Readers familiar with this literature can form an opinion about the degree of fit between the distributions of atypical HLAs and the main proposals made by students of early voyaging.  Older references are included, to show how early some perceptive postulates were made.  It is important to realize that some scholars who compiled striking trait comparisons strongly opposed prehistoric contact as an explanation.

It is unfortunate that ideology has been allowed to color discussions of early voyaging and that participants are often assumed to have a political agenda.  It has become fashionable to find “subtle racism” or “inherent racism” in any presentation of evidence for early contacts between Native Americans and others.  This school of thought seemingly attaches less importance to evaluation of the evidence than to the supposed ideological consequences of interpretations.  One view may be considered comfortable and therefore correct, while another is “disturbing” and therefore false.  Members of this school take offense at claims of early contact, saying that such claims deny Native Americans the capacity to have developed their own civilizations without “help.”  This position is so devoid of logic that investigators trained in scientific disciplines are incredulous when they first encounter it.   Nevertheless, it is a stance that has become entrenched through repeated assertion.

David Kelley has commented on this phenomenon a number of times.  His remarks in a Smithsonian publication (1995) may be the most insightful analysis to date.  He pointed out how few anthropologists are aware of contributions by competent scholars from fields other than anthropology, and advises: “To gain some idea of the problems of appraising intercontinental relationships, one should study the comprehensive annotated bibliography of Sorenson and Raish (1990), the only bibliography I have ever sat down and read right through.”  A second edition of Sorenson and Raish appeared in 1996, and both editions have been invaluable resources for me.



Support From Other Genetic Systems


Following each discussion of HLA alleles are comments about supporting data from other systems.  For example, it is noted that five American populations with seemingly Afro-Asiatic HLAs also have genes in the immunoglobin, Duffy, Rhesus, and Kell systems that are rare outside of Africa.  Other examples are taken from the transferrin, acid phosphatase, and adenylate-kinase compilations of Mourant and of CS.  To my knowledge, these things have not been pointed out before in the context of ancient contacts between Native Americans and others.  They may be only curiosities, but I would like to see a rigorous examination by specialists who are open to the idea of outside influences.  A few more detailed comments on certain genes are appended, but a comprehensive review of the atypical genes is beyond the scope of this report.

Recent results from studies of mitochondrial DNA (mtDNA) are not considered here except for a brief summary in the appendix.  Because only females transmit mtDNA, only the most prevalent types survive for long.  Work so far has identified three basic female lineages with close relatives in northeast Asia (now designated A, C, D), one associated with southern Asia (B), and a possible fifth (X, proposed to be called E) that seems European.  Results from both nuclear and mtDNA studies are stimulating new and controversial hypotheses too rapidly for confident inclusion in this report.   

An exhaustive study of all American data would go far toward establishing a more reliable classification of indigenous groups, their order of entry, and migration paths.  CS’s brief sketch (pp. 340-342) is an excellent start.  My own studies have convinced me that remnants of the earliest colonists are concentrated in southernmost South America and that later arrivals, coming by sea, occupied the coasts first and then moved inland, especially up the South American rivers and the Mississippi.

Using genetic information by itself to prove assimilation of transoceanic voyagers is difficult, even when data from several different systems are in agreement.  Because human distributions have changed with time, arguments based on the present situation are not convincing unless combined with other kinds of evidence, as I have tried to do in some of the endnotes.  I am not competent to make a critical assessment of all of the cited material, although much of it has been scrutinized by specialists and found to be acceptable.  Often there is not enough information to judge the validity of a given finding except in the general sense that it is consistent with other evidence considered important by the more experienced scholars.



Apparent Afro-Asiatic HLAs in America           


Table 4 shows the occurrence, in American populations, of nine HLA types that currently display most of their highest frequencies in Southwest Asia, North Africa, or northern India.  They account for 47% of the “non-Indian” HLA data in Native America, but these occurrences are concentrated in five populations of southwestern North America and three of the Southern Andes.  Table 5 shows the highest world frequencies listed in CS’s Appendix 2.  The following observations regarding HLA distributions are in decreasing order of importance in America of the relevant alleles.  Numbers in parentheses are percentages contributed to the total “non-Indian” HLA content of 32 American samples. 



Table 4.  Frequencies (%) of “Afro-Asiatic” HLAs in American samples.


Sample B*21a A*32 A*30b B*14 A*1c B*17d A*29e B*37 B*18f
C Amerind 7.5 6.5 6.5            
Papago 12.5 0.7 0.7     1.6      
Pima 9.4     1.3       0.3  
Nahua 3.5 0.5   0.5 2.0 3.5 n.d. 1.5 2.5
Navajo 2.5 1.5 8.5            
Cherokee   3.0 5.0 1.0 2.0 1.0 1.0    
East. Maya 0.3 0.7 2.8 1.6 0.9 0.3     1.6
Mapuche 1.7 9.0   7.8 4.7        
Atacama   n.d. n.d. 10.7 7.1        
Quechua     1.0 2.0 1.0 1.0 1.0   2.0
Araucano 0.5 n.d. n.d.   0.5   n.d.   0.5
Grd Eskimo 0.2     0.2 1.3 0.6     0.2
Oyampí   6.6              
Parakana   0.5       0.5      
Warao     3.4     0.5      
Zuni   0.6         0.6    











a Yupa, 1.0%.

b Guaraní ,1.5; Trio, 0.7%.

c Northwest Eskimo, 2.8; Aymara, 1.0; Inupik 1.0; Ticuna, 0.2%.

d Eastern Eskimo, 0.3%.

e Yupik, 1.0; Guaraní, 0.5%.

f Northwest Eskimo, 0.2%.



Table 5.  Highest world frequencies (%) of Afro-Asiatic HLAs

(New World groups/composites underlined).







Saudi Arabia 22.2 Samoab 36.4 Samoab 26.0
Tigre 21.3 Tuareg 15.8 San 23.5
Jordan-Palest. 16.0 Honshuc 12.0 Central Bantu 21.3
Papago 12.5 Mapuche 9.0 Sardinia 16.3
Tuareg 12.1 Sardinia 8.3 Biakad 12.6
Berber 12.0 Punjab 6.8 Ibo 10.7
Mbutia 10.7 Oyampí 6.6 Basque 8.9
Iraq 9.5 C Amerind 6.5 Navajo 8.5
Pima 9.4 Khoi 6.5 Saudi Arabia 6.6
Turks 8.9 Greece 6.1 C Amerind 6.5
C Amerind 7.5 Brahman 5.9 Greece 5.4
Lebanon 6.8 Yugoslavia 5.8 Cherokee 5.0





North China


aPygmies of Zaire.

bSamoan outliers; absent elsewhere in Polynesia.

cChubu sample.  Japanese samples are highly variable; the Kyushu sample displays 8%; overall, Japanese samples average only 2.7%, the Ainu 0.3%.

dPygmies of Cameroon.



Table 5, continued.


B*14   A*1   B*17  
Atacama 10.7 Britishe 22±3   Koya (India) 25.5
Portugal 9.5 Jordan-Pales.f 19.0 Khoi 24.0
Sardinia 8.2 Berberg 18.0 Bantu 23.8
Mapuche 7.8 Scandinaviah 16±4 Ibo 17.8
Berber 7.0 Germanich 15±1 Dravidian 14.4
Iraq 6.2 Tunisia 15.0 Sardinia 13.5
Ireland 5.8 Punjab 15.0 Altaic comp. 11.4
Spain 5.5 Central India 14.5 Sherpa 10.9
Tigre 5.3 Czekoslovak. 14.5 Pygmies 10.8
Jordan-Palest. 5.0 Tibet 14.4 Tuva 10.0
Central Bantu 4.4 Belgium 13.9 Basque 9.0
France 4.3 Hungary 13.8 Berber 8.0
Tunisia 4.0 Spain 13.1 Malaysia


eAverage of four; see text.  Irish highest (26.4%).

fAlso, Lebanon, 12.7; Iraq, 11.6: Iran, 8.8%.

gAlso, Tigre, 11.7; Tuareg, 7.2%.

hAverage of four; see text.                        



Table 5, continued.


A*29   B*37   B*18  
Basque 14.2 Pygmies 5.4 Sardinia 28.2
Pygmies 11.2 Marathi 4.4 Basque 13.7
Spain 7.7 Iran 3.3 Sumatra 11.9
Lebanon 7.4 Sweden 2.1 Greece 11.6
Bantu 6.8 Mande 1.7 Italy 9.3
France 6.7 Nahua 1.5 Bali 8.9
Vietnam 6.0 Portugal 1.4 Berber 8.0
Uzbekistan 5.7 Belgium 1.4 Yugoslavia 7.8
Scotland 5.2 Russia 1.3 Lebanon 7.3
Caingáng 5.2 North China 1.2 Malaysia 7.3
Iraq 5.1 Greece 1.2 Melanesia 6.9
Portugal 5.0 Italy 1.2 Austria 6.8
Wales 4.9 England 1.2 Hungary 6.7



B*21 (10.4%).  Old World occurrences of B*21 are concentrated in regions of strong Arab presence or influence.  Frequencies of more than 15% are confined to populations in Saudi Arabia, Ethiopia (Tigre), and Jordan-Palestine, but influence extends across North Africa and into Spain, Portugal, and Italy (5-6%). In America, 84% of occurrences are clustered in four Uto-Aztecan populations (Papago, Pima, Nahua, and a Central Amerind composite).[9] The Papago have the fourth-highest frequency in the world, comparable to that of Tuaregs and Berbers.[10]  CS’s Central Amerind composite sample is unique in that all of its “non-Indian” HLAs are of the Afro-Asiatic set (B*21, A*30, A*32).  Thus, significant Afro-Asiatic contact with western Mexico and/or the Caribbean almost certainly occurred, probably from Arabia or North Africa.[11]

Traces of B*21 also appear proximate to the Pacific coast of South America (Mapuche, Araucano) and near Lake Maracaibo of Venezuela’s Caribbean littoral (Yupa) but not in CS’s fifteen other South American samples.  The Mapuche sample has the highest total content of the Afro-Asiatic HLA alleles reported in America (13%; see Table 12).  Much evidence of other kinds has been presented for an early North African presence in parts of South and Central America.[12]

A*30 (8.0%).  The relatively rare A*30 allele is sparsely sampled, and CS have little to say about its peculiar distribution.  I interpret it as a signature of a Caucasoid population that reentered sub-Saharan Africa at an early date, then reached America by voyages to the Caribbean and parts of South America, and also entered from Asia with the ancestral Na-Dene.  Highest Eurasian values are found in Sardinia, Spain (Basques), Saudi Arabia, Greece, and northern China.  But four of the globally highest levels appear in Africa, especially among the San (Bushman) of Botswana.  This fits CS’s finding (pp. 175, 176) that the San have about 50% (Asiatic) Caucasoid genes.  There are no data from the Caucasus or Southwest Asia that might help define A*30’s original range.  CS found no A*30 data from North Africa, but computer interpolation indicated 6%.  A*30 is absent from the Ainu, Eskimo, and Lapps, as well as from populations of Australia, New Guinea, and Pacific Oceania except for one exaggerated frequency found in Samoan outliers, possibly the result of one or a small number of early voyages.

The Navajo have the highest American frequency of A*30, presumably reflecting the “Dene-Caucasian” expansion postulated by Ruhlen and others[13] linking Athapaskans with Sino-Tibetan and other Dene-Caucasian languages.  If this postulate is correct, A*30 should be present in many indigenous populations of western Canada, but at present no data are available.

Other significant North American occurrences are in the Central Amerind composite, the Cherokee, and the eastern Maya,[14] supporting the proposition that the Cherokee entered North America from the south.  CS stated (p. 334, on the basis of their map on p. 249) that American A*30 peaks in the southeastern United States (where Cherokee is the only sample).  I postulate that A*30 was carried across the Atlantic to the Caribbean and also to the Guaraní and Trio people near the mouths of the Plata and Amazon rivers, then up the rivers to the Quechua.  It appears in only four of 14 South American samples, being strongest (3.4%) in the sample from the Warao of the Venezuelan coast.  The Warao were skilled canoe voyagers (Wilbert 1977) and are Paez-speakers like the Timuca of Florida (Greenberg 1987).  Granberry (1991) also found Timuca to be close to Warao but containing (Arawakan) Maipuran elements as well.  He postulated that the Timuca migrated to the southeast directly by sea from the region of Puerto Hormiga, about 2000-1500 B.C.  It would be valuable to know whether Indians of the southeastern United States other than the Cherokee possess the A*30 allele. 

A*32 (7.0%).  The A*32 allele seems to indicate a Mediterranean or specifically Aegean impact in the Caribbean region (including on the Cherokee) as well as on Tupians of the lower Amazon (Oyampí and Parakana).  It seems to connect this set and the Central Amerind composite with northern India, Sardinia, the Tuareg of Algeria, and with populations around the Adriatic Sea in Greece, Yugoslavia, and Italy.[15]  A*32 is absent from other South American samples except the Mapuche.  Carriers of A*32 seem also to have reached parts of Japan, where isolated spikes appear in contrast to the low Japanese average.[16]  An unexpected link between East Asia and India has been discovered recently through studies of HLA*2 subtypes (Narinder Mehra, reported in Anonymous 1997) and also through virology.  Miura, et al. (1994) have identified two varieties of human T-lymphotropic virus type HTLV-I that connect India and Japan, and one of these has also been transmitted to Colombia and the Caribbean.[17]  Migrations that could account for these data could also explain why dolmens and other “megalithic” phenomena, present in India by 1000 B.C., then appeared in Korea, Japan and Colombia, about 600 B.C. (described by Joussaume 1985:267-94; see also map in Scarre 1988:35).

A*32 levels in the Mapuche, Oyampí, and the Central Amerind composite samples are among the top nine in the CS tabulation (7-9%), and the Tupian Oyampí near the mouth of the Amazon River have the second highest American frequency.  If this is not an artifact of sampling, it implies connection of Tupians with the Mapuche by river commerce.[18]  Unfortunately there are no data from the Atacama or Araucano, whose histories are intertwined with that of the Mapuche (Araucano subsumes Mapuche).

Both A*32 and A*30 are found at significant levels in Greece, Sardinia, and in the Central Amerind composite.  They also appear at anomalously high frequencies in Samoan outliers but are not documented elsewhere in Pacific islands.  This may reflect limited exploration of the Pacific by Mediterraneans who otherwise left few traces except (controversial) petroglyphs.  The strong presence of A*32 but not A*30 among the Mapuche and Oyampí indicates a different source, perhaps India, for A*32 in the Amazon region.

B*14 (7.0%).  The B*14 allele appears to link the western Mediterranean with the Atacama, Mapuche, and Quechua.  CS describe it as peaking in the Middle East, Sardinia, and southern Spain, but it actually reaches its highest world frequency (11%) in the Atacama population of Chile.  The Mapuche sample, with 8%, is fourth, after those from Portugal and Sardinia.  Other high frequencies occur in North Africa, Iraq, and Ireland.

The presence of B*14 in the Andes might be attributed to recent Iberian influence, except that if this were the case it should be widely common through South America.  In fact, outside of the Andean cluster it has been reported only in a Caingáng sample near the mouth of the Río de la Plata.  It seems likely to me that B*14 was carried to its present locations by a more ancient population, with roots in the Near East.[19]  It appears also in eastern Maya, Nahua, and Cherokee populations; but unlike A*32, B*14 was not found in the Papago, Zuni, Navajo, Central Amerind composite, Oyampí, or Parakana samples.

Most of the “non-Indian” HLA content of the Atacama and Mapuche samples comes from B*14 and A*1, both of which are important in North Africa.[20]  These two alleles have similar distributions in America and are likely to have been brought by the same people.  The Atacama also have an 8% frequency of the African FY*0 allele of the Duffy system, which is found as well among the Ge-speaking Cayapó of Brazil.[21]

A*1 (6.5%).  CS refer to the A*1 antigen as “typically European” (p. 288); but, like B*14, it is equally important in North Africa and in the Near East.  A*1 reaches high frequencies in the Caucasoid regions of the world, but is rare in East Asia, America, and Oceania.  Both A*1 and B*14 display high frequencies among Berbers, Tunisians, the Tigre, and populations of Jordan, Palestine, Spain, and Ireland, and both reach significant levels in the Andean cluster.  A*1 has apparently not been reported elsewhere in South America.

Inexplicably, CS say that A*1 is absent in America (p. 369), despite its presence in six of their Andean listings as well as in six others in the hemisphere. As in the case of B*14, the highest American levels of A*1 were found among the Atacama and Mapuche (7%, 5%), with lower frequencies from the nearby Quechua, Aymara, and Araucano samples, and also among the Ticuna.  If this were a consequence of recent contact with the Spanish or Portuguese—who carry 12-13% of A*1—it would be difficult to explain the absence of A*1 in Brazil.  I suspect that its Andean distribution is due to an older, unrecognized contact with the Near East.[22]

In North America, the highest level of A*1 (2.8%) came from a sample of northwestern Canadian Eskimos.  The Inupik of Alaska had 1% as well, but the allele was not found among eastern Canadian Eskimos.  One possibility is that A*1 (but not B*14) was brought to northwestern America by western Asians related to the later Scythians, Tocharians, or Celts.[23]  A*1 is probably scattered throughout indigenous populations of the North American northwest who have yet to be tested.  The Greenland Eskimos have traces of several unexpected HLAs, including 1.3% of A*1, that probably came from later contacts with Europeans.

B*17 (2.6%).  The highest frequencies of B*17 occur in Africa and India, in an apparently ancient pattern.  Low levels were found in ten American samples, but more than half was in two Uto-Aztecan populations (Nahua and Papago).  Traces in the Orinoco delta (Warao) and in the lower Amazon (Tupian Parakana) again suggest trans-Atlantic contact.

A*29 (2.0%).  A*29 is spread thinly over Southwest Asia, Africa, and Western Europe, with highest current frequencies among Basques and certain African pygmies.  In America, A*29 has been reported only in southern Brazil (Caingáng 5%, Guaraní 0.5%) and for the Cherokee and Yupa samples (1% each).

B*18 (1.9%).  B*18 appears to be an ancient Caucasoid antigen, linking Basques, Berbers, Sardinians, Greeks, and Southern Europeans.  It also went along the Asian coast, especially to Indonesia, then apparently to Ecuador and Mexico.  The overall distribution suggests involvement of Mediterranean seafarers.  In America, it appears above the 1% level only among the Nahua, Quechua, and eastern Maya, with traces among the Araucano, northwest Canadian Eskimos, and the Greenland Eskimo.  I suggest that it came to the Pacific coast by way of Indonesia.[24]

B*37 (0.6%).  The B*37 allele is scattered worldwide at low frequencies, with highest present levels being found in Africa and western Asia.  In the Americas, B*37 is found only in the Uto-Aztecan Nahua and Pima samples, and for the Caingáng of Brazil, supporting the postulate of contacts from Africa, India, or Indonesia.  It is somewhat surprising to find that the Nahua sample had the sixth-highest world frequency of this relatively rare antigen.



Afro-Asiatic Genes from other Systems

When a small foreign population is absorbed by a larger one, survival of new low-level genes over time may be a matter of chance.  A few genes (other than those determining HLAs) characteristic of Africa or the Near East seem to have survived in America.  Several interesting examples occur in populations for which we as yet have no HLA data, and some of these are mentioned in the appendix; but a comprehensive treatment the topic is beyond the scope of this report. 

Four genes seldom found outside of Africa are the V antigen of the Rhesus system, allele FY*0 of the Duffy system, the Jsa allele of the Kell system, and immunoglobin Za;bc35.  The Quechua have both the V antigen and Jsa.  The Aymarans and Oyampí also have Jsa, the Atacama have FY*0, and the Papago have Za;bc35. It is possible that all five populations have all four of these markers, but apparently the appropriate tests have not been done.  There are ample data from the long-studied Rhesus system, but the Native American data in CS include only 13 results for FY*0 (two positive), 18 for Za;bc35 (three positive), and 23 for Jsa (nine positive).  American positive Jsa results other than those from the Andes are clustered in the Caribbean region.  They are all from speakers of Arawakan languages except for one Hokan tribe of Honduras (Jicaque) and one Carib tribe of Surinam (Wajana).

Other genes now concentrated in Africa and the Near East are of interest, although less diagnostic.  One of these, the K allele of the Kell blood group, associated with populations of Arabia and North India, is present in the Mapuche, Quechua, Caingáng, Aymara, Trio, Maya, Nahua, Cherokee, and Central Amerind composite.  Two of the highest values of K listed by Mourant are from Gambia and from Ge-speaking tribes of Brazil directly across the Atlantic from one another.  Another Afro-Asiatic marker, transferrin type B1, occurs in the Pima, Caingáng, and Araucano samples.  The Papago also possess a B transferrin, but the subtype was not identified in the reference.  Additional information is found in the appendix.



Apparent Southern Asian HLAs in America


Table 6 shows the occurrence in America of five HLA types that currently have their highest frequencies in Southeast Asia, India, China, and Pacific Oceania.  They account for 28% of the “non-Indian” HLA data in America, and are concentrated in the Nahua, northwest Canadian Eskimo, Cherokee, eastern Maya, Araucano, Caingáng, and Quechua samples.  CS said (p. 130) that A*10, A*11, B*13, and B*22 are rare in America, but 14 of 32 samples include at least one of these.  Table 7 lists the highest world frequencies from CS’s Appendix 2.  The distribution of each is described below, in the same format as was used for the Afro-Asiatic set.



Table 6.  Frequencies (%) of “southern Asian” HLAs in American samples.


Sample A*33a A*11b A*10c B*13d B*22
Nahua 4.0 7.1 3.0 1.0 7.0
Eastern Maya 7.5 3.6     1.1
Cherokee 8.0       3.0
Pima 3.6        
Papago 2.1 0.7      
Zuni 0.6 1.2     0.6
Araucano n.d. 3.3 6.8 0.5  
Quechua 5.0 0.1 0.1 1.0 1.0
Caingáng 2.2     4.0  
Greenland Eskimo   1.0 0.4   0.2
E Canad. Eskimo n.d. 0.3     3.2

NW Can. Eskimo






aAymara, 3.0; Navajo, 1.0; Inupik, 1.0; Warao, 0.5%; Atacama, no data.

bParakana, 0.5%.

cMapuche, 0.6%.

dAtacama, 2.4%.



Table 7.  Highest world Frequencies (%) of “southern Asian” HLAs

(New World groups/composites underlined).


A*33   A*11   A*10  
New Guineaa 15.5 Thailand 28.9 Australia 30.6
Ibo 13.2 Dravidian 28.7 Fiji 17.0
Philippinesb 13.0 Melanesia 28.5 Ainu 16.9
Vietnam 13.0 China 26.7 Uzbekistan 15.9
Punjab 11.0 Vietnam 22.8 Pakistan 15.7
Central Bantu 10.2 Sumatra 22.6 Khoi 14.7
Mande 9.7 Bali 20.2 New Guinea 14.5
Iraq 9.4 Philippinesd 19.7 Melanesia 14.4
Koreac 8.1 Tibet 19.0 Philippinesb 13.0
Cherokee 8.0 Polynesia 16.1 Polynesia 13.0
Turkey 7.7 New Guinea 16.0 Micronesia 12.9

Eastern Maya






a Highest value.  Oceanian levels are mostly low or untested, but Samoan outliers display 14.4%.

bNegrito component.

cJapanese average, 4% (Ainu, 1%).

dAustric component.



Table 7 (continued).


B*13   B*22  
Australiae        20.9 Melanesia 32.3

New Guineae         

18.5 Polynesia 29.6


11.8 New Guinea 26.8
China                          11.1 Australia 25.0
Thailand                   9.8 NW Eskimo 12.2
Melanesia                    9.1 Micronesia 8.8
Samoan outlrs         7.6 Korea 7.6
Lebanon                 7.4 Japan 7.5
Bali                     7.1 Tibet 7.5
Russia            5.9 North China 7.2
Ireland                        5.6 Nahua 7.0
Korea                  5.2 Greece 4.8

eHighest values; highly variable.  Averages are: Australia, 8.4%; New Guinea, 6.4%.


A*33 (9.6%).  Unlike the other HLAs in this set, A*33 reaches high levels in parts of Africa, and its distribution supports the idea of early two-way movements between Africa and Asia.  The very ancient peopling of southern Asia from Africa is well established, but evidence for a massive ancient backflow is relatively new.  Backflow has been demonstrated recently through nuclear DNA (Hammer et al. 1997, summarized in Gibbons 1997).  CS (p. 176) also interpreted some of their data as indicating Asian backflow to the African Khoisan people. 

A*33 also seems to go along the Asian coast through the Philippines, Vietnam, Korea, and Japan, and to America.  This trail may have originated in Anatolia or Mesopotamia, judging by the high frequencies in present-day Turkey and Iraq.[25]  A*33 appeared in only 11 of 29 American samples, but levels in the Cherokee and eastern Mayan samples are among the highest in the world.  It is the most important “non-Indian” HLA in the Cherokee, Maya, Quechua, and Aymara samples.  Initial input to the Mexican and Ecuadorian coasts seems likely to me.[26]

B*22 (6.4%).  Nearly all of the American B*22 was found in the northwest Canadian Eskimo and Nahua samples (12%, 7%), with small amounts among other Eskimos, the Quechua, and the Zuni.  The northwestern Eskimo frequency of 12% is exceeded only by those of Melanesia, Polynesia, New Guinea, and Australia.  The close connection between Eskimos and Oceanian peoples is more apparent in the basic “American” HLAs A*9 and B*40 than in B*22, which is absent from the Yupik and Inupik samples and reaches only 3% in eastern Canadian Eskimos (Table 10, appended).  It appears that B*22 was rare in founding populations but came with later contacts on the western coast of North America.  Fuegans and certain other early Americans have been claimed to be “Australoid,”[27] but HLA data from southern South America provide no useful information on this question.

A*11 (5.4%).  The “southern Asian” HLAs other than B*22 seem to have been brought to Mexico and South America in several episodes of transpacific contact.[28]  A*11 is characteristic of China,[29] India,[30] Mainland Southeast Asia, Indonesia, and Pacific islands.  It also appears in ten American groups, but the Nahua, Eastern Maya, Araucano, and Caingáng samples account for 80% of its occurrence in the hemisphere.  The distribution supports the notion of transmission of Indonesian influence to the west coast postulated by various investigators (e.g., Jett 1968), but the 2% frequency among the Caingáng may reflect Indonesian contact by way of South Africa, as advocated by Whitley.[31]

A*10 (4.0%).  HLA A*10 now has its highest frequencies in Australia but is also important throughout the Pacific realm.  In America, 85% of A*10 occurrences are concentrated in the Araucano, Nahua, and Cherokee samples.  A*10 is the most important “non-Indian” HLA among the Araucano (7%), who almost certainly have an Oceanic component.  A relatively low level (85%) of phosphoglutamase-2 also suggests Oceanic influence on the Araucano.  All other American populations listed in CS’s work have 100% except the Trio, Oyampí, Parakana, and Cayapó of the lower Amazon (90-99%).  Most values under 100% occur in Pacific islands, the lowest listed by CS being 81% from New Guinea.

B*13 (2.4%).  Though concentrated in southern Asia, B*13 occurs widely, indicating an ancient dispersal.  It has been reported in only five American populations (Caingáng, Atacama, Quechua, Nahua, Araucano), peaking in the Caingáng (4%), consistent with the idea of Indonesian contact in southern Brazil.



Southern Asian Genes from other Systems


Gene flow from India or Indonesia is suggested by the presence of the K allele of the Kell blood group in the Aymara, Quechua, Mapuche, Caingáng, Trio, Nahua, Cherokee, eastern Maya, and Central Amerind composite samples; by the predominance of B over A in the ABO system in the Aymara, Caingáng, Trio, and Zuni samples, and by the finding of allele 2 of the adenylate kinase system in the Aymara one.  These atypical genes appear to be best preserved in the Aymara-speaking population.  It seems almost certain to me that there were inputs from Indonesia: one on the west coast (Ecuador and Mexico) and one on the east coast of Brazil via Madagascar and South Africa.

Several populations have transferrin alleles B or D, which are rare in America but much less so in southern Asia.  Transferrin B2 occurs in Aymara and Quechua samples.  The Quechuans, Yupa, and Warao have TFChi, as do some Chibchan Paez and Carib tribes for whom we are so far without HLA data.  The Quechuans, Maya, and Caingáng have D1, and there is unspecified D in the Nahua, Eastern Maya, and Oyampí samples.  CS do not give subclassifications, but the Mourant volumes list some.  Additional information is provided in the appendix.

In the Rhesus system, three of the four d-negative alleles are now most common in India, Arabia, Indonesia and a few indigenous American populations.  The fourth, cde (r), is too common worldwide to be useful here.  Both Cde (r') and cdE  (r") are common among the Na-Dene and contiguous Americans, but not elsewhere in America.  However, there are occurrences in Mexico that may be the result of foreign contact.

Cde (r'), characteristic of South India, reaches high frequencies among the Chamula, Tarascans, and certain Mayans, with lower frequencies among the Seminoles and various Chibchans.  In populations for whom we have HLA data, Cde was noted among the Aymarans, Quechuans, Nahua, Trio, and Navajo.  CdE (ry) has a similar distribution, being prominent in the Otomí sample, with lower levels among Tarascans, certain Mayans, the Chamula, and the Central Amerind composite.  It is more characteristic of Arabia than of India, and apparently is not present in South America.  However, cdE (R"), characteristic of Indonesia, especially Java, has its highest American frequency among the Ranquel of eastern Argentina and also appears in the Quechua, Araucano, Trio, Zuni, Navajo, and certain Mayan samples.  The rare Rhesus alleles appear in several Mayan populations listed by Mourant, but CS’s eastern Mayan composite had none.  The Quechuans are unique in having two of these alleles as well as three “non-Indian” transferrins.

The pattern of Asian HLAs suggests to me an early southern Asian input to the Quechuans and Mayans from Indonesia, and a later influence on the Nahua, Cherokee, and Araucanians from other Asian sources.  The other genetic systems point to the Aymarans and Quechuans as the primary recipients, with lesser effects on the Trio, Maya, Nahua, and Caingáng.  A problem in crosschecking data from the various systems is differential sampling. For example, there are no transferrin data for the Cherokee. All that can be said with any confidence is that there seem to have been significant contacts with southern Asians on both coasts of the Americas.

Further information regarding the various blood groups may be found in the appendix.  This subject is complex and worthy of analysis by someone with more resources and knowledge than I.  The comments presented here are intended only to call attention to findings that are potentially valuable but rarely discussed.  



Apparent European HLAs in America


Table 8 shows the occurrence in America of four HLA types that at present have their highest frequencies in Europe.  They account for 25% of the “non-Indian” HLA data.  The single highest frequency was found among eastern Canadian Eskimos (10% of B*7), but the four alleles are also important in the northwest Canadian Eskimo, Nahua, Cherokee, Araucano, and Quechua samples.  Table 9 lists the highest world frequencies.



Table 8.  Frequencies (%) of “European” HLAs in American samples.


Sample B*7 B*12 B*8 A*3
E Can. Eskimo 10.4     0.2
NW Can. Eskimo 5.7 5.7 3.4 2.7
Greenland Eskimo 1.0 1.3 0.9 0.8
Nahua 4.0 3.5 1.0 2.0
Cherokee 3.0 1.0 1.0 1.0
Papago 1.5 2.4 0.8  
Pima 0.8 0.2   0.9
Araucano 1.0 4.3 3.3 2.4
Mapuche 1.2 1.7 2.3  
Quechua 4.0 2.0 1.0 1.0
Ticuna 0.8 1.3   0.8
Inupik 1.0   1.0  
Eastern Maya   1.8 0.4  
Trio   0.7    
Guaraní     1.5  
Zuni     0.6  
Aymara       3.0
Atacama       2.4
Caingáng       0.4



Table 9.  Highest world frequencies (%) of “European” HLAs.


B*7   A*3   B*12   B*8  
Lapland 23.1 Lapland 28.8 Basque 23.0 Britishb 16±2
Iceland 22.3 Finland 25.3 Easter Is.e 20.0 Punjab 12.6
Tuareg a 21.8 Iceland 17.5 Britishb 19±1 Scandinaviab 12±1
Britishb 17±2 Scandinaviab 16±1 France 17.5 Germanicb 10±1
Scandinaviab 16±1 Yugoslavia 15.8 Spain 16.8 Iraq 9.5
Finland 14.8 Walesc 15.4 Portugal 14.7 Hungary 9.3
Germanicb 14±1 Germanicb 15±1 Scandinaviab 14±1 France 9.3
Khoi 13.2 Tigre 15.1 Germanicb 13±1 Czechoslov. 8.7
France 12.4 Lebanond 14.2 Tunisiaf 13.0 Iceland 8.6









aOther North Africans, 3-8%.

bComposite; see text.

cOther British, 10-14%.

dJordan-Palestine, 11.0; Iran, 10.6; Iraq, 8.3%.

eNegligible elsewhere in Polynesia except New Zealand (3.8%).

fOther North Africans, 5-6%.



B*7 (9.1%).  Eastern Canadian Eskimos have the highest frequency of B*7 in America (10.4%), indicating early travel between Europe and northeastern America, as advocated by several scholars.[32]  Indians of eastern North America very possibly have significant levels of B*7, but the only sample available is from the Cherokee (3%). Highest frequencies worldwide are from northern Europe except for the Algerian Tuareg sample, which is third highest.[33]

B*7 seems to have reached the west coast as well.[34]  Northwest Canadian Eskimos have the second-highest American frequency (5.7%).  Presence in South America is largely on the Pacific coast (Quechua, Mapuche, Araucano), except for a trace among the Ticuna of western Brazil.  Japanese samples cited by CS average 4%, but the aboriginal Ainu have only 1%.  This adds to the evidence that a Caucasoid population (other than Ainu) moved across Asia, impinging on Japan and probably reaching western America.

B*12 (6.9%).  Except for one exaggerated frequency from Easter Island (20%), highest levels are in Western Europe and Tunisia, a pattern suggestive of expansion from Iberia into both Britain and North Africa; Basques now have the highest European value (23%).  Transmission of B*12 to America appears to have been across Asia, because it has its American peak in the northwest Canadian Eskimo sample but is absent from the eastern Canadian Eskimos.  At least 90% of occurrences appears in the West, with more than half in the Eskimo, Araucano, and Nahua populations.[35]  B*12 was found at significant levels in all Japanese samples, being strongest on Honshu (12%).   The Korean sample had 11%.  As in the case of B*7, B*12 seems associated with a Caucasoid migration across Asia to Japan and beyond.[36]  The anomalous Easter Island value probably reflects contact with South America, because other Polynesian frequencies are negligible except for one of 3% in New Zealand.  There is evidence that American Indians reached eastern Polynesia (Heyerdahl 1953), but the extent of their explorations remains unknown.  American cotton seems to have been carried to the Marquesas Islands (Stephens 1971; Langdon 1982) and to Hawaii (Langdon 1982) in pre-Columbian times, and stone mirrors much like distinctive American types have been excavated on Kauai, Hawaiian Islands (Probst 1963).  Serjeanston (1989) has used HLA data in an attempt to reconstruct the trail.

B*8 (4.5%).  The highest frequencies of B*8 were found in British populations, especially the Irish (19%).  Other European samples had high levels, as did those of Punjab and Iraq.  The pattern seems to show expansion of a Southwest Asian population into Europe, with little effect on East Asia other than in Japan (Honshu, 5%).  A connection between India and Japan again seems indicated. As is true of B*12, the highest American frequencies occur in the northwest Canadian Eskimo and Araucano samples, and these, with that of the Nahua, account for approximately half of the American total.  The American B*8 distribution is much like that of B*12, suggesting that they were carried by the same people.  Levels of B*8 in Japan are comparable to those among the Eskimo and Araucanians, except that none was found in the aboriginal Ainu sample.

A*3 (4.7%).  Like B*7, A*3 reaches high levels in Lapland and Finland but is not as prominent in British populations other than the Welsh.  A high level in Yugoslavia links a population in the Balkan region with the Welsh (Romanian Vlach, Serbian Valhos).  Like B*8, A*3 is well represented in Southwest Asia (Uzbekistan, Lebanon, Jordan, Palestine, Iraq, Iran), from which it may have dispersed.  CS said, erroneously (p. 369), that A*3 is absent among American natives, but it is spread thinly over 12 of their samples at a level of 1.5 ± 0.9%.  It is strongest in the Aymara, northwest Eskimo, Araucano, and Nahua samples, implying considerable antiquity on the west coast.  A*3 is the only HLA allele that has its highest American frequency among Aymara-speakers. 



European Genes from other Systems


There may be no clear marker for European ancestry among the “classical” genetic variants, although a high frequency of the Rhesus cde (r) allele might qualify.  CS also considered immunoglobin f;b, adenylate kinase 1, and ABO type B, when found together, to indicate Caucasoid admixture in eastern North America (p. 337).  Occurrences among indigenous Americans of cde or A or B of the ABO system were once attributed to recent admixture and databases were “corrected” accordingly.  It is now understood that some of these unexpected alleles arrived in antiquity.  Mummies from Paracas, Peru, possessed both A and B antigens (Allison et al. 1976, 1978).   The cde allele has its highest modern frequency among Basques, who retain traits of an ancient European population, but cde also remains at high levels in Africa and India.  Well over 40% of the indigenous American samples that Mourant and CS listed have at least some cde, and these include isolated tribes with no evidence of recent European or African admixture.  The A antigen is common among Na-Dene and Eskimo populations, and B appears at significant frequencies among Eskimos, South Andeans, and others such as speakers of Hokan and Chibchan-Paez languages.  Further comments on these distributions appear in the appendix.


Eastern Canadian Eskimos have 13% of immunoglobin f;b, which probably came from Europe where it is now most common, its highest frequencies being in Sardinia, Bulgaria, Greece, and Ireland (79-85%).  However, similar frequencies occur in Elamo-Dravidian groups of India and are nearly as high in Iraq, Lebanon, and Iran, suggesting dispersal to Europe from Southwest Asia.  Curiously, one Carib population showed 13%, and f;b was found in samples from the Aymarans and from six tribes of the Amazon region, suggesting Indonesian or Indian contact; but much wider sampling is needed (see appendix).



The American HLAs


Ten types listed by CS together account for 94% of the HLA-A total and 93% of the HLA-B total in the Americas.  The highest frequencies of nine of these occur in America and the tenth (A*9) displays higher levels only in New Guinea (Table 10, appended).  These ten types were almost certainly possessed by the earliest colonizing groups in the hemisphere, but their distributions indicate multiple migrations.  The clusters might provide the basis for a classification scheme like that devised by Schanfield (1992) on the basis of immunoglobin data.  Limited HLA data now available seem to define three distinctive regions: South Andean, where A*28 and B*15 are concentrated; eastern South America, especially the northeast, where A*31, B*5, and B*35 are prominent; and far northern North America, where most of the A*9, B*27, and B*40 occur.


Distributions of these ten HLAs in other parts of the world are not what would be expected from the premise that Americans stem mainly from Northeast Asians.   Instead, the basic American populations seem most like those of western Asia and of Southeast Asia, paralleling the findings of Steele and Powell (1992) regarding Paleo-Indian skulls.  Their multivariate analysis showed craniofacial features closer to those of southern Asians and Europeans than to those of northern Asians.  The only HLA prominent in both America and northern Asia is A*2, which CS call ubiquitous and which is also strong throughout Europe, South China, and Japan.  Distributions of the ten “American” HLAs are outlined in the appendix.





What has been done in this report is sometimes called data-mining, and it is well known that the dangers of finding spurious patterns increase as data sets become larger and more complex.  After finding patterns, one should apply inferential statistics to distinguish valid patterns from those that arise by accident.  It seems to me that the genetic data base at present is insufficient for rigorous validation of the patterns that various investigators have noted, but that it is nevertheless worthwhile to call attention to patterns that need to be tested.

I am struck by the fact that apparent patterns in HLA data seem to reflect hypothetical early interhemispheric contacts proposed independently by many others on the basis of other kinds of data, and I am skeptical that this result could be entirely accidental.  The following provisional conclusions are based primarily on HLA distributions, although certain of them incorporate the results of other studies.  Some are more solid than others, as indicated in the text.

1.  Approximately ten of the 29 HLA varieties tabulated by Cavalli-Sforza, Menozzi, and Piazza almost certainly were present in founding American populations, but they indicate predominantly southern Asian rather than northern Asian origins for the American population base.

2.  As many as 18 of the HLA alleles that occur in the Western Hemisphere may have been introduced at various times after early colonization.  However, these constitute only 6-7% of the total, varying from zero in some isolated South American tribes to 24% in the Nahua sample.  They are concentrated in areas once occupied by urban societies and are also characteristic of certain parts of the Eastern Hemisphere that are claimed to have been in early contact with America.

3.  Three “non-Indian” alleles (B*21, A*33, B*7) are most important, each contributing about 10% of the atypical HLA total.  Their presence seems to reflect an early Near Eastern influence on the American west coast (A*33), European input to eastern Eskimos (B*7), and an Afro-Asiatic influence in southwestern North America (B*21).  These interpretations are supported by findings of atypical genes from other systems, especially immunoglobin, transferrin, Kell, and Rhesus.

4.  Ninety percent of American B*21 is clustered in speakers of Uto-Aztecan languages and contiguous Navajo.  The Papago frequency is one of the highest in the world, comparable to the frequencies of B*21 among the North African Tuaregs and Berbers.

5.  The distribution of A*33 seems associated with transmission of traits from the Near East through Southeast Asia and is the most important “non-Indian” allele in the Cherokee, Eastern Maya, Quechuan, and Aymaran populations.

6.  The European B*7 is likely to have come at an early date to eastern Canada and also across Asia and the Pacific to western Canada, Ecuador, and Mexico.  It is strong among the Quechua, Nahua, and Cherokee, the last being genetically almost indistinguishable from the Nahua.

7.  A North African or Iberian contact with the Atacama and Mapuche is indicated by the distributions of B*14 and A*1. The Atacama of northern Argentina have the highest recorded frequency of B*14, similar to the B*14 frequencies of the Berber, Iraqui, Iberian, and Irish samples.  The Mapuche have the fourth highest world level of B*14 and, together with the Atacama, possess 70% of reported American B*14.  These two also have the highest American levels of A*1, a characteristic of North Africans and Southwest Asians but now highest among the Irish.  The Atacama also have the African FY*0 allele of the Duffy blood group, known otherwise in only one other American sample (Cayapó of Brazil).

8.  On the North American continent, HLAs characteristic of Afro-Asiatic populations are cluster in the Papago, Pima, Navajo, Cherokee, and composite Central Amerind samples.  Voyages to the Gulf of California (B*21), Caribbean region, and Atlantic coasts of South America (A*30 and A*32) seem likely.  The Central Amerind composite sample of 18 tribes is unique in having “non-Indian” HLAs only from the Afro-Asiatic set (B*21, A*30, and A*32).

9.  Alleles A*30 and A*32 appear near the mouths of the Amazon (Oyampí, Trio, Parakana), the Plata-Paraná (Guaraní), and the Orinoco (Warao), suggesting exploration of rivers by expeditions from the Mediterranean.  The Oyampí, Parakana, and Guaraní are Tupians.  The distribution of the apparently Mediterranean or Indian A*32 seems to connect Tupians with the Mapuche via river traffic.  The A*30 pattern suggests Greek or Arabian influence on America and also on parts of Africa.

10.  Southern Asian HLAs seem to have come in several episodes: B*22 to western Eskimos and the other four (A*10, A*11, A*33, B*13) to the west coasts of Mexico and South America at various times, supporting claims of early Oceanic or Indonesian inputs to Ecuador and Peru, and later Oceanic, Indian, and Chinese influences on Nahua, Mayan, and Araucanian societies.  Genes from systems other than HLA also indicate influence from India, Indonesia, or Arabia on the Aymaran, Quechuan, and Mayan populations as well as on various Mexican and South Andean groups for which we lack HLA data.

11.  Proposals that Indonesians reached Brazil by way of Madagascar and South Africa seem supported by the distributions of A*11, B*13, and genes from other systems found in samples from the Caingáng and Trio of the Brazilian coast.  The Andean Aymara-speakers seem to have the greatest concentration of anomalous genes that might have an Indonesian source.  These traits could have come by river traffic from the east or directly across the Pacific.

12.  European alleles A*3, B*7, B*8, and B*12 seem associated with migrations across Asia that continued on to the western Eskimos, Nahua, Quechua, and Araucano.  All but A*3 are also important in Japan.  B*7 seems also to have come from Northwest Europe to eastern Canada, where the other European HLAs are nearly absent.  Both HLA and immunoglobin distributions reflect partially separate genetic heritages of the eastern and western Eskimos. The dominant atypical HLA alleles are the southern Asian B*22 in the west and the European B*7 in the east. The western Eskimos have the Asian immunoglobin type fa;b, whereas the eastern Eskimos have the European type f;b.

13. HLA B*7, now concentrated in extreme northwestern Europe, displays unexpectedly high frequencies both in eastern Canada and in parts of North Africa where the Amber trade is thought to have introduced a Scandinavian influence.  Similar rock carvings in all three places appear to confirm some degree of intercontinental commerce at about 700 B.C.

14.  Distributions of “American” HLAs A*28 and A*31 suggest genetic backflow from Brazil to Africa.[37] Both are characteristic of eastern South America, where highest world frequencies occur, but anomalously high levels appear sporadically in Africa (especially Tigre, Tuareg, Mande).  More complete sampling in Africa is needed to assure that this is not a sampling artifact.


Section II: Appendix, Addendum & Foot Notes

Section III: References Cited


Originally published in Pre-Columbiana, Volume 2, Number 2 & 3, December 2000 & June 2001




Copyright © 2000 & 2001 by James L Guthrie


New England Antiquities Research Association


NEARA Home Page