Not looking for Haplogroup W? Also hosted here: F-20 Tigershark - El Cid - UFO DNA - Encyclopedia Astronautica
Table of Contents:
Haplogroup W Origins - Subgroup: W1 - W3 - W4 - W5 - W6 - W7 - W9 - The Basics - What do those letters and numbers mean? - What Makes You Haplogroup W? - Things are More Complicated Than You Thought - What Does It All Mean? - Two Clocks Running at Different Speeds - rCRS, RSRS, and all that... - Ancient W - W3a2 - Jewish W - How to Read the Family Tree Diagrams - Haplogroup W Resources - interpretation of your individual results! - Contact - NEW! MTDNA Family Tree
Haplogroup W Gets No Respect!
Although one of the earliest identified haplogroups, it was not to be considered important enough to be listed as one of the 'seven daughters of Eve' and given a snappy, attention-getting female name. We have selected Wilma Flintstone as an extremely appropriate mother-of-all-W's. Due to the low percentage of W's in the world, scientific papers often pay little attention to haplogroup W results. But the data is out there - a W descent tree can be constructed, specific mutations trace the migration of Wilma's descendants from their Central Asian origin to Europe, the Middle East, and India. So let's go, fearlessly, where scientists fear to tread. After all, the certainties proclaimed on the pop science web sites equate to only the 'latest guess'. Anthropology is one field where whatever today's conventional wisdom is, you can bet it will be something different in five years.
The author of these pages is a haplogroup W who coincidentally has a degree in physical anthropology, including training in human genetics, statistics, and so on. These pages and their findings have been reviewed by current experts in the field, but do not represent publishable conclusions. They merely represent my best shot at understanding my matrilineal heritage based on all available information at this moment in time.
Back to Table of Contents
Haplogroup W Origins and Subgroups
Modern Worldwide Distribution of Haplogroup W
In the Old World Haplogroup W occurs primarily in South Asia, the Near East, Central Asia, and Europe, with an average of only 1 to 2% of the population being of this group. Emigration in the last 500 years from Europe and South Asia to the Americas, southern Africa, Australia, and New Zealand brought Haplogroup W to those continents. However these general figures disguise important differences between the subgroups.
The first migration beyond Africa and the Arabian peninsula of the matrilineages still extant in modern humans began around 60,000 years ago. Among these were ancestors of the the M and N haplogroups, descended from L3. These first Eurasian descendants of genetic Eve seem to have been boat-building fisherman, and are believed to have expanded quickly along the Indian Ocean coastline. After a quick initial expansion along the coast, all the way to Australia, there was a slower conquest of the interior of Asia.
Despite the simplifications promoted in pop science, these matrilineal lines were not 'the ancestors of modern humans'. As these new immigrants expanded into Eurasia, they ran into other humans there before them. Neanderthals and Denisovans are known for sure, but there may were probaly others. Hominids had been present throughout Eurasia for over a million years. And the new immigrants interbred with them. By chance, or some reason not understood yet, the only matrilineal lines found via mitochondrial DNA in current humans so far are from these African emigrants. But our total DNA shows our heritage from these other humans who were there before the 'out of Africa' bunch...
Modern Old World Distribution of Haplogroup W
The first member of what is referred to as the W haplogroup, whom we refer to as Wilma, was born between 13,000 and 37,000 years ago, probably in what is now northwest India or northern Pakistan. This was at the end of a period of cold known as the Last Glacial Maximum. Migration of people outside of the W homeland was blocked to the north by extremely arid arctic deserts. As the climate began warming, between 19,000 and 11,000 years ago, nomadic hunters expanded back into these areas, which were becoming grasslands. The existing modern W subgroups emerged in the area between the Caspian and Aral Seas during this period.
Ice Age Climates
As the population expanded, both in numbers and area, the new Haplogroup W subgroups developed. Around 15,000 years ago a W woman had a mutation at position 7864 in the coding region, the first W1. Another W woman had a 194 mutation in HVR2. From her would be descended the other subgroups (W3 through W9) subgroups. Over a dozen other subgroups have been identified but not given official designations yet.
is defined by the 7864 mutation. The Finnish-Nordic branches of this subgroup account for a large portion of northern European W's. W1 began differentiating into subgroups around 15,000 years ago. The descendant lineages went both into Europe and Scandinavia via Russia, and the Middle East via Iran, but each lineage with very different routes and timings.
, the oldest subgroup, appeared around 14,000 years and spread later via the horse culture of the steppes into Europe and India. They are found today in both Europe and South Asia.
is defined by the striking HVR2 motif 143-194-196. It appeared in the central Asian steppes around 9,000 years ago. One lineage migrated via Scandinavia into Britain and Ireland. Others are found in central and western Europe, Anatolia, Russia, and India.
is a uniquely European group, although it originated 10,000 years ago, probably in the Russian steppes. Its subgroups differentiated during the period of the Corded Ware culture in Central Europe, beginning 5,000 years ago. Later migrations of Celts, Saxons, and Vikings carried the type to the British Isles.
is identified by the 16192 16223 16292 16325 16519 motif. It originated in Central Asia around 13,000 years ago. The structure of its descent tree and the wide distribution of its members indicate that it spread throughout the Middle East and Europe in the same period and along the same routes as the first agriculturalists.
W7 was identified by the 185 HVR2 mutation and found at present among those of Armenian descent. However additional recent results from Germany with very different coding region results probably indicate the 185 mutation arose independently in Europe. W8 is identified by the 5147 and 8697 mutations, with only two results to date, from Yemen and Britain. W9 is identified by the 14097 mutation and found in Turkey and Britain. There are in addition at present over a dozen subgroups that have not yet received an official designation. Such designations are not usually allocated until at least two full genome results establishing the subgroup are submitted to Genbank.
Back to Table of Contents
MTDNA Haplogroup W1
W1 Geographic Distribution
W1 is defined by the 07864 coding region mutation; a third of W's are of this subgroup, making it the largest. If HVR2 results are available, W1 may be tentatively identified by not having the 00194 mutation, which differentiates it from all other W subgroups.
W1 emerged about 15,000 years ago in central Asia. There are nine recognized major subgroups, W1a through W1i, plus more than two dozen others not yet given an official designation.
W1 consists of the following major subgroups:
W1a (currently defined by coding region mutations 05495 12669). These mutations actually identify a Finnish subgroup that diversified around 3,300 years ago, coincident with the expansion of Finns into their current territory. However recent results have shown a more complex picture. First, it is found that there was a lineage of W1a's that migrated from Finland via Sweden to Britain. Second, a new ancestral branch has been discovered, with just the 5495 mutation, which is much older - around 15,000 years old. This has currently been identified in Spain and Norway, indicating a Neolithic maritime distribution, from Scandinavia via the North Sea to the Bay of Biscay.
W1b (defined by coding region mutations 04928 and 09612, and HVR2 mutation 00227). This so far remains a purely Finnish group. Prior to the migration of the Finns from their Siberian homeland to their modern territory, about 4,000 years ago, a 10086 mutation split W1b into two lineages. After 3,000 years ago, both expanded into numerous subbranches.
W1c (defined by coding region mutations 14148). This emerged around 9,000 years ago in the steppes of Eurasia. Descendent lineages are:
W1c1, with the 11204 and 12648 mutations, 7,000 years old, found in Norway, Germany, and the British Isles.
A major subgroup with the 119 mutation which emerged 8,000 years ago. Major subgroups are:
W1c with the 5004 mutation, around 2500 years old, found today in Sweden and Ireland.
W1c with the 150 mutation emerging around 8,000 years ago, found today in India.
W1c with the 16292C mutation, about 6,000 years ago, found today in India and Iran.
W1c with the 16193 and 152 mutations, about 4,000 years old, found today in Turkey and Bulgaria.
There are various lineages coming directly from W1c with only single samples, found today in Ireland, Denmark, France, Iran, and Georgia.
There are various lineages coming from W1c+119 with single samples, found today in Italy, Britain, and Norway.
W1d is defined by coding region mutation 08383, 09278, and 14981. There is only one full genome result so far, that an Israeli Jew of Iraqi origin. The person has numerous additional changes, including loss of 16292 in HVR1, 16260 and 16298 in HVR1, and 00189, 00194, 00200 in HVR2. There are similar HVR / HVR2 results from Iran, Iraq, and Cyprus. The type seems therefore to have emerged in the earliest agricultural societies, and spread into the eastern Mediterranean, the Levant, Iraq, and Iran.
W1e is defined by the 08659 and 08887 coding region mutations and emerged around 6,000 years ago and is limited to continental Europe. A twig with a single sample is found in the Czech Republic today. The major branch also has the 16295 mutation in HVR1. The ancestral type with only the 16295 mutation is today found in Finland, with a diversification around 4,500 years ago. A sub-branch emerged 1300 years ago with ancestors with additional 10398 and 16324 mutations in Prussia and Switzerland.
There are further HVR1 results, not certainly W1e, with both the 16295 and 16324 mutation. The geographic distribution (and the associated dates) correlate with the migrations of the Huns under Attilla. 16295+16324 is also found among the Sekely in Romania, who believe themselves descendants of the Huns. Some traditions say the Huns originated in northern Eurasia, the same area as the Finno-Ugarits. So one theory could be that the original 16295-only mutation arose around 4500 years ago among the peoples in northern Eurasia who would later split into the Finns, Estonians, Magyars, and Huns; but only the 16324 mutation emerged among the Estonians and Huns; and the Huns brought it into central Europe, while the Estonians into maritime western Europe via Scandinavia. Alternatively, these central European 16295+16234's could be another haplogroup. Only full genome results for some of these central European types will resolve the question.
W1f, defined by the 09950 mutation in the coding region. This emerged 7,000 years ago, presumably among the first European Neolithic agriculturalists in the Danube valley. So far only three coding region results have been reported; one branch in Austria has a 16275 mutation in HVR1; the other branch, in the Czech Republic, has a 16249 mutation. No HVR1-only or HVR1+HVR2 results have been reported that might provide more information
W1g, defined by the 16320 mutation in HVR1. This extends from the Ukraine through Poland, Hungary, the Czech Republic, Germany, and France to Portugal. This arose 4,500 years ago and the timing and the distribution suggest the lineage was brought into Europe by the Yamnaya culture from the steppes between the Caspian and Black Seas.
W1h, defined by the 16145 mutation in HVR1, emerged around 9,000 years ago. The major branch of this tree is found among the Ashkenazi Jews of Eastern Europe. Other single-ancestor lineages are in Italy and the Czech Republic, although tese may represent occasions where the 16145 mutation arose independently. Interestingly, HVR1-only results with the -16292 16145 motif are also found among Polish Roma...
W1i, defined by the 5580 mutation, emerged around 4,000 years ago. This is found today in Bulgaria, Hungary, Slovenia, Poland, and a Circassian from Israel. This suggests the lineage was brought into Europe by the Yamnaya culture from the steppes between the Caspian and Black Seas.
There are numerous other W1 lineages without enough results yet to receive a subgroup designation. As is the case with the non-Finnish, W1a's, W1b's, and W1e's, these all seem to be north European maritime lineages, perhaps descended from a basic W1 lineage in the eastern Baltic Sea around 4,000 years ago.
One branch sharing the 119 mutation in HVR2 spread to territories on the Baltic and North Atlantic, and Bay of Biscay. Ancient DNA results show W+119 was present in Germany 7300 years ago, in burials of the LBK culture. Finds of the same type continue in the later Schöningen, Salzmünde, and Bernburg cultures through 5000 years ago.
One sharing the 16295 mutation arose around 3,000 years ago and is found today in the Netherlands and the British Isles.
W1 Geographic Distribution
Back to Table of Contents
MTDNA Haplogroup W3
Geographic Distribution of Haplogroup W3
W3, together with W4 through W7, were descendants of a woman with a 194 mutation born in northwest India around 15,000 years ago. W3 is identified by the further coding region 1406 mutation, and emerged around 14,000 years ago. W3's seem to have diversified into subgroups among the nomadic cultures of the Asian steppes south of the Aral Sea after the last glacial maximum. From there they spread via Russia into eastern Europe, perhaps with the peoples that brought the horse to Europe. There are two great divisions of W3: W3a, which appeared 13,000 years ago, and W3b, which appeared 10,000 years ago. Some branches in both divisions also migrated into India. Each lineage followed its distinctive route and timing.
The major families and their time of emergence are:
W3a, defined by the 15784 coding region mutation. This emerged in the steppes or northwest India around 13,000 years ago and led to the following subgroups:
W3a1, defined by the 13263 coding region mutation, emerging in the steppes 12,000 years ago. The 13263 mutation also occurs in Haplogroup C. Therefore testing done to place results in a haplogroup, even if only HVR1 and/or HVR2 are tested completely, allows individuals to be identified by as W3a1 without a full genome test. W3a1 is found in south Asia and continental Europe, from Russia to Britain and Germany to Italy. The distribution suggests it was bought to these regions by horse nomads from the Eurasian steppes.
W3a1a, defined by the 07151 mutation. W3a1a originated south of the Aral Sea around 10,000 years ago. The major branch seems to have spread via Russia into the Ukraine, Poland, Belarus, the Baltic states, and Finland. However there are branches with representatives in the British Isles and Italy, suggesting migrations of these lineages deeper back in the Neolithic.
W3a1a1, which emerged in Russia around 8,000 years ago. One distinct branch, with the 16291 mutation, appeared around 1,000 years ago and is today found only among Ashkenazi descendants.
Geographic Distribution of Haplogroup W3a1b
W3a1b with the 10245 mutation. This emerged 10,000 years ago, south of the Aral Sea or in northwest India. It migrated into eastern India and Sri Lanka. There is one result from Spain... a signal from colonial times?
W3a1c with the 7269 mutation. This emerged 4,000 years ago. The only results with geographical locations confirmed are from the British Isles.
Geographic Distribution of Haplogroup W3a2
W3a2 with the distinctive 16209 and 16255 HVR1 mutations emerged between the Caspian and Aral Seas around 7,500 years ago. One branch is found today in Turkmenistan, perhaps near the point of origin. The lineage migrated north of the Caspian and Black Seas. One branch moved into the Ukraine and later into the Tatras Mountains of Poland / Slovakia as part of the Lemo ethnic group. Another branch moved south into Romania, and then probably along the Danube, eventually reaching the area of Rouen, France. From here what seems to be a single emigrant ancestor, Marie Marguerie, arrived in Quebec in 1641, ancestoring hundreds of thousands of descendants in the United States and Canada.
Geographic Distribution of Haplogroup W3b
W3b, which emerged with the 12923 coding region mutation and 199 HVR2 mutation 10,000 years ago. Descendant lineages are found today in Lithuania and Poland (W3b1, around 2500 years old), India, Armenia, Hungary, Italy, Bulgaria, Iran, Turkey, and the British Isles. This clearly has a history and distribution as complex as W3b, but is much less common and with fewer results to flesh it out.
Back to Table of Contents
MTDNA Haplogroup W4
W4 is defined by the striking HVR2 motif 143-194-196. It appeared in the Asian steppes east of the Aral Sea around 13,000 years ago. W4, together with W3 through W9, were descendants of a woman with a 194 mutation born in northwest India around 15,000 years ago.
The currently identified W4 lineages are:
W4a with the 3531 coding region mutation and reversal of the 16292 mutation in HVR1. This seems to have originated in Eurasia around 8,000 years ago. Two minor lineages with the 119 mutation are found today in India and Romania.
W4a1, which originated in Scandinavia around 3,000 years ago. A major subgroup with the 16286 mutation appeared in Scandinavia 2000 years ago. This subgroup is found in Sweden, Norway, Denmark, and the British Isles. Single results are from Poland and the Netherlands. The timing and distribution of the branches indicates a clear signal related to the expansion of the Vikings in historical times. It is most often reported from Scotland and the northern areas of England, again connecting it to the Viking settlement of Britain.
W4b, with the 7444 mutation, 10,000 years old, reported from the Czech Republic and Montenegro. The Montenegrin branch split off from the Czech line 4,000 years ago.
W4c, reported from Italy and Britain, with the lines splitting 8,000 years ago.
W4d, 11,000 years old, found today in Iran, Italy, and among Iraqi Kurds. The migration history can be looked at in more detail looking at HVR1-HVR2 only results (since those mutations are nearly unique). The ancestral type originated in Turkmenistan around 11,000 years ago and migrated into the Near East at the very dawn of agricultural. The unchanged lineage probably reached northern Italy around 5,000 years ago as part of the Neolithic expansion along the sea routes of the Mediterranean, and then split into two branches - the northern Italian branch with the 16239 mutation, and a branch with the 194 mutation, which moved north into (southern?) Germany. One is reminded here of Otzi the Iceman, the frozen Neolithic hunter found in the Alps on the Austrian-Italian border, but genetically most resembling Sardinians.
W4d Geographic Distribution
W4d Descent Tree
Undesignated W4 lineages with only one result so far are found in Poland, Britain, Italy, Spain, Tibet, Hungary, India, and Germany.
Back to Table of Contents
MTDNA Haplogroup W5
W5 Geographic Distribution
W5 emerged 14,000 years ago in the steppes north of the Aral Sea. W5, together with W3 through W9, were descendants of a woman with a 194 mutation born in northwest India around 15,000 years ago. W5 is defined by two coding region mutations - 06528 and 15775. There are two identified major subgroups, W5a and W5b.The geography and timing of their origin and expansion corresponds closely the Neolithic LBK Linear Pottery culture (9,000-7,000 years ago), followed by subsequent expansion of the Funnel Beaker culture (7,000-5,000 years ago), the Corded Ware culture (4500 years ago) and finally the expansion of the Angles and Saxons in Britain (1500 years ago).
There is one undesignated subgroup with the 3316 mutation with ancestors in Denmark and Iran. These split around 8,000 years ago so this would seem to be evidence for the origin of W5 in the Eurasian steppes before its major expansion in western Europe.
There is another undesignated subgroup with the 1119 mutation, around 4,000 years old, but with ancestors in Denmark, Germany... and among Moroccan Berbers. The Berber result is 'way out of place, and obvious theories would be Visigoth ancestors, or a North European slave (thousands were taken to Morocco from raids on Britain in the Middle Ages).
W5a emerged in a lineage with the 10097 coding region and 16362 HVR1 mutation around 9,000 years ago, probably in Germany. Due to the 16362 mutation W5a's can be identified by HVR1 results alone (with caution, since 16362 appears in
other W subgroups). Full-sequence results for the main sub-branches W5a1 and W5a2 are only from Germany, Britain, and Ireland. There is one undesignated sub-branch in the Ukraine. HVR1-only results that are likely W5a show a wider distribution, but mainly filling in the geographic gaps and extending a bit into adjacent countries. The preponderance of British, Irish, and German results do not reflect reality, but rather the most common ancestries of the British, Irish, and American persons who most commonly have DNA tests.
W5a is represented in ancient DNA by a burial from the Late Neolithic Bell Beaker Kromsdorf site in Germany, 5000 years ago. This is an HVR1-only result and the W5a subgroup cannot be determined.
W5a1, with a 10410 coding region mutation emerged 7,000 years ago in Germany. One lineage is found in Ireland today.
W5a1a, with a 4363 coding region mutation, emerged 4,500 years ago, and spread from Germany into Denmark, Britain and Ireland.
W5a1a1, with a 9275 coding region mutation, emerged 3,000 years ago in Germany and spread to Britain and Ireland.
W5a2 Geographic Distribution
W5a2, with a 150 HVR2 mutation, emerged 6,000 years ago in Germany. It is found today in Germany, Britain, and Ireland, with single examples from Austria and Hungary.
W5b, with the 11696 coding region mutation, emerged 7,000 years ago. Representatives are found today in the Netherlands, Denmark and Ireland. There are two ancient HVR1-only results from Germany and Hungary, both around 7000 years old, which are probably W5b (16093 mutation). However they could also be ancestral to modern W3a1a2's.
Back to Table of Contents
MTDNA Haplogroup W6
W6 Geographic Distribution
W6 is identified by the 4093 and 8614 coding region mutations, and the HVR1 16325 mutation. The vast majority also have a 16192 HVR1 mutation. W6, together with W3 through W9, were descendants of a woman with a 194 mutation born in northwest India around 15,000 years ago. W6 appeared in the area between the Black and Caspian Seas, perhaps in what is now Georgia, around 12,000 years ago. W6 graphs as many geographically separate lineages, indicating a wide dispersal a very long time ago. The distribution of modern results, unique among W subgroups, seems to indicate a migration into Anatolia and Iran, and then into Europe and the Middle East. This would correspond to the spread of agriculture by the first Neolithic farmers beginning 10,000 years ago.
The W6 subgroups are:
W6a, with the 8610 coding region mutation, which emerged 5,000 years ago. Here There are, remarkably, two identical W6a ancient full-sequence results. One is from the Yamnaya culture of Samara, Russia, 5000 years ago, and the other from the Corded Ware culture of Germany, 4500 years ago. This would seem to be a clear signal of the migration of this specific lineage from the Eurasian steppes into Central Europe. Contemporary descendants with identical results are reported from Switzerland and Italy. Descendants with further mutations are reported from in in Russia, Slovakia, Lithuania, Poland, Switzerland, Britain - and the UAE!Surviving coding-region lineages are found in Lithuania, Russia, Slovakia, Switzerland, and the UAE. W6 is more firmly represented by three full-sequence results.
W6b Geographic Distribution
W6b, with 4646, 6297, and 8605 mutations in the coding region. This emerged in the middle east 10,000 years ago. Current descendants are found in Dubai, Iran, Palestine, and in Germany and Great Britain.
W6c Geographic Distribution
W6c, with the 8658 and 8002 coding region mutations. This is reported from Iran, Turkey, and Armenia.
Back to Table of Contents
MTDNA Haplogroup W7, W8, W9, and Undesignated W+194 Groups
W7 Geographic Distribution
W7 was identified by the 185 HVR1 mutation, originally being linked to a lineage with the 6635 coding region mutation that is today found only among Armenians. Later results of a different branch, with 185 HVR2 mutation and the 7702 coding region mutation, were reported from Germany. These are most likely separate branches arising independently from W+194's.
W+194 with an additional 119 mutation is ancestral to no fewer than two designated (W8, W9) and over a dozen undesignated lineages. These had a common ancestor 11,000 years ago and are very widely dispersed geographically. W9 is identified by the 14097 mutation. Again, only two results reported so far, from Britain and Turkey, with a common ancestor 5,000 years ago. W8 is identified by the 8697 and 5147 mutations. Only two results have been reported, from Yemen and Britain. Their common ancestor was 10,000 years ago. Other lineages in this group are from Spain, Armenia, Germany, India, Italy, Denmark, Finland, Ireland, Portugal, Turkey, Bulgaria, and Iran. Each of these lineages have separate histories but are rarely reported so filling in the gaps would be pure speculation.
Back to Table of Contents
What do those letters and numbers mean?
Each cell of your body has a copy of the instructions that are used by the cell to maintain and duplicate itself. These instructions (the DNA) are in a single string of only four chemical compounds that are represented by the letters A, C, T, and G. The length of this string - the code - to define an entire human being - is necessarily very long. The human code has 3.165 billion letters. In letter shorthand it looks like:
and so on for 52.75 million lines.
Half of this code comes from your mother, and half from your father, with two exceptions. One exception (only if you are male) is a section of 51 million letters that came only from your father. These are the instructions that make a male different from a female (the y-DNA). The other exception is a string of separate DNA 16,568 letters long (the mtdna). This is only passed from mother to daughter. So the ydna defines your patrilineal line and your mtdna your matrilineal line. Neither of these have anything to do with all the 'in betweens' in your family tree!
It is possible to 'read' the entire 16,568 letters of mtdna code (a Full Genetic Sequence, or mtFull Sequence, or FGS). Depending on which test you purchased, all, or only a few hundreds or thousands of the letter sequence may have been read. Your mtdna result looks something like this:
HVR1 (the basic test): 16223T, 16292T, 16519C
HVR2 (the extra test): 73G, 189G, 195C, 204C, 207A, 263G, 309.1C, 315.1C
CR (Coding Region, only with the full sequence test): 709A 750G 1243C 1438G 2706G 3505G 4769G 5046A 5460A 7028T 8251A 8860G 8994A 11674T 11719A 11947G 12414C 12705T 14766T 15326G 15884C
HVR1 and HVR2 are the names of sections of that 16,568 mtdna string of letters. The HVR1 section goes from letter 16001 to letter 16569. HVR2 runs from letter 1 to letter 574. The CR runs from letter 575 to 16000. Your mtdna code, in the sections you paid for, has been tested completely. However rather than list hundreds of letters, only the differences from a standard letter sequence are shown (this standard sequence is called the Cambridge Reference Sequence, or CRS). So the result above means that where letter 73 in the CRS was an A, you have a G. Where letter 16223 in the CRS was a C, you have a T. (normally C's change to T's and vice versa, and A's to G's and vice-versa - but there are, less often, other possibilities
Where a number like 309.1 or 309-1 appears, this means that an extra letter was inserted at that position, compared to the CRS code.
One point of confusion - the Cambridge Reference Sequence is not the code of 'Eve'. Your results are not the changes that you have compared to the matrilineal ancestor of us all. They are instead the differences between your code and that of a lady who happened to be sequenced in Cambridge, England, in the 1990's. She was actually a member of haplogroup H, the most common in Europe. To make things even more confusing, in 2013 the standard shifted to comparing results to mitochondrial Eve instead of the CRS. This is called "RSRS", while CRS had been retitled "rCRS" (revised Cambridge Reference Sequence).
Back to Table of Contents
The Bottom Line - What Makes You Haplogroup W?
A person is classified as belonging to Haplogroup W if they have a change to their mtdna compared to the Cambridge Reference Sequence at letters 1243, 3505, 8994, 11947, and 15884. In studies so far these five changes are unique to Haplogroup W. There are other mtdna changes that most or all W's share (see below). But these other changes are seen in other haplogroups and so do not unambiguously classify you as a W.
These changes do not involve the HVR1 and HVR2 regions (letters 16001 to 16569 and 1 to 574). However Family Tree DNA (FTDNA) conducts a separate test of 22 locations outside of HVR1 and HVR2 to assign customers to a haplogroup. FTDNA classifies you as a W if you are 1243C (the five other locations are not tested due to the expense). One of the 22 markers tested is 13263 (used to define haplogroup C). Coincidentally, this is also the change that defines haplogroup W subgroup W3a1. So if you have 13263G here, Family Tree DNA reports you as W3.
Members of the W haplogroup, compared to the Cambridge Reference Sequence, do have other common changes in their HVR1 and HVR2 mtdna. While these are not used to define W, they are common to nearly all W's. The list of common W haplogroup changes are:
In HVR1 changes of 16223T, 16292T and 16519C.
In HVR2 073G, 189G, 195C, 204C, 207A, 263G, 309.1C, and 315.1C.
Back to Table of Contents
Things are More Complicated Than You Thought
No sooner have you understood the basic ideas, then you find things are more complicated than you thought. Such is life. Time to grow up....
The most common mutations are a change in a letter at a location to its complementary letter - A to G, G to A, C to T, or T to C.
Next most common are insertion of an extra letter. These are represented in your results by a dot, followed by the number of insertions. For example, 315.1C - meaning an extra C was inserted after 315C; or 315.2C - meaning two extra C's were inserted after 315C.
Deletions can also occur, where a letter in the sequence is missing. These are indicated by a dash, for example 16183-.
Occasionally the test results show 'heteroplasmy' - meaning the mtdna in your cells shows multiple results for the same position. This can happen because a single cell contains hundreds of thousands of mtdna copies, groups of which may have mutated and have different letters at the same location; and your results are based on the average for many millions of cells, some of which may all contain a certain mutation, while others are unchanged. These ambiguous results are indicated by the letters Y (C or T - example 16093Y); R (A or G - example 16034R); M (A or C - example 16183M); W (A or T - example 16189W); N (G or A or T or C - example 16192N). U, S, M, K, V, H, B, D, and X are also used to indicate other combinations of results for the same location.
Back to Table of Contents
What Does It All Mean?
Your mtdna result is unique among all other types of dna results because it alone shows you an unbroken lineage going back thousands of years all the way back to genetic 'Eve'. This is your matrilineal line - your mother's mother's mother's mother's mother's....mother. Full genome results, such as 23andme or Family Finder, can help you locate relatives no farther back than five generations or so due to recombination. If you're a man, Y-dna results trace your patrilineal line back to genetic 'Adam'. Only your mtdna result gives both men and women complete insight into the matrilineage of their family and allows remote common ancestors to be identified.
Combining the information on your place on the tree with the current Old World geographic locations of your relations and rough dating for each branch of the tree via the genetic clock, allows the origins and migrations of your ancestors to be determined.
There is a lot of confusion because these each of these migrations are of a single lineage, leading from Wilma, around 17,000 years ago, to you today. Each of these lineages have their own history. W's or W1's or W3's or whatever did not belong to a particular clan or tribe. Rather a particular lineage traces the migrations of a particular woman's descendants, who were part of various tribes, cultures, or peoples over the millennia. This means charts showing the 'migrations of a haplogroup', or showing the percentage of a haplogroup in a particular population, can be misleading because they are combining information on the locations and migrations of various individual lineages, which can have very different routes and histories.
It also has to be understood that the lineages that exist today are only a tiny fraction of all of those that ever existed. Genetic 'Eve' was not the only woman alive 180,000 years ago. There were tens of thousands of others, including not only genetically modern humans, but also Neanderthals, Denisovans, and other archaic humans. We now know that they interbred with modern humans, and are part of our DNA ancestry. But they have left no existing mitochondrial lineages.
You know yourself of male or female lines in your own family that have come to an end when a particular relative has no surviving children. The situation was even more dire in prehistory.
Think about it: for thousands of years, the earth's human population was relatively stable. That meant that each woman, on average, had two surviving children, of which, on average, one was a girl who could pass on her mitochondrial DNA to a new generation. However individual women may have had no children, or no girls, that survived to have children in turn. Those matrilineal lineages were lost.
Today there are thousands of women with the same mitochondrial DNA sequence. Things average out and at least some women with the same mtdna result will probably survive in each generation to pass on the lineage. But prior to the advent of agriculture, many humans lived in small nomadic groups, and the chances of losing a particular lineage were much greater.
Being a hunter-gatherer nomad on foot meant that a woman could not have more than one babe-in-arms at a time. The interval between births was delayed by nursing the children for a longer interval than in later times; and use of abortion or infanticide. As a result, it is estimated that in Paleolithic times the average woman gave birth every 27 months.
Every birth carried a 10% chance of the death of the mother in childbirth. The result was that the average woman had 4.7 live births during a 13-year reproductive span (from age 16 to age 29). Half of these were girls; but with 50% infant mortality, deaths prior to reproductive age, and women who could not conceive, only enough children survived to replace their parents.
When a new mutation occurred in a paleolithic woman, there was therefore only a very small chance it would be passed on to enough descendants to ensure that it would survive in the population. As we go back in time, fewer and fewer of the lineages and haplogroups that ever existed survive today. So our tree of descendants from Wilma is woefully incomplete. It had many branches and twigs that did not survive to present times and we cannot know about (except as more
ancient DNA results become available).
Although a migration route can be guessed at, the timing in many cases is ambiguous because the genetic clock is very rough indeed. In a full genome, there is only one new mutation in a lineage every millenium or so, on the average. But this is an average - there can be many thousands of years between mutations on some lineages. In fact, in the 18,000 years since Wilma, there are between one and 16 mutations depending on the lineage, with the average being five. Confident dating of a lineage requires a large number of descendants in the tree, but currently there may be so few that dating is very uncertain. Recently the availability of ancient DNA results allows actual values to be used at a few places in the tree.
One conclusion from the tree analysis and the ancient DNA evidence is that haplogroup W only arrived in Europe after 5000 BC, among the peoples who brought agriculture and pastoralism to the continent. So while it may be concluded that a given lineage entered central Europe from the Eurasian steppes, it may not be possible to determine if it arrived with the pastoralists that introduced the horse there around 4000 BC; or by later migrations by the Celts, Germans, Slavs, or Magyars.
Back to Table of Contents
Two Clocks Running at Different Speeds
The Coding Region (CR)
Scientists have identified two areas of the human mitochondrial DNA. One was dubbed the 'coding region' - positions 575 to 16000. The genetic material here is involved in the basic processes of cellular life. Therefore most mutations here would be damaging to the life of the cell. They would result in improper functioning of basic biochemical processes, ending with the death of the cell. If this was an egg cell from the mother, it would mean the death of any new life in the womb. Such mutations would not be passed on to descendants.
Nevertheless, there were some locations where mutations could occur and were passed on to the descendants. They were either neutral, or conferred some advantage or minimal disadvantage to the offspring. The changes here provided the long-range genetic 'clock' that allowed the descent of the mankind from 'genetic Eve' to be determined. The people of the earth were assigned to 'haplogroups' based on a tree of descent derived from these coding-region mutations being passed on to descendants farther down the tree. Out of the entire coding region, it was estimated that such mutations that were passed on occurred at a rate of around 0.03 changes per location per million years (range of various calculations 0.0126 to 0.0609).
The Highly Variable Regions (HVR)
The other area of the mtdna - called the D-region, the non-coding region, the highly variable segments (HVS), or the highly variable regions (HVR) - went from positions 16001 to 16568 (HVR1), and then from 1 to 574 (HVR2). This area was thought to only assist in aligning the mtdna during replication, and any mutations would be neutral and have no effect on the organism. Mutations that occurred here would not be removed due to bad effects on cell function, and therefore would accumulate at a much faster rate. The rate of mutations that are passed on in the HVR was estimated at around 0.5 changes per location per million years - more than 15 times faster than the coding region (range of various calculations 0.0865 to 1.7957).These could be used, it was believed, to study the descent within haplogroups, in some cases in genealogical time scales. These were the original basis for genealogical genetics.
Two Clocks, Two Speeds
At first one would think there would be far more HVR mutations than CR mutations, but this is not the case. While the retained mutation rate is 15 times lower in the coding region, there are also nearly 15 times more locations there than in the HVR. Therefore, using the averages indicated above, in 10,000 years one would expect to see in the coding region around 4.6 changes (15,425 locations x 0.03 changes per location per million years / 1,000,000 years x 10,000 years ). In the HVR region this would be 5.7 changes (1,141 locations x 0.5 changes per location per million years / 1,000,000 years x 10,000 years). Given the uncertainties in the estimates, these are nearly the same number!
The difference is that, with 15 times fewer locations, a particular mutation in the HVR is more likely to be flipped 'back' to its original value. Take haplogroup W as an example. The founding ancestor had a 16292T mutation. 10.5% of W's don't show the 16292T mutation - e.g. the location has mutated back to the 16292C of the Cambridge Reference Sequence. Over the many tens of thousands of years since ‘Genetic Eve’, this makes HVR useless for constructing the 'big' descent tree of mankind. But on the time scale within a haplogroup, the HVR can be nearly as reliable a guide as the CR changes. Often the number of coding region mutations from the putative ancestral sequence of the haplogroup is greater than the number of HVR changes. For example, the original study that included a number of complete sequences of the W haplogroup was done among Finnish W1 subjects. Many showed little variation in the HVR1 and HVR2 areas (zero or no changes from the 'defining' W1 haplotype) but two to five changes in the coding region.
One might expect the two clocks to be running together - e.g. on the average, the more coding region changes, the older the lineage, and the more the HVR mutations. But this does not occur at all due to the low number of average mutations (less than ten within most haplogroups that emerged after the last Ice Age). Within full-sequenced haplogroup W individuals, there is no correlation between the number of coding region changes and the number of HVR region changes.
What this also means is that, depending on your luck, your HVR1 or HVR2 result alone may provide a unique indicator to your ancestry. However it also may not – you may have a ‘vanilla’ result typical for your haplogroup or a mutation that is shared by several subgroups within the haplogroup. In such cases only getting an mtdna full genome sequence will clarify the situation.
Back to Table of Contents
Useful for Genealogy?
The mtdna clock is too slow to tell you, for example, that someone is your fifth cousin because of a difference in your mtdna results. On the average, even if you have the complete mtdna sequenced, there is only one change every millennium or so. If you only had HVR sequenced, that's two millennium. However having the same mtdna as someone else, if you've either had an FGS done or are lucky enough to have a distinctive HVR result, can be a great aid in genealogical research. For in that cases it may indicate that you may share a common immigrant ancestor. For example, all those with the W3a2 "French W"; motif in North America seem to trace their ancestry back to Marie Marguerie, who migrated to Quebec in 1641. Another person, a Hungarian of German heritage, who's ancestors were likely resettled there from Bavaria in the 18th Century, has a full FGS match with an American who's ancestors came from a certain town in Bavaria. This makes it very likely that this is where her ancestors came from in the 1700;s, even without documentary evidence.
On a deeper level, your mtdna results can provide details on the prehistoric migrations of your ancestors. Were they the among the immigrants to Europe that brought agriculture to the continent 6000 years ago? Or others that brought horses and the wheel 5000 years ago? Or even later Slavs, Norsemen, Magyars, Ashkenazim Jews, who migrated into various parts of Europe in historical times? Your mtdna result may provide the answer!
Back to Table of Contents
HVR, HVS, and all that
Looking around the net, you'll quickly find that while FTDNA and others call the areas tested HVR1 and HVR2 (highly-variable regions 1 and 2), in other places HVS-I and HVS-II are mentioned (not even getting into HVR3 and HVS-III).
As with a lot of things involving science, it turns out to be messy ('exact science' = oxymoron). The definition of HVS can vary from paper to paper (and sometimes isn't even defined). HVR-1 extends from location 16001 to 16569, whereas HVS-I in older papers was 16090-16365 and in more recent papers was 16037-16518. Similarly, HVR2 is locations 001 to 574, whereas the older version of HVS-II was 68 to 263 and the later version is 74 to 300. The reason for the narrower ranges was to save money when conducting a large number of tests, and also to ignore pesky loci that seemed to never change or changed too readily ('hot spots'). The issue of 'hot spots' - loci to ignore because if you include them it makes your nice network of descent and relationship into a spaghetti-like diagram - is controversial. Which ones to ignore seems to vary from haplogroup to haplogroup. 16519 is widely seen as the most unstable of hot spots, but in haplogroup W it is as solid as a rock. In haplogroup W, location 119 seems to spring up everywhere in the diagram.
The bottom line - when comparing lots of results, you have to go to the 'lowest common denominator' HVS-I / HVS-II ranges, otherwise you won't be able to use a lot of the published data.
So now you know.
Back to Table of Contents
rCRS, RSRS, and all that...
The Cambridge Reference Sequence (CRS) was the first fully-sequenced mtdna. As such, it was used as the reference sequence, meaning that instead of spelling out all 16,578 letters of each new sequence discovered, it was only necessary to list those locations and letters different from the CRS. After some refinement this reference was called the revised Cambridge Reference Sequence (rCRS).
Unfortunately this caused endless confusion, since the sequences were not being compared to mitochondrial Eve (the matrilineal ancestress of all mankind) but to the sequence for a European Haplogroup H individual. It was sort of like a large family each listing their differences between themselves and a cousin, or nephew, rather than their common grandmother.
After years of study the sequence for mitochondrial Eve could be reconstructed with enough confidence to make that the new standard. This was issued in 2012 and called the Reconstructed Sapiens Reference Sequence (RSRS).
So Wilma's mtdna expressed in terms of CRS or rCRS:
HVR1: 16223T 16292T 16519C
HVR2: 073G 189G 195C 204C 207A 263G 309.1C 315.1C
Coding Region: 709A 750G 1243C 1438G 2706G 3505G 4769G 5046A 5460A 7028T 8251A 8860G 8994A 11674T 11719A 11947G 12414C 12705T 14766T 15326G 15884C
Changed to the following in RSRS:
HVR1: 16129G 16187C 16189T 16230A 16278C 16292T 16311T
HVR2: 146T 152T 189G 204C 207A 247G
Coding Region: 709A 769G 825T 1018G 1243C 2758G 2885T 3107X 3505G 3594C 4104A 4312C 5046A 5460A 7146A 7256C 7521G 8251A 8468C 8655C 8701A 8994A 9540T 10398A 10664C 10688G 10810T 10873T 10915T 11674T 11914G 11947G 12414C 13105A 13276A 13506C 13650C 15884C
It is quite different, but the mitochondrial Eve version clearly shows the changes over time and relates directly to Wilma's Chain of Being - which are the mutations that and migrations that led from mitochondrial Eve to Wilma were as follows:
Emerged 151,600 to 233,600 years ago in Africa
Emerged 130,000 to 200,000 years ago in Africa.
Mutations: 146T 182T 4312C 10664C 10915C 11914G 13276A 16230A
Emerged 117,00 to 180,000 years ago in Africa.
Mutations: 152T 2758G 2885T 7146A 8468C
Emerged 90,000 to 140,000 years ago in Africa.
Mutations: 195T 247G 825T 8655C 10688G 10810T 13105A 13506C 15301A 16129G 16187C 16189T
Emerged 80,000 to 130,000 years ago in Africa.
Mutations: 4104A 7521G
Emerged 66,000 to 106,000 years ago in Africa. Mutations: 182C 3594C 7256C 13650C 16278C
Emerged 57,000 to 87,000 years ago in northeast Africa or the Arabian Peninsula. This is the 'out of Africa' lineage.
Mutations: 769G 1018G 16311T
Emerged 56,000 to 87,000 years ago in Near East. This was the first Eurasian haplogroup. Full genome studies indicate that they interbred with Neanderthals, although no Neanderthal mtdna lineage survived.
Mutations: 8701A 9540T 10398A 10873T 15301G
Emerged 49,000 to 75,000 years ago in Central Asia. N2 is the deduced ancestor of the N2a and W subgroups. No person with this sequence is known to exist today.
Mutations: 189G 709A 5046A 11674T 12414C
Emerged 13,000 to 29,000 years ago in South Asia.
Mutations: 195C 204C 207A 1243C 3505G 5460A 8251A 8994A 11947G 15884C 16292T
Back to Table of Contents
In the early 2000's the prevailing theory was that central and northern Europe were populated by Paleolithic peoples from 'ice age refuges' in Spain and Italy after the glaciers melted. These first people on the land stayed there, and the genetic makeup of Europe was therefore fixed 10,000 years ago. Subsequent invasions of Europe - by agriculturalists, horse nomads, Celts, Germans, Greeks, Italics, Slavs, Hungarians, and so on involved transfer of technology, culture, or ruling classes, and did not make a big imprint on the genetic makeup of Europeans.
However as more and more results of ancient DNA extracted from Paleolithic skeletons were published, it became apparent that their y-dna and mtdna types were quite different from those of modern Europeans. Generally opinion shifted until it was taken that Neolithic agriculturalists and pastoralists displaced the earlier inhabitants - if not by war, then simply by outbreeding them due to the more intense exploitation of the land and higher populations densities of farmers and herders compared to hunter-gatherers. These new arrivals arrived via multiple routes - via the Danube Valley, by sea in the Mediterranean, and by horse and wagon from southern Russia.
W skeletons, as in the modern population, are rarer. However there are some results. No W has been found in Paleolithic Europe so far. The oldest W's reliably reported are from the Linear Pottery Culture (LBK; 5,500 4,900 calibrated B.C.) site Derenburg Meerenstieg II in Germany. Two of the 22 individuals for which an HVS1 sequence could be obtained were W's with the 16093 16223 16292 motif. The basic motif itself is found in modern mutations 16093 is found (in combination with other HVR1 mutations) in modern populations as W3a1a (with the 13263G coding region mutation) and as W4 (with 13263A in the coding region and the infamous 194-143-196-192 in HVR2). The basic motif is currently reported from Turkey (2), Netherlands (2), Britain (1), Ireland (2).However 16093 plus some other mutations is much more wide ranging. 7000 years since the LBK is enough time for another HVR1 mutation or more to have occurred.
There is an even older W, which predates LBK, but this is a very doubtful result. This was from Unseburg Germany 6550 BC, but 'multiple sequences' were read from the skeleton (which was an isolated find from the 1930's), 'crystal outside of physiological range' had formed on the exterior, and 'ambiguous results' were obtained. So the idea that W was absent in Europe prior to the arrival of middle eastern agriculturalists is not disproven by this sample.
Ancient DNA results show W+119 was present in Germany 7300 years ago, in burials of the LBK culture. Finds of the same type continue in the later Schöningen, Salzmünde, and Bernburg cultures through 5000 years ago.
Results from the Tomb of the Shroud in Israel (originally purported to be the family of Jesus) date to the first century AD and believed to be W, but aside from excluding W6, could be any of the known groups.
There are what seem to be W3's with the 16295 and 16304 HVR1 mutations from the Spanish Neolithic, one group 5500 years ago, another 2500 years ago.
More concretely, there are full genome results from Germany for W3a1 4000 years ago, and W3a1a from the Yamnaya culture in Samara, Russia, 5000 years ago. The German result has no modern descendants discovered yet. The Russian result is ancestral to modern results from the Ukraine and Poland, with a genetic clock origination date of 8000 years ago.
W5a is represented in ancient DNA by a burial from the Late Neolithic Bell Beaker Kromsdorf site in Germany, 5000 years ago. There are two results from Germany and one from Hungary, all around 7000 years ago, which are probably W5b (16093 mutation). However they could also be ancestral to modern W3a1a2's.
W6 is more firmly represented by three full-sequence results. Here there are, remarkably, two identical W6a results. One is from the Yamnaya culture of Samara, Russia, 5000 years ago, and the other from the Corded Ware culture of Germany, 4500 years ago. This would seem to be a clear signal of the migration of this specific lineage from the Eurasian steppes into Central Europe. W6a is dated by modern results to 4500 years ago, with contemporary descendants in Russia, Slovakia, Lithuania, Poland, Switzerland, Britain - and the UAE!
There is a full-sequence W6c, also from the Yamnaya culture, 5000 years old. This is the ancestral type, with no modern counterparts. Descendants are found today in Iran, Armenia, and Turkey, perhaps indicating migration of another stream of the Yamnaya culture south.
results to date:
W but not W6; Germany; Unseburg Germany 6550 BC ; 16223T 16292T - but 'multiple sequences'; 'crystal outside of physiological range'; 'ambiguous results'
Back to Table of Contents
W but not W6; Hungary; Starcevo Alsonyek-Bataszek 5500 - 4500 BCE [BAM3]; 16223T 16292T
W but not W6; Germany; Bernburg Benzingerode 3200-2800 BCE [BENZ 15]; 16223T 16292T
W but not W6; Kazakhstan; Steppe nomads Kazakhstan Zevakinskiy 800-600 BC ; 16223T 16292T
W but not W6; Israel; Jewish Israel Tomb of the Shroud Jerusalem [SC17 F] F 0-100 AD; 16223T 16292T
W but not W6; Israel; Jewish Israel Tomb of the Shroud Jerusalem [SC7 M] M 0-100 AD; 16223T 16292T
W but not W6; UK; Anglian England Norton Cleveland Market N57] 400-600 AD; 16223T 16292T
W but not W6; Russia; Ancient Yakut DNA, 1780. Female population from Southern Siberian tribes; 16223T 16292T
W1+119; Germany; LBK_EN Halberstadt-Sonntagsfeld 5298-5247 calBCE; 16223T 16292T 16519C; 073G 119C 189G 195C 204C 207A 263G; 709A 750G 1243C 1438G 2706G 3505G 4769G 5046A 5460A 7028T 7864T 8251A 8860G 8994A 11674T 11719A 11947G 12414C 12705T 14766T 15326G 15884c
W1+119; Germany; LBK_EN Viesenhäuser Hof, Stuttgart-Mühlhausen 5500-4800 BCE; 16223T 16292T 16519C; 073G 119C 189G 195C 204C 207A 263G; 709A 750G 1243C 1438G 2706G 3505G 4769G 5046A 5460A 7028T 7864T 8251A 8860G 8994A 11674T 11719A 11947G 12414C 12705T 14766T 15326G 15884c
W1+119; Germany; Schöningen Salzmünde 4200-3350 BCE [SALZ 19]; 16223T 16292T; 073G 119C 189G 195C 204C 207A 263G;
W1+119; Germany; Bernburg_MN Benzingerode 3101-2919 calBCE; 16223T 16292T 16519C; 073G 119C 189G 195C 204C 207A 263G; 709A 750G 1243C 1438G 2706G 3505G 4769G 5046A 5460A 7028T 7864T 8251A 8860G 8994A 11674T 11719A 11947G 12414C 12705T 14766T 15326G 15884c
W1+119; Germany; Schöningen Salzmünde 4200-3350 BCE [SALZ 20 and 35]; 16223T 16292T; 073G 094A 119C 189G 195C 204C 207A 263G;
W1a or W1i or W3a1a; Israel; Jewish Israel Tomb of the Shroud Jerusalem [SC7 ?] ? 0-100 AD; 16292T
W1a or W1i or W3a1a; Israel; Jewish Israel Tomb of the Shroud Jerusalem [SC7 F] F 0-100 AD; 16292T
W1a or W1i or W3a1a; Israel; Jewish Israel Tomb of the Shroud Jerusalem [SC4] M 0-100 AD; 16292T
W3; Spain; Spain Cami de Can Grau Granollers 3500-3000 BC; 16223T 16292T 16295T 16304C
W3; Spain; Cami de Can Grau Granollers, Barcelona 5500 kya; 16223T 16292T 16295T 16304C
W3; Spain; Iberian Spain Torrelo Bonerot 700-500 BC; 16223T 16292Y 16295Y 16304Y 16311Y
W3; Spain; Iberian Spain Torrelo Bonerot,Castellon 700-500 BC ; 16223T 16292Y 16295Y 16304Y
W3a1 ; Germany; 3 Unetice_EBA Esperstedt 2118-1961 calBCE ; 16147G 16223T 16292T 16519C; 073G 189G 194T 195C 204C 207A 263G; 709A 750G 1243C 1406C 1438G 2706G 3505G 4769G 5046A 5211T 5460A 6267A 7028T 8251A 8860G 8994A 11674T 11719A 11947G 12414C 12705T 13263G 14025C 14766T 15326G 15784C 15884c
W3a1a ; Russia; Yamnaya Lopatino II, Sok River, Samara 3500-2700 BCE; 16223T 16292T 16519C; 073G 189G 194T 195C 204C 207A 263G; 709A 750G 1243C 1406C 1438G 2706G 3505G 4769G 5046A 5460A 7028T 7151T 8251A 8860G 8994A 11674T 11719A 11947G 12414C 12705T 13263G 14766T 15326G 15784C 15884c 15951G
W5a; Germany; Kromsdorf 2,600 2,500 BC; 16223T 16292T 16362C; 073G 189G 194T 195C 204C 207A 263G;
W5b; Germany; Derenburg Meerenstieg II 5500-4900 BC; 16093C 16223T 16292T
W5b; Germany; Derenburg Meerenstieg II 5500-4900 BC; 16093C 16223T 16292T
W5b; Hungary; Starčevo Lánycsók, Gata-Csotola 5500 - 4500 BCE [LGCS1]; 16093C 16223T 16292T
W6a ; Russia; Yamnaya Lopatino II, Sok River, Samara 3500-2700 BCE; 16192T 16223T 16292T 16325C 16519C; 073G 189G 194T 195C 204C 207A 263G; 709A 750G 1243C 1438G 2706G 3505G 4093G 4769G 5046A 5460A 7028T 8251A 8610C 8614C 8860G 8994A 11674T 11719A 11947G 12414C 12705T 14766T 15326G 15884c
W6a ; Germany; Corded_Ware_LN Esperstedt 2566-2477 calBCE ; 16192T 16223T 16292T 16325C 16519C; 073G 189G 194T 195C 204C 207A 263G; 709A 750G 1243C 1438G 2706G 3505G 4093G 4769G 5046A 5460A 7028T 8251A 8610C 8614C 8860G 8994A 11674T 11719A 11947G 12414C 12705T 14766T 15326G 15884c
W6c; Russia; Yamnaya Lopatino I, Sok River, Samara 3090-2910 calBCE; 16192T 16223T 16292T 16325C 16519C; 073G 189G 194T 195C 204C 207A 263G; 709A 750G 1243C 1438G 2706G 3505G 4093G 4769G 5046A 5460A 7028T 7852A 8002T 8251A 8614C 8658T 8860G 8994A 11674T 11719A 11947G 12414C 12705T 14766T 15326G 15884c
The question of Jewish Haplogroup W mtdna types has been brought up, sometimes in terms of the old controversy on the origin of Ashkenazi Jews (converted Khazars versus Jews from the diaspora). Of course Haplogroup W is but one haplogroup found among the worldwide Jewish population. What we find are separate lineages with different population histories. So one Ashkenazi lineage may represent a Central Asian origin for the that lineage, and another Ashkenazi lineage has a Middle Eastern origin. This supports recent full-genome research that indicates that both theories are correct...
There are the following known Jewish W lineages:
W1d, defined by coding region mutation 08383, 09278, and 14981. There is only one full genome result so far, that an Israeli Jew of Iraqi origin. The person had numerous additional changes, including loss of 16292 in HVR1, 16260 and 16298 in HVR1, and 00189, 00194, 00200 in HVR2. There are similar HVR1 / HVR2 results from Iran, Iraq, and Cyprus. The type seems therefore to have emerged in the earliest agricultural societies, and spread into the eastern Mediterranean, the Levant, Iraq, and Iran.
Back to Table of Contents
W1h, defined by the 16145 mutation in HVR1, emerged around 9,000 years ago. The major branch of this tree is found among the Ashkenazi Jews of Eastern Europe. Other single-ancestor lineages are in Italy and the Czech Republic, although these may represent occasions where the 16145 mutation arose independently. Interestingly, HVR1-only results with the -16292 16145 motif are also found among Polish Roma...
W3a1, with HVR1/2 results only, found in Iraqi Jews (with a 16124 mutation) and a Sephardic Jew from Istanbul (with a 16234 mutation) - perhaps W3a1a's, but not W3a1a1 since they don't have the 16291 mutation.
W3a1a1 emerged in Russia around 8,000 years ago. One distinct branch, with the 16291 mutation, appeared around 1,000 years ago and is today found only among Ashkenazi descendants.
W6, HVR1/2 results only, found in Ethiopian and Iranian Jews (with different HVR1 mutations indicating a split long, long ago).
How to Read the Family Tree Diagrams
The phylogenetic trees are generated by Fluxus Network software. A group of mtdna sequences are entered into the software, which then reconstructs all possible, shortest, least complex phylogenetic trees for those sequences. Often multiple alternate trees result, differing only in detail at a few places. I arbitrarily select the one that seems most reasonable for presentation on this site. In the case of HVR1 or HVR1 + HVR2 trees, the one most closely matching the results of the full sequence tree is used. Let's study one example:
In these trees, the central, ancestral type is indicated (in this case, 'W1', with the basic HVR sequence 16223 16292 16519 / 073 189 195 204 207 263). Each line away from the central type shows the additional mutations leading to the next motif on the tree. For example, DEX586, just to the right of the central type, has a '0119' marked on the line. This means that DEX586 has the motif 16223 16292 16519 / 073 119 189 195 204 207 263 (W1's result with the addition of the 119 mutation).
Moving further on the tree, we see four descendant lineages of DEX586. The one to the upper right, BE3509, has "16295" on the line. So BE3509's motif is 16223 16292 16295 16519 / 073 119 189 195 204 207 263 (DEX586's result with the addition of the 16295 mutation).
mv1, farther up the tree to the right of BE3509, has '16292' on the line. This indicates a change from mv1 at 16292, which however is already present in the mv1 motif. What this means is that 16292 has changed back to the CRS value, so would not be reported as a change to CRS, so that the motif for mv1 is 16223 16295 16519 / 073 119 189 195 204 207 263. (BE3509's motif without the 16292).
The labels of the motifs consist of the two-letter ISO country code for the most remote known matrilineal ancestor for that motif; and the four right-most characters of the motif's GenBank, FTDNA, mitosearch, or other identifier. 'UN' indicates unknown ancestry. 'mv' motifs are intermediate types inferred by the Network software, but (so far) not represented by an existing lineage. The larger yellow circles indicate multiple individuals reporting that motif. Multiple countries for such motifs are listed on the chart.
Dates for the emergence of a branch, as calculated using the 'genetic clock' are indicated in terms of thousands of years (kya) before the present. These are the average of a very wide range of possible values (usually plus or minus 50%). The range is so great due to the small number of samples. Those marked '(actual)' are not such estimates, but actual results based on ancient DNA.
Back to Table of Contents
Haplogroup W Resources
Family Tree DNA Members
I can provide a report to those who tested with Family Tree DNA with full sequence results who want to understand their results better. To do this, go to your mtDNA - Results - rCRS results page. There are two tabs at the bottom, RSRS Values' and 'rCRS Values'. Make sure you are on the 'rCRS Values' tab. Copy the Mutations section ('HVR1 differences from rCRS, HVR2 differenes from rCRS, Coding Region Differences from rCRS') from the web page and paste into an e-mail and send it to me at this address. If you have any information on your matrilineal (mother's mother's mother's mother's...) European or Asian ancestor's place of residence, please mention that..
With the introduction of the v5 chip by 23andme in August 2017, 26 key markers informative of placing 23andme users on the W family tree were removed. Therefore, I no longer provide the analysis service to 23andme users. They can still use (James Lick's Hapmap to get a best estimate of their placement on the W family tree. If you wish me to interpret your Hapmap results as to geographic origin, then save the results page as html or print it to pdf and email it to me as an attachment to me at this address. If you have any information on your matrilineal (mother's mother's mother's mother's...) European or Asian ancestor's place of residence, please mention that..
Family Tree DNA provides a public view of some members' Haplogroup W results at the Haplogroup W and N2a Project. FTDNA members may join the W and N2a Haplogroup discussion group.
The Haplogroup W Facebook page has pretty low participation but is not totally inactive.
Phylotree.org provides the latest mtdna family tree and is the semi-official source for subgroup designations.
Ian Logan provides invaluable resources on current Genbank full mtdna sequences, and provides instructions and assistance on how to submit your full genome results to Genbank.
The Wikipedia Haplogroup W page has relatively limited and out-of-date information.
Mitosearch will allow you to search for hvr-only Haplogroup W matches.
© 2001-2017, Mark Wade
Click here to :