Who Are You: Are Home DNA Genealogy Kits Accurate?

18 12 2019

Today, I’ll tackle the big question about direct-to-consumer genealogy genetic testing kits: you know, like 23andMe, AncestryDNA (from Ancestry.com), and Geno 2.0 (from National Geographic). They purport to tell you where your ancestors came from.

And that question is: are they accurate? Do they tell you where your ancestors came from? Or are they BS? I’ve often talked to people with questions about these tests: “So I’m 30% French!  Can I get citizenship and celebrate Bastille Day?”  “My Dad is from Thailand, why doesn’t this say I have any Thai ancestors??”  Stuff like that. Maybe because I’m the only anthropologist many people seem to know, but I get asked this question a lot, so I figured I’d write something up to help answer questions.

Now, let me candid: I have some training in population genetics, but it is not my primary field (my Ph.D. is in cultural anthropology). I am 100% happy to have any population geneticists or biological anthropologists weigh in on places where I am incorrect, or add anything they think is important (please let me know in the comments section). That said, I have discussed this with a number of people whose expertise is this type of thing, and verified as much as possible of what is below with them.

So, are the tests for real? Yes and no. They are not hoaxes, or lying; the results they are sending you real genetic tests, and they’re giving you real data. BUT TO BE CLEAR, THESE TESTS CANNOT GIVE YOU A COMPLETE MAP OR PERCENTAGE OF YOUR ANCESTORS THAT CAME FROM SPECIFIC COUNTRIES.  Specifically, the results they give you are very generally correct (there is some DNA in you present from those groups), but the percentages are likely to be wildly off, and more importantly, there is a huge percentage of your DNA they are missing.  The problem is twofold: 1) the interpretation they present of your genetic data and 2) what’s missing from the tests (or they simply can’t reveal). So it’s not that the tests are false, per se, the problem is generally what the companies are telling you, and what they’re telling you that the results mean. Or rather, what they are leading you to believe these tests mean.

THE INTERPRETATION

So let’s take things one at a time. First off, the tests do not tell you what percentage of your ancestry is from each different country.   A lot of folks submit their sample and, based on the advertising they see on TV or in magazines, expect something like a map, that says they are 35% Irish, 27% Korean, 48% Chilean, etc. Let’s be clear: these tests DO NOT SHOW YOU THAT. An acquaintance got back a test that said “15% African” and then literally claimed he was now allowed to make African-American jokes (hint: NO. NOT AT ALL. Even if that’s what the test was really saying).  Here are two examples of those type of results:

Ancestry DNA Result Screenshot

Weirdly, some of these results are by country (Senegal), some by region (Africa North) and some by general culture area (Native American). Genetically I understand why, but that’s a very strange way to report them, and they are presented as roughly equivalent groupings. And they all have very specific percentages!

Ancestry DNA Screenshot 2

These results are from a different company, but with similar results: a mixture of types of sources, but with very specific % amounts, some seemingly overlapping to get you to 100%

There are a bunch of reasons for this. One is what the tests are measuring: they test you for certain genetic markers, and then match those markers to those in their database of tested populations. So what they are actually saying is that some of the genetic markers they are testing for (and there are a lot they’re aren’t using) match the markers in test populations taken in those countries, generally within the last 10-15 years. Now, you can probably see the first problem there: what if your ancestors came to the United States 50 years ago? Or 100, or heaven forbid, 200 years ago? Those populations – the ones in those countries that would actually match your ancestors there – have never been tested, because of course nobody was doing specific DNA testing 100 years ago. These ancestry tests can’t account for recent migrations in those areas, or other populations coming into those areas and mixing genetically with the current populations.

Here’s another: human genetic variation can be geographical, but it’s not politically bordered (thus “broadly Northwestern European” as a result on one of the test examples above). As one geneticist noted, “Present-day patterns of residence are rarely identical to what existed in the past, and social groups have changed over time, in name and composition.”

Let’s hit up another huge problem: the data that they are presenting to you is actually a range of possibilities based on their test, but it is reported as if it is a real, specific percentage. This is what anthropologist Jonathan Marks calls “plausibility genetics.” For example, with 23andMe, a spokesperson noted that their results are based on a sliding confidence scale, ranging from 50-90 percent (it isn’t revealed how they reached these percentages, it’s likely a combination of calculations based on their confidence intervals and the margin of error). Think of how that would play out with any other scientifically-derived information. If you picked a mushroom in the woods, then asked a mycologist if it was poisonous, and they said “sure! You can totally eat that!” and you said “how certain are you?” and they replied “50%!” than you would gag, and hack up the mushroom you’d just eaten (apparently you ate it before asking about the confidence interval).

Ancestryrtange

If you got this set of results, you might think you were 30% Irish (Happy St Patrick’s Day!); but by checking the details, you can see the actual range they’re giving you is “somewhere between 14%-46% [probably]”

I’ve seen one 23andme result that said “8.2% British and Irish.” Okay, WTF? The confidence varies by 50%, and they’re telling you to within a tenth of a percentage point? And keep in mind, they’re not the only ones. Ancestry’s senior manager of population genomics admitted “While we think you’re, say, 20 percent Irish, it could be as low as 5 (percent) and it could be as high as 35 (percent). That range will vary depending on that particular estimate and also on other populations that make up who you are.” And because of the way the companies’ algorithms work, the way those percentages are reported within the confidence interval can actually vary: for one set of triplets – who have identical genetics – tests reported three completely different percentages of British Isle ancestry: one was 59%, the a second 66%, and the third 70%. Now, those numbers aren’t hugely far off from each other, but when you’re making claims that give people a very specific percentage result (66%?), then those differences do matter: there’s over an 18% increase between 55 and 70 percent, so these genetically identical siblings were reported as significantly different from each other.

THE LIMITATIONS OF THE TESTS

Here’s another problem: there’s also a tremendous amount they’re missing. Let’s start with a standard DNA test: to briefly repeat, you have 25% genetic relation to each of your grandparents. Let’s say 3 of them came from Ireland, and the fourth was from the Sudan. Depending on what markers the ancestry tests were using and what genes you inherited from each grandparent, they could show that you have either 0% Irish ancestry or 0% Sudanese ancestry. Either way: BIG PROBLEM.

Things get even worse because a lot of the tests use mitochondrial DNA to delineate your ancestry. It’s only transmitted through the female line, so that means you would only have the mitochondrial DNA marker of your mother, and she would only have it of hers, who would only have it of hers, etc. So right off the bat, you’re missing 50% of the genetic matching just of your parents! You’ll get nothing from your father’s side here. And it gets worse fast: go to your parent’s parents and you’re excluding 75% the genetic possibilities: that’s right, this “complete” genetic test you’ve got is missing three-quarters of your ancestry just from your grandparent’s generation, let alone any farther back. Of your eight great-grandparents, you’ll only have a mitochondrial DNA match from one of them. Keep in mind, you are equally genetically descended from all of your grandparents, but this test would only show that you were related to one of them.

mtdna_med

A pretty good visual representation of what information you would get with an analysis of nuclear DNA vs. mitochondrial DNA (credit to the University of CA Museum of Paleontology)

The exclusion of so much data leads to a lot of problems: I’ve had a friend with known Native American ancestry get worried when their test showed 0% DNA matches with Native populations – because the genes the companies tested for simply didn’t come from that part of their ancestry. It also leads to articles like this one, where somebody whose father is Samoan (i.e. 50% genetics should be from him) got results back saying she had 0% Samoan ancestry:  https://e-tangata.co.nz/news/my-dna-results-are-in-im-whiter-than-the-milkman

On a related note, and to further muddy the waters of what you can tell about your ancestry from the test results, we’ve got what demographers call “pedigree collapse.” Basically, think of it this way: there are currently 7.6 billion people in the world. Each of them has two parents, so to find out the total number of ancestors they all have, you’d double that: but if you go back a generation, there were not 15 billion people alive. Now, that includes multiple generations, and a lot of these people are siblings, or share parents, etc., but there is no way that you can keep going back in time to look at your ancestors, and assume they’re all separate people – there simply weren’t enough people alive; I have eight great-grandparents, but their generation did not contain 60 billion people. The explanation for why there weren’t a kajillion people in the past and the ancestor numbers still work is because of pedigree collapse: if you go back a little – even in modern historic time – your ancestors start to overlap in their ancestry quite a bit; everybody shares common ancestors with everybody else. So take it back far enough in the past, and you’ll basically have the same ancestors as most of the people you know. Among other things, the testing companies use this to show how you’re related to some famous people, as we’ll see in a moment.

FINAL PROBLEM: MEANINGLESS FLUFF

Finally, we need to talk about some of the absolutely ridiculous additional findings that these companies give you based on their sample of your genes. There’s an ever-expanding set of bizarre test results and related claims, because the companies selling the tests are constantly coming up with new ways to interpret your data to keep people interested: we’ll just look at three.

Let’s start with an interesting example from Nat Geo’s Geno 2.0 test: they have a “genius section” that tells you which geniuses from the past you are related to, and when you shared a common ancestor. The list includes King Tut, who was apparently a “pharaoh genius,” whatever that means (since he reigned for just nine years and died age 18, Tut must have been a child prodigy).

We can look at an example from my own test, below: hey, good news, I’m related to Abraham Lincoln and the dashing explorer Sir Francis Drake! Wait, I share an ancestor with them from…45,000 YEARS AGO?

Geno Genius Results

BTW, I’m also related to King Tut, Charles Darwin, Genghis Khan, Queen Victoria, Thomas Jefferson, and my favorite Renaissance poet, Francesca Petrarca – all about 30,000 years ago, of course

In other words, you’re telling me that so long ago that people were literally living in caves, I shared an ancestor with Abe Lincoln? So what? Does genius transfer genetically? If so, would it stay in the genome for 45,000 years undiluted? And honestly, 45,000 years ago genius basically meant turning to the person sitting next to you and saying “brrrrr, it sure is cold – what if we banged these two rocks together?” (that’s a poor anthropology joke, of course: human control over fire is thought to have been “discovered” roughly one million years ago).

Many of the tests also tell you what percentage of your DNA is from ancient, now-extinct human species, like the Neanderthals or Denisovans. According to Geno 2.0, I’m 1.4% Neanderthal, slightly higher than the average for modern humans, at 1.3%. And it’s even weirder that some of these testing services will tell you the percentage of DNA you share with the Denisovans, an ancient group of hominids so little understood that essentially nothing is known about them, and all the information we have is from a measly six fragmentary fossils: a part of a finger bone, a single molar, that sort of thing. Look, I’m not saying this information isn’t interesting, because it is, a little bit. But it isn’t meaningful; among other things, almost all humans have some of that DNA, so it doesn’t tell you anything about yourself per se.

One more example, just because it’s so ridiculous: 23andMe claims they can tell you what time you’re likely to get up in the morning, “based on 450 places in your DNA that are associated with being either a morning person or a night person” TO WITHIN FIVE MINUTES.

23andmewakeup

Don’t hit that snooze button! Your DNA says that you get up at precisely 8:55 am!

Among the genetic sources they have for this is a study, published in a real academic journal, that does indeed find a correlation between certain genetic markers and being a self-described morning person. But here’s the thing: the study itself notes that they are missing links saying genetics caused the love of mornings: “we did not find evidence for a causal relationship in our Mendelian randomization analysis.” (study here: https://www.ncbi.nlm.nih.gov/pubmed/26835600). Seriously, ask any geneticist whether you can legitimately predict within five minutes when somebody is going to wake up based on their DNA, and then hand them a hankie to dry the tears of laughter that will be streaming down their face.

 

THE BIG WRAP UP

So the answer to the question “are these tests bogus” is: they’re not, but the data they give you is not particularly useful for really determining your geographic ancestry. I’d say they’re fine to spark your interest in genealogy, or give you an idea of some interesting but not genetically meaningful information (like your % of Neanderthal DNA), but they will not really tell you where all your ancestors are from.

Finally, I should be very clear: we are speaking only of genealogical, ancestry-style consumer genetic tests; not the ones that link you to your relatives, determine paternity, or are used in criminal cases: those are a different beast entirely, and all completely legit.

So go out there, pick up a kit for you, or your aunt, or your dog – but also, know exactly what you’re getting.

Until next time, I’m Scott, your friendly neighborhood anthropologist.

P.S. – here are a couple of other writeups on consumer genetics tests by anthropologists.  They’re complimentary to each other (and to mine), if you really want the full story:

1) From  John Edward Terrrell: https://www.sapiens.org/technology/dna-test-ethnicity/

2) From Jonathan Marks (Marks has some of the best writing on the subject): https://drive.google.com/file/d/1vx-OU3LwdNoxBnSLRcr9_4tctA7VuR32/view