What AncestryDNA taught me about DNA, privacy and the complex world of genetic testing
After spitting in a tube, I learned where my DNA comes from and where my personal data might go.
Jackson RyanFormer Science Editor
Jackson Ryan was CNET's science editor, and a multiple award-winning one at that. Earlier, he'd been a scientist, but he realized he wasn't very happy sitting at a lab bench all day. Science writing, he realized, was the best job in the world -- it let him tell stories about space, the planet, climate change and the people working at the frontiers of human knowledge. He also owns a lot of ugly Christmas sweaters.
The hardest thing about having your DNA sequenced is generating a teaspoon's worth of spit.
They don't tell you this in the marketing materials for your typical at-home DNA test kit, but producing enough saliva to fill a pen-sized tube up to its high spit mark is hard work -- and strangely nerve-wracking, too.
I sneak into an unused meeting room, chewing on air to generate slobber. The kit has two tubes. One, now full of my spit, and a second smaller tube with a chemical mix that stabilizes DNA. After uniting the two tubes, I stick the pale blue spit-mix into a box and mail it off to AncestryDNA, the genetics arm of the world's largest genealogy company, Ancestry.
In 2012 Ancestry launched the AncestryDNA service, which provides paying users the ability to build a timeline of their genes, search for relatives and understand what geographic regions their DNA originates from. Ancestry has sold 14 million kits since launch, and the number continues to grow as curious consumers turn to DNA to unravel their histories.
Over the last two years many DNA kit manufacturers have begun marketing their products as "perfect gifts." In the 2018 Thanksgiving period, AncestryDNA broke its November sales record. Your DNA story has become this year's hottest Christmas gift! Consumer genealogy tests have become big business practically overnight. Why are we so interested in finding out the secrets of our DNA?
"I think the major appeal of DNA testing is to find out something new about us," says Caitlin Curtis, a population geneticist at the University of Queensland. That's certainly true for me, at least. My first thought is what revelations my spit might teach me about myself.
But in the quest for answers, do we truly understand what kind of information we're giving up?
The almost unfathomable complexity of all life on Earth, from bacteria to humans, relies on DNA, but the DNA code itself is made up of just four letters: A, T, C and G.
These letters, known as bases, always pair together the same way -- A with T, C with G. The order in which these letters are arranged is what makes us different and gives us our unique traits. And because we hand parts of our DNA from parent to offspring, it also links us to the past. We just need to be able to "read" it and put all those bases in order.
This is known as DNA sequencing.
The technology to perform this task has improved dramatically over the last two decades, driving the costs of DNA sequencing down from $10,000 in 2011 to $1,000 in 2017, according to the US National Human Genome Research Institute. Those advances have trickled through to the commercial sector, allowing a myriad of companies, from startups to huge public organizations, to develop their own at-home DNA testing kits.
Kits provide customers with an estimation of their genetic histories, ancestries and even potential health issues they might run into. But going from a saliva sample to a genetic history solution is a complex process involving overwhelming amounts of data and statistical analyses that often confound more than they clarify.
"There is a general lack of knowledge about how the whole process of ancestry testing works," Curtis says. "People's perceptions of the results might be different from the way a genetic scientist might interpret the results."
I'm pretty well versed in the complexities of molecular biology, but after sending my spittle away I become acutely aware that I have no idea how AncestryDNA's test works. I know it'll give me an "ethnicity estimate" and tell me my "DNA story," but beyond the marketing buzzwords I'm in the dark.
Science, math and data
AncestryDNA uses a database that contains more than 16,000 reference DNA samples from 43 regions around the world.
About 12,000 of these samples come from Ancestry users who opt in and allow the company to use their DNA for research purposes, while the remaining reference samples come from public databases such as the 1000 Genomes Project.
"We find people with long family histories from a certain part of the world and we analyze their DNA, and their DNA becomes, by definition, 100 percent from the region" says Barry Starr, director of scientific communications at AncestryDNA.
The science of it is complex: The procedure splits up a DNA sample into 1,001 different "windows," as Starr calls them. All up, those 1,001 windows look at approximately 700,000 spots in the DNA code. When you take the test, every window is compared to the 1,001 windows in a reference sample, and that occurs for each of AncestryDNA's 43 regions.
If 500 of those windows match, say, a Canadian region, then by AncestryDNA's definition, I am 50 percent Canadian.
"It really is cutting-edge science, and as the field advances we advance with it and so provide updates to consumers when we have made changes based on the progression of the science," says Starr.
CNET rates AncestryDNA as having one of the best kits available, in large part thanks to its huge database. But testing doesn't just rely on database size -- where the data comes from is also important. Almost 75 percent of AncestryDNA's ethnic regions skew toward European descent, so detailed estimates of ethnicity from other regions is difficult to obtain at present. A study, published in Nature in 2016, suggested that the scientific inquiry into genomes was also suffering from bias.
With fewer reference samples from both consumers and scientific research available in regions of Africa and Asia, accurate estimates for genetic heritage in those locations are more prone to error.
"Everyone started out in Africa, and a small set of them moved out of Africa and colonized the world," explains Starr. "The genetic diversity within Africa is huge compared to the rest of the world, which means you need larger reference panels."
And the results of different genealogy tests may show marked differences. For instance, 23AndMe, a rival genealogy company based in California, has a more extensive catalogue of East Asian regions than AncestryDNA. Providing DNA samples to both companies could lead to completely different ethnicity estimates. It's not that your DNA has changed -- but the different databases and algorithms used to calculate it have.
My DNA story
I'm not exactly sure where I come from.
An educated guess would say this impressively pale skin hails from a region localized entirely within Britain. There could be some Scottish in there. Maybe a hint of Irish, too. I don't think there's lots of room for suspense or intrigue here.
Four weeks after I spat in a tube, my email chimes.
I click through ready to solve this admittedly feeble personal mystery. But there are no shocking revelations. I end up with an ethnicity estimate that puts my DNA origins at 55 percent England, Wales and Northwestern Europe and 44 percent Ireland and Scotland.
However, there's also a zero to 1 percent chance my DNA comes from a region in West Africa that AncestryDNA pegs as relating to "Benin/Togo." Surprising to me, but not unusual, according to Starr.
"A 0-1 percent would say there might be something interesting here, but there might not," he says. A result such as this might "fall out" in the future, as AncestryDNA's databases continue to be refined by additional samples and research programs.
My ethnicity estimate is only one half of the picture, however, because I can also look at my DNA matches, which directly correlates my DNA with that of other users in Ancestry's database. In my case, it throws up two matches that AncestryDNA classes as "second cousins" -- pretty close relatives of mine, according to my genes.
I've never seen these people.
And this is a caveat for the AncestryDNA kit. Your DNA might kick up matches with people you've never seen before, but if you want to fit them into your family tree, you need to subscribe to the other side of the Ancestry business to pore over how you might, potentially, be related to one another.
Digital DNA trail
In January, Buzzfeed News reported that FamilyTreeDNA, another huge provider of at-home DNA kits, had given the FBI access to its database of over a million profiles. The company provides the FBI with the ability to upload genetic profiles from crime scenes to FamilyTree's database, which may aid them in genetically hunting down criminals. However, FamilyTree didn't notify users that their genetic information might be used this way before giving the FBI access.
"I believe that there is an ethical obligation for these companies to be very upfront, honest and explain in simple terms to people what might happen to their data after they take a test, but that is not always the case," says Curtis.
AncestryDNA's terms and conditions state that it "does not claim any ownership rights in the DNA submitted for testing" but by submitting a sample you effectively "grant AncestryDNA ... a royalty-free, worldwide, sublicensable, transferable license to host, transfer, process, analyze, distribute, and communicate your Genetic Information for the purposes of providing you products and services."
It may be my DNA, but how it's used in the future is something that AncestryDNA decides. However, there is a failsafe. The nuke-it-all option.
"It's your data, you should be able to do with it what you want," Starr says. "If you decide at some point that you don't want us to have it anymore, you can tell us to delete it and you can even tell us to destroy the DNA sample."
DNA as data
"The biggest danger with handing control of your DNA data is the potential for discrimination based on that information," says Curtis.
Now that even our DNA is being digitized and stored in the infinite online filing cabinet of the World Wide Web, we must confront a reality in which our own genetic makeup can be hacked, stolen or used against us.
"There are some parallels to broader conversations around how to govern our personal digital data online – and the possibility for it to be used in unanticipated ways in the future," she continues.
A cautionary tale, it would seem, considering genealogy testing has undergone rapid growth in the last two years. And though the science is getting better, the regulations and potential pitfalls are becoming harder to nail down.
"It's a complicated issue because in some countries there is protection against discrimination, and in some countries there are very few laws about what you can do with genetic data," explains Curtis.
In the US, the Genetic Information Nondiscrimination Act of 2008 prevents health insurers and employers from discriminating against you based on your genetic profile.However, in Australia, insurance companies can discriminate based on the results of a DNA test, increasing premiums or completely excluding coverage for certain diseases.
Cool. Cool cool cool.
Almost none of this research was done before I spat into a tube six or seven weeks ago, and now I realize my nerves weren't about how much spittle I could produce. I jangled because I was diving headfirst into a world I thought I understood, but actually knew hardly anything about.
There were voices gnawing at my subconscious. A devil on one shoulder, an angel on the other. One quietly trying to tell me that it's kind of weird to give a private, multinational company access to the immutable information that can be used to identify me -- and only me.