DEV Community

DeepDNA
DeepDNA

Posted on • Originally published at deepdna.ai

How DNA Testing Works: Saliva to Insights

TL;DR: Consumer DNA testing works by extracting DNA from your saliva, amplifying it, then scanning ~600,000–700,000 specific genetic positions on a microarray chip using fluorescent probes. The raw genotype data is computationally analyzed and compared against published research to generate ancestry, health, and trait reports. Accuracy exceeds 99% for common variants but drops for rare mutations.

How DNA Testing Works: From Saliva Sample to Genetic Insights

You spit into a tube, drop it in the mail, and a few weeks later, a report arrives telling you where your ancestors came from, whether you're a carrier for certain conditions, and how your body might respond to caffeine. It feels almost magical — but the science behind it is remarkably concrete.

Disclaimer: This article is for educational purposes. It does not constitute medical advice. Consult a healthcare professional for personalized guidance.

Every consumer DNA test follows the same basic pipeline, whether it comes from 23andMe, AncestryDNA, or any other provider. Understanding how DNA testing works helps you interpret your results with the right level of confidence — and skepticism. Here's what actually happens between the spit and the spreadsheet.

What Happens When You Take a DNA Test?

DNA testing is the process of analyzing specific locations in your genome to identify genetic variants that influence your traits, ancestry, and health risks. Consumer DNA tests use a technology called SNP genotyping to read approximately 600,000 to 700,000 specific positions in your DNA — a tiny but informative fraction of your 3.2 billion base-pair genome.

The process unfolds in five stages: sample collection, DNA extraction, DNA amplification, SNP genotyping on a microarray chip, and computational analysis. Each stage involves distinct biochemistry and quality controls. The entire pipeline typically takes three to eight weeks from the moment the lab receives your sample.

Step 1 — Collecting Your Sample

Most consumer DNA tests start with saliva. You'll receive a collection kit with a tube containing a stabilizing buffer — typically an Oragene-style solution that preserves your DNA at room temperature during shipping.

Here's a fact that surprises most people: the DNA in your saliva doesn't come primarily from the cells lining your cheeks. Research published in BMC Medical Genomics found that approximately 74% of the DNA in saliva originates from white blood cells, not epithelial cells (Hansen et al., 2007). White blood cells produce higher-quality, longer fragments of genomic DNA — which is exactly what genotyping platforms need.

That's why the instructions tell you not to eat, drink, or brush your teeth for 30 minutes before spitting. Food particles and toothpaste don't contaminate your DNA directly, but they can introduce bacteria or chemicals that interfere with downstream processing.

Step 2 — Extracting DNA from Your Cells

When your sample arrives at the lab, technicians first perform a visual inspection. Samples that are discolored, contain insufficient volume, or show signs of leakage are flagged for reprocessing or a replacement kit.

Accepted samples move to DNA extraction. The lab adds chemical reagents — typically a detergent like SDS (sodium dodecyl sulfate) — to break open the cell membranes. Protease enzymes then digest the proteins wrapped around your DNA. What remains is a soup of nucleic acids, salts, and cellular debris.

The DNA is purified using either ethanol precipitation or silica-column filtration. Both methods exploit DNA's chemical properties: it binds to silica in the presence of certain salts and precipitates in ethanol. The result is a clean pellet or solution of pure genomic DNA.

Studies using Oragene collection kits report a median yield of 110 micrograms of DNA from a 2-milliliter saliva sample (Abraham et al., 2012) — far more than the ~200 nanograms needed for a single genotyping run. The excess provides a safety margin for repeat assays if the first attempt fails quality control.

Step 3 — Amplifying Your DNA

Even 200 nanograms of DNA isn't enough to cover a microarray chip with hundreds of thousands of probe sites. The lab needs billions of copies of your genome fragments, and it uses a process called whole genome amplification (WGA) to produce them.

Whole genome amplification: A laboratory technique that creates millions of copies of an entire genome from a small starting sample, using enzymes called polymerases that replicate DNA strands at multiple random sites simultaneously.

Unlike standard PCR — which targets specific genes — WGA amplifies the entire genome in an unbiased fashion. The most common method, multiple displacement amplification (MDA), uses a highly processive polymerase (phi29) that can generate 20 to 30 micrograms of product from as little as 10 nanograms of input DNA.

The amplified DNA is then fragmented into pieces roughly 300 to 500 base pairs long and labeled with fluorescent tags. These tagged fragments are now ready for the genotyping chip.

How SNP Genotyping Actually Works

This is where the core science happens. Your labeled DNA fragments are applied to a microarray — a glass slide or silicon chip studded with hundreds of thousands of microscopic probes, each designed to bind to a specific location in the human genome.

SNP (single nucleotide polymorphism): A position in the genome where a single DNA letter varies between individuals. For example, at a given position, some people carry an A and others carry a G. SNPs are the most common type of genetic variation, with roughly 4 to 5 million in each person's genome.

Consumer DNA tests use chips manufactured by Illumina, the dominant platform in the industry. 23andMe's current chip (version 5) is based on the Illumina Global Screening Array and tests approximately 630,000 SNPs. AncestryDNA uses a custom Illumina OmniExpress-based chip covering roughly 682,000 positions (ISOGG Wiki).

Here's how the chip reads your DNA:

  1. Hybridization. Your fragmented, labeled DNA washes over the chip. At each probe site, fragments containing the matching sequence bind to the probe through complementary base pairing — the same A-T and C-G rules you learned in biology class.

  2. Extension and labeling. The probe extends by one base, incorporating a fluorescently labeled nucleotide that corresponds to the specific variant (allele) present in your DNA at that position.

  3. Fluorescence scanning. A laser scans the chip and reads the color of the fluorescent signal at each probe. Different alleles produce different colors, allowing the system to determine your genotype — whether you carry two copies of the same allele (homozygous) or one of each (heterozygous).

  4. Intensity analysis. Software measures the intensity ratio of the two possible fluorescent signals at each SNP position. Clean, well-separated clusters indicate confident genotype calls; ambiguous signals are flagged.

The entire genotyping run takes about 24 hours. When it works well, call rates exceed 99% — meaning the system confidently assigns a genotype at more than 99 out of every 100 tested positions.

For a deeper look at what DNA analysis can reveal, see our complete guide to DNA analysis.

From Raw Data to Your Report

The raw output from the genotyping chip is a file listing each tested SNP position and your genotype at that position — essentially a table with 600,000+ rows. But a table of genotypes isn't useful on its own. The analysis pipeline transforms this raw data into interpretable results.

Genotype calling is the first computational step. Algorithms like Illumina's GenCall cluster the fluorescence intensities across all samples processed in the same batch and assign genotypes based on where each data point falls within the clusters. Reported accuracy exceeds 99.5% for common variants (Tandy-Connor et al., 2018).

Next comes imputation — a statistical technique that infers the genotypes of SNPs you weren't directly tested for. Because nearby SNPs tend to be inherited together (a phenomenon called linkage disequilibrium), algorithms can predict your likely genotype at untested positions using reference panels like the 1000 Genomes Project. Imputation can expand your effective dataset from ~600,000 directly typed SNPs to several million.

Finally, the analysis engine compares your genotypes against published genome-wide association studies (GWAS). These studies, often involving hundreds of thousands of participants, have identified statistical associations between specific SNPs and traits ranging from eye color to disease risk. Your report translates these statistical associations into personalized risk estimates, ancestry percentages, and trait predictions.

If you already have raw data from a previous test, you can explore what else it can tell you in our guide on what to do with your 23andMe raw data.

SNP Genotyping vs Whole Genome Sequencing: What's the Difference?

Consumer DNA tests and clinical whole genome sequencing (WGS) answer fundamentally different questions, and understanding the distinction matters for interpreting your results correctly.

SNP genotyping chips read predetermined positions — typically 600,000 to 700,000 selected SNPs known to be informative for ancestry, common health traits, and pharmacogenomics. Think of it as checking specific answers on a multiple-choice exam. The cost ranges from $99 to $199.

Whole genome sequencing reads every single one of your approximately 3.2 billion base pairs. It detects not just SNPs but also insertions, deletions, structural variants, and novel mutations that no chip could anticipate. WGS costs have dropped dramatically — from $100 million for the first human genome in 2001 to roughly $200 to $600 today.

The critical limitation of SNP chips shows up with rare variants. A study published in Genetics in Medicine demonstrated that the positive predictive value of consumer genotyping for some BRCA1 and BRCA2 variants — mutations associated with breast and ovarian cancer risk — can fall below 5% (Tandy-Connor et al., 2018). In other words, more than 95% of positive results for these rare mutations from DTC tests were false positives.

This doesn't mean consumer tests are unreliable for common variants. For frequently occurring SNPs — the kind that influence traits like lactose tolerance, caffeine metabolism, or ancestry — accuracy remains excellent. But for clinically significant rare variants, confirmatory testing through a clinical-grade laboratory is essential.

For a practical example of how genetic variants affect drug response, see our pharmacogenomics guide for Europe.

What Can (and Can't) a DNA Test Tell You?

Consumer DNA tests excel at certain things and fall short at others. Being clear about both prevents misplaced anxiety and misplaced confidence.

What consumer DNA tests can do well:

  • Estimate your ancestral composition with reasonable accuracy for well-studied populations
  • Identify carrier status for many single-gene conditions (like cystic fibrosis or sickle cell trait)
  • Report common variants affecting traits like earwax type, cilantro taste perception, or MTHFR gene status
  • Provide basic pharmacogenomics information (how your genes might affect drug metabolism)

What consumer DNA tests cannot do:

  • Diagnose any disease — a genetic variant is a risk factor, not a diagnosis
  • Detect all disease-related mutations — chips test a fixed panel, not the full genome
  • Provide accurate results for populations underrepresented in genetic research databases
  • Replace clinical genetic testing ordered by a healthcare provider

The most important principle to remember: genetic risk is not genetic destiny. A variant associated with increased risk for a condition means your statistical probability is higher than average — it doesn't mean you'll develop that condition. Environment, lifestyle, and other genetic factors all play a role.

Frequently Asked Questions

How long does a DNA test take?

From mailing your sample to receiving results, expect 3 to 8 weeks. The lab processing itself takes 1 to 2 weeks; most of the wait is shipping, queue time, and quality checks. 23andMe estimates 3 to 4 weeks from sample receipt.

Is saliva DNA as good as blood DNA?

Yes. Multiple studies show greater than 97% concordance between SNP genotyping results from saliva and blood samples (Abraham et al., 2012). Saliva is now the standard for consumer genomics precisely because it performs comparably to blood without requiring a needle.

How accurate are consumer DNA tests?

For common SNPs, genotyping accuracy exceeds 99.5%. However, accuracy drops substantially for rare variants. If your results flag a clinically significant rare mutation, seek confirmatory testing through a clinical genetics laboratory before making any health decisions.

Can a DNA test be wrong?

Yes, in specific scenarios: rare variant false positives (as discussed above), sample contamination, or insufficient DNA quality. Additionally, ancestry estimates can shift as reference databases expand — your "results" are only as good as the comparison population data available at the time.

What happens to my DNA sample after testing?

This varies by company. Some destroy your sample after processing; others store it for potential future analysis. Most companies offer the option to request sample destruction. If data privacy matters to you — especially in Europe — read our guide on GDPR and genetic data.

Your DNA Data, Your Insights

Understanding how DNA testing works is the first step toward getting real value from your genetic data. The technology is reliable for what it's designed to measure — common variants across well-studied populations — but it has boundaries that matter.

If you've already taken a consumer DNA test and want to explore what your raw data can reveal beyond what your original provider reported, tools like DeepDNA can analyze your existing raw data file for additional health, trait, and pharmacogenomics insights. You can also compare your options in our guide to 23andMe alternatives in Europe.


Originally published at deepdna.ai

Top comments (0)