DEV Community

Orbit Websites
Orbit Websites

Posted on

Uncovering the Ancient Origins of Obesity: Neanderthals' Surprising "Fat Factories" 125,000 Years Ago

Uncovering the Ancient Origins of Obesity: Neanderthals' Surprising "Fat Factories" 125,000 Years Ago

Spoiler: This isn’t a history lesson — it’s a hands-on bioinformatics adventure. We’re going to use real genetic data to explore how Neanderthal DNA might influence modern human metabolism, including fat storage. You’ll learn to analyze ancient DNA, compare genomes, and identify obesity-linked variants — all in Python.

Let’s dig into the science behind the headlines.


🧬 Why This Matters

Recent studies (like those from the Max Planck Institute) show that some modern humans carry Neanderthal gene variants linked to increased fat storage. These “thrifty genes” helped Neanderthals survive Ice Age winters — but today, they may contribute to obesity.

We’ll use public genomic datasets to:

  1. Download Neanderthal and modern human genomes
  2. Identify key SNPs (genetic variants)
  3. Cross-reference with known obesity-related genes
  4. Visualize the results

No prior genomics experience? No problem.


🛠️ Tools We’ll Use

  • Python 3 (with pandas, requests, matplotlib)
  • UCSC Genome Browser API (for genomic data)
  • dbSNP & GWAS Catalog (for disease-linked variants)
  • Jupyter Notebook (recommended)

Install dependencies:

pip install pandas requests matplotlib
Enter fullscreen mode Exit fullscreen mode

Step 1: Fetch Neanderthal Genome Data

We’ll use the publicly available Altai Neanderthal genome (sequenced in 2013). We can access it via the UCSC Genome Browser’s API.

import requests
import pandas as pd

# Query UCSC for Neanderthal SNP data near the PPARG gene (key in fat regulation)
def fetch_genome_data(chromosome, start, end, genome="neandertal1"):
    url = f"http://genome.ucsc.edu/cgi-bin/das/{genome}/dna"
    params = {'segment': f'chr{chromosome}:{start},{end}'}
    response = requests.get(url, params=params)

    if response.status_code == 200:
        return response.text
    else:
        print(f"Error: {response.status_code}")
        return None

# Example: Get region around PPARG (chromosome 3, position ~12,400,000)
data = fetch_genome_data(3, 12400000, 12401000)
print(data[:500])  # Preview first 500 chars
Enter fullscreen mode Exit fullscreen mode

🔍 Note: The UCSC DAS server may be slow. For this tutorial, we’ll simulate data if needed.


Step 2: Load Modern Human Variants (dbSNP)

We’ll compare Neanderthal DNA to known human SNPs. Let’s pull data from dbSNP using a simplified CSV.

# Simulate a small dataset of obesity-linked SNPs
obesity_snps = pd.DataFrame({
    'rsID': ['rs1801282', 'rs3856806', 'rs4684847'],
    'gene': ['PPARG', 'PPARG', 'BSX'],
    'chromosome': [3, 3, 1],
    'position': [12405680, 12407980, 20112345],
    'effect_allele': ['C', 'T', 'A'],
    'effect': ['increased fat storage', 'insulin resistance', 'appetite regulation']
})

print(obesity_snps)
Enter fullscreen mode Exit fullscreen mode

Output:

       rsID gene  chromosome   position effect_allele                    effect
0  rs1801282  PPARG           3   12405680             C    increased fat storage
1  rs3856806  PPARG           3   12407980             T   insulin resistance
2  rs4684847    BSX           1   20112345             A  appetite regulation
Enter fullscreen mode Exit fullscreen mode

Step 3: Simulate Neanderthal Genotype Matching

We don’t have direct SNP calls from the API, so let’s simulate what we’d do with real alignment data.

Assume we’ve aligned Neanderthal reads and found:

# Simulated Neanderthal genotype calls
neanderthal_calls = {
    'rs1801282': 'C/C',  # Homozygous for 'C' — the risk allele
    'rs3856806': 'T/T',
    'rs4684847': 'A/A'
}

# Add to our dataframe
obesity_snps['neanderthal_genotype'] = obesity_snps['rsID'].map(neanderthal_calls)
print(obesity_snps)
Enter fullscreen mode Exit fullscreen mode

Now we see:



       rsID gene  chromosome   position effect_allele                    effect neanderthal_genotype
0  rs1801282  PPARG           3   12405680             C    increased fat storage                 C/C
1  rs3856806  PPARG           3   12407980             T   insulin resistance                 T/T
2  rs4684847    BSX          

---

☕ **Appreciative**
Enter fullscreen mode Exit fullscreen mode

Top comments (0)