This is a submission for the Gemma 4 Challenge: Build with Gemma 4
What I Built
AlphaFold Gap-Filler is the missing piece. It takes an uncharacterized protein and produces a complete, specific, actionable drug development hypothesis in under 30 minutes — automatically, from scratch, with no prior knowledge of the protein.
Demo
Code
How I Used Gemma 4
Gemma 4's (gemma-4-e2b-it) long-context window makes this possible for the first time. We assemble all five evidence streams — evolutionary homologs from BLAST, genomic neighborhood from Ensembl, interaction network from STRING DB, structural confidence from AlphaFold, and 50+ paper abstracts from PubMed — into a single inference call. Gemma 4 holds all of this simultaneously and reasons across it. No previous model could do this. No existing tool does this. This is the contribution.
Workflow of the Project: -
Stage 1 — Gemma 4 Function Prediction: Five evidence streams are assembled and fed to Gemma 4 in a single call. The model produces a structured hypothesis with molecular function, biological process, disease relevance, confidence scoring per dimension, the top 3 supporting evidence pieces, a specific falsifiable experiment, and alternative hypotheses.
Stage 2 — ChEMBL Drug Repurposing: Drug target names are extracted from the hypothesis and used to query ChEMBL's database of 17,500+ approved drugs. Existing compounds that target related proteins are retrieved with approval phase, SMILES, and Lipinski drug-likeness scoring.
Stage 3 — Boltz-2 Binding Affinity: Candidate drugs and the protein sequence are fed into Boltz-2, the first open-source AI model approaching free-energy perturbation accuracy at 1000x lower compute cost. Predicted IC50 values rank candidates by binding potency.
Stage 4 — OpenTargets Disease Validation: The protein and its interaction partners are queried against OpenTargets' genetic evidence database. GWAS hits, disease associations, and genetic scores are retrieved to validate or challenge the hypothesis with independent genomic evidence.
Stage 5 — Gemma 4 Drug Strategy: A final Gemma 4 call synthesizes all data into a complete preclinical roadmap. Specific assay types, expected positive and negative results, cost estimates, animal model selection, biomarker strategy, and Phase I clinical trial design — all generated automatically.
Top comments (0)