Gemma 4 Finds What Llama 3.1 Missed
Experiment #3 | April 2, 2026
Experiment Card
Same 16 papers. Different model. Does Google's Gemma 4 (26B) find things that Llama 3.1 (8B) missed?
Model Comparison
| Metric | Llama 3.1 8B (EXP-002) | Gemma 4 26B (EXP-003) |
|---|---|---|
| Top hypothesis score | 90/100 | 95/100 |
| Top insight | cGAS inhibition (single target) | Copper → mitochondria → cGAS cascade (multi-target) |
| Depth of reasoning | Individual mechanisms | Interconnected pathogenic networks |
| Cross-paper synthesis | Found copper debate | "Papers build on each other" (more nuanced) |
| Research gaps | Gut microbiota, CRISPR in humans | Same + translational models, long-term CRISPR safety |
| Papers analyzed | 16 | 16 (same) |
| Characters read | 1,935,627 | 1,935,627 (same) |
Gemma 4 Hypotheses
Multi-Target Cascade
Copper → Mito → cGAS
Copper dysregulation (cuproptosis) triggers mitochondrial failure, which releases mtDNA, activating cGAS-STING neuroinflammation. A self-perpetuating cycle across HD, AD, PD.
Metabolic Restoration
Insulin Sensitizers
Restoring metabolic balance stabilizes mitochondria, preventing mtDNA release and dampening the inflammatory switch before irreversible structural atrophy begins.
DNA Repair Fidelity
LIG1 Enhancement
Enhancing DNA repair enzyme fidelity (LIG1) reduces oxidative damage accumulation, protecting mitochondrial integrity and reducing DAMP-fueled neuroinflammation.
The Key Insight
Gemma 4 didn't just find individual targets. It found the connections between them. Where Llama 3.1 saw "cGAS pathway" and "mitochondrial dysfunction" as separate findings, Gemma 4 traced the causal chain: copper dysregulation → mitochondrial failure → mtDNA release → cGAS-STING activation → chronic neuroinflammation → protein aggregation. One self-perpetuating cycle, not isolated mechanisms.
This is what a larger, more capable model adds: systems-level reasoning across multiple papers.
Limitations
- Single model, single run. Different prompt or temperature could produce different results.
- Not reviewed by HD domain experts.
- The "cascade" hypothesis is the model's synthesis, not an established scientific finding.
- Only open-access papers analyzed.
- This comparison shows a larger model reasons deeper, but does not validate the hypotheses themselves.
What This Means for HD Research Hub
Model size matters
26B produces systems-level insights that 8B misses. Worth the slower inference. Justifies applying for API credits (Gemini Pro, Claude) for even larger models.
Gemma 4 is our new default
Apache 2.0, 256K context, runs locally on Mac. Better quality than Llama 3.1 on our workload. No API costs.
Exp #1 (abstracts) · Exp #2 (full text) · Exp #4 (somatic CAG) · Research Tracker · Dashboard