arrow_back HD Research Hub

Gemma 4 Finds What Llama 3.1 Missed

Experiment #3 | April 2, 2026

Experiment Card

ID
EXP-003-GEMMA4
Date
2026-04-02
Type
Multi-Model Comparison (same papers as EXP-002)
Status
Complete
Infrastructure
Model: Gemma 4 (26B)
Context: 256K available
Hardware: Mac M2 MBP (100% GPU)
Cost: $0 (local)
The Question

Same 16 papers. Different model. Does Google's Gemma 4 (26B) find things that Llama 3.1 (8B) missed?

Model Comparison

Metric Llama 3.1 8B (EXP-002) Gemma 4 26B (EXP-003)
Top hypothesis score90/10095/100
Top insightcGAS inhibition (single target)Copper → mitochondria → cGAS cascade (multi-target)
Depth of reasoningIndividual mechanismsInterconnected pathogenic networks
Cross-paper synthesisFound copper debate"Papers build on each other" (more nuanced)
Research gapsGut microbiota, CRISPR in humansSame + translational models, long-term CRISPR safety
Papers analyzed1616 (same)
Characters read1,935,6271,935,627 (same)

Gemma 4 Hypotheses

Multi-Target Cascade

Copper → Mito → cGAS

Confidence95/100

Copper dysregulation (cuproptosis) triggers mitochondrial failure, which releases mtDNA, activating cGAS-STING neuroinflammation. A self-perpetuating cycle across HD, AD, PD.

Metabolic Restoration

Insulin Sensitizers

Confidence92/100

Restoring metabolic balance stabilizes mitochondria, preventing mtDNA release and dampening the inflammatory switch before irreversible structural atrophy begins.

DNA Repair Fidelity

LIG1 Enhancement

Confidence88/100

Enhancing DNA repair enzyme fidelity (LIG1) reduces oxidative damage accumulation, protecting mitochondrial integrity and reducing DAMP-fueled neuroinflammation.

The Key Insight

Gemma 4 didn't just find individual targets. It found the connections between them. Where Llama 3.1 saw "cGAS pathway" and "mitochondrial dysfunction" as separate findings, Gemma 4 traced the causal chain: copper dysregulation → mitochondrial failure → mtDNA release → cGAS-STING activation → chronic neuroinflammation → protein aggregation. One self-perpetuating cycle, not isolated mechanisms.

This is what a larger, more capable model adds: systems-level reasoning across multiple papers.

warning

Limitations

  • Single model, single run. Different prompt or temperature could produce different results.
  • Not reviewed by HD domain experts.
  • The "cascade" hypothesis is the model's synthesis, not an established scientific finding.
  • Only open-access papers analyzed.
  • This comparison shows a larger model reasons deeper, but does not validate the hypotheses themselves.

What This Means for HD Research Hub

Model size matters

26B produces systems-level insights that 8B misses. Worth the slower inference. Justifies applying for API credits (Gemini Pro, Claude) for even larger models.

Gemma 4 is our new default

Apache 2.0, 256K context, runs locally on Mac. Better quality than Llama 3.1 on our workload. No API costs.

Exp #1 (abstracts) · Exp #2 (full text) · Exp #4 (somatic CAG) · Research Tracker · Dashboard