Gemma 4 Finds What Llama 3.1 Missed

Experiment #3 | April 2, 2026

Experiment Card

EXP-003-GEMMA4

Date

2026-04-02

Type

Multi-Model Comparison (same papers as EXP-002)

Status

Complete

Infrastructure

Model: Gemma 4 (26B)

Context: 256K available

Hardware: Mac M2 MBP (100% GPU)

Cost: $0 (local)

The Question

Same 16 papers. Different model. Does Google's Gemma 4 (26B) find things that Llama 3.1 (8B) missed?

Model Comparison

Metric	Llama 3.1 8B (EXP-002)	Gemma 4 26B (EXP-003)
Top hypothesis score	90/100	95/100
Top insight	cGAS inhibition (single target)	Copper → mitochondria → cGAS cascade (multi-target)
Depth of reasoning	Individual mechanisms	Interconnected pathogenic networks
Cross-paper synthesis	Found copper debate	"Papers build on each other" (more nuanced)
Research gaps	Gut microbiota, CRISPR in humans	Same + translational models, long-term CRISPR safety
Papers analyzed	16	16 (same)
Characters read	1,935,627	1,935,627 (same)

Gemma 4 Hypotheses

Multi-Target Cascade

Copper → Mito → cGAS

Confidence95/100

Copper dysregulation (cuproptosis) triggers mitochondrial failure, which releases mtDNA, activating cGAS-STING neuroinflammation. A self-perpetuating cycle across HD, AD, PD.

Metabolic Restoration

Insulin Sensitizers

Confidence92/100

Restoring metabolic balance stabilizes mitochondria, preventing mtDNA release and dampening the inflammatory switch before irreversible structural atrophy begins.

DNA Repair Fidelity

LIG1 Enhancement

Confidence88/100

Enhancing DNA repair enzyme fidelity (LIG1) reduces oxidative damage accumulation, protecting mitochondrial integrity and reducing DAMP-fueled neuroinflammation.

The Key Insight

Gemma 4 didn't just find individual targets. It found the connections between them. Where Llama 3.1 saw "cGAS pathway" and "mitochondrial dysfunction" as separate findings, Gemma 4 traced the causal chain: copper dysregulation → mitochondrial failure → mtDNA release → cGAS-STING activation → chronic neuroinflammation → protein aggregation. One self-perpetuating cycle, not isolated mechanisms.

This is what a larger, more capable model adds: systems-level reasoning across multiple papers.

warning

Limitations

Single model, single run. Different prompt or temperature could produce different results.
Not reviewed by HD domain experts.
The "cascade" hypothesis is the model's synthesis, not an established scientific finding.
Only open-access papers analyzed.
This comparison shows a larger model reasons deeper, but does not validate the hypotheses themselves.

What This Means for HD Research Hub

Model size matters

26B produces systems-level insights that 8B misses. Worth the slower inference. Justifies applying for API credits (Gemini Pro, Claude) for even larger models.

Gemma 4 is our new default

Apache 2.0, 256K context, runs locally on Mac. Better quality than Llama 3.1 on our workload. No API costs.

Exp #1 (abstracts) · Exp #2 (full text) · Exp #4 (somatic CAG) · Research Tracker · Dashboard