Gemma 4 Architecture Explorer · hackable edition

Hack Recipes →


Embedding
Normalization
Attention
MLP / FFN
Structural
Output
Hack it
Decoder Block (×N, 5:1 local:global)Input TokensSentencePiece 262KToken Embedding262144 × d_model, tiedPer-Layer EmbeddingsE2B/E4B only — THE trickPre-Attention RMSNormLocal Sliding Attnwindow 512, 5 of every 6Global Attnp-RoPE, every 6th layerMLP (GeGLU)gated, 8× expandMoE Side Block26B-A4B onlyVision EncoderSigLIP-v2, var-aspectFinal RMSNormOutput Headtied to tok_embllama.cpp / GGUFuniversal runtimeMLX (Apple Silicon)M-series nativeUnsloth LoRAfast fine-tuningQAT / Quantizationint4 shipsContext Extensionp-RoPE scaling
Click a node in the diagram
or sidebar to explore it