Embedding
Input Tokens
Token Embedding
Per-Layer Embeddings
Normalization
Pre-Attention RMSNorm
Final RMSNorm
Attention
Local Sliding Attn
Global Attn
MLP / FFN
MLP (GeGLU)
Structural
Decoder Block
MoE Side Block
Vision Encoder
Output
Output Head
Hack it
llama.cpp / GGUF
MLX (Apple Silicon)
Unsloth LoRA
QAT / Quantization
Context Extension
Click a node in the diagram
or sidebar to explore it
or sidebar to explore it