3D Attention Volume — Scan through the model's layers
Each horizontal plane is one layer's attention matrix for the selected head. Z axis = layers. Drag to orbit, scroll to zoom.
Thinking 3D — Watch the model process your prompt
Layer-by-layer animation showing how predictions form. Each plane is a layer's confidence heatmap. Attention beams show token connections. Predictions appear on the right.
Residual Stream Divergence
L2 norm of the difference in residual streams. Shows where the model first notices the prompts differ.
Logit Lens Diff
Green = both prompts predict the same token. Red = predictions disagree. Intensity = Jensen-Shannon divergence.
Attention Divergence by Head
Frobenius norm of attention difference per head. Click a cell to compare patterns side-by-side.
Prompt A
Prompt B
|A - B| Difference
Evolution Comparison
How the last-token prediction evolves through layers for each prompt.