Select a feature from the sidebar
Clan activation ?Mean activation of this feature across windows from each of the 7 sperm whale clans. A clan-specific feature fires strongly for one clan and near-zero for others.
How strongly this feature fires for each clan's coda sequences
Top activating windows ?The 20 coda sequences that activate this feature most strongly. Each colored token is a coda type. Brightness = relevance: vivid tokens contribute most to this feature's activation, faded tokens contribute less. Click any token to see which other features respond to it.
Coda sequences where this feature fires strongest. Brighter tokens matter more.
Logit lens ?What this feature does to next-token prediction. "Promotes" means the feature pushes the model toward predicting these coda types next. "Suppresses" means it pushes away from them. Computed by projecting the feature's decoder direction through the output head.
Which coda types this feature promotes or suppresses in next-token prediction
Co-firing features ?Features that tend to activate alongside this one. High co-firing rate means these features frequently fire on the same windows, suggesting they encode related or complementary information.
Other features that tend to activate on the same windows
Activation distribution ?Histogram of this feature's nonzero activation values. A tight distribution means the feature fires at a consistent strength. A wide distribution means it fires at varying intensities, possibly encoding graded information.
Distribution of nonzero activation magnitudes