Advanced algorithms revealing the beauty of the most complex mathematical structures hidden inside neural networks
The weights of a neural network are high-dimensional and opaque. Current interpretability methods largely ask: which inputs matter? We are interested in a different question: what structures has the network learned, and how are they organised?
If higher-categorical signatures can characterise the geometry of learned representations, then we have a new lens for model auditing and debugging: understanding not just that a model fails, but what relational structure it has (mis)learned.
This matters for transfer learning and domain adaptation: if we can identify which abstract structures a model has acquired, we can predict where it will generalise and where it will break, before deployment.
It matters for model compression: structure that is redundant or degenerate under a categorical lens is structure that can be pruned without behavioural loss.
And it matters for architecture search: if certain categorical signatures reliably yield useful representational geometry, we can design networks that learn them by construction rather than by accident.
The demo above is a provocation: that even toy signatures yield non-trivial geometry. The ongoing research programme is to scale this toward a structural vocabulary for what neural networks know and how they know it.