Zefram

Behavioural reverse-engineering with neural networks

Loading Simulation...

In this toy, we demonstrate how a warm-start neural network attempts to control the internal parameters of a Braitenberg Vehicle to match a target observed behaviour.

Simple Machines

Braitenberg vehicles are simple robots with sensors (for light and temperature) connected to motors for two wheels; they were invented in 1984 by the cyberneticist Valentino Braitenberg, who intended them to demonstrate how complex behaviours (loftily named "love", "fear", "curiosity", and so on) can arise from simple rules.

In this sense, Braitenberg vehicles prefigure the philosophy of Boids, Multi-Agent Systems, and studies relating cellular automata to fundamental physics. All of these fields, broadly speaking, make the same point: complex and lifelike behaviours can emerge from systems with simple internal mechanisms. The upshot of each is the converse proposition, that if we can find the simple mechanism that produces an observed complex behaviour, then we have a model that explains the behaviour. Let's provisionally call such approaches simple machines.

Part of the ongoing intellectual shock of LLMs is that such approaches have fallen out of favour: after all, there is no need to search for simple mechanisms if the aim is to produce complex behaviour, when instead one can use a complex mechanism tuned with a lot of data. The crucial advantage that neural approaches hold over simple machines is that neural networks benefit from feedback: backpropagation is a feedback process in which a stream of data shapes the network to have desired input-output behaviours, like a rock is smoothed and shaped over time when placed in a river.

So, the question is whether the playing field can be levelled: can simple machines also benefit from data-driven feedback mechanisms? If so, then the classical weakness of simple machines — namely the difficulty of reverse-engineering controlling parameters for observed behaviours — may be bypassed.

Obviously yes (but subtly it is uncertain)

The gut-reaction answer is that this is just the problem setting of reinforcement learning, and so there is nothing new about this problem. But there are some mild distinctions that may matter.

First, the setting of feedback for simple machines is about behaviour matching, in domains where there are no a priori metrics, or derived metrics from learnt representations. Whereas most settings in reinforcement learning have a reward signal and representation-learning is a secondary concern, the priorities here are reversed.

Second, there may be considerable causal- or computational- distance between the control surface of a simple machine and its behaviour: whereas in reinforcement learning the learner produces an action of an agent directly, in the setting of simple machines the learner tweaks something "behind the scenes", hoping that resulting actions are classified as behaviours similar to the target, modulo some learned representation.

There is good evidence that such a setting is still possible. For instance, it is well known that RL suffices for StarCraft II, and more down-to-earth, neural cellular automata can learn to match emojis: all this is evidence that representation dependence, coordination, and mediating computational mechanisms are not fundamental obstacles, but there may be an underserved niche to explore that does not cleanly fall within established disciplinary boundaries.

Why does this matter?

Simple machines are cheap. A Braitenberg vehicle runs on a microcontroller; a neural policy requires a GPU. If we can tune simple controllers to match complex behaviours, we unlock deployment at the edge: drones, warehouse robots, IoT actuators, all without the latency, power draw, or cloud dependency of neural inference.

Simple machines are fast to simulate. A digital twin governed by simple rules can be rolled forward millions of times for planning, scenario analysis, or reinforcement learning's inner loop. This matters for logistics and fleet coordination, where you need to simulate many agents interacting over long horizons, and for supply chain stress-testing, where you want to explore disruption scenarios faster than real time.

Simple machines are auditable. Regulated industries, such as medical devices, autonomous vehicles, and aerospace face certification regimes that neural networks struggle to pass. A controller with twelve interpretable parameters is legible to a regulator in a way that a million-weight policy is not.

And simple machines compose. Unlike monolithic neural policies, Braitenberg-style controllers can be mixed, matched, and hierarchically combined. This modularity enables configurable behaviour libraries, off-the-shelf locomotion, obstacle avoidance, or target-seeking that customers can tune without retraining.

Besides being fun to watch, the demo above is a sketch proof-of-concept. We know that gradient-based feedback can shape simple machines to match complicated behaviours using neural feedback mechanisms: the ongoing research question is how this scales.