7 Comments
User's avatar
galen's avatar

Why is point (4) probably infeasible in basically all cases?

Expand full comment
Konrad Kording's avatar

because we know that most classes can basically approximate anything.

Expand full comment
Will Dorrell's avatar

Thanks for posting this. I was confused about one aspect: why should it be worrying that a task-optimised RNN recovers the same set of phenomena?

Couldn't it just be that the RNN learns to be exactly the mathematical model we initially wrote down?

I think figures 4-6 of Sorscher, Mel, et al. 2023, A unified theory for the computational and mechanistic origins of grid cells, point to some evidence that this is what happens?

Expand full comment
Konrad Kording's avatar

I think there is a confusion about the logic. The mathematical definition of attractors is not a process model, it is merely a property of the system. So I do not see how it could be a mechanism. And the RNN could be of the form of the neuroscience model that I gave. Or pretty much any other form as RNNs of pretty much any shape can approximate pretty much anything. Hence we can not know the real mechanism.

I think the Sorscher paper is really great. But I do not see how it advances the finding to mechanism? Can you maybe unpack this?

Expand full comment
Will Dorrell's avatar

Thanks for the response, I think this has partly answered my question, but perhaps I can explain what I understand and you can tell me where I mess up (sorry for the long post!):

By mathematical model I mean the more zoomed-in neuroscience one you refer to, with particular connectivity and velocity neurons etc. I see that you're making a (very reasonable) distinction between this and the more abstract model. So, my bad, I'm just referring to the neural model, not the abstract one. We all agree the neural model is a mechanistic model of a circuit?

And I agree, an RNN can presumably solve the task in many ways that aren't the neural model. I think I now understand your main point, which is a good one: a variety of RNNs might well show us other ways to solve the problem that would look the same to all experiments we've run so far, so we should run experiments that actually try to measure the local connectivity etc.

I guess the one place I still disagree is how much I should worry about this. There will always be theories that fit all the data observed but I'm happy to discount. It could be that supernatural deity X is moving particles in such a way that physical theory Y looks reasonable, except in one small portion of the ocean where X happens to enjoy screwing with us. Discounting that theory even though we haven't tested every patch of ocean seems okay. Similarly, just because the alternative theory takes the shape of an RNN, I don't think we should necessarily care. Some RNNs have unreasonably large weights, or very large numbers of neurons, or concurrently implement minecraft and track an angle, it seems reasonable to discount these.

Then which RNNs are reasonable? I guess the surprising thing is that training task-optimised neural networks in relatively assumption-free ways (e.g. not too many neurons, a bit of weight regularisation) seems to produce networks that solve the task in basically the same way as the neural model that was written down in yesteryear. I point to the Sorscher, Mel paper because it seems the best example of evidence of this (localised connectivity and shifted velocity neurons) - though I'm sure there's others. That's evidence that at least one 'reasonable RNN' implements the neural model.

Then we just need to see if there are others. I agree the concept of 'reasonable RNN' is loose, but there are definitely limits to what an RNN the same size as the fly ring attractor can do, and it seems reasonable to regularise the weights and only think about a network after training to convergence. If these are the conditions I would not be too surprised if we're already quite close to identifying the reasonable RNN model up to unavoidable indeterminacies (e.g. neuron swapping).

Expand full comment
Konrad Kording's avatar

I think the point you make is a very deep one: Our measurements are compatible with a simple model. Why would we worried about all the more complicated models. E.g. in physics there could always be microscopically an entity (say god) that communicates quantum mechanics. Why would we need to worry about this?

But I think that the two domains physics and neuroscience are, deep down, quite different. In physics, the more complicated model is really not very appealing - after all we already did all the possible experiments and the more complex model would be inconsequential. But in neuroscience, the more complex model can experimentally be tested (e.g. through perturbation studies, recording more neurons, etc). As such we can more beyond and we should expect that any amount of effort put on falsifying the simplest model we have will do so. It is also consequential, if we want to say cure a disease by perturbations, a wrong model will wrongly predict what happens.

But I think in practice the distinction is even more problematic. Because there are so many alternative models in the RNN space, in particular when not recording all neurons, the simplest model will make no meaningful out of domain predictions. In many ways it would be entirely useless.

To recap, the model simplicity argument in a way really does not work in neuroscience. It does not work conceptually. It is not useful. And it does not even make meaningful predictions.

Expand full comment
Will Dorrell's avatar

I can see the point you're making, but personally don't find it convincing.

For example, the alternative physical theory suggested above is consequential and experimentally testable: I just need to go to the proposed patch of ocean and perform basic physics tests. This might matter - maybe I want to build an oil well there, then it matters which physical rules are operating. Yet despite having testable, consequential differences it would still be ludicrous to waste effort testing this hypothesis.

I think this situation is exactly analogous to some neuroscience, especially discussions of the fly ring attractor. I have never been presented with an alternative model that fits all the measured data in the way attractor models do. The model makes out-of-domain predictions, it was posited in the 90s and its predictions have been verified, both in the central complex, and more recently in the entorhinal cortex (see fig 3 of vollan et al. 2025).

Don't get me wrong, there are places where I agree with you much more: if you are building exploratory RNN models of some otherwise arbitrary neural recordings then the right way to proceed seems much closer to your approach (with analyses like in Galgali et al. 2023). But in the case you specifically target - attractor models - I remain convinced that the alternative hypotheses you point to, even if they are RNNs, are largely of the `god moving particles' type. I suspect this point will hold more broadly too. Of course, building viable alternative models is a great use of time, but I don't think it has happened yet, and in this case I wouldn't bet on one existing.

Expand full comment