You’re right that fMRI measures blood flow rather than direct neural activity, and the authors acknowledge that limitation. But the study doesn’t treat it as a direct window into brain function. Instead, it proposes a predictive attention mechanism (PAM) that learns to selectively weigh signals from different brain areas, depending on the task of reconstructing perceived images from those signals.
The “thermal imager” analogy might make sense in a different context, but in this case, the model is explicitly designed to deal with those signal differences and works across both modalities. If you’re curious, the paper is available here:
[0] https://www.biorxiv.org/content/10.1101/2024.06.04.596589v2....
The paper [0] doesn’t pretend otherwise. It trains a model (PAM) to learn which brain regions carry useful info for reconstructing images, and applies this to both fMRI data from humans and intracranial recordings from macaques. The two signal types are handled separately.
If you want an analogy, it’s less like tapping power lines and more like trying to figure out which YouTube video someone is watching by measuring heat on the back of their laptop every few seconds. There’s a pattern in there, but pulling it out takes work.
[0] https://www.biorxiv.org/content/10.1101/2024.06.04.596589v2....
The "fully process" part is part of the story though -- e.g. perhaps some reactions use the dorsal stream based on peripheral vision while ventral stream is still waiting on a saccade and focus to get higher resolution foveal signals. But though these different pathways in the brain operate at different speeds, they're both still very much in the brain.
https://assets.nautil.us/10086_6412121cbb2dc2cb9e460cfee7046...
https://nautil.us/the-strange-brain-of-the-worlds-greatest-s...
(the path is from the back of the head (V5?) where the visual nerve comes into brain)
But yeah, reflexes are processed in the central nervous system (CNS), typically the spinal cord or brainstem, not necessarily the brain.
Am I understanding this right? It seems that by reading areas of the brain, a machine can effectively act as a rendering engine with knowledge on colour, brightness etc per pixel based on an image the person is seeing? And AI is being used to help because this method is lossy?
This seems huge, is there other terminology around this I can kagi to understand more?
AI is the method. They put somebody in a brain scanner and flash images on a screen in front of them. Then they train a neural network on the correlations between their brain activity and the known images.
To test it, you display unknown images on the screen and have the neural network predict the image from the brain activity.
Not onto known images, onto latent spaces of existing image networks. The recognition network is getting a very approximate representation which it is then mapping onto latent spaces (which may or may not be equivalent) and then the image network is filling in the blanks.
When you're using single-subject, well-framed images like this they're obviously very predictable. If you showed something unexpected, like a teddy bear with blue skin, the network probably would just show you a normal-ish teddy bear. It's also screwy if it doesn't have a well-correlated input, which is how you get those weird distortions. It will also be very off for things that require precision like seeing the actual outlines of an object, because the network is creating all that detail from nothing.
At least the stuff using a Utah array (a square implanted electrode array) is not transferrable between subjects, and the fmri stuff also might not be transferrable. These models are not able to see enough detail to know what is happening- they only see glimpses of a small section of the process (Utah array) or very vague indirect processes (fmri). They're all very overfitted.
There are startups working on less intrusive (e.g. headset) brain-computer interfaces (BCI).
More importantly, these techniques operate on the V1, V4 and inferior temporal cortex areas of the brain. These areas will fire in response to retina stimulation regardless of what's happening in the rest of your brain. V1 in particular is connected directly to your retinas. While deeper areas may be sympathetically activated by hallucinations etc, they aren't really related to your conception of things. In general if you want to read someone's thoughts you would look elsewhere in the brain.