0:06

Welcome to our first guest lecture in Computational Neuroscience.

We're honored to have with us today Federiki, my colleague in the department

of physiology and biophysics at the University of Washington.

Fred was an undergraduate and graduate student at the University of California

in Berkeley. His graduate work [UNKNOWN] focused on

building theoretical studies of signal processing in the nervous system.

Fred then went on to a post-doc with Eric Schwartz at the University of Chicago,

working on mechanisms of synoptic transmission in the retina.

He then did a second post-doc with Dennis Baylor at Stanford, where he worked on

light transduction in photo-receptors. So, Fred is truly remarkable in having

made a transition from truly elegant theoretical work.

Which led to the publication of what's been a highly influential book, Spikes,

to truly elegant experimental work. The creativity and excellence of Fred's

work has lead to his recognition as an investigator at the Howard Hughes Medical

Institute. In his work, Fred combines a mastery of

technique with a beautiful clarity of thought.

And we're delighted to give you this opportunity to hear from him a little bit

about his research. >> Thanks for the introduction.

My lecture today will be about vision and starlight and the mechanisms that let us

see under these conditions. We know, from a long history of

behavioral measurements, that our ability to see under these conditions is limited

more by the division of light into discreet photons.

The physical nature of light itself, than it is by biological noise and

inefficiencies. As we'll see in a minute, that raises

some general computational issues. First, let me introduce the retina

itself, which is where the visual process begins.

2:44

So, the situation we're thinking about is a few rods out a of a pool of a thousand

absorb photons, all of the rods are generating noise.

We want to know how to pool signals across those rods to reliably extract

signals from those rods that absorbed photons.

Those sparse signals can reject noise from the remaining rods.

This is a situation where averaging, you normally think about averaging as being a

good strategy to extract weak signals. Under these circumstances, averaging is a

disaster. That's because the signal is not

uniformly spread across the array of detectors.

It's a little bit like a situation where you're in a football stadium.

There's a 1000 people yelling of you, you care about a few of them.

Under those circumstances, a good strategy for extracting signals from

those few people you care about would not be to stick a microphone at the 50 yard

line. And average across everybody, all those

sources of sound in the stadium. Instead you would need to go seat by seat

and make a selection, is this likely to be the person who I care about, based on

some prior information about them. Say the, the people you care about gotta

be a little bit louder than average. So you go seat by seat, make a selection

about which people to retain, which people to reject and then average those

resulting signals. So, looking for something analogous here.

That is, we're looking for some kind of threshold, which selectively retains

signals from those rather likely to be absorbing photons.

And rejects noise from the remaining from the remaining rods.

The behavioral consequences for getting this right are fairly extreme, that's

because the signal is so sparse. If we can reject noise from the 99.9% of

rods that are just generating noise and selectively retain signals from the on

order of .1% of the rods, that absorb photons.

We stand to win considerably. Okay.

So, what we're looking for then is some kind of thresholding non linearity, which

retains signals from those rods that absorb photons and rejects noise from the

remaining rods. They mention this as a general issue,

it's one that comes up in many other cases in the nervous system.

Cases in which you have convergence of many inputs onto a downstream cell and a

small subset of those inputs are active while all of the inputs are generating

noise. There are a number of conceptual and

technical advantages for studying this issue in the context of photon detection

in the retina. One of those is that we have access to

the signal and noise properties of the rod photo receptors, so we can measure

the responses of the rods to single photons.

We can measure the noise and the rod responses and we can summarize those by

constructing distributions, that capture the probability that the rod generates a

given amplitude response. We can plot that as probability versus

amplitude. For those rods in black that failed to

absorb a photon and are just generating noise.

And those rods in red that absorbed a photon and are generating single photon

response. Those give us the basis for making

theoretical prediction about how to make a selection between signal and noise.

Particularly, we might think that this threshold in non-linearity should come

in, and slice out and eliminate responses from those rods that are generating

noise. And retain responses from those rods that

generate single photon responses. It's nice because we have a theoretical

basis for what an appropriate readout might be, for the rod array under these

conditions. We also know a great deal from the

anatomy about where such a thresholding non-linearity might be implemented.

In particular, the rod signals traverse the retina through a specialized circuit.

The first cell in that circuit is known as a rod bipolar cell.

And rod bipolar cells receive input from multiple rods.

So, that means that they have already combined signals from multiple rods.

If they do so in a linear fashion, in other words they equally weight inputs

from rods that are generating noise and signal.

You've already begun to average the rod responses and you've begun to

inextricably mix signal and noise. So, the last opportunity, where we have

access to the responsive individual rod photoreceptors.

We have the full capability of making a selection of those rods that are

absorbing photon, and generating single proton response.

Versus signals or noise from those rods that fail to absorb a photon.

Is here at the synap between the rods and the rod bipolar cell.

So, anatomically we have a good prediction about where such a

thresholding non-linearity might occur. It should occur at the synapse between

rods and rod bipolar cells. Indeed if we record from rod bipolar

cells, we see evidence for such a thresholding non-linearity.

We can now take the measured distribution of rod signal and noise, and ask what the

appropriate non-linearity is to predict the bipolar responses.

I summarize that here. So, again these, this is plotting

probability versus the amplitude, the distribution of rods that are generating

noise responses. And the distribution of responses from

rods that absorbed the protons. On the same scale, I plotted the

estimated non-linearity at the synapse between the rods and the rod bipolar

cells. That's here in blue.

I plotted gain of the non-linearity versus the amplitude.

So, you think of this non-linearity as everything to the left get's eliminated.

So, all this noise and a good chunk of the single proton response distribution

gets eliminated. And only those responses that are to the

right of this transition from the non-linearity between the gain of 0 and 1

are retrained. So, we see evidence for non linear

threshold between rod and rod-bipolar cells, it's kind of what predicted.

Interesting thing here is, that we not have predicted the location of this non

linearity. In particular, it is located well up into

the single photon response distribution. Naively we might look at these and say,

well I should really put a line here, right at the crossing point between the

noise distribution and the signal distribution.

That would be choosing a location for this threshold in non-linearity, which

makes a decision based on the likelihood of the given amplitude response.

If the amplitude of the response is, is if a given amplitude response is more

likely to have arisen from this distribution of single photon responses,

we would retain it. If it's more likely to have risen, arisen

from the noise distribution, we would eliminate it.

Instead this thresholding non-linearity seems to be pushed off to the right.

In other words, we're eliminating many single photon responses, which seems like

exactly the opposite of what we'd like to do, to build a system that operates on

low light levels. However, this particular way of plotting

the data, is somewhat misleading. An what we've not accounted for here, is

the prior probability that the rod absorbs a photon.

You can think about that as there is some area under this curve, the noise curve

which represents the likelihood if the rod is generating noise.

There is some area here under the single proton response distribution curve, which

is the likelihood of probability the rod absorbs a photo.

11:39

However we really want to think about these distributions as what would happen

near visual threshold, when something like one in ten thousand rods absorb a

photon. And that's what I've plotted over here.

So, now the area underneath the noise distribution and the signal distribution

had been scaled to represent this prior probability that something like 1 in

10,000 robs absorbs a photon. So, the area under the signal

distribution here is 10,000 times smaller than the area under the noise

distribution. That shifts this crossing point of the

signal and noise distribution far out to the right.

And it shifts it to a point that's very close to the location of the transition

of the non-linearity between a gain of 0 and a gain of 1.

In other words, if you're simply applying a rule like maximum likelihood, you get a

given amplitude response from the rod. And you're going to associate that with

noise, if that amplitude is more likely to have originated from the noise

distribution. You're going to associate it with signal

if that amplitude is more likely to have originated from the signal distribution.

That simple rule can predict the position of this nonlinearity and predict in

particular that you should throw away many single photon responses.

You should do that because the cost of accepting those amplitudes, down in here,

is to allow lots of noise to come through the system.

Much more noise than you would like to allow to come through.

So, the basic bottom line here is the nice example in which the prior

probability has an important impact on how we think about signal detection

theory working. Now we think about appropriate strategies

for extracting sparse signals from many noisy inputs.