Missing LFP channels, and scale factor

I have been working with the LFP data from the Allen Institute Neuropixels Visual Behaviour dataset. I have two questions about working with the LFP data.

1) Scale factor

Firstly in the tutorial on LFP analysis (here) they do not seem to be scaling the data at all and I heard the Allen Institute neuropixels data had a 0.195 byte/volt scale factor.

2) Missing Channels (channels are described in meta data which do not have lfp data) Secondly there seems to be some channels missing from a recordings for instance from the first recording in probeE there are 73 channels from that probe in CA1, but in the lfp data only 18 exist.


# calling the metadata about the probes and channels from first recording and its LFP data session = cache.get_session_data(sessions.index.values[0]) probe_id = session.probes[session.probes.description == 'probeE'].index.values[0] lfp = session.get_lfp(probe_id)

lfp.shape

returns:
(12028993, 95)

But despite neuropixels probes have 364 channel (and one reference channel) there are only 95 channels total in the lfp data from that recording. Running… ‘’‘lfp.shape’‘’ returns… ‘’‘(12028993, 95)’‘’. Likewise if I check for the number of channels in a brain regions, for instance CA1, as listed by the structure annotations in the session object storing the meta data there are far fewer lfp channels in the lfp object.

# check the number of channels from this probe in CA1 session.channels.structure_acronym[(session.channels.probe_id==probe_id)&(session.channels.structure_acronym=='CA1')]

returns:
id 850258856 CA1 850258862 CA1 850258868 CA1 850258874 CA1 850258880 CA1 ... 850258988 CA1 850258992 CA1 850258994 CA1 850258998 CA1 850259000 CA1 Name: structure_acronym, Length: 73, dtype: object

So in the metadata there are 73 channels from this region.

# now we check for matching channel identifiers ca1_idx = np.isin(lfp.channel.values, ca1_chans.index.values) ca1_idx = lfp.channel.values[ca1_idx] ca1_idx.shape

returns:

(18,)

So despite the meta data indicating that probe E has 73 channels in CA1 the LFP data has 18 of those 73. Have I simply chosen a bad probe? Please help me understand why there is missing LFP data. I checked the next session and it was a similar story. Only 19 channels in the lfp this was in session…
session = cache.get_session_data(sessions.index.values[1])

So I am discussing sessions 715093703 and 719161530

Hi there! If you load the data via the AllenSDK, it will be returned in units of volts. The scaling factor is only needed if you’re directly reading the raw data.

Regarding the missing channels, we only save every 4th LFP channel in the NWB files (and also the data is downsampled from 2500 Hz to 1250 Hz). This is because there’s a lot of redundant information on adjacent LFP channels, and excluding channels makes the resulting files much smaller. We also exclude any channels that are outside the brain. So going from 384 to 96 or fewer total channels is expected.

Metadata about all of the probe channels is still stored in the file, because spike waveforms are saved for the entire span of the probe.

Awesome thanks very much for clearing that up! Do you recommend subtracting the median value for a channel as well to remove noise? Also if I’m computing CSD along the probe, that would mean that the distance between channels is 60 micrometers (20 x(4-1) dist between channels) correct?

Angus

The data pre-processing steps include subtracting the median value of the channels outside the brain to remove common noise, so you don’t need to do that again.

The distance between the channels is 40 microns, since there are 20 microns between rows, and taking every 4th channel results in skipping a row.