Mean waveforms of Neuropixels Visual Coding Dataset

Hello!
I’ve downloaded some of the session files for neuropixel dataset, many thanks for this set of data!
Question: If the RMS measurements for the waveform of a cell are generated by randomly sampling 1000 times from the pool of waveforms for an individual putative cell, why are there <1000 waveforms for each cell? For example, in session 715093703 the cell with id 950931751 has 383 waveforms that can be aligned and averaged; but far <1000.
Hopefully I’ve understood the dataset correctly so far!
Eric

Hi Eric,

The mean waveform for each cell is a matrix with dimensions of channels x times. It’s about 2 ms data from all of the channels of the probe, averaged across 1000 individual waveforms. So, the 383 refers to the number of channels on the Neuropixels probe, rather than the number of waveforms used for averaging.

The waveform is stored as an xarray DataArray, to allow you to access the data using semantic indexing. For example, if you want to pull out the waveform for only the peak channel, you can use:

peak_channel = session.units.loc[unit_id].peak_channel_id
wv = session.mean_waveforms[unit_id].loc[{'channel_id' : peak_channel}]

However, it’s also useful to look across a wider range of channels, because there’s information about the cell type in the direction and speed of the waveform propagation. See this paper for more details.

Hi Josh,
Thanks for the feedback, I appreciate the input and knowledge that I didn’t actually understand the database as clearly as I thought!
Thank you,
Eric