Interpreting df/f trace data and finding stimulus onset

matthewchen37 · June 8, 2020, 10:47pm

I’m using the AllenSDK as part of my research and specifically want to use the Df/f trace data. When I plot out Df/f vs time (using get_dff_trace() and saving the timestamps and the trace data like the tutorial https://allensdk.readthedocs.io/en/latest/_static/examples/nb/brain_observatory.html)
I get a plot for a cell like this:

dff_traces_cell_specimen_id_517406566

Some data about this specific experiment and cell that I’m referring to from it’s metadata file if you’re wondering:

‘age_days’: 108,
‘cre_line’: ‘Cux2-CreERT2/wt’,
‘device’: ‘Nikon A1R-MP multiphoton microscope’,
‘device_name’: ‘CAM2P.2’,
‘excitation_lambda’: ‘910 nanometers’,
‘experiment_container_id’: 511507650,
‘fov’: ‘400x400 microns (512 x 512 pixels)’,
‘genotype’: ‘Cux2-CreERT2/wt;Camk2a-tTA/wt;Ai93(TITL-GCaMP6f)/Ai93(TITL-GCaMP6f)’,
‘imaging_depth_um’: 175,
‘indicator’: ‘GCaMP6f’,
‘ophys_experiment_id’: 501794235,
‘pipeline_version’: ‘3.0’,
‘session_start_time’: datetime.datetime(2016, 2, 8, 14, 9, 15),
‘session_type’: ‘three_session_B’,
‘sex’: ‘male’,
‘specimen_name’: ‘Cux2-CreERT2;Camk2a-tTA;Ai93-222424’,
‘targeted_structure’: ‘VISp’

I understand that this plot graphs out all of session B, but is there any way I can splice the data to only show the df/f for static gratings? The stimulus epoch table that I get from this experiment does not line up with the timestamps from the df/f data (this goes from 747 to 113623 while the dff_trace timestamps go from around 0 to 4000)

stimulus epoch table:
stimulus start end
0 static_gratings 747 15198
1 natural_scenes 16102 30550
2 spontaneous 30700 39579
3 natural_scenes 39580 54028
4 static_gratings 54931 69378
5 natural_movie_one 70281 79310
6 natural_scenes 80213 96091
7 static_gratings 97370 113623

Also is there a way to determine at what specific time a stimulus is given relative to the timestamps from df/f trace, specifically for static gradients? My research mentor recommended that I get approximately 3 seconds before the “static gradient stimulus onset” and 3 seconds after of df/f in a process called “epoching”.

I apologize if any of this sounds confusing. I’ve just been learning about this over the past week and any clarifications would be helpful!

saskiad · June 10, 2020, 11:02pm

The stimulus epoch table and the stimulus table provide indexes into the trace itself. So when the start & end for static_gratings is 747 and 15198, you can plot this epoch with

plt.plot(dff_trace[cell_id, 747:15198])

The imaging was acquired at 30Hz, so to add 3 seconds before and after, you would add 900 frames to either end.

matthewchen37 · June 11, 2020, 5:43pm

Thank you so much for the reply! In that case, if I want to access mean df/f for each of the 50 trials like this graph from the Stimulus documentation:

Are there any functions that can plot this graph? Or extract the pre-calculated mean df/f data? Or would I have to calculate the mean df/f myself?

EDIT: I’ve been trying to plot this graph using plot_sg_traces from: https://alleninstitute.github.io/AllenSDK/allensdk.brain_observatory.brain_observatory_plotting.html#allensdk.brain_observatory.brain_observatory_plotting.plot_sg_traces

but seem to be getting an error whenever I run the code? Here is the relevant code:

sg = StaticGratings(expDataSet)
stim_table = sg.stim_table
print(stim_table)
observ_plot.plot_sg_traces(sg, "sg_traces")

where expDataSet is the data set from the experiment and observ_plot is allensdk.brain_observatory.brain_observatory_plotting

This is my error message:

TypeError: only size-1 arrays can be converted to Python scalars`

The above exception was the direct cause of the following exception:

Traceback (most recent call last):

  File "C:\Users\Matthew Chen\Anaconda3\lib\site-packages\pandas\core\nanops.py", line 69, in _f
    return f(*args, **kwargs)

  File "C:\Users\Matthew Chen\Anaconda3\lib\site-packages\pandas\core\nanops.py", line 125, in f
    result = alt(values, axis=axis, skipna=skipna, **kwds)

  File "C:\Users\Matthew Chen\Anaconda3\lib\site-packages\pandas\core\nanops.py", line 762, in nanvar
    avg = _ensure_numeric(values.sum(axis=axis, dtype=np.float64)) / count

  File "C:\Users\Matthew Chen\Anaconda3\lib\site-packages\numpy\core\_methods.py", line 38, in _sum
    return umr_sum(a, axis, dtype, out, keepdims, initial, where)

ValueError: setting an array element with a sequence.

EDIT #2: Also, what does the variable in sg.interlength mean? Is this a time interval?

saskiad · July 2, 2020, 12:21am

Hi - sorry for the delay!

The df/f per trial and the mean df/f trial are both pre-calculated for you and should be easy to access. Within the analysis object (sg) there is two dataframes called “sweep_response” and “mean_sweep_response”.

sg = StaticGratings(expDataSet)
sweep_response = sg.sweep_response
mean_sweep_response = sg.mean_sweep_response

The sweep response contains the DF/F for each trial for each cell in a session. It includes ~1 second prior to the trial start, and ~1 second after it ends. The columns are different cells, with the cell index as the column key as a string. The rows are individual trials, with the same index as the stim table.

The mean sweep response is structured the same way, in terms of columns and rows, but contains the mean DFF during each trial as described in the white paper.

I’ll need to dig into the error that you’ve encountered above, but hopefully this helps you do what you’re trying to do.

matthewchen37 · July 9, 2020, 11:04pm

Hi, thank you for the reply!

When looking at the sweep response, for experiment 511507650 I see 154 columns meaning 154 cells, but when I view the experiment online, https://observatory.brain-map.org/visualcoding/search/cell_list?experiment_container_id=511507650&sort_field=p_sg&sort_dir=asc, filtering for the experiment, the cell list shows 305 cells. Why are there different cell numbers for the same experiment?

saskiad · July 9, 2020, 11:22pm

When you look at the experiment online, you see all of the cells that were identified for any of the three sessions that make up that experiment container. If you recall, for each field of view, we imaged three different times using different stimuli. We match cells across the sessions, and some cells are found in all three sessions, some in any combination of 2 of the 3 sessions, and some just in one session. You should notice on the website that for some cells, for some stimuli, it says NA which means that cell wasn’t identified for the session that included that stimulus. The sweep response dataframe only includes cells that were identified during the session of that particular stimulus.

(Also, just in case you hadn’t noticed this yet, one column in the sweep response is the running speed, and is called “dx”).

matthewchen37 · July 20, 2020, 8:50pm

That makes sense, thanks for the response! In regards to the running speed, I noticed there were negative values included when I do data_set.get_running_speed(). Though, I thought that speed was a scalar quantity so speed should always be positive? I noticed that the speed is measured via a wheel/disk the mouse is placed on. Is this true for all stimuli? If so, do the positive and negative values correspond to counterclockwise and clockwise directions in centimeters/sec based on the rotational speed of the wheel?

saskiad · July 21, 2020, 5:32pm

Yeah, the wheel rotation as the mouse runs forward results in positive speeds. If the mouse pushes the wheel backwards, the speeds will be negative. Usually that happens when a mouse is rocking the wheel back and forth a bit, I don’t think I’ve seen a mouse actually run backwards in these data.

RodrigoSandon · July 31, 2020, 5:33pm

Hello Dr. de Vries,

Question on the experiments that are available as of now in the Brain Observatory (456) versus the number of experiments mentioned in your paper “A large-scale standardized physiological survey reveals functional organization of the mouse visual cortex” (432). Were more experiments added since the publishing of the paper or were some experiments omitted? If some experiments were omitted from the paper, why?

Thank you,

Rodrigo

saskiad · August 3, 2020, 4:55pm

Hi Rodrigo,

It’s a little of both. New data was added between the initial paper submission and the final accepted version. Most of the new data was incorporated into the paper, but we chose not to include any of the GCaMP6s data and only analyze the GCaMP6f data in the paper.

Topic		Replies	Views
Sweep_response calculation and relationship with DF/F Technical brain-observatory-visual-coding , analysis , allensdk	1	554	May 26, 2020
Running the allensdk analyses on new data - time traces Technical brain-observatory-visual-coding , analysis , allensdk , how-to	6	1324	December 11, 2019
Data - Visual Coding Allen Brain Observatory	0	1164	February 5, 2024
What does the % in DF/F mean in the heatmap of tuning curve Technical brain-observatory-visual-coding , analysis , allensdk , how-to	1	2619	May 26, 2020
Deconvolution with OASIS or other methods Technical	2	129	June 13, 2024

Interpreting df/f trace data and finding stimulus onset

Related topics