Different id names in the Allen Brain Observatory

Jballbe · February 16, 2026, 12:54pm

Hi!

I am new in using the Allen Brain Observatory, and I am trying to understand how the data are organized and how I can access them with the AllenSDK. I followed some of the instructions from this tutorial , and I got confused with the different id available. For example I used

# Download cells for a set of experiments and convert to DataFrame
cells = boc.get_cell_specimens()
cells = pd.DataFrame.from_records(cells)

and in cells dataframe, there are “cell_specimen_id” ; “experiment_container_id” ; “specimen_id”

On the other hand, when I use

pvalb_ecs = boc.get_experiment_containers(cre_lines=['Pvalb-IRES-Cre'])
pvalb_exp_table = pd.DataFrame(pvalb_ecs)

the dataframe pvalb_exp_table, there are “id”, “donor_name” and “specimen_name”.

Can you tell me or point me to a tutorial which would indicate what’s the difference between these different ids please? I would notably be interested in having a list of cells id that I could group per Cre-line and animal.

Thank you very much for your help!

Best,

Julien

saskiad · February 17, 2026, 7:39pm

Hi Julien,

These fields are indeed confusing. I don’t know that there is convenient documentation I can point you to, but I will try to unpack them here.

In the cell specimens table (the output of boc.get_cell_specimens()):

“cell_specimen_id” is a unique ID for the individual cell. This ID will also be in the data object for the session(s) that cell is imaged in.
“experiment_container_id” is a unique ID for the experiment container, which consists of three sessions imaged from the same group of cells. (see this diagram: Brain Observatory — Allen SDK dev documentation ). Each of these sessions also has a unique ID, and I’ll show that in a moment.
“specimen_id” is an unique ID for the subject (i.e. mouse) that the data was collected from. This is redundant with other fields.

When you use boc.get_experiment_containers(), you are getting a list of experiment containers that meet a given criteria.

“id” here is the experiment_container_id. (see comment below)
“donor_name” is the id for the subject. (This is the field that I use for identifying data by animal)
“specimen_name” combines the genotype and donor_name for a subject. This is how subjects are tracked in our internal management systems, but we often reduce this to the donor_name for simplicity.

If you were to use boc.get_ophys_experiments() to get a list of individual sessions, you will find that each entry has an “id” field, which is the id for that individual session, as well as an “experiment_container_id” field, which is the id of the experiment container that it belongs to. The fact that “id” can be different things for different queries has to do with the relational database that organizes all these fields, and (to me) is very confusing.

To get a list of cells based on Cre-line and animal, I would either:

use the cell specimens table and filter by your Cre-line and the specimen_id field.

or

2.select containers (or individual sessions) by Cre line, then by donor_name, and then get the cell_specimen_ids from the data object. This only gives you the specimens within that given session.

Let me know if you have more questions!
Saskia

Jballbe · February 19, 2026, 8:46am

Hi Saskia,

Thank you very much for your explanations, those are very helpful!

Have a good day,

Julien

Jballbe · April 17, 2026, 11:31am

Dear Saskia,

I hope you are doing well. I have some more questions about the data. To get the number of cells per container I ran

all_cells = boc.get_cell_specimens()
all_cells_df = pd.DataFrame(all_cells)
area_cre_line_specimen_counter = all_cells_df.grouby(["area", "specimen_id", "tld1_name", "experiment_container_id"]).size().reset_index()
area_cre_line_specimen_counter.columns = ["area", "specimen_id", "tld1_name", "Container", "N_cells"]

and in summary_stats, notably for container 670396939, (PValb in VISp) I get 13 for N_cells.

However, when I try to extract the cell traces for this container:

experiment_container_id = 670396939

###For each container there are 3 imaging session, each one having its own data file, so we need to specify which datafile/session we want

#We can get for a given container_id the list of the imaging session with the function  boc.get_ophys_experiments
sessions = boc.get_ophys_experiments(experiment_container_ids =[experiment_container_id])
sessions_df = pd.DataFrame(sessions)

session_id_natural_scene = boc.get_ophys_experiments(experiment_container_ids = [experiment_container_id], stimuli = ["natural_scenes"])[0]['id']


###GET DATA
#get_ophys_experiment_data returns the data object giving us access to the NWB file  for a SINGLE IMAGING SESSION
data_set = boc.get_ophys_experiment_data(session_id_natural_scene)

#From there we can access different traces or data (ROI, maximum projection, DF/F traces...)
#the function get_dff_traces returns 2 objects: the time_steps, and the dff traces
ts, dff = data_set.get_dff_traces()

I only get 7 traces.
I therefore understand that for a given container, the sessions A, B and C corresponding to different experimental protocols do not record the same cell, am I right?

Thanks for your help!

Best,

Julien

saskiad · April 20, 2026, 5:01pm

Hi Julien,

Sorry for the delay - just getting back from being out for a short bit. This sounds completely right: there were only 7 cells identified in this specific session, but 13 identified across all three sessions. When I have a minute later today I’ll confirm just to be sure.

Saskia

Jballbe · May 12, 2026, 8:00am

Dear Saskia,

I am very sorry for not answering earlier. Thank you for your response, indeed after digging into the Allen tutorials, it is never indicated that the same cells are recorded across sessions.

I have another quick question, I downloaded the running speed of the experiment_id 674679940, and some values are negative. Is that an artefact or does it represent an opposite direction ?

Thank you for your help!

Best,

Julien

saskiad · May 12, 2026, 6:45pm

Hi Julien,

The running speed comes from the running disc that the mouse is on. Usually the negative values are when the mouse was largely stationary and would rock or wiggle the disc back and forth underneath it. You often see this at the end of a running bout when the mouse stops - kind of a bit of hysteresis.

Saskia

Jballbe · May 13, 2026, 9:13am

Hi Saskia,

Thank you very much for the precision !

Have a good day!

Best,

Julien

Topic		Replies	Views
How to get the container id's that I have downloaded from Allen Brain Observatory from the cache Technical	4	188	April 18, 2024
Access to donor id from cell_id Allen Cell Types Database allensdk , how-to	2	102	September 6, 2024
Software Development Kit (Allen SDK): Allen Brain Observatory Allen Brain Observatory	0	370	February 5, 2024
How can I get the L0 regularization events for a specific cell ID? Technical brain-observatory-visual-coding , analysis , allensdk , how-to	2	659	September 6, 2019
Cell Types Database: API Allen Cell Types Database	1	1378	February 7, 2024

Different id names in the Allen Brain Observatory

Related topics