I recently started exploring the Allen Brain Cell Atlas for the adult mouse brain and have been following the guides containing Python examples demonstrating how to access the data.
At the same time, I’m using gene expression data from the Mouse Brain ISH Atlases. Both are registered to CCFv3 according to the documentation.
However, I’m running into an issue with respect to identifying regions of interest.
When working with ISH data, the examples often use the AllenSDK to access the reference space. For example, I can download a structure tree of regions of interest from the CCFv3 2020 reference space like so:
reference_space_key = os.path.join('annotation', 'ccf_2020')
resolution = 10
rspc = ReferenceSpaceCache(resolution, reference_space_key, manifest=os.path.join(output_dir, 'manifest.json'))
# ID 1 is the adult mouse structure graph
tree = rspc.get_structure_tree(structure_graph_id=1)
From this tree, I can pull out a unique ID for each region of interest, such as “Entorhinal area, lateral part, layer 1,” which has an ID of 1121. I can further verify this ID by going to the Allen Brain Explorer and entering that unique anatomical ontology ID.
The metadata in the Allen Brain Cell Atlas, though, seems to use a parcellation index rather than the anatomical ontology IDs used in other datasets. For example, the “Entorhinal area, lateral part, layer 1” has a parcellation index of 1109.
As another example, the anatomical ontology ID for AUDv1 is 951, while the parcellation index is 949. While the indices are close, they don’t map to the same structure when using an anatomical ontology lookup.
Since both atlases are registered to CCFv3, I’m assuming the acronyms are consistent and I can map between datasets via them.
To be certain, are the acronyms the same across the gene expression datasets and the cell atlas?
Also, is there a mapping or another reference that maps parcellation indices to anatomical ontology IDs in the gene expression datasets?
Any insight would be appreciated.