How to use "Mouse Whole Cortex and Hippocampus 10x" to annotate a spatial data set in seuart?

Hi Allen Brian Map Community,

Long time listener, first time caller.

I am a novice to coding and data analysis. I am working my way through the seurat tutorial to analyze spatial transcriptomic data (Analysis, visualization, and integration of spatial datasets with Seurat • Seurat). In this tutorial, there is a section where they subset out the cortical regions of the sample data set and integrate these subsetted data with the allen brain map data from 2016 (Adult mouse cortical cell taxonomy revealed by single cell transcriptomics | Nature Neuroscience). These data are used to integrate cell type annotations for mouse cortical cells in the sample spatial data set.

My data and my biological questions are more heavily focused on the hippocampus. For this reason, I would like to use the new, 2021 whole cortex and hippocampus atlas from the Allen Institute to annotate my spatial data. These data are not neatly packaged into an .RDS file that i can use like in the seurat tutorial.

How would use these data to integrate with and annotate my spatial data set? Do i need to wrangle the whole 70 gb of data? or is there a smaller matrix, or subset of the data that i can use to accomplish what I’ve outlined.

Hopefully, this was clear enough! Thank you in advanced

Hi @ivingan01. Sorry for the delay in getting back to you and thank you for your interested in these data. At the moment there is not a separate cortex and hippocampus taxonomy or data set, nor is there a way to only download a subset of the data. However, if you follow the steps listed in this other Community Forum post you can start from the ~5GB hdf5 file instead of the ~68GB csv file and do the subsetting yourself. The analyses we’d suggest would be to:

  1. Download and subset the data as described above
  2. Additionally exclude all cells from cell types largely found outside of hippocampus. You can visualize this generally in the web app and with much more context in the associated publication. To first approximation, all non-neuronal types are in hippocampus, glutamatergic types 318-364 are in hippocampus + subiculum (see Figure 6), and GABAergic cells are more complicated but there are several type enriched in hippocampus (see Figure 2).
  3. Use the resulting data set and associated clusters as your reference (to replace the 2016 data set) and follow the spatial data tutorial mentioned above.

If any of this is not clear, feel free to post.
(If others in the community have other suggestions, please post as well!)

Update: the hippocampal data from “A high-resolution transcriptomic and spatial atlas of cell types in the whole mouse brain” (bioRxiv link) is available on AWS (NOT here)! Please see this link on ABC Atlas access instead.