Hi,
When using MapMyCells, instead of mapping to the whole brain, is it possible to restrict the mapping to one or several specific brain region(s)? For example, I know my sample is from LHA, then map to the cells only in HY might give us a much better accuracy. Is it possible to achieve this by changing the code in MapMyCells jupyter notebook?
Kind Regards
Xinyu/Sophie
Hi @sophiechenhf
What you describe is not possible in the on-line tool but, as you guessed, is definitely possible if you clone the backend python code and run it yourself.
The process is a little less direct that what you describe. You cannot specify an anatomical region (like HY). However, you can tell the MapMyCells code to only consider a subset of cell types in the taxonomy and only map to those. So: what you need to do is download the data (or really just the metadata) from the Yao et al. 2023 Whole Mouse Brain analysis using the abc_atla_access tool, determine which cell types are confined to HY, and tell MapMyCells to only map those cell types.
This notebook describes that process. It actually does it in two ways. The first way is somewhat time consuming. You download the Yao et al. 2023 transcriptomic data, limit it to a specific anatomical region (we chose the isocortex for no particular reason), create a new set of marker genes based only on cells sampled from that region, and map to an isocortex-only taxonomy using those marker genes.
This is, as I said, costly (in terms of compute time). We did it for pedagogical purposes.
If you don’t want to go to the trouble of selecting a new set of marker genes and just want to restrict your taxonomy to HY-specific cell types and go ahead, using the marker genes already computed by Yao et al., section 8 of the notebook shows you how to do this (you will still want to consult sections 1-3 to see how to download the Yao et al. data).
Please let us know if anything is unclear/does not work.
Cheers,
Scott
Hi Scott,
This is very helpful! We will try it out! Thanks a lot!
Hi Scott,
I have a follow up question. In the example notebook, the entire WMB-10Xv3 dataset was downloaded and isocortex cells were extracted from the metadata. Why not just download the Isocortex-1 and Isocortex-2 dataset? Is it because other datasets like MB might also contain cells with feature_matrix_label = isocortex?
Kind Regards
Xinyu/Sophie
Cell [6] of the notebook downloads the full set of metadata. There is little cost to this (it is large for CSV files, but small compared to .h5ad files), and we ultimately need most of the files to help us associate cells with cell types.
.h5ad files are downloaded in cell [10] of the notebook, and that should be limited to downloading only the .h5ad files that contain the cells we are analyzing for purposes of the notebook.
Is this not your experience? (if the full 10Xv3 dataset were downloaded, it would take up ~ 80 GB of hard drive space).
Got it! That makes sense!