Regional cell taxonomy

scanchi · October 9, 2024, 5:59pm

Hello All,

Thank you for providing the community with both the resources and tools for taxonomy assignment. Our lab is interested in developing region specific reference for our analysis purposes. I have been using the command line version of the MapMyCells and @danielsf has been very helpful in helping solve roadblocks thus far. During the process, I realized that the original taxonomy assigned to the region of interest had many subclasses that had very few cells assigned (n < 20). Closer examination of these classes revealed that they not originally belonging to the region of interest i.e. Hippocampus in this case.

Were these assigned due to the process of computational approximation or are these cells really part of the hippocampal formation ? Currently, it was suggested to implement a strategy to pool cells across the taxonomy for better marker signal detection. Before doing that, I want to confirm if these subclasses are biologically correct or if they should be filtered out. Thank you for your help!

jeremyinseattle · October 14, 2024, 5:10pm

Hi @scanchi,

I’d suggest removing any cluster that seems unreasonable and then grabbing all (or more) of the cells from clusters that seem reasonable. In this case the key question is how to define “reasonable”, which ultimately comes down to a neuroscience question. As a very rough rule of thumb (e.g., you should confirm for your use case): non-neuronal types, and to a lesser extent GABAergic types, are more likely to span regions than other neuronal types and could be retained. When in doubt, I’d recommend keeping the cluster. If the mapping is working correctly, then if you have a cell type from the wrong region, no “high quality” cells should map to it.

Best,
Jeremy

scanchi · October 15, 2024, 4:30pm

Thanks Jeremy! This does sound like a reasonable approach and what I was leaning towards as well. Does this also imply that when we run a dataset against WMB using MapMyCells tool, we are likely to see a portion of results map to taxonomy that is likely inaccurate ? However, we can determine confidence in assignment based on number of cells called as well as perhaps mean expression values.

danielsf · October 15, 2024, 5:51pm

MapMyCells/the cell_type_mapper code reports some confidence metrics to help you judge the quality of each cell’s mapping. They are

avg_correlation is the average correlation coefficient between the cell and its chosen cell type (class, subclass, cluster, etc.). (i.e. how well correlated is the cell with its chosen type in the space of marker genes)
bootstrapping_probability is the fraction of the the bootstrapping iterations that actually chose the assigned cell type (i.e. when we randomly subsampled the marker genes, how frequently did the mapping result change or not).

This jupyter notebook does a deep dive on the contents of the files output by MapMyCells when run on real data and shows how the quality metrics correlate with the actual quality of mapping.

Note: since you are running the code locally, I would recommend setting type_assignment.bootstrap_factor=0.5. I have found that this gives the bootstrapping_probability metric more informative values than the default type_assignment.bootstrap_factor=0.9.

Topic		Replies	Views
Mapping to Brain region MapMyCells	5	497	June 13, 2025
Select specific brain region to map in MapMyCells MapMyCells	5	60	March 26, 2025
Low diversity in mapped results MapMyCells analysis	15	57	May 28, 2025
Inquiry About Human Gene Markers in MapMyCells MapMyCells atlas-human-brain-adult	3	25	June 30, 2025
MapMyCells marker genes of classes Cell Taxonomies celltype	2	228	March 19, 2024

Regional cell taxonomy

Related topics