I was wondering if there is a way to get the full post-clustering gene expression vs. cell type that is displayed here for the mouse MOp? Brain Knowledge Platform
Or do I need to take the raw data provided here (Main page) and cluster it myself?
If so, any tips for making sure my clusters match the BICCN MOp clusters?
Basically I am hoping to get the full list of marker genes for each cell subtype in the Allen MOp taxonomy
Can the new MapMyCells service help with this? Or Azimuth?
Thanks for your question! You will need to get the cell x gene data from NeMO, as you mention, but the marker genes and annotations you’re looking for are available in this repo, although spread across a few different tables:
Azimuth does host the MOp taxonomy so could be useful in your work, however MapMyCells currently does not. MapMyCells does host the whole mouse brain taxonomy, which you may also find useful if you’re looking for broader cell type annotations.
Hope this helps!
Thanks so much Ray! Would you say one approach may be as follows then:
Download the cell x gene data from NeMO, then for each cell in the cell x gene data (tagged via it’s cell barcode), I use the ‘cell to cell set assignments’ to find it’s Accession ID in the taxonomy, then use the ‘Taxonomy nomenclature table’ to find it’s cell type alias?
And then sorry not sure if you know this, but was all of this data used in determining the final published taxonomy? Just preliminarily searching through it only a subset of cells were included, possibly filtered through some QC mechanism
Okay I think I have an understanding of how to do it after doing some thinking and reading through the datasets more; will let you know if i have any issues, thank you for all the help!
Sorry Ray, am having one issue if you have any insight would be appreciated. It seems most of the final BICCN taxonomy is coming from the Broad data.
However, the Broad data h5 file is missing ‘features/feature_type’ which Seurat Read10X_h5 requires to read in the matrix:
Do you know how I might be able to overcome this?
Same issue occurs if I download the data from here and use the Read10X function on the barcodes, features, matrix files
I solved it by just using the features.tsv from another 10X data set; it has the same genes in the correct order, but also all the necessary columns for Read10X to work, so it worked now.
So sorry for the late reply! I’m glad you were able to solve this. Please follow up if you run into any other issues.