Scrattch.hicat tutorial

enh008 · March 17, 2021, 9:43pm

Hello,

I followed the scrattch.hicat tutorial found here (https://taxonomy.shinyapps.io/scrattch_tutorial/#section-overview) with my scRNAseq dataset (instead of the tasic et al data) and the results says that my data has 6 clusters denoted by 6 different numbers (not 1-6). I looked up these numbers in the primary cell type id column of the Tasic, et al annotations file and identified the corresponding primary cell type. I thought, perhaps these cluster numbers are correlated to the Allen Brain cell types and these are my 6 cell types in my data! Then, I looked up these numbers in the Yao et al MOp mouse cortex data set annotations file, which revealed 6 different cell types. What does the user gain from these 6 cluster numbers if they are not correlated to an Allen Brain cell type? How does the user get to identifying “Vip Lmo1” or “Sst Cbln4” type of nomenclature for these 6 clusters? I do not see any output that gives the top 2 marker genes to be used for this naming convention. If it is possible I would like to classify my cell types the same way as the Allen Brain does so that way these cell types are comparable with the classified cell types published by Allen Brain. I completed the pipeline to get my 6 clusters by the same pipeline the Allen Brain folks use (scrattch.hicat) but I need help figuring out how I should name these 6 clusters to obtain nomenclature compatible with “Pvalb Tpbg”, etc.

Thank you,
Eden Hornung

yzizhen · April 2, 2021, 11:06pm

The clustering pipeline does not automatically produce any cluster labels. This is true for almost all existing tools. Our labels were generated based on manual curation of markers based on differential gene expression analysis. If you just want to map your cells to our reference, the easiest approach maybe is to use Seurat label transfer function. On the other hand, your dataset seems to include only a subset of clusters present in our taxonomy. Seurat sometimes have issue when there is a big difference in cell type composition between the query and reference datasets. If you have issue with that, you can try “map_sampling” function in scrattch.hicat package, with option method=”mean”. It requires both train and testing data matrix to be transformed by logCPM, clustering labels for the training dataset, and marker genes.

github.com

AllenInstitute/scrattch.hicat/blob/master/R/annotate.R

# Function call map
# function_1()
#   called_by_function_1() called_function_file.R
#
#
# map_by_cor()
#   get_cl_means() util.R
#
# map_cl_summary()
#   map_by_cor() annotate.R
#
# predict_annotate_cor()
#   map_by_cor() annotate.R
#   compare_annotate() annotate.R
# 
# map_sampling()
#   map_by_cor() annotate.R
#
# map_cv()
#   map_by_cor() annotate.R

This file has been truncated. show original

The cluster membership can be found in the supplementary table 10 of 2018 Tasic paper.

Topic		Replies	Views
Map cells to their cell type Allen Mouse Brain Atlas celltype , how-to	4	549	September 27, 2024
Assigning Allen Transcriptomic Taxonomy to External Dataset Technical atlas-reference-maps , transcriptomics , celltype , analysis , rna-seq	3	645	June 17, 2021
Metadata of mouse whole cortex and hippocampus 10x data set Transcriptomics Explorer	6	728	September 8, 2021
Marker genes of Transcriptomic explorer Technical transcriptomics , celltype , how-to , rna-seq , software	1	668	November 9, 2021
How to assign cell type labels to my 10X dataset using Allen Brain as a reference atlas? Technical atlas-cell-types	2	670	August 29, 2022

Scrattch.hicat tutorial

Related topics