Metadata of mouse whole cortex and hippocampus 10x data set

Michael · October 14, 2020, 3:25am

I recently downloaded the “mouse whole cortex and hippocampus 10x” data set. As a first step, I loaded the “gene expression matrix” with python and created a tSNE plot. Individual data points were colored according to the “class_label” information (e.g. Glutamatergic) in the “table of cell metadata” .csv file. To my surprise, class labels did not highlight groups of clusters in the tSNE plot as expected, but rather were randomly spread out across the tSNE plot. Next, I verified whether the clusters in the tSNE plot were actually clusters of similar cell types by coloring the individual data points with regard to the transcript count of several canonical gene markers. As expected, cells expressing the same canonical gene marker (e.g. Vip) were clustering together. Therefore, it seems that I failed to retrieve the appropriate information on an individual cell from the “table of cell metadata”. Please correct me if I am wrong, but the identifier in the “sample_name” column is what I should use to retrieve metadata from the “table of cell metadata” .csv file for a particular cell/row in the “gene expression matrix”, right?
Thank you for your help!

cvanv · October 15, 2020, 4:25am

Hi @Michael,

you are correct that the “sample_name” column is the identifier used in the matrix. Have you checked whether those identifiers match in the files you downloaded? The order of the samples in the matrix does not match the order of the samples in the metadata file perfectly. Could this cause the mismatch you observe?

Best,
Cindy

dvera · September 8, 2021, 6:29pm

I’m having the same issue. Expression patterns on Allen cell browser (10X mouse cortex & hippocampus) do not match those in public data sets for download. Other data sets seem to be fine, including Smart-seq and Human Cortex.

A clear example of this is if you look at Pvalb expression patterns in the UCSC cell atlas (a browser session derived from allen brain atlas public files):

https://cells.ucsc.edu/?ds=allen-celltypes+mouse-cortex+mouse-cortex-2020&gene=Pvalb

with pvalb expression in the transcriptomics explorer:

https://celltypes.brain-map.org/rnaseq/mouse_ctx-hip_10x?selectedVisualization=Scatter+Plot&colorByFeature=Gene+Expression&colorByFeatureValue=Pvalb

cvanv · September 8, 2021, 7:11pm

Hi @dvera, yes same problem as described before. Since the order of the samples in the count matrix does not match the order of the samples in the metadata, have you checked whether cells are assigned correctly?

dvera · September 8, 2021, 7:12pm

To the best of my knowledge, the metadata is matched to the expression matrix based on the cell names, not by index/row#.

dvera · September 8, 2021, 7:13pm

Also note that this problem is specific to this particular dataset (mouse 10x cortex/hippocampus). The same methods for assigning metadata to cells works in all the other data sets, suggesting there is a problem with the public dataset itself for mouse 10x cortex/hippocampus.

cvanv · September 8, 2021, 8:16pm

Ok, I had to download the data to verify but I can’t reproduce the problem. I have used the following code to check the expression of Pvalb in the umap space. Could you give this a try and see if it works?

mat <- fread("matrix.csv")
colnames(mat)[1] <- "sample_name"

#meta <- fread("metadata.csv")
umap.2d <- fread("tsne.csv")


rd.dat = as.data.frame(umap.2d)
colnames(rd.dat)[1:3] = c("sample_name","Dim1", "Dim2")
sub.mat = select(mat, c("sample_name", "Pvalb"))
rd.dat$expr = sub.mat$Pvalb[match(rd.dat$sample_name, sub.mat$sample_name)]
rd.dat <- rd.dat[order(rd.dat$expr),]
p = ggplot(rd.dat, aes(Dim1, Dim2)) + geom_point(aes(color = expr), 
            size = 0.15)
        p = p + scale_color_gradient(low = "gray80", high = "red")
        p = p + theme_void() + theme(legend.position = "none")
        p = p + coord_fixed(ratio = 1)
        p = p + ggtitle("Pvalb")

Topic		Replies	Views
Mouse CTX-HPF datasets before the update How To atlas-mouse-brain-adult , analysis , how-to	1	672	April 27, 2022
Is there a cell type taxonomy of mouse in the hippocampus? Technical how-to	4	44	March 20, 2025
Map cells to their cell type Allen Mouse Brain Atlas celltype , how-to	4	550	September 27, 2024
Citation for Allen scRNAseq Mouse and Human datasets Transcriptomics Explorer transcriptomics , rna-seq , publication	4	1416	March 30, 2020
Mouse Whole Cortex and Hippocampus 10x Technical atlas-mouse-brain-adult , transcriptomics , rna-seq	3	507	August 8, 2022

Metadata of mouse whole cortex and hippocampus 10x data set

Related topics