Metadata of mouse whole cortex and hippocampus 10x data set

I recently downloaded the “mouse whole cortex and hippocampus 10x” data set. As a first step, I loaded the “gene expression matrix” with python and created a tSNE plot. Individual data points were colored according to the “class_label” information (e.g. Glutamatergic) in the “table of cell metadata” .csv file. To my surprise, class labels did not highlight groups of clusters in the tSNE plot as expected, but rather were randomly spread out across the tSNE plot. Next, I verified whether the clusters in the tSNE plot were actually clusters of similar cell types by coloring the individual data points with regard to the transcript count of several canonical gene markers. As expected, cells expressing the same canonical gene marker (e.g. Vip) were clustering together. Therefore, it seems that I failed to retrieve the appropriate information on an individual cell from the “table of cell metadata”. Please correct me if I am wrong, but the identifier in the “sample_name” column is what I should use to retrieve metadata from the “table of cell metadata” .csv file for a particular cell/row in the “gene expression matrix”, right?
Thank you for your help!

Hi @Michael,

you are correct that the “sample_name” column is the identifier used in the matrix. Have you checked whether those identifiers match in the files you downloaded? The order of the samples in the matrix does not match the order of the samples in the metadata file perfectly. Could this cause the mismatch you observe?

Best,
Cindy