I am looking to utilise the MERFISH data to map cells from another single cell resource onto the MERFISH slices. This is related to the data in the Yao, 2023 paper.
Looking at the data I cannot seem to find a file that contains both the gene expression data and the cell co-ordinates.
The h5ad files have gene expression data and the cell_metadata includes co-ordinates however they cannot be joined together as there is no common identifier in both files.
Have you compared the cell_label column in the cell_metadata.csv file with the index of the obs dataframe in the h5ad files? These should share common values (modulo a few cells that are in the h5ad files that failed to pass QC and thus are not listed in cell_metadata.csv). That is, generally speaking, how you unify the metadata files with the h5ad files.
More formally this notebook (and the ones linked to it) shows you how to access a self-consistent version of the released MERFISH data and visualize it in CCF coordinate space, both according to cell type and gene expression (see, especially this section of the notebook).
I suppose I should clarify my use of “a few” above, for anyone who finds this thread. There are 395366 cells (out of 3.9 million) that failed QC and thus appear in the h5ad file but do not appear in cell_metadata.csv