Generating input for MapMyCells from Spatial Data

skouneli · December 5, 2023, 7:12pm

Hi! I am trying to use 10X Genomics Spatial Gene Expression data as an input for the MapMyCells celltype mapping tool. I am having an issue with getting the input in the correct format.

These are the steps I am following but it still appears my .h5ad file is not in the correct format. Any recommendations? Thanks!

sceasy::convertFormat(seurat_spatial_object, from=“seurat”, to=“anndata”, assay = “Spatial”,
outFile=‘CCIvSham.h5ad’)

my_too_large_adata = read_h5ad(‘~/filename.h5ad’)

minimal_adata = my_too_large_adata$X

ad ← AnnData(
X = minimal_adata,
obs = data.frame(group = rownames(minimal_adata), row.names = rownames(minimal_adata)),
var = data.frame(type = colnames(minimal_adata), row.names = colnames(minimal_adata))
)
write_h5ad(ad,‘~file.h5ad’,
compression=‘gzip’)
ad
AnnData object with n_obs × n_vars = 34217 × 32272
obs: ‘group’
var: ‘type’

When I upload this final .h5ad file it fails at recognizing the input file due to incorrect formatting.

SvenOtto · December 5, 2023, 8:25pm

@skouneli thank you for your interest in MapMyCells.

It appears you’re doing the following:

You start off with a Seurat object.
You convert it to an h5ad file.
You reduce the size of said h5ad file.

Could you confirm that you’ve transposed your Seurat object, if necessary, so that rows are cells and columns are genes? (The opposite tends to be true for default Seurat objects).

Related useful documentation can be found here “Creating h5ad input files in R & Python > 1.) If your data is stored as a csv file with sample names as columns and gene names in the first row:”

Alternatively, have you tried, whether the manual copy over of gene and cell identifiers that you excluded from your write-up of the “Reducing size of h5ad files in R & Python” offered any resolution?

danielsf · December 5, 2023, 8:28pm

Also: if it fails again, can you post your run ID in this chat? The run ID should be a semi-random string like “1701448506122-6fa0c2b5-88b5-45ea-b02f-4071ea7bfe87”; it will show up in the “your run has failed” message on the MapMyCells website. Having that information will allow us to get the detailed error message and provide a more precise diagnosis of what is happening. Thanks.

skouneli · December 6, 2023, 12:30pm

The RunID is 1701802295018-6aa01aea-f605-4e28-a148-2afc22b1a979

And it looks like my column names are genes and my rows are the spot/“cell” identifiers (see picture)

Thanks for your help!

danielsf · December 6, 2023, 5:24pm

I downloaded your input file. I was unable to open it with the anndata library. I get

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/allen/aibs/technology/danielsf/miniconda3_230814/envs/cell_type_mapper/lib/python3.9/site-packages/anndata/_io/h5ad.py", line 197, in read_h5ad
    return read_h5ad_backed(filename, mode)
  File "/allen/aibs/technology/danielsf/miniconda3_230814/envs/cell_type_mapper/lib/python3.9/site-packages/anndata/_io/h5ad.py", line 128, in read_h5ad_backed
    f = h5py.File(filename, mode)
  File "/allen/aibs/technology/danielsf/miniconda3_230814/envs/cell_type_mapper/lib/python3.9/site-packages/h5py/_hl/files.py", line 567, in __init__
    fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
  File "/allen/aibs/technology/danielsf/miniconda3_230814/envs/cell_type_mapper/lib/python3.9/site-packages/h5py/_hl/files.py", line 231, in make_fid
    fid = h5f.open(name, flags, fapl=fapl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5f.pyx", line 106, in h5py.h5f.open
OSError: Unable to open file (truncated file: eof = 101677282, sblock->base_addr = 0, stored_eof = 369387362)

which, based on cursory googling, is an indication that the file is somehow corrupted.

Possible causes I can think of:

Maybe the buffer had not been totally flushed when you saved the file to disk (the anndata library doesn’t always flush and close file objects when you think it ought to)
File upload might have terminated prematurely.

Can you please:

Close out of your R/python session. Start a new session, and verify that you can open this file on your local machine?
assuming that (1) works fine, just try again in case something when wrong during file upload the first time?

skouneli · December 6, 2023, 7:07pm

Thanks for all the recommendations. I did 1 and 2 and I am still getting an error on input file not in correct format (Run ID: 1701889390195-a3a109e1-3a09-4f6d-839d-c151566963c2)

A user above suggested manually copying over but I am not sure how to do that with my spatial Seurat object.

Thanks for any help/insights.

danielsf · December 6, 2023, 8:45pm

Same error on our end. h5py and anndata believe that your file is corrupted.

How large is the file on your system? The file that got uploaded is 166 MB

Topic		Replies	Views
Mapmycell pipeline for user Cell Taxonomies	3	118	May 1, 2024
Error to uploading my H5ad file on Mapmycell MapMyCells	19	329	May 29, 2024
Mapping failed because of application errors troubleshooting MapMyCells	4	344	October 25, 2023
MapMyCells Troubleshooting Guide MapMyCells how-to	1	600	November 18, 2024
MapMyCells Now Accepts CSV Input Files! MapMyCells	0	48	March 7, 2025

Generating input for MapMyCells from Spatial Data

Related topics