File Upload Error (Failed) in MapMyCells

mschro · May 7, 2024, 5:14pm

Good afternoon,

I am trying to upload an .h5ad file of size 3.6Gb with 79,263 cells and 25,262 genes. I converted the gene names to Ensembl IDs and the data type to CSR matrix of numpy float 32 values, same as the example mouse .h5ad files. The error I’m getting is “Upload data failed” with “File Upload Error: Failed” but no further explanation. I checked all the input requirements carefully and am quite sure I’m meeting them. Any idea what else could be causing the error?

Many thanks,
Margaret

danielsf · May 7, 2024, 5:26pm

Hi @mschro

There is a subtlety to modern browsers we did not appreciate when writing our documentation. Currently, the file upload limit is 2 GB, not 4 GB. Sorry for the confusion. Let us know if you need help splitting your file into two chunks to get it down under the limit.

Again: apologies for the confusion.

Cheers,

Scott Daniel

mschro · May 7, 2024, 6:37pm

Thanks for the quick response, Scott! One more question: have you found it makes a difference in the mapping if the number of genes is downsampled such that only the X (5,000 or 10,000) most highly variable are used? I am considering shrinking the gene list rather than cutting down on the number of cells mapped at once.

danielsf · May 7, 2024, 8:01pm

MapMyCells compares your gene list to a list of blessed marker genes and only uses genes that occur in both lists. We haven’t publicized the list of marker genes, so I would not recommend cutting genes from your dataset, just in case the genes you cut are marker genes.

I just looked over your original message. That amount of data does not seem very large to me. It feels like you should be able to get it into the 2GB limit.

If you are using python, have you tried writing to h5ad with compression, i.e.

my_anndata_object.write_h5ad(
    'path/to/output.h5ad',
    compression='gzip',
    compression_opts=4)

Also, if your data is raw counts (i.e. just integers) you can try saving it as a matrix of unsigned 16 bit integers (np.uint16 in python). That’s a quick 25% savings there (the way CSR matrices are stored, half of your data won’t benefit from this change).

mschro · May 7, 2024, 8:28pm

Good to know about the marker genes. I’ll keep all the genes then.

I have not tried with compression, but I did switch to uint16 and now I’m able to process my data in manageable batches of <40K cells each (it’s working quite nicely!).

Thanks for all the help!

Topic		Replies	Views
MapMyCells Troubleshooting Guide MapMyCells how-to	1	588	November 18, 2024
Mapping failed because of application errors troubleshooting MapMyCells	4	340	October 25, 2023
Input file validation error from MapMyCells MapMyCells	2	69	July 31, 2024
Mapping failed, unable to see logs MapMyCells	6	40	January 13, 2025
Mapping failed because of application errors MapMyCells	12	408	February 26, 2025

File Upload Error (Failed) in MapMyCells

Related topics