H5 to h5ad conversion help


I am quite new at this but I only have .h5 files so the instructions on how to make the h5ad files that you have do not work for me. Instead I tried to make an h5ad file type with the Seurat and SeuratDisk (remotes::install_github(“mojaveazure/seurat-disk”)) R packages and got an output like this from h5ls

               group         name       otype  dclass      dim
0                  /            X   H5I_GROUP                 
1                 /X         data H5I_DATASET   FLOAT 52146466
2                 /X      indices H5I_DATASET INTEGER 52146466
3                 /X       indptr H5I_DATASET INTEGER    61183
4                  /          obs   H5I_GROUP                 
5               /obs __categories   H5I_GROUP                 
6  /obs/__categories   orig.ident H5I_DATASET  STRING        1
7               /obs       _index H5I_DATASET  STRING    61182
8               /obs   nCount_RNA H5I_DATASET   FLOAT    61182
9               /obs nFeature_RNA H5I_DATASET INTEGER    61182
10              /obs   orig.ident H5I_DATASET INTEGER    61182
11                 /         obsm   H5I_GROUP                 
12                 /          raw   H5I_GROUP                 
13              /raw            X   H5I_GROUP                 
14            /raw/X         data H5I_DATASET   FLOAT 52146466
15            /raw/X      indices H5I_DATASET INTEGER 52146466
16            /raw/X       indptr H5I_DATASET INTEGER    61183
17              /raw          var   H5I_GROUP                 
18          /raw/var       _index H5I_DATASET  STRING    15764
19                 /          var   H5I_GROUP                 
20              /var       _index H5I_DATASET  STRING    15764
21              /var     features H5I_DATASET  STRING    15764
22                 /         varm   H5I_GROUP

When I tried to run your code I got to this point with my data;

counts  <- h5read("X.h5ad", "/X/indices")
genes   <- h5read("X.h5ad","/obs/nFeature_RNA")
samples <- h5read("X.h5ad","/obs/_index")
colnames(counts) <- as.character(genes)
rror in `colnames<-`(`*tmp*`, value = c("715", "491", "1069", "514",  : 
  attempt to set 'colnames' on an object with less than two dimensions

So then I tried to run it but it failed.

Here is the run ID
Run ID: 1709982686767-adc6c8dc-5caf-42fb-b996-474d7cc3ce5b

I can attach the log if you need… But the error is;

Mapping Failed. Use log files for troubleshooting MapMyCells issues. Post them [in the community forums](https://community.brain-map.org/c/how-to/mapmycells/20) for further assistance.Mapping algorithm failed because of application errors.Please confirm that your input data is in cell (rows) by gene (columns) format.

I am not really sure how to proceed. Maybe you could give some instructions on how to make a h5ad file type from a h5 file type?
Also there is an error message that the species cannot be found for some of the genes but that is because the data I have is not from Human or Mouse. Is it not possible to use data from other species?

I hope that is enough information. I will keep trying in the meantime. Thanks.

Update. I made a different .h5ad file in a different way but I still got an error message when I tried to upload. I am not sure what is wrong with it this time though…

Here is the run ID

Run ID: 1710169576028-ee17415c-6b3f-4c77-b2a1-4dbde5bde60d

Hi @Pycnopodia, I suspect your problem is one of two things: (1) improper conversion from h5 to h5ad or (2) improper conversion between species.

For #1, h5 files come in many flavors but it looks like the info you need is stored in it. It also looks like you’ve tried a couple of methods, but here are a couple more that might help: Gene expression matrix.cvs is too large to load it - #17 by jeremyinseattle. Once you have your gene expression matrix (either sparse or dense) in a matrix file, as well as gene names and sample names read into separate variables, you can follow the steps here to write out the h5ad file: File Requirements and Limits - brain-map.org. It should be okay to skip the step where you give the count matrix row names and column names (I think).

For #2, if you are using a species other than mouse (if mapping to whole mouse brain) or human (if mapping to human MTG), you’ll need to convert the provided gene names to the corresponding mouse or human ortholog. We provide a convenient R package to do this conversion here: GeneOrthology/README.md at main · AllenInstitute/GeneOrthology · GitHub.

Please post again if this doesn’t solve you issue.

I’ve looked at the error logs and can confirm that the reason the code failed is Jeremy’s #2 above. Currently, MapMyCells only knows about human and mouse marker genes (depending on the taxonomy you are mapping to). Genes from any other species will cause the code to fail with the message it gave you (which you should be able to see in validation_log.txt).


Ok thanks. I didn’t realise that would be a problem. I will try with the GeneOntology package soon but I have a meeting today so I will look at it tomorrow or later in the week. But just by glance it seems like it is possible because my data is from dog but is it possible to use the h5ad file or do I have to start from the beginning? Thanks.

I would suggest replacing the gene names from dog with corresponding gene names in human/mouse in the h5ad file and try it. If you still get an error, then revisit whether updats to the h5ad format are needed.