Mapping failed due to application errors

Hi!

We’ve been trying to map a sci-RNA-seq data (generated via the Parse biosciences platform), but have been encountering issues mapping stage. The tool says “Mapping Failed. Mapping algorithm failed because of application errors.”
Here is some additional info:

  • Operating system and web browser: Apple MacOS 15.0.1 and Chrome
  • Run ID: 1740693429486-3616e145-8ed8-481f-a16b-a686b78d654e
  • Information on the data: This is human single cell combinatorial indexing based RNAseq data generated using the PARSE platform. The parse pipeline generates a h5ad file. However, since this has gene names, I modified those to ENSEMBL IDs in R (using the gene names csv file). The final AnnData object n_obs x n_vars = 4034 x 62710 with a total size of 34.4MB.
  • Instructions on how to reproduce the issue: I uploaded the h5ad to the site and used the 10x Human MTG SEA-AD as the reference taxonomy with Deep Generative Mapping (have also tried 10x Whole Human Brain with Hierarchical mapping). The job proceeds through ‘input file validation’ and then errors.

Any suggestions/assistance would be greatly appreciated! Thanks.

Hi @ubhaskar

The specific error message produced by your job is

Validation error: e=AttributeError("'Categorical' object has no attribute 'index'"), type(e)=<class 'AttributeError'>, fname='run.py', lineno=145
Traceback (most recent call last):
  File "/apps/run.py", line 145, in run
    runner.run()
  File "/usr/local/lib/python3.10/site-packages/cell_type_mapper/cli/validate_h5ad.py", line 228, in run
    result_path, has_warnings = validate_h5ad(
  File "/usr/local/lib/python3.10/site-packages/cell_type_mapper/validation/validate_h5ad.py", line 105, in validate_h5ad
    result = _validate_h5ad(
  File "/usr/local/lib/python3.10/site-packages/cell_type_mapper/validation/validate_h5ad.py", line 188, in _validate_h5ad
    was_transposed) = _transpose_file_if_necessary(
  File "/usr/local/lib/python3.10/site-packages/cell_type_mapper/validation/validate_h5ad.py", line 494, in _transpose_file_if_necessary
    var_species = detect_species(var.index.values)
AttributeError: 'Categorical' object has no attribute 'index'

Which means that the var element in your h5ad isn’t a dataframe, but a Categorical (I did not even know it was possible to create an anndata file whose obs and var were not dataframes).

Can you recreate your h5ad file with var as a full dataframe whose index is whatever column has the ENSEMBL IDs in it? This would look something like

import anndata
import pandas
src = anndata.read_h5ad('/original/h5ad/file.h5ad')
old_var = src.var
new_var = pandas.DataFrame({'gene_id': old_var}).set_index('gene_id')
dst = anndata.AnnData(
    var=new_var,
    obs=src.obs,
    X=src.X
)
dst.write_h5ad('/path/to/fixed/file.h5ad')

(Making the same change to obs if obs also ends up not being an h5ad file)

Sorry. I just realized my example code is in python and you explicitly called out R in your post. I do not know R. Ping back here if you are unclear how to proceed and I will tag in someone who does.

For what it’s worth: I just tested the DeepGenerative mapping algorithm on data I knew it could handle and there is a different failure resulting to a mismatch between versions of scvi_tools and anndata that snuck in under our radar. We are discussing the best fix. I will post here when the fix has been deployed.

Thanks so much, @danielsf for the quick response.

I think changing the var and obs elements both to a dataframe object fixes the issue. But, you are right. I did run into an error with the scvi_tools/anndata problem that you mentioned with the DeepGenerative algorithm.

I ended up using hierarchical clustering instead, and the mapping now goes to completion.

Thanks!

@ubhaskar
For what it’s worth: the DeepGenerative mapping algorithm should be fixed now

1 Like

Thank you! I’ll check that out too.