Run ID: 1734750680278-c5899a74-16a9-4fb4-9b16-6f2fd6068687
I am unable to see or download the log file even when I allow pop ups in my browser settings.
I am unable to understand what when wrong with the validation of my h5ad file.
Any help appreciated.
Sorry for the late reply. Lots of end-of-the-year vacations prevented anyone from seeing your post.
I cannot find an error message in our AWS logs. Furthermore, your job appears to have taken only 30 seconds from start to failure, which makes me think that uploading the file to MapMyCells is the step that failed. Can you just try again and see what happens?
Is it possible you had the .h5ad file open in another terminal session/notebook while you were trying to upload it? Sometimes that poses a problem, as anndata tries to keep the file open as long as any session representing that file is live. That can create odd access patterns on your computer.
@hchintalapudi if you are still having trouble with this:
May I ask what the name of the file you tried to upload for mapping was? Specifically, did the word ‘validated’ occur in the file name?
Hi,
Thank you for your response.
I am still unable to run it successfully and also not able to download logs in spite of allowing pop-ups on chrome.
The file name is “combined_adata_for_MapMyCells.h5ad”
Run ID: 1736792289608-b09a67d3-c604-4f9b-be8f-e9cffc4e50a9
May I download your data file from S3 (I can see it in our pipeline bucket)? I would like to inspect it’s format to see if there is something unique about it that is preventing it from running.
Please do, thank you!
Your data failed to map because MapMyCells was not able to uniquely identify the genes in your h5ad file. MapMyCells works in ENSEMBL IDs. If a user submits data that identifies genes with gene symbols, it does its best to map those symbols to ENSEMBL IDs. Your data contains both symbols and ENSEMBL IDs. Unfortunately, some of the symbols map to ENSEMBL IDs that are already listed in your dataset (or, in some cases, MapMyCells has mapped two symbols to the same ENSEMBL ID). This confusion, in which the data that actually goes into the mapping step contains two entries for one ENSEMBL ID, is causing MapMyCells to crash.
I ran the “map to ENSEMBL IDs” step on my local machine and will attach the error message here so you can see which gene symbols/IDs are causing a problem. The safest solution is for you to recreate the file identifying every gene with an ENSEMBL ID, making sure that each ENSEMBL ID is listed only once. Alternatively, you can go through the list of degenerate genes and decide which to keep and which to remove from your file.
Note: MapMyCells only cares about the values in the index of your h5ad file (i.e. var.index.values
). It does not care about the other columns in var
.
Why couldn’t you download the validation log?
I’m not 100% sure. There are a lot of degenerate genes. My working hypothesis is that the error log may have exceeded some file size limit we were unaware of in our infrastructure. I took your data and downsampled it to 5 pairs of degenerate genes. The mapping failed and I was able to download the validation log for this smaller dataset. I will consult with our other engineers and see if I am right/what we can do about this.
For now, I hope the attached error log gets you unstuck. Thank you for your patience.
repeated_genes.txt (358.3 KB)