Hello,
I created a query data set using SEA-AD snRNAseq MTG data from the raw fastqs at sage Synape (with clinical consensus diagnosis of Alzheimers disease and Control). I ran cell ranger (with introns) and perfomed QC. The ref I am using the MTG final_nuclei ref provided in the AWS registry by Gabitto et al.
I have large portion of the query that wont match the ref.
(TOP )QUERY+REF. (BOTTOM) JUST REF (same latent space)
I’m using this for my scvi:
scvi.model.SCVI.setup_anndata(adata, batch_key=“libraryBatch”, layer=“counts”,
categorical_covariate_keys=[“individualID”, “sex”],
continuous_covariate_keys=[“age_numeric”])
vae=scvi.model.SCVI(adata, n_latent=30)
And this for my scANVI
lvae=scvi.model.SCANVI.from_scvi_model(vae,adata=adata,
unlabeled_category=“Unknown”,
labels_key=“subclass_label”)lvae.train(max_epochs=100, early_stopping=True, n_samples_per_label=100)
I have added/substacted various other categorical covariates and tried to fix this issue but the result is the same. The cluster at the top left of the query+ref UMAP also gets assigned a different cell type with subtle changes in the categorical covariates as well.
Any help/suggestions would be appreciated.

