I am the Scientific Project Manager for the SEA-AD Consortium. I am posting a community question received via email related to working with SEA-AD data:
I have made use of some of the SEA-AD Allen Institute Datasets for some analysis I did, linking to Power in Differential Expression in scRNA-seq data (with the hopes of having a publication down the line).
I just had a query about the datasets (as provided on https://cellxgene.cziscience.com/collections/1ca90a2d-2943-483d-b678-b809bf464c30). I noted that in your paper (https://doi.org/10.1101/2023.05.08.539485), you have stated that the data comes from 84 donors, and so each cell type should have 84 samples. However, from the data as in the link, when I download these files, I seem to be getting quite a wide range of sample numbers (usually 89, but often 88 or 87, with only one cell type having 84 samples).
Would you have any idea as to why this could be? I downloaded each dataset from the link above (in .rds format), then used “readRDS()” to load these in, and used the Seurat “as.SingleCellExperiment()” function to read these in as SCE objects. I then just looked at the unique Donor IDs for each dataset, and that is how I produced the “numSamples” column below (please disregard the other columns). Could you please guide me on this if possible?