Hello, @jeremyinseattle and @tylermo!
I’m reaching out with a few questions about handling the Whole Cortex and Hippocampus 10X dataset in Seurat:
-
Complete Dataset in Seurat: Has there been any success in loading the entire dataset into a Seurat object since its release? If so, is this object accessible for use? I ask because it was alluded to earlier here that it might be possible to do that using Python with more memory. I’m under the impression that working with this dataset becomes significantly easier once it’s in a Seurat object format. Please let me know if my understanding is incorrect.
-
Label Transfer: For cell type annotation in our recent cortical snRNAseq data, I’ve been effectively using MapMyCells. To complement this, I’m exploring Seurat’s TransferData function to transfer labels onto our dataset AND get prediction confidence scores. This approach would mirror MapMyCells’ functionality but offer more flexibility with parameter adjustments if needed. For this, I need both our data and the whole cortex and hippocampus dataset in Seurat object form, hence my initial query above.
As per @jeremyinseattle’s suggestion, I understand that subsetting the data might be a practical solution, but I have some reservations:
- Best Practices for Subsetting: I’m seeking guidance on the most effective subsetting techniques. A key concern is the potential exclusion of rare cell types or crucial data elements, and I wonder how significant this issue might be. Are there established criteria or methods to ensure that a randomly chosen subset accurately mirrors the entire dataset? It might perhaps depend on the question at hand, but I’d love to hear your views on this. Additionally, I’m interested in determining the optimal number of cells for a subset that balances practicality with comprehensiveness. Is there a recommended ‘sanity check’ or verification process to confirm the validity and reliability of findings derived from subsetted data?
I’d really appreciate your insights!
Thank you!
Best,
Sai