A9 all nuclei consolidated dataset

Rajesh · June 30, 2026, 8:40am

Dear team,
I noticed that the SEA-AD consortium recently released an updated version of the consolidated A9 dataset (https://sea-ad-single-cell-profiling.s3.amazonaws.com/PFC/RNAseq/SEAAD_DFC_RNAseq_all-nuclei.2026-06-22.h5ad). The cell counts differ across donors—particularly H20.33.001, H20.33.005, H20.33.008, H20.33.016, H20.33.019, H20.33.026, and H20.33.027—compared with the 2024 dataset. In all cases except H20.33.016, the new dataset appears to include all previously available cells plus additional cells that were not present before. I also noticed that the updated dataset is missing three useful metadata columns: “Class Confidence,” “Sub Class Confidence,” and “Supertype Confidence.” Could you help me understand the differences between these datasets and advise which version is recommended for use?

Thanks in advance.

Best regards,
Rajesh

kyle.travaglini · June 30, 2026, 9:28pm

Hello @Rajesh ,

The new objects should be used. They differ as we re-analyzed the MTG and DFC datasets in the context of the additional allo- and neo-cortical regions. I can share the manuscript with you that describes the changes when it comes out, but the codebase is here: GitHub - AllenInstitute/SEA-AD_Multiregion_2026 · GitHub . As noted in the Changelog, the cell type taxonomy is the same except for expansion to include types found outside of MTG. As we re-mapped the data, there are going to be small numbers of cells that switch types (mostly those on the boarder between two groups) as well as shift between pass/fail in quality control cutoffs.

While compiling the dataset, we realized a small fraction of libraries (12 of ~900) appeared swapped after looking at the whole genome sequencing and the variants in RNA reads. They are here: SEA-AD_Multiregion_2026/00_data_curation/00_single_nucleus_multiome/06_finalize_data_assets/mixup_investigation_02-14-2025.csv at main · AllenInstitute/SEA-AD_Multiregion_2026 · GitHub . The new MTG and DFC objects have swapped these libraries to their correct donors/brain regions. The others were caught prior to release.

Finally, we decided to remove Class/Subclass/Supertype confidence from the main objects as we have found they are not particularly well calibrated (they do tend to be lower in mis-called types in reference benchmarking, but cell type abundance in the reference has a big influence on the final number). They are still available for those who want to find them here: AWS S3 Explorer and AWS S3 Explorer in the iterative_scANVI output files.

Best, Kyle

Rajesh · July 1, 2026, 7:20am

Dear Kyle,

Thanks for the prompt and detailed response.

I have a related query: Is the 2024 consolidated ATAC MTG object still good to use? Will you be releasing a consolidated PFC ATAC object as well?

Thanks.

Best regards,

Rajesh

Rajesh · July 1, 2026, 7:46am

Dear Kyle,

Thanks for the response. I have a related query: I see that the iterative_scANVI file is still labelled “2024” even though it is within the 2026 folder and the time stamp shows that it has been recently modified. Should I assume that this file has been updated for the corrected library information?

Thanks.

Best regards,

Rajesh

kyle.travaglini · July 1, 2026, 8:24pm

Hi @Rajesh ,

We are working on an updated ATACseq object for MTG and will release data for the others (timeline is TBD, but its a high priority for us). The fragment files for all regions are on the ADKP if you would like to access the data sooner. As with the RNAseq objects, the MTG ATACseq object will be charged a little bit, but mostly at the margins.

The iterative_scANVI files indeed use the older version of the data before swap corrections. The safest thing to do is match on the cell ids, which for SEA-AD are [Barcode]-[library_prep]-[ar_id]. Library_prep is our ID for each sequencing library and ar_id is our ID for the subsequent genome alignment.

Best, Kyle

kyle.travaglini · July 2, 2026, 6:50pm

The manuscript describing the updated dataset is here: https://www.biorxiv.org/content/10.64898/2026.07.01.734821v1.full.pdf

Rajesh · July 3, 2026, 5:13am

Thanks for the help.

Topic		Replies	Views
SEA-AD snATAC-seq data Science sea-ad	4	449	April 22, 2024
Data Updates - December 13, 2023 Allen Brain Cell (ABC) Atlas	10	492	October 7, 2024
RDS file for the seattle Alzheimer’s Disease Brain Cell Atlas	3	455	October 23, 2024
Mouse CTX-HPF datasets before the update How To atlas-mouse-brain-adult , analysis , how-to	1	706	April 27, 2022
Downloading and exploring -omics data from SEA-AD How To transcriptomics , celltype , rna-seq , human , sea-ad	2	828	February 2, 2023

A9 all nuclei consolidated dataset

Related topics