Isoforms in SMART-Seq v4 DFC M1C Dataset

JDiSp · October 4, 2025, 1:56am

Hello, thank you for creating and maintaining this excellent resource. I am following up on this thread (insert). I would like to analyze these data for a specific set of isoforms (PPARG1 and PPARG2). I thought that the counts files (here) may contain transcript-level counts but I believe that they are gene-level counts, since I only find counts for a single “PPARG” annotation. Is this due to the annotation features file that was used in the summarizeOverlaps quantification? If so, is it advisable to work from the bam files and re-run summarizeOverlaps?

A few other clarifying questions on the human_cortex_SS4_Open_GRU_raw-file-manifest-2025-7-30.tsv:

the Tar Filename field entries do not specify R1 or R2 for paired end reads, yet, in the methods section to the paper cited in the above preceding thread (Jorstad, 2024), the SMART-seq methods section mentions “After clipping, the paired-end reads were mapped using Spliced Transcripts Alignment to a Reference (STAR)”
could you please provide a key to each element of the filenames separated by underscores, e.g. in F1S4_191008_307_A01.fastq.tar . Does every unique filename correspond to a single cell? I’m wondering why each substring like F1S4_191008_307 has 8 files, A01:H01?

Thank you for your patience as this is my first time working with SS data. I appreciate any guidance you can provide.

Susan.Sunkin · October 14, 2025, 3:59am

Hi,

Thank you for your question. I asked our technical experts how to address your questions. Here is the response:

Isoforms. Our published expression matrix files are all at the gene-level. For isoform analysis the bam files will be needed to generate a new isoform level expression matrix.
Our single cell SMARTerV4 data is FACS sorted into 8 well strips. Example F1S4_191008_307_A01
1. F1S4 = Indicates the FACS machine used and the SMARTerV4 (S4) method.
2. 191008 = the date of sort.
3. 307 = Number of sorted cell.
4. A01 = Sort well.

Thank you

JDiSp · October 14, 2025, 1:58pm

Got it! Thank you so much!

Topic		Replies	Views
Human Multiple Cortical Areas SMART-seq (2019) Science atlas-cell-types , transcriptomics , analysis , how-to , rna-seq	1	689	July 7, 2021
Metadata/Publication for SMART-Seq v4 DFC M1C Dataset atlas-human-brain-adult , celltype , how-to , rna-seq	1	71	July 29, 2025
Mouse Whole Cortex and Hippocampus SMART-seq intron counts not in hdf5 expression matrix file? atlas-mouse-brain-adult , transcriptomics , rna-seq	1	388	March 3, 2023
Fastq file availability from human SMARTseq data Cell Taxonomies transcriptomics	1	508	October 19, 2022
Transcriptomic explorere missing information? Transcriptomics Explorer atlas-cell-types , transcriptomics , analysis , how-to	1	774	October 21, 2020

Isoforms in SMART-Seq v4 DFC M1C Dataset

Related topics