Gene length used for RPKM calculation


Im intrerested in the RNASeq data provided by the Brain Span developing human brain atlas.
I already tried to find my answer in the “Transcriptome_Profiling.pdf” file and I hope I didnt overlook a similar entry in this forum.

I wonder if the specific gene length, used for the calculation of gene expression in RPKM, is somewhere available? That would be the sequence length of the “annotation enrty” used in the “mrfQuantifier” step
“then this count is normalized by sequence length of the annotation entry (per killobase)”

As far as I understood this length is calculated from the composite gene model, which is generated in the step “mergeTranscripts”

I could reproduce this step by myself as it only requires annotation files, however im not sure what exactly was used for the file “transcript.interval”

mergeTranscripts knownIsoforms.txt transcript.interval compositeModel > geneComposite.interval

Thanks in advance

Hi @Fiona,

These RNA-seq data were processed and the methods summarized by our collaborators at Yale. More recently these data were published as part of the PsychENCODE consortium, although the details about RPKM conversion are identical to what we have in our help documentation. If other community members have a better answer, please post, but otherwise I would suggest contacting the corresponding authors of this paper from Yale directly.