Hello,
I am analysing the RNA-Seq data from the BrainSpan Atlas of the Developing Human Brain. I have seen that this dataset contains RNA-Seq RPKM values for 52376 genes.
I have read the documentation on how the RNA was extracted from the brain tissues, and there is a section called ‘mRNA Library Preparation and Sequencing’ which states that mRNA was purified from the total RNA. I have also read that the Gencode gene annotations were used in the alignment of the reads to the reference genome.
If only mRNA was sequenced, why does the BrainSpan RNA-Seq dataset have RPKM values for 52376 genes? There are ~20,000 protein coding genes, and the most recent Gencode release has annotations for 19955 protein coding genes.
Thanks.