Transcriptomics (RNA-seq/microarray) data normalization - FAQ

Hi @fanikonst,
This is a great question, and one which has caused some confusion for many people. Broadly, in the Allen Human Brain Atlas, the fold change value represents the average log2(intensity) of all samples in the Target Structure minus the average values in the Contrast Structure. The reason this is confusing is because the two bolded values are not the defaults. By default, z-score values are shown and samples are wrapped up by structure. To make these adjustments, you’ll need to toggle the “Resolution” and “Color Map” options to the bottom right of the heatmap. It is worth noting that when you click on “Download this data”, you will download whatever values are currently being shown in the heatmap (so if you are looking at z scores, you will download z scores). You can download all the data on the download page.

I’m not sure what statistic is used to calculate the P-value, but likely it is calculated on raw data rather than Z scores (Does anyone else know?). Finally, the z-scores shown in the app are calcluated as z-score(log2(fpkm_normalized)) where the population for the z-score is all samples for the given gene (as above). For atlases with RNA-seq data, we use log2(fpkm_normalized +1) instead.

Here is an example differential search where the downloaded data would produce the expected fold changes. If you have any more questions, please post again!

1 Like