(Further) Normalization of Gene Data for Comparison


We are trying to compare expression levels of specific genes in specific regions of the brain. Our aim is to be able to graph this in some sort of pie chart comparison, but we obviously cannot chart negative numbers on a pie chart (z-scores). Any tips on how we can further normalize this data so that we can have an accurate representation of the differentiation of our specific gene expression levels in our areas of interest throughout the brain? We have already gone through all of the z-score/microarray-related forums we could find, but we’re not sure they address this problem specifically.


Hi @ksalazar. Thanks for you interest in these data! I’m not sure a pie chart is the best visualization for this, as the expression across the brain isn’t really a fraction of the whole in any simple sense. We typically have used a barplot, along these lines (panel c). You can make such plots with the atlasplot R library and hbadata. That said, if you want to make pie charts, you have a few options:

  1. Use the intensity data rather than the z scores. You can access this by changing the colormap to “log2 intensity” in the heatmap view, or by downloading all of the data from the download page. Also this is what is used in atlasplot.
  2. Set all values in the z-score that are less than 0 to 0. With microarrays, there is no absolute 0 so every region will have some background level of expression. This is an imperfect way of addressing this issue.
  3. Use the intensity data and omit all brain regions with expression below a background level of expression. The value differs for each probe but could be calculated or approximated.

I would not recommend attempting to further normalize the data for this use case. Hopefully this helps, but if you have any additional questions, please reply!