Z-score for human-microarray and mouse ish-data

Hi Austin,

The Ivy GAP downloadable data is FPKM-normalized and not z-scores, and therefore a 0 value means that no RNA for that gene was detected. Here is a good summary of different RNA-seq normalization strategies, but it short, FPKM (fragments per kilobase per million) accounts both for the total number of RNA measured in a sample as well as the gene length.

In the heatmap view (e.g, here) I would suggest treating the 0 z-scores either as “NA” or as whatever the smallest non-zero value is. It appears that z is set to 0 whenever the log2 intensity value is 0, which does not make sense, as it always should be negative. More generally, I would recommend using read counts or log normalized FPKM values rather than z-scores whenever dealing with RNA-seq data programatically.

Best,
Jeremy

1 Like