Reproducing r-score correlations in Allen Human Brain Atlas

Hi, @jenniferhu04. Deciding which probe(s) to best represent each gene is something that we have spent a lot of time considering. You have a few choices.

  1. Use the probe with the highest average expression level. Usually this probe best represents the underlying gene expression
  2. We wrote a paper comparing gene expression from the Allen Human Brain Atlas using microarray and RNA-seq. Additional File 8 shows statistics for each probe, and probes with the lowest q-value best represent true gene expression values as measured with RNA-seq.
  3. If you want to aggregate probes using the average or another metric (take the mostly highly expressed probe per gene as mentioned in #1 above), you can do that easily with the collapseRows R function.

After calculating your gene expression matrix using one of the above three options, then I would suggest defining your R score between genes.