Hi! I’m trying to identify all genes expressed in the auditory cortex using the human brain atlas. What is the simplest way to do this using the online GUI?
Hi @ajdefreese,
I would suggest using the differential search. Here is a quick query comparing STG vs. the rest of the brain. If you want to redefine what you mean by “auditory cortex” or change the comparison region to specific brain structures (such as neocortex) rather than the whole brain, you can adjust what is in the “Target Structure(s)” and “Contrast Structure(s)” boxes, respectively. Sorry for the delay on this response.
-Jeremy
Thank you so much for your help! I’ve been trying to replicate these queries with the downloaded data, but cannot get matching p-values and fold changes - despite several attempts. Can you help provide insight into how exactly these are calculated so I can try and replicate it in my code?
Thank you in advance for your help!
Hi @ajdefreese ,
There are a few Community Forum threads on this topic already. A good one to start with would be this one: Transcriptomics (RNA-seq/microarray) data normalization - FAQ . The short answer is that it likely has to do with the way data is summarized or normalized on the website relative to the raw data download. If you are comfortable working with the downloaded data, I’d recommend using those values and associated statistics in your analysis.
Best,
Jeremy
Thank you again for your help and for pointing me to the forum discussions. I realize I’m still a bit confused, and I wanted to clarify before I move forward.
When you mention “downloaded data,” are you referring to:
- The complete normalized microarray dataset across all brains (from here: https://human.brain-map.org/static/download/#:~:text=Complete%20normalized%20microarray%20datasets,white%20paper%2C%20Microarray%20Data%20Normalization),
or
- The summarized results that can be downloaded after running a differential search on the website (example: https://human.brain-map.org/microarray/search/show?domain1=4005&domain2=4133&selected_donors=9861,10021,12876,14380,15496,15697&search_type=differential)?
I’ve been working with the first dataset, but when I try to replicate the fold-changes and p-values from the second dataset (website GUI), I get very different results. My understanding was that the first dataset was already normalized and should therefore match what I see in the GUI, but maybe I’m misunderstanding how the two sources relate.
Ultimately, my goal is to identify genes that are more highly expressed in STG vs. the rest of the gray matter. As a next step, I’d also like to use those results for a UMAP visualization. But before I do that, I’d like to be sure I understand how to replicate the differential statistics shown on the website.
Your guidance so far (and links to other discussions) have been very helpful—I just wanted to check if I’m interpreting things correctly, or if I should be working directly with the summarized data instead of the complete dataset.
Thank you again for your time and help!
I mean #1, not #2 above. The data that is downloaded in #2 depends on the specific visualization shown (e.g., if your color-map is showing Z-scores you’ll download something different than if it’s showing log2 intensity, and if your Resolution is “Structures” you’ll download something different from if it is “Samples”).
See more details at this post: Reproducing r-score correlations in Allen Human Brain Atlas - #2 by jeremyinseattle .
Unfortunately, I’m not sure exactly how the fold change and p-values differ between methods. The complete set of documentation for human brain atlas is here: Documentation: Human Brain Atlas