Hello Juan - that should be correct - if comment is blank it was directly annotated.
I too am interested in the precise process which was used to convert from the colorimetric pixel values in an ISH slice image, and the corresponding heatmap pixel values in that slice’s expression data image. I can’t seem to find this described in any of the documentation, including supplementary document 2 of “Genome-wide atlas of gene expression in the adult mouse brain” by Lein et al (Nature, 2007), which has a section entitled “Automated Gene Expression Detection”.
One thing I couldn’t find was a mathematical statement of the heatmap used in the Allen brain map expression images
I’ve correlated ISH and expression images (precisely, image 100873513 which is slice 99 from experiment 100041580 in the developing mouse atlas*) pixel by pixel to try to determine the mapping.
Here’s a plot in RGB space showing:
800 randomly selected pixels in an ISH image which correlate with pixels in the Allen expression image which have been deemed to be expressing the gene in question (Emx2) (purple outlines to points.)
800 randomly selected pixels from the ISH image which correlate with NON expressing pixels from the expression image. (Green outlines to points.)
In each case, the fill of the circle is the RGB colour value from the ISH image pixel. It’s a little hard to get a feel for it in 2D, but I can only upload one view of the graph.
So, the general colour scheme is: yellower: non expressing; purpler/darker: expressing.
why so much overlap in the points?
In the graph, I’ve included no information about the quantity of expression (because I don’t know the heatmap for the expression map).
thanks for reading!
- (Full size) images were obtained with:
Thanks for this great question. A description of the analysis methods related to data analysis are published in IEEE, by Ng, et al (2007). There’s a paywall on this - if you don’t have access, before you buy it, note that you won’t find the mathematical statements in there; its a more detailed description of the whitepapers that you can download from the Documentation tab of the Allen Mouse Brain Atlas portal. But hopefully one of the authors who worked on this project will weigh in here!
If someone could just answer this question, that alone would be helpful:
Once the segmentation algorithm has identified expressing vs non-expressing pixels, the expressing pixels are colorized based on the luminosity of the ISH image (0.21 R + 0.72 G + 0.07 B) over some small local neighborhood. On the web application it is then colorized using the jet colormap. As such this not a “quantity” of expression as you put it. Rather, we wanted to provide a view of the color intensity in the expression mask.
Segmentation takes into account intensity of the local background which may contributes to the overlap that you are observing.
Thank you very much for responding, I do appreciate it. I didn’t write back immediately, as I wanted to make sure I had a clear set of questions for you.
I’m not sure I understand everything that you wrote, so I’ll add questions & commentary to quotes from your post:
“Once the segmentation algorithm has identified expressing vs non-expressing pixels…”
Ok, so this algorithm must be using the luminosity and/or some color information in the original ISH image to determine the expressing/non-expressing state for that pixel. I understand from the documentation that this algorithm is masking away non-tissue areas of the image and may be accounting for occulsions, tissue tears, etc. So does the algorithm use color information at this stage and if so, how?
“…the expressing pixels are colorized based on the luminosity of the ISH image (0.21 R + 0.72 G + 0.07 B) over some small local neighborhood”
Ok, so that sounds like for a given pixel that is determined to be expressing, you then go back to the ISH image, and perhaps position a 2-D Gaussian hill over the pixel and sample the ISH color information around this hill to gather a number for the expression image pixel. I don’t understand (0.21 R + 0.72 G + 0.07 B). That color axis in 3D color space is more or less green and none of the ISH images seem to have greenish color in them. When I plot expressing and non expressing ISH pixel colors, they seem to be arranged along a line in color space which is roughly grey. Here are the points, plotted as in my first post, along with the 0.21R + 0.72G + 0.07B line in thick green.
Ok, so the expression map is then plotted with the jet colormap, that is helpful. I notice that in the Brain Explorer application, you can actually find some numbers associated with the jet colormap. Are these accessible via the API? That is, is the range which the jet map represents available via the API? Do these numbers have any physical meaning, or are they arbitrary?
Finally, do you know if the luminosity of the stain in the ISH image is roughly linear with the number density of mRNA strands that code for the given protein, or is it some other relationship, such as exponential?
“So does the algorithm use color information at this stage and if so, how?”
It’s bee a long while and people has come and go. My best recollection is that the color information get converted to single “intensity” value during the segmentation process.
“Are these accessible via the API? That is, is the range which the jet map represents available via the API?”
Yes, they are. Have a look at these places first and follow up with questions
“Do these numbers have any physical meaning, or are they arbitrary?”
They are again local average of the gray scale value using that formula over “expression pixels” (see the whitepaper pdf)
“, do you know if the luminosity of the stain in the ISH image is roughly linear with the number density of mRNA strands that code for the given protein, or is it some other relationship, such as exponential?”
No, I do not. ISH is semi-quantitative (!) with the quantitative part being good spatial resolution but they color aspect is not so much.
For the in situ hybridization (ISH) assay presented in the Allen Mouse Brain Atlas, colorimetric ISH signal intensity is not linear with mRNA quantity in a cell. No quantitative assessment about amount of expression of a given transcript should be made based on the appearance of the purple precipitate, that goes much beyond qualitative terms such as ‘lots’ ‘some’ or ‘little.’ Refer to the detailed technical documentation on ISH process for an explanation of why. Additional experimental data would be required to establish the quantitative nature of transcript level, on a probe-by-probe basis.
Thanks for all the additional information. I believe I have enough information to help me to generate some expression heat maps from the ISH images now.
What I’ve decided to do is to take the ISH color information, and apply a transform in color space so that the “purplish” colours end up being aligned with one of the color space axes (it happens to be the one that was originally ‘blue’). I then make an elliptical tube around this axis, and all color pixels whose (transformed) color lands them inside the tube get a non-zero “expression” value taken from the projection of the point onto the axis. If the pixel is on this axis, but it nonetheless too ‘light’ then it gets expression=0 assigned to it.
It would still be great if the precise algorithm that was applied to turn ISH images into expression maps was published at some time. Is the code that was used open source or available by personal communication?
I tried and was wondering for the API for genes Fam84b , Fam184a , and Rasa, does the return energy density values synthesize both coronal and sagittal experiments for each gene since there is no place in the API for me to specify experimental ID?
Can you post your API query so I can see what you are seeing?
It’s just the example from above:
Each “structure unionize” record is for specific experiment and structure.
http://api.brain-map.org/api/v2/data/query.xml?criteria= model::StructureUnionize , rma::criteria, section_data_set(genes[acronym$in'Fam84b','Fam184a','Rasa2']) , rma::include, structure,section_data_set(plane_of_section) ,rma::options [only$eq'structures.id,structures.acronym,structures.graph_order,genes.acronym,plane_of_sections.name,data_sets.id
This modification of the query may be helpful to see if the experiment was sagittal or coronal
This is an example xml for the first record
<structure-unionize> <expression-density>0.00384803</expression-density> <expression-energy>0.454006</expression-energy> <id>453370663</id> <section-data-set-id>75081395</section-data-set-id> <structure-id>17437</structure-id> <sum-expressing-pixel-intensity>5350170.0</sum-expressing-pixel-intensity> <sum-expressing-pixels>45346.6</sum-expressing-pixels> <sum-pixel-intensity>247297000.0</sum-pixel-intensity> <sum-pixels>11784400.0</sum-pixels> <voxel-energy-cv>0.839586</voxel-energy-cv> <voxel-energy-mean>0.454002</voxel-energy-mean> <section-data-set> <id type="integer">75081395</id> <plane-of-section> <name>coronal</name> </plane-of-section> </section-data-set> <structure> <acronym>r8B</acronym> <graph-order type="integer">2118</graph-order> <id type="integer">17437</id> </structure> </structure-unionize>
Does this help?
Hello, what does voxel_energy mean and how could I convert it to expression density?
More information about term definition and calculating gene expression based on image density measurements is in the documentation for the Allen Mouse Brain Atlas, downloadable as a PDF here:
You can also learn more about the API in the Help section:
Hope this helps!
A very informative thread about how ISH intensities are colorized!
I am curious if there is a way to map the expression mask RGBs back to an intensity value? I understand a major caveat here is that probe to intensity signal is itself non-linear. Is the mapping from intensity to jet linear once a feature is detected in the mask?
My understanding is that the ISH data expressed as “expression energy” is not quantitative with respect to the actual amount of mRNA, because the relationship between color intensity and mRNA quantity is not linear.
However, the data should be somehow linear within a certain range with the same probe. This would allow for relative quantitative comparisons of the gene expression in different brain areas within a certain range of expression. Is there any estimation of the range of expression energy values within which the signal can be considered fairly linear?
I found that for some genes different probes have been used at different developmental stages of the mouse brain. In the case of the gene Cdh5, one probe was used for embryonic samples ( RP_080912_03_B02 ) and another for P56 ( RP_070219_02_C05 ).
In this case the expression energy is quite different between the embryonic and postnatal samples.
If a different probe was used, it is not possible to compare expression levels between stages.
Is there any reason why two different probes were used?
Has this been a common practice and should be checked for each gene?
Thank you for your help
I would like to know what the probe name used for the ISH experiments means in the Developing mouse brain atlas.
Some ISH done for the same gene at different developmental stages show different probe names. However, in some cases the sequence of the primers is the same but the probe number is not. If the primers used for the PCR were the same for generating 2 probes, the sequence of the probe should be the same, but in same cases they have a different number.
Thus, how do I know if the same probe was used for different experiments? What could be the differences between 2 probes with different probe number but generated with the same pair of primers?
Thank you for your help
The probe name, with the embedded date, indicates when the probe was made (“lot control”). The templateID indicates the probe design, so the sequence for a given templateID will always be the same. Hundreds of different probes could have been generated using the same templateID at different points in time, for different projects.
Thank you tylermo for the response about the probes names. Its very helpful
I was wondering about the status of the annotation of the P28 Development mouse brain atlas.
The Reference atlas does not show P28; however, saggital P28 experiments are shown for a number of genes. A query using RMA with the P28 section_IDs and structure unionize retrieves expression values. Do these values correspond to the annotation used for other postnatal stages? If so, what level of annotation show be considered for P28?