Quantification of gene expression by in situ hybridization: finding and using the raw values

Seb,

Once the segmentation algorithm has identified expressing vs non-expressing pixels, the expressing pixels are colorized based on the luminosity of the ISH image (0.21 R + 0.72 G + 0.07 B) over some small local neighborhood. On the web application it is then colorized using the jet colormap. As such this not a “quantity” of expression as you put it. Rather, we wanted to provide a view of the color intensity in the expression mask.

Segmentation takes into account intensity of the local background which may contributes to the overlap that you are observing.

Lydia

Dear Lydia,

Thank you very much for responding, I do appreciate it. I didn’t write back immediately, as I wanted to make sure I had a clear set of questions for you.

I’m not sure I understand everything that you wrote, so I’ll add questions & commentary to quotes from your post:

“Once the segmentation algorithm has identified expressing vs non-expressing pixels…”

Ok, so this algorithm must be using the luminosity and/or some color information in the original ISH image to determine the expressing/non-expressing state for that pixel. I understand from the documentation that this algorithm is masking away non-tissue areas of the image and may be accounting for occulsions, tissue tears, etc. So does the algorithm use color information at this stage and if so, how?

“…the expressing pixels are colorized based on the luminosity of the ISH image (0.21 R + 0.72 G + 0.07 B) over some small local neighborhood”

Ok, so that sounds like for a given pixel that is determined to be expressing, you then go back to the ISH image, and perhaps position a 2-D Gaussian hill over the pixel and sample the ISH color information around this hill to gather a number for the expression image pixel. I don’t understand (0.21 R + 0.72 G + 0.07 B). That color axis in 3D color space is more or less green and none of the ISH images seem to have greenish color in them. When I plot expressing and non expressing ISH pixel colors, they seem to be arranged along a line in color space which is roughly grey. Here are the points, plotted as in my first post, along with the 0.21R + 0.72G + 0.07B line in thick green.

Ok, so the expression map is then plotted with the jet colormap, that is helpful. I notice that in the Brain Explorer application, you can actually find some numbers associated with the jet colormap. Are these accessible via the API? That is, is the range which the jet map represents available via the API? Do these numbers have any physical meaning, or are they arbitrary?

Finally, do you know if the luminosity of the stain in the ISH image is roughly linear with the number density of mRNA strands that code for the given protein, or is it some other relationship, such as exponential?

best regards,

Seb James

“So does the algorithm use color information at this stage and if so, how?”

It’s bee a long while and people has come and go. My best recollection is that the color information get converted to single “intensity” value during the segmentation process.

“Are these accessible via the API? That is, is the range which the jet map represents available via the API?”

Yes, they are. Have a look at these places first and follow up with questions
http://help.brain-map.org/display/mousebrain/API
http://api.brain-map.org/examples/foldchange/index.html

“Do these numbers have any physical meaning, or are they arbitrary?”
They are again local average of the gray scale value using that formula over “expression pixels” (see the whitepaper pdf)

“, do you know if the luminosity of the stain in the ISH image is roughly linear with the number density of mRNA strands that code for the given protein, or is it some other relationship, such as exponential?”

No, I do not. ISH is semi-quantitative (!) with the quantitative part being good spatial resolution but they color aspect is not so much.

Hi everyone,

For the in situ hybridization (ISH) assay presented in the Allen Mouse Brain Atlas, colorimetric ISH signal intensity is not linear with mRNA quantity in a cell. No quantitative assessment about amount of expression of a given transcript should be made based on the appearance of the purple precipitate, that goes much beyond qualitative terms such as ‘lots’ ‘some’ or ‘little.’ Refer to the detailed technical documentation on ISH process for an explanation of why. Additional experimental data would be required to establish the quantitative nature of transcript level, on a probe-by-probe basis.

Thanks for all the additional information. I believe I have enough information to help me to generate some expression heat maps from the ISH images now.

What I’ve decided to do is to take the ISH color information, and apply a transform in color space so that the “purplish” colours end up being aligned with one of the color space axes (it happens to be the one that was originally ‘blue’). I then make an elliptical tube around this axis, and all color pixels whose (transformed) color lands them inside the tube get a non-zero “expression” value taken from the projection of the point onto the axis. If the pixel is on this axis, but it nonetheless too ‘light’ then it gets expression=0 assigned to it.

It would still be great if the precise algorithm that was applied to turn ISH images into expression maps was published at some time. Is the code that was used open source or available by personal communication?

1 Like

Hi @jeremyinseattle,

I tried and was wondering for the API for genes Fam84b , Fam184a , and Rasa, does the return energy density values synthesize both coronal and sagittal experiments for each gene since there is no place in the API for me to specify experimental ID?

Thank you!

Hi,

Can you post your API query so I can see what you are seeing?

It’s just the example from above:

http://api.brain-map.org/api/v2/data/query.json?criteria=model::StructureUnionize,rma::criteria,section_data_set(genes[acronym$in’Fam84b’,‘Fam184a’,‘Rasa2’]),rma::include,section_data_set(genes),rma::options[only$eq’genes.acronym,data_sets.id’][start_row$eq8000][num_rows$eq2000]

Each “structure unionize” record is for specific experiment and structure.

http://api.brain-map.org/api/v2/data/query.xml?criteria=
model::StructureUnionize
,
rma::criteria,
section_data_set(genes[acronym$in'Fam84b','Fam184a','Rasa2'])
,
rma::include,
structure,section_data_set(plane_of_section)
,rma::options
[only$eq'structures.id,structures.acronym,structures.graph_order,genes.acronym,plane_of_sections.name,data_sets.id

This modification of the query may be helpful to see if the experiment was sagittal or coronal

This is an example xml for the first record

<structure-unionize>
<expression-density>0.00384803</expression-density>
<expression-energy>0.454006</expression-energy>
<id>453370663</id>
<section-data-set-id>75081395</section-data-set-id>
<structure-id>17437</structure-id>
<sum-expressing-pixel-intensity>5350170.0</sum-expressing-pixel-intensity>
<sum-expressing-pixels>45346.6</sum-expressing-pixels>
<sum-pixel-intensity>247297000.0</sum-pixel-intensity>
<sum-pixels>11784400.0</sum-pixels>
<voxel-energy-cv>0.839586</voxel-energy-cv>
<voxel-energy-mean>0.454002</voxel-energy-mean>
<section-data-set>
<id type="integer">75081395</id>
<plane-of-section>
<name>coronal</name>
</plane-of-section>
</section-data-set>
<structure>
<acronym>r8B</acronym>
<graph-order type="integer">2118</graph-order>
<id type="integer">17437</id>
</structure>
</structure-unionize>

Does this help?

Hello, what does voxel_energy mean and how could I convert it to expression density?

Hi @caroline78,
More information about term definition and calculating gene expression based on image density measurements is in the documentation for the Allen Mouse Brain Atlas, downloadable as a PDF here:


You can also learn more about the API in the Help section:
help.brain-map.org/display/mousebrain/API

Hope this helps!

1 Like

A very informative thread about how ISH intensities are colorized!

I am curious if there is a way to map the expression mask RGBs back to an intensity value? I understand a major caveat here is that probe to intensity signal is itself non-linear. Is the mapping from intensity to jet linear once a feature is detected in the mask?

1 Like

Hello,

My understanding is that the ISH data expressed as “expression energy” is not quantitative with respect to the actual amount of mRNA, because the relationship between color intensity and mRNA quantity is not linear.

  1. However, the data should be somehow linear within a certain range with the same probe. This would allow for relative quantitative comparisons of the gene expression in different brain areas within a certain range of expression. Is there any estimation of the range of expression energy values within which the signal can be considered fairly linear?

  2. I found that for some genes different probes have been used at different developmental stages of the mouse brain. In the case of the gene Cdh5, one probe was used for embryonic samples ( RP_080912_03_B02 ) and another for P56 ( RP_070219_02_C05 ).
    In this case the expression energy is quite different between the embryonic and postnatal samples.
    If a different probe was used, it is not possible to compare expression levels between stages.
    Is there any reason why two different probes were used?
    Has this been a common practice and should be checked for each gene?

Thank you for your help
Juan

1 Like

Hello,
I would like to know what the probe name used for the ISH experiments means in the Developing mouse brain atlas.
Some ISH done for the same gene at different developmental stages show different probe names. However, in some cases the sequence of the primers is the same but the probe number is not. If the primers used for the PCR were the same for generating 2 probes, the sequence of the probe should be the same, but in same cases they have a different number.
Thus, how do I know if the same probe was used for different experiments? What could be the differences between 2 probes with different probe number but generated with the same pair of primers?

Thank you for your help

Juan

Hi Juan,

The probe name, with the embedded date, indicates when the probe was made (“lot control”). The templateID indicates the probe design, so the sequence for a given templateID will always be the same. Hundreds of different probes could have been generated using the same templateID at different points in time, for different projects.

Thank you tylermo for the response about the probes names. Its very helpful

I was wondering about the status of the annotation of the P28 Development mouse brain atlas.
The Reference atlas does not show P28; however, saggital P28 experiments are shown for a number of genes. A query using RMA with the P28 section_IDs and structure unionize retrieves expression values. Do these values correspond to the annotation used for other postnatal stages? If so, what level of annotation show be considered for P28?

Thank you
Juan

To support quantification of data at P28, we registered the P56 annotation to the P28 reference space. Hence the level of annotation for P28 is the same as for P56 data.

Hello,
Thank you for all previous answers!
I’m trying to find the connectivity values between the isocortex and the subpallium. This data was published in as a low resolution matrix heatmap in Fig 3 of the Oh et al 2014 Nature paper, which refers to a supplementary table for the raw values. However, I couldn’t find the supplementary table. I was wondering if someone has access to a higher resolution figure of the connectivity values between each structure of the isocortex and the subpallium.
Thank you for your help

Juan

Hello,
Never mind. I found the raw data of the connectivity heat map.
Thanks
Juan

Thanks for all the additional information on the topic.