Confused about datasets (mouse gene expression - ISH)

Hi,

I have been going through the literature on mouse gene expression datasets, and I read this article:
https://www.nature.com/articles/nn.2281
This mentions that the AGEA was made from 4,000+ genes from the ABA, each containing 51,533 cubic voxels. However, I checked the mouse_expression_data_sets.csv file, and there are 25,000+ genes, each made of 58 * 67 * 41 voxels (159,326), and a sample gene expression RAW file can be downloaded from this link:
http://download.alleninstitute.org/informatics-archive/october-2014/mouse_expression/01/3_grid.zip
I’m wondering why the voxel count of this dataset does not match the paper.

Best,
Momo

Any reply please?

I’m not an expert in this, but I think this line from the methods explains the difference:

The 200-μm 3D grid consists of 67 coronal × 41 horizontal × 58 sagittal voxels (sum = 159,326), of which 51,533 voxels are masked to intersect the ABA data.

I believe the brain itself takes up 51.5k voxels within the 159k voxel rectangular volume.

I think the difference in the gene numbers is due to it being only genes with coronal planes:

To maximize spatial coverage, registration integrity and correlation data quality, the AGEA was developed based on the coronal ISH image series in the Allen Brain Atlas rather than the full genomic sagittal image data.

Thank you, that helped!