Downloading an Image

Hi @ajuavinett

I think I have an answer for you. This is going to be a bit rambling because the answer I have was cobbled together from several community posts and several hunches about how this API works. I suspect that the answer I am ultimately going to give you will not completely suit your use case, so I’m going to try to leave enough bread crumbs that you can reverse engineer whatever other functionality you need should you get stuck again (though, if that fails, please do post again; the more we can get this stuff written down the better).

Getting set up

To start with, you will need to install the python libraries requests and xmltodict. They are both pip installable so

pip install requests xmltodict

ought to just work.

The key to how this API works is that the various URLs posted above query an API and return metadata about our datasets in the form of an XML document. XML is a structured way of representing data. I don’t particularly like working in XML. Fortunately, the xmltodict library will just convert it to a python dict for you.

In principle: Mouse data

Let’s pretend that you were asking about mouse data. In the post above, you will see the following URL

http://api.brain-map.org/api/v2/data/query.xml?criteria= \
model::SectionDataSet, 
rma::criteria,[failed$eq'false'],products[abbreviation$eq'Mouse'],plane_of_section[name$eq'sagittal'],genes[acronym$eq'Adora2a']

This URL gives you metadata about all section images in Mouse ISH that are stained for the gene Adora2a. You would use it as follows

import requests
import xmltodict

section_dataset_url = (
    "http://api.brain-map.org/api/v2/data/query.xml?criteria="
    "model::SectionDataSet,"
    "rma::criteria,[failed$eq'false'],products[abbreviation$eq'Mouse']"
    ",plane_of_section[name$eq'sagittal'],genes[acronym$eq'Adora2a']"
)

section_dataset_xml = requests.get(
    section_dataset_url
)

as_dict = xmltodict.parse(section_dataset_xml.text)
section_list = as_dict['Response']['section-data-sets']['section-data-set']

for section_dataset in section_list:
    print(section_dataset['id'])

This will give you the result

70813257
69855739

which are the IDs for the two section datasets that fit our criteria. Now, we need to find all of the images that go with those section datasets. To do that, we would use something like the URL

http://api.brain-map.org/api/v2/data/query.xml?criteria= 
model::SectionImage, 
rma::criteria,[data_set_id$eq70813257]

Sven posted above to break down a section_dataset to its individual Section Images. Programmatically, that looks like

for section_dataset in section_list:
    section_id = section_dataset['id']
    section_image_url = (
        "http://api.brain-map.org/api/v2/data/query.xml?criteria="
        "model::SectionImage,"
        f"rma::criteria,[data_set_id$eq{section_id}]"
    )
    print(f'=======section dataset {section_id}=======')
    section_image_xml = requests.get(
        section_image_url
    )
    section_image_dict = xmltodict.parse(section_image_xml.text)
    image_list = section_image_dict['Response']['section-images']['section-image']
    for image in image_list:
        print(image['id'])

For each section dataset, we are querying its section images, loopoing over those images, and displayiing their IDs. These are the IDs that go into the image download URL

http://api.brain-map.org/api/v2/image_download/70679088

Generalizing

But you didn’t ask about Mouse data. Obviously, we have to change something about this URL

http://api.brain-map.org/api/v2/data/query.xml?criteria= \
model::SectionDataSet, 
rma::criteria,[failed$eq'false'],products[abbreviation$eq'Mouse'],plane_of_section[name$eq'sagittal'],genes[acronym$eq'Adora2a']

to get the human sections. Just changing 'Adora2a' to 'DISC1' fails (I assume because DISC1' is not a valid mouse gene). Unfortunately, changing products[abbreviation$eq'Mouse'] to products[abbreviation$eq'Human'] also fails. The relevant human dataset is not called 'Human'.

If you look in this post

you will see reference to a URL that queries products

`http://api.brain-map.org/api/v2/data/Product/query.xml?num_rows=10&start_row=20&order=products.name`

I’m going to edit this to list no num_rows and start_row=0 in the hopes that this will get us all of the products. The code I am running is

import requests
import xmltodict

product_query = (
    "http://api.brain-map.org/api/v2/data/Product/"
    "query.xml?start_row=0&order=products.name"
)

product_xml = requests.get(product_query)

as_dict = xmltodict.parse(product_xml.text)

# loop through products; list all with some variation of
# "human" in their abbreviation
for product in as_dict['Response']['products']['product']:
    if 'human' in product['abbreviation'].lower():
        print(product['abbreviation'])

which gives result

HumanTBIonlyISH
HumanTBInoISH
DevHumanISH
DevHumanMA
DevHumanRef
DevHumanTrans
HumanASD
HumanCtx
HumanNT
HumanSZ
HumanSubCtx
HumanMA
HumanCellTypes
HumanCellTypesHistology
HumanCellTypesTranscriptomics
HumanGBMRNASeq

Based on an educated hunch, let’s assume the product you want is 'DevHumanISH'. Now, our code to get section image IDs is

import requests
import xmltodict

section_dataset_url = (
    "http://api.brain-map.org/api/v2/data/query.xml?criteria="
    "model::SectionDataSet,"
    "rma::criteria,[failed$eq'false'],products[abbreviation$eq'DevHumanISH'],"
    "genes[acronym$eq'DISC1']"
)

section_dataset_xml = requests.get(
    section_dataset_url
)

as_dict = xmltodict.parse(section_dataset_xml.text)
section_list = as_dict['Response']['section-data-sets']['section-data-set']

for section_dataset in section_list:
    section_id = section_dataset['id']
    section_image_url = (
        "http://api.brain-map.org/api/v2/data/query.xml?criteria="
        "model::SectionImage,"
        f"rma::criteria,[data_set_id$eq{section_id}]"
    )
    print(f'=======section dataset {section_id}=======')
    section_image_xml = requests.get(
        section_image_url
    )
    section_image_dict = xmltodict.parse(section_image_xml.text)
    image_list = section_image_dict['Response']['section-images']['section-image']
    for image in image_list:
        print(image['id'])

Note: I dropped the requirement that the images be saggittal. That was returning no results.

The above code returns a bunch of results like

=======section dataset 100116503=======
101679545
101679533
101679460
=======section dataset 100118416=======
101693918
101693910
101699143
101693902
...

I hope this gets you unstuck. If you want to play around some more with the API, you can look for community posts with the keyword “RMA” (“restful model access”). A lot of legacy documentation got moved into the forums a year or two ago.

And of course, post again if something does not work for you.

PS this post is probably a good resource for anyone wanting to engage with this API.