Hi @ajuavinett
I think I have an answer for you. This is going to be a bit rambling because the answer I have was cobbled together from several community posts and several hunches about how this API works. I suspect that the answer I am ultimately going to give you will not completely suit your use case, so I’m going to try to leave enough bread crumbs that you can reverse engineer whatever other functionality you need should you get stuck again (though, if that fails, please do post again; the more we can get this stuff written down the better).
Getting set up
To start with, you will need to install the python libraries requests
and xmltodict
. They are both pip installable so
pip install requests xmltodict
ought to just work.
The key to how this API works is that the various URLs posted above query an API and return metadata about our datasets in the form of an XML document. XML is a structured way of representing data. I don’t particularly like working in XML. Fortunately, the xmltodict
library will just convert it to a python dict for you.
In principle: Mouse data
Let’s pretend that you were asking about mouse data. In the post above, you will see the following URL
http://api.brain-map.org/api/v2/data/query.xml?criteria= \
model::SectionDataSet,
rma::criteria,[failed$eq'false'],products[abbreviation$eq'Mouse'],plane_of_section[name$eq'sagittal'],genes[acronym$eq'Adora2a']
This URL gives you metadata about all section images in Mouse ISH that are stained for the gene Adora2a
. You would use it as follows
import requests
import xmltodict
section_dataset_url = (
"http://api.brain-map.org/api/v2/data/query.xml?criteria="
"model::SectionDataSet,"
"rma::criteria,[failed$eq'false'],products[abbreviation$eq'Mouse']"
",plane_of_section[name$eq'sagittal'],genes[acronym$eq'Adora2a']"
)
section_dataset_xml = requests.get(
section_dataset_url
)
as_dict = xmltodict.parse(section_dataset_xml.text)
section_list = as_dict['Response']['section-data-sets']['section-data-set']
for section_dataset in section_list:
print(section_dataset['id'])
This will give you the result
70813257
69855739
which are the IDs for the two section datasets that fit our criteria. Now, we need to find all of the images that go with those section datasets. To do that, we would use something like the URL
http://api.brain-map.org/api/v2/data/query.xml?criteria=
model::SectionImage,
rma::criteria,[data_set_id$eq70813257]
Sven posted above to break down a section_dataset to its individual Section Images. Programmatically, that looks like
for section_dataset in section_list:
section_id = section_dataset['id']
section_image_url = (
"http://api.brain-map.org/api/v2/data/query.xml?criteria="
"model::SectionImage,"
f"rma::criteria,[data_set_id$eq{section_id}]"
)
print(f'=======section dataset {section_id}=======')
section_image_xml = requests.get(
section_image_url
)
section_image_dict = xmltodict.parse(section_image_xml.text)
image_list = section_image_dict['Response']['section-images']['section-image']
for image in image_list:
print(image['id'])
For each section dataset, we are querying its section images, loopoing over those images, and displayiing their IDs. These are the IDs that go into the image download URL
http://api.brain-map.org/api/v2/image_download/70679088
Generalizing
But you didn’t ask about Mouse data. Obviously, we have to change something about this URL
http://api.brain-map.org/api/v2/data/query.xml?criteria= \
model::SectionDataSet,
rma::criteria,[failed$eq'false'],products[abbreviation$eq'Mouse'],plane_of_section[name$eq'sagittal'],genes[acronym$eq'Adora2a']
to get the human sections. Just changing 'Adora2a'
to 'DISC1'
fails (I assume because DISC1'
is not a valid mouse gene). Unfortunately, changing products[abbreviation$eq'Mouse']
to products[abbreviation$eq'Human']
also fails. The relevant human dataset is not called 'Human'
.
If you look in this post
you will see reference to a URL that queries products
`http://api.brain-map.org/api/v2/data/Product/query.xml?num_rows=10&start_row=20&order=products.name`
I’m going to edit this to list no num_rows
and start_row=0
in the hopes that this will get us all of the products. The code I am running is
import requests
import xmltodict
product_query = (
"http://api.brain-map.org/api/v2/data/Product/"
"query.xml?start_row=0&order=products.name"
)
product_xml = requests.get(product_query)
as_dict = xmltodict.parse(product_xml.text)
# loop through products; list all with some variation of
# "human" in their abbreviation
for product in as_dict['Response']['products']['product']:
if 'human' in product['abbreviation'].lower():
print(product['abbreviation'])
which gives result
HumanTBIonlyISH
HumanTBInoISH
DevHumanISH
DevHumanMA
DevHumanRef
DevHumanTrans
HumanASD
HumanCtx
HumanNT
HumanSZ
HumanSubCtx
HumanMA
HumanCellTypes
HumanCellTypesHistology
HumanCellTypesTranscriptomics
HumanGBMRNASeq
Based on an educated hunch, let’s assume the product you want is 'DevHumanISH'
. Now, our code to get section image IDs is
import requests
import xmltodict
section_dataset_url = (
"http://api.brain-map.org/api/v2/data/query.xml?criteria="
"model::SectionDataSet,"
"rma::criteria,[failed$eq'false'],products[abbreviation$eq'DevHumanISH'],"
"genes[acronym$eq'DISC1']"
)
section_dataset_xml = requests.get(
section_dataset_url
)
as_dict = xmltodict.parse(section_dataset_xml.text)
section_list = as_dict['Response']['section-data-sets']['section-data-set']
for section_dataset in section_list:
section_id = section_dataset['id']
section_image_url = (
"http://api.brain-map.org/api/v2/data/query.xml?criteria="
"model::SectionImage,"
f"rma::criteria,[data_set_id$eq{section_id}]"
)
print(f'=======section dataset {section_id}=======')
section_image_xml = requests.get(
section_image_url
)
section_image_dict = xmltodict.parse(section_image_xml.text)
image_list = section_image_dict['Response']['section-images']['section-image']
for image in image_list:
print(image['id'])
Note: I dropped the requirement that the images be saggittal. That was returning no results.
The above code returns a bunch of results like
=======section dataset 100116503=======
101679545
101679533
101679460
=======section dataset 100118416=======
101693918
101693910
101699143
101693902
...
I hope this gets you unstuck. If you want to play around some more with the API, you can look for community posts with the keyword “RMA” (“restful model access”). A lot of legacy documentation got moved into the forums a year or two ago.
And of course, post again if something does not work for you.
PS this post is probably a good resource for anyone wanting to engage with this API.