My computer is about 7 years old. It’s great it has a GPU and SSD, but it’s not enough for a holiday project that I came up with. I ran up against a wall for the data storage and processing power I need.
I need cloud computing for my little neural data visualization project idea!
I went on Google Cloud and created a Jupyter notebook instance on Vertex AI. I wanted to use this instance to pull Allen data into Google Cloud Storage. So, I found Cloud Storage, created a bucket and a folder for data. I then had to upload a manifest.json file that I got before from Allen Brain Observatory commands that pulled it into my computer.
Here’s the code for pulling the data that I ran on the Jupyter cloud instance:
Don’t forget to install allensdk (for some reason you also need the --user flag, otherwise fails): !pip install --user allensdk
from google.cloud import storage
import os
from allensdk.brain_observatory.ecephys.ecephys_project_cache import EcephysProjectCache
import numpy as np
import pandas as pd
client = storage.Client()
def configure_allensdk():
# Configure the cache to use Google Cloud Storage
#EcephysProjectCache.cache_data = True
#Replace bucket_name and storage_path with your own names of the buckets and folders
#you create on Google Cloud Storage
#EcephysProjectCache.manifest_uri = f'gs://{bucket_name}/{storage_path}manifest.json'
manifest_path = f"gs://allen-neuropixesl/AllenNeuroPixelsCache/manifest.json"
# Initialize the cache
cache = EcephysProjectCache.from_warehouse(manifest=manifest_path)
sessions = cache.get_session_table().index
print(sessions) #See the session identifiers
return sessions, cache
sessions, cache=configure_allensdk(storage_path)
def make_csd_images():
for s in sessions:
session = cache.get_session_data(s,
isi_violations_maximum = np.inf,
amplitude_cutoff_maximum = np.inf,
presence_ratio_minimum = -np.inf
)
probes=session.probes.index
print(probes)
#There are 6 probes for each session.
for p in probes:
session.get_lfp(p)
session.get_current_source_density(p)
make_csd_images()
The files loaded faster in the notebook after Iran the script, but I can’t see them in cloud storage. The jupyter instance has 100GB storage, I suppose it could have pulled it there. Now the instance seems to be frozen because I can’t get it to finish provisioning, takes a long time-- maybe it’s related to the fact that I filled up the 100GB with the data.
I was just wondering if you could give any comments on what I might be doing wrong or if it’s possible to pull the data into cloud storage somehow. Perhaps there’s a better way than what I tried? I’m also working through the documentation on Google Cloud (support is 29$ a month).
Best,
Maria
PS Love the work you are doing!