Opening Raw Data via Jupyter notebook made in AWS sagemaker

I have recently connected to the allen brain observatory bucket via AWS and now have access to download raw data (i.e. spike_band.dat).

However, the raw data is quite large and I would like to access it without downloading anything to my computer in order to choose which data is most suitable for me.

I have attempted to open it via jupyter notebooks but I am getting a permission error. This is probably because the files are private and need to be accessed by a method however I do not know of any function in the form of get_raw_data(session_id, probe_id) or get_spike_band(session_id, probe_id) as is in the case get_ophys_experiments() for the brainobservatory cache. Does anyone know of any way to open and process the data in Jupyter notebooks?

Hi @maciej-123,

Thanks for the question! There were some public permissions issues for the S3 bucket for those spike_band.dat files that needed to be addressed. I’ve modified the bucket settings so that those files should now be publicly accessible. Could you try again and let me know how it goes?

Best,

Nick

Thank you for helping me out, I have restarted jupyter and attempted to reconnect but I am still getting the same permission error (I assume no extra steps are needed apart from this).

Did you give public access to all the spike_band files? If not then, please let me know which files to access. Also, am I using the correct method to access the data or should I try some other way?

Hi @maciej-123 ,

I’ve confirmed that all spike_band.dat files in the S3 bucket are now public. Because of the large size of the raw data, there is unfortunately not a set of AllenSDK classes and methods to allow easy access and download of raw data.

If you want to download and look at downsampled data you can take a look at:
https://allensdk.readthedocs.io/en/latest/_static/examples/nb/ecephys_data_access.html

Additional tutorials for working with and analyzing downsampled data can be found at: Visual Coding – Neuropixels — Allen SDK dev documentation

In terms of accessing the raw spike_band.dat files, it looks like you have mounted the S3 bucket as a local file-system. Are you using s3fs? If so, can you share how you are mounting that ecephys raw data S3 bucket?

Best,

Nick

Thank you for your help, I do not quite understand what you mean by mounting the bucket, I am using a Jupyter notebook created via amazon AWS to try and access the data:

From this sentence on the github page: ‘s3fs allows Linux and macOS to mount an S3 bucket via FUSE’, it appears that I am not using s3fs as my operating system is windows 10.