Hi everybody,
I am a new user of the AllenSDK. I would like to start using this dataset with my students in a course. However, I’m facing problems when loading the data.
I followed the basic tutorials and downloaded the data corresponding to a session. However, I’m unable to operate with it. When I try to call any of the functions, my laptop goes out of RAM memory.
#Download and load of the session goes well, no problem. Takes a few seconds.
session_id = 798911424
oursession = cache.get_session_data(session_id)
#When I call this in a notebook cell, RAM usage starts to gradually increase
oursession.metadata
Calling get_session_data(id)
allocates around 1GB, which I believe is reasonable given that all data went into the variable. Totally fine.
The weird thing is that asking for the metadata just starts to allocate more and more memory gradually, and the function does not seem to finish. After 1-2 minutes approximately, the function has allocated enough memory to kill all my RAM (my laptop has 8GB). It was basically 5GB. Also, the code continues to be executed (i.e., it’s not that code finishes and there’s a memory leak somewhere else).
Is this normal behaviour? To my understanding metadata
is just a dictionary with a few fields, there’s no way it weights all that. I find the same problem if I try to call, e.g., oursession.structurewise_unit_counts
. When I manually interrupt the kernel, also, memory is not freed again.
I tried to find system requirements for the AllenSDK to know if (maybe) this is the usual way of operating and I just need more RAM or there’s something else going on. But to me, this seems like some kind of problem, which I am not able to diagnose…
My OS is Ubuntu 20.04. I am using AllenSDK in a clean environment with Python 3.11 and AllenSDK 2.16.2.
Thank you all in advance for the support!