So: I know what is happening. It is going to require a bugfix.
When you input a CSV file, the code has to do some analysis to figure out if the first column in the CSV file is a list of row names or a list of values (in which case the rows have no names). Both shapes are things we expect users to submit to us.
Part of that process is reading the CSV in with pandas. For better or worse, your CSV is big enough that when pandas reads it in (with a very default configuration because we are assuming we know nothing about the shape of the CSV file), it requires too much memory and the system crashes (probably because pandas doesn’t know be default what is a string and what is an int and reads everything as strings).
I need to put some work in making this process more intelligent/efficient, apparently.
There is one quick way around this: the “read everything into pandas and intuit the meaning of the first column” step would not be necessary if the value of the first column were blank. Unfortunately, the first row in your csv file looks like
"",cell_name_A,cell_name_B,cell_name_C,...
which is not the same as
,cell_name_A,cell_name_B,cell_name_C,...
(the first column in the first example is not blank; it is a string with two characters, both of which are "
). This is also a bug. Our code should probably treat ""
and ''
as blank.
I will work to fix these bugs. In the meantime, should you find yourself blocked, editing your CSV file to look like the second example above (with an actual blank first column entry) ought to get your data through.
Sorry about this, and thanks for the bug catch.