Microarray Expression File

Hello, I am a undergrad student at UC Davis, with no experience using the API, or SDK of the Allen Brain Institute.

I am trying to create a python module, that will match patient microarray expression data to their disease, and anatomical structure. So using my module, and downloaded microarray data, I can querey the following. “Given gene classification, “autism”, what genes are expressed?” “Given anatomy structure, “prefontal cortex”, what diseases have their genes significantly expressed here?” etc.

Now, I have the expression.csv file, and I understand the rows are the probes, and each column is a z score. However, I don’t know which column goes to which structure, or to which patient. So my question is, “how is the expression file from the downloaded microarray data organized?”

To reiterate, I downloaded all the data from the gene classification, “autism”. The expression file has no headers, detailing what anatomical structure or patient the z-scores come from. Is there a key for the csv file?

Hi @GerrikLabra. When you download expression data using a “Download this data button” you should have a zip file with four files: “Contents.txt”, “Expression.csv”, “Columns.csv”, and “Probes.csv”. Contents.txt will provide more details, but Probes.csv tells you about the probes/genes and Columns.csv tells you about the samples, including the information you request here.

1 Like

Hi Jeremy,

I read the document and the help files, but it only tells me the following.

"The file Expression.csv contains expression values, calculated using zscore normalization. Each row begins with the ID of a probe.

The file Columns.csv contains metadata for each column in Expression.csv, arranged in the same order."

So, is row 1 column B of expression.csv, relating to the second row of columns.csv. Then in row 2, col B, expression, is row 1 of columns. row 3, 4, 5, etc, column B, of expression is for row 1 of columns. Ok. Then columns B-all represents in expression.csv represent all the rows in columns.csv

Correct! Columns B,C,D,… of expression.csv correspond to rows 2,3,4,… in Columns.csv. Rows 1,2,3,… in expression.csv correspond to rows 2,3,4,… in Probes.csv and can also be matched from the id’s in column A.

1 Like