Overview of the Aging, Dementia and TBI Project

The Aging, Dementia & TBI study incorporates many disparate data modalities - histology, protein quantification, gene expression and clinical diagnoses - which makes visualizing relationships and correlations within the data challenging. To enable exploration of the data, we have created unique data snapshots with visualization that incorporates t-Distributed Stochastic Neighbor Embedding (t-SNE) Plots and Parallel Coordinate Plots. These snapshots are curated walk-throughs of the data as an entry for exploring the data.

Each image from this page links you to a subset of the data and describes, in story-form, possible interpretations of the data.

t-Distributed Stochastic Neighbor Embedding (t-SNE) Plots

This method of data visualization is a technique that reduces the complexity of multi-dimensional data to two dimensions. In each of the data snapshots, the data is embedded by a limited number of parameters and the data plot coupled to the parallel coordinate plot allows the color of the data points to be changed based on independent data parameters. In some of the examples, there is the option to alter the embedding of the data (by clicking on the check box above the radio button in the parallel coordinate plot. For more information on this data representation, please visit t-SNE – Laurens van der Maaten.

Clicking on one of the data points will lead to a specimen detail page where all data collected from this donor can be accessed.

Parallel Coordinate Plots

This data representation allows for n data modalities to be plotted against n distinct axes. For each of the snapshots, the data modality is listed above each axis with a radio button that enables coloring of the t-SNE and plots according to that parameter. Embedding of the data in the t-SNE plot can also be altered by clicking in the checkbox over an axis (when available). Each of the coordinate axes is equipped with a slider bar that enables a subset of that data to be highlighted. Hover over the axis to enable the slider function, then click and drag to limit the data represented. Excluded data points will be indicated by colorless circles.

Data Snapshots


This snapshot demonstrates the power of these visualizations using an obvious way to embed the data, genes specific to one sex or the other. In this example, the data were embedded by the top 10 genes enriched in expression in males and the top gene enriched in expression in females. This allows for an obvious separation of the dataset in the data space. With the data clustered by sex, querying other parameters, such as demographics, diagnoses and histopathology becomes a simple matter of coloring the dataset by that parameter (by clicking the radio button) and/or limiting samples using the slider bar for each axis.

Brain Regions

In this snapshot, the data were embedded by genes that are differentially expressed in each of the brain regions sampled. This lays out the differences between the cortex and hippocampus and highlights the expression similarities of the cortical regions. With this clustering, you can query region specific genes as well as other demographic, histopathological or diagnostic parameters.

White & Grey Matter

This snapshot embeds the data based on gene expression enhanced in the white matter over the grey matter and vice versa. Not surprisingly, distinguishing these tissues highlights markers for excitatory and inhibitory neurons in the cortex, and glial cells in the white matter. With this clustering, you can then query demographic, diagnostic and neuropathologic parameters.


The data in this snapshot were embedded by gene markers for inflammation and clustering in this manner allows for querying the data by protein concentration as well as other demographic, diagnostic or neuropathologic parameters.


The data in this snapshot were embedded by the genes most differentially expressed in the hippocampus of donors given the diagnosis of dementia over those who had no such diagnosis. This data clustering allows you to explore relationships of some specific gene markers, as well as other diagnostic, demographic or neuropathologic parameters.

Traumatic Brain Injury

In this snapshot, the data were embedded by genes differentially expressed in the cortex of donors who self reported at least one traumatic brain injury (TBI) with a loss of consciousness vs controls. The data clustered in this manner can then be queried for other factors regarding TBI, as well as other neuropathologic, demographic or diagnostic criteria.


In this snapshot, the data were embedded by the levels of two proteins known to be increased in the brains of patients inflicted with Alzheimer’s related dementia; phosphorylated tau, pTau, the form of tau present in neurofibrillary tangles, and the neurotoxic amyloid peptide αβ42, which is present in amyloid plaques. Clustering the dataset in this manner allows for querying the samples by other neuropathologic, demographic and diagnostic criteria.