Question about Allen's mesh formats

The issue that’s been coming to my attention is, there is NO unified standard for named objects in meshes.

Lots of people need meshes for lots of different purposes, ranging from simple visualization to complex calculations. There are many different starting points, ranging from Neurolucida tracings to ball and stick models to numeric lattices.

To my knowledge there are about 30 commonly used mesh file formats, the most popular ones are OBJ, PLY, STL, and of course SWC for skeletons. Only one of these natively handles named objects (OBJ), and the sad truth is that 90% of the software out there completely ignores this aspect of the spec. (For example VTK will not read a multi-object OBJ file, you can’t say “that’s a vesicle and that’s a mitochondrion and the rest of it is the plasma membrane” - neither will MeshLab - Blender is one of the few tools that handles this properly).

My question is - would you like to drive towards a mesh format that actually works, for neuroscientists. The two most essential requirements are: named objects and the ability to attach arbitrary data to mesh elements. There are some existing formats that will handle this, but they’re proprietary. And the compartmentalization requires more structure than just a numpy array. (You can put just about anything you want inside an HDF5 file, yes? Numpy arrays are convenient but there has to be an agreed upon form for the metadata, so you can say “these are the faces of vesicle # 399”).

Other mesh formats will always be necessary, because people like to use all kinds of tools. But there has to be a USEFUL common format that lets you get anywhere from anywhere else. Currently if you’re tracing vesicles in a synapse you’ll lose all identifiability the moment you export and re-import the file. Similarly for an SWC tracing, cylinders associated with skeletons become natural computational compartments, but they have to retain identifiability and separability to be able to employ them that way.

Currently with the exception of OBJ the mesh formats are single-object, and the only alternative is to store 50k traced cylinders in an HDF5 folder. My question is, before we go too far down that road and it becomes really hard to get back, wouldn’t it be better to agree on a form that’s useful and make it a standard? Allen is generating enough data to where this becomes important! How can I say this… we neuroscientists need to encourage private enterprise to give us the tools we NEED instead of the ones they have. OBJ format actually works, so why is the entirety of private and public enterprise ignoring the standard? It’s probably because no one is yelling and screaming about it, 'cause we scientists haven’t quite gotten to the automated computational meshes yet. But we will - this year, maybe next - and what are we going to do when we can’t even tell the computer “that’s a vesicle, and this other thing is ER”?

Supplementary - here’s a link to a very nice description and set of tools from Allen and MICrONS:

These tools focus on visualization, which is definitely a need, and a worthy effort. I need the geometry for my work too, but I’m in a computational effort where we have to subdivide meshes in different ways, with a variety of resolutions. We might put 5 layers on a cylinder or only 3, and basically the idea of “manipulating meshes”, or skinning a skeleton in different ways for computational purposes, is a little different from segmentation or visualization. In a workflow, we’re going to download a dataset in as close to a useful form as we can get, then work with it offline. During the work, we’ll generate many representations of the same mesh with different resolutions, both higher and lower than the original. We use the methods being proposed in the link, like voxelization and such, but also some methods peculiar to neuroscience, for instance there’s a brute force method that lines up the segments in really bad SWC tracings (which you can download in seconds from neuromorpho instead of spending time in the cloud), so you can pass them to mesh tools and get decent starting points for computation, instead of meshes with holes all over them. How much adjustment is allowable for computation? That’s one of the things we’re trying to find out, besides the discretization errors we find certain things that don’t care much about geometry and other things that do. One common task is populating a geometry for biophysical purposes, “channel density” is of limited value in clustering, so we have to scale the meshes to be able to accommodate these arrangements. The assessment of errors requires some high-power multiphysics, Navier-Stokes with charged particles, Vlasov-Poisson on a friendly mesh. The good news is, these methods also tell us what spines are doing, and astrocytes that are only 10 nm away. It all depends on a computationally accessible mesh, and that’s a mouthful by itself because it has to scale, it gets repeatedly decimated and reskinned. The formats should be usable for all these purposes, they should be friendly for these purposes, yes?

Further amplification: this link

suggests SWC is a standard format. Unfortunately, SWC will not work for biophysical simulations, as already mentioned. There seems to be a common misunderstanding of what exactly constitutes a mesh. There are publications from MICrONS talking about meshes of large volumes of brain with thousands of neurons, but connection mappings (no matter how sophisticated) are not meshes. Meshes are for biophysics, they help us understand things we can’t see, like astrocyte leaflets. We want to position ion channels at the vertices of meshes, it requires nm resolution for synaptic scaffolding, and the interior compartments have orientations just like the synapses do. From my POV the inability to include cellular components is a major oversight, someone just wasn’t thinking. Named objects are absolutely mandatory for this effort, as are parenting abilities and several other things. With good meshes you can look at calcium and vesicular release, even active transport. Without the computational meshes one has to make some highly non-biological assumptions (like homogeneous extracellular space, which doesn’t happen when astrocytes are only 10 nm away). The structural aspects are fundamental, without the differential geometry nothing else works. Bottom line, we need to go to something better than SWC files. There’s already some big libraries of EM tracings detailing cellular components, how can those be integrated with neuron shapes? Short answer: with SWC, they can’t. SWC stops at the plasma membrane. OBJ would work IF the software obeyed the standard, but it doesn’t. Someone has to start making noise about this.

Here’s what to expect: starting from an SWC tracing with 5885 points, converting to cylinders yields 185,000 vertices, and partitioning into computational compartments creates about 800,000 faces for surface-only. When the object relationships are maintained (each cylinder is named and capped) the resulting OBJ file is about 185 mB. It takes Blender about 5 seconds to read in and render. If instead of surface triangulation we go to full 3d tetrahedra the mesh becomes 1.7 gB in memory and the OBJ file gets so big Blender won’t read it anymore. This is for “a” single neuron (or in this case an astrocyte). In a real biophysical simulation we’re not going to use a whole neuron, usually we’re only interested in a small patch of membrane. The big deal is the assumptions you have to make to get the model to work - like, let’s say you just want to look at a synapse, so you chop off the rest of the neuron’s geometry upstream. What kinds of boundary conditions do you now have to impose - for instance if you just put a piece of membrane there, is that sufficient to retain the calcium behavior at the other end of the synapse? If you use the usual Navier-Stokes hard walls it won’t work, you have to go to more of a slip model, or like acoustics with semi reflective walls that partially absorb a pressure wave. This is what meshes are for, they tell us the things the connectomes can’t. Is it really possible for an astrocyte to modify activation functions and regional energy functions? The calcium stuff is very complicated. The connectomes are driving towards libraries of neurons with well known distributions of channels and receptors, and even that probably isn’t enough. Shape matters - a glutamate molecule is about 0.8 nm long and it’s going to diffuse through a channel 10 nm wide with a bunch of big proteins in the way. The positioning of the EAAT’s relative to the “cisterns” commonly found near synapses is enormously important, diffusion is only half the story. Not even to mention the cytoskeleton and the regulation of gap junctions. All this, is what meshes are for. Between the probabilistic opening and closing of ion channels and the homogeneity of a Nernst or GHK equation are about six layers of assumptions, and 5 of them are highly non-biological. Meshes can bring a frightening amount of accuracy and realism to the table but we have to be facile with them and the tools have to be robust. Some of these idiotic tools just FAIL when they can’t handle a mesh, you get an empty dataset, no error messages, not even “dendrite 27 is disconnected, can’t process mesh”. If you had the error message you could fix dendrite 27 or just delete it, but without the message the entire mesh becomes worthless because it can’t be converted.

All right, I was just speaking with the software folks over at Open Brain Institute. They’ve recognized the meshing issue and seem to have it licked. They’ve done some great work.

HOWEVER - they store meshes the same way Allen does, as sub-components of an hdf5 file. There is no standard for the hierarchical naming and organization of organelles.

Think about what this means for the user.

Let’s say I find a neuron I like, I look in the metadata and it says someone’s created a mesh, and sure enough I find an SWC file. Great - but no organelles. Aw shucks - but look, here’s the raw electron micrographs, I can just retrace these here on my Wacom tablet. So NOW we’ll have a bunch of new tracings and each one will have to be registered to the original image. So if I want to share the results with the community, that’s several hundred new SWC files with associated metadata. The NEXT person who comes along, and has the same issue with organelles, and maybe thinks they can save themselves a little time by re-using my meshes, is going to have a devil of a time.

I spoke with the VTK folks about this. The OBJ format is of limited utility, it DOES keep objects separate but does NOT have hierarchical scene management. The only formats with this capability are proprietary, with the exception of 3MF which is almost like hd5 but lives in a zip file.

Adding meshes willy-nilly to an hdf5 file is a bad idea. Really. It’ll come back to bite us in short order.

At the opposite end of this is a different problem, one can create a perfect reference mesh with 1 nm resolution in less than 5 min, but a typical neuron done this way will have 100 million vertices. However one can decimate as needed and derive any desired resolution from this reference. (If you need less than 1 nm a mesh probably won’t help you, you’re better off with Boltzmann on a lattice).