Browser plugin-free CIF visualization: comparison of the open-source engines

Tilde
Materials Informatics Lab
5 min readNov 18, 2015

--

Here we compare four open-source browser engines for plugin-free rendering of the crystalline structures in CIF format. We’ll talk about the crystalline data formats available and the modern JavaScript programming language (including transpilation of other languages to JavaScript). Thus we’ll “teleport” from the world of Schrödinger and Landau to the world of Berners-Lee and Jobs, and then back — let’s go.

Short intro: we know that the atoms in crystalline solids are arranged in the periodically repeating units with the size of the order of nanometer. Solid state physics (in the spirit of reductionism) studies the link between properties of these structural units (primitive cells) and macro-objects. For instance, being able to control an adsorption of water molecules at the surfaces of transition metal oxides means reducing the costs of hydrogen and oxygen synthesis, enhancing the performance of the fuel cells and sensor devices, optimizing the drug transport in pharmacology. One way or another, solid state physics describes any laboratory or technological process, and it’s hard to overestimate the importance of molecular visualization.

De facto, CIF is the most popular file format for crystalline data in materials science. An important difference of CIF from many other chemical file formats is the periodical translations support (in other words, to define an infinite crystal we repeat its primitive cell in three dimensions).

CIF was established in 90-s by the International Union of Crystallography (IUCR). CIF is based on a text container called STAR (Self-Defining Text Archive and Retrieval), where the physical properties, obtained e.g. as a result of X-ray diffraction or theoretical modeling, are labeled by the standard tags. The standard tags determine the parameters of the unit cell, its symmetry, atoms contained, relevant scientific publication metadata etc. These tags are defined in the external CIF dictionaries (like XSD schemata of the XML documents), so it is possible to validate a CIF file against a CIF dictionary, and even to infer the new physical properties from those available. The difference is that the CIF format allows the arbitrary tags. They are ignored by CIF parser, but later can become the part of standard CIF dictionaries, according to IUCR. Furthermore, CIF format supports the relational data model, so one can refer to the specific atom in the crystalline structure by its identifier. The drawback is the absence of a convenient multi-level hierarchy support, so here the STAR container concedes badly to XML. To the point, that’s why there is an XML-based competitor of CIF called CML (Chemical Markup Language).

Traditionally CIF files are supported by desktop applications, like Vesta, Accelrys (BIOVIA), RasMol, and others, however, also in this field browsers began to invade the area of desktop applications about four years ago. The known open-source results of this “invasion” are collected below. Their codebases are bundled in a single web app, which can be found in the GitHub repository. The code has been tested in several popular browsers, including IE 11 and mobile Safari. Structure of the repository is the following: folder data contains CIF examples (you may probably have yours). Folder engines contains JavaScript code for all the engines, folder utils includes auxiliary code e.g. browser-based CIF loader. To run this web app, make its folder discoverable to your web server, then navigate to the appropriate address in your web-browser (or just refer to the online branch of this repository). All files are static, no server-side code is required.

Shortly about four CIF examples from the data folder:

  • adsorption.cif — a model of aforementioned water adsorption on the perovskite surface (cf. picture in the beginning), by default its primitive cell is loaded into each engine.
  • fullerene.cif — famous fullerene.
  • lfp.cif — lithium iron phosphate, the cathode material of lithium-ion battery. Pay attention to the lithium ion, it’s agile and lightweight. (This is because lithium is the third in the periodic table, just below hydrogen.) The third electron or lithium is coming or leaving the external circuit, as a result of discharging or charging the battery, respectively, while the lithium ion itself travels through the electrolyte.
  • mdma.cif — 3,4-methylenedioxy-methamphetamine. Its functional groups can be replaced relatively easy, forming a new (and therefore legal) compound with nearly the same effect. This presents the legal issues with its synthetic analogues in some countries.

See below the comparison table.

Java applets, Flash and other plugin-based software was not considered as an evolutionary dead-end, in favor of the only pure (as it’s commonly said, “vanilla”) JavaScript. However, pure JavaScript can be synthesized from the number of other programming languages, and the first and third competitors are exactly such cases.

JSmol is obtained from Java code of Jmol using the Java2Script tool. The total size of JSmol codebase is 12.7 MB, parts of which are loaded by request. The table above provides the size required to initialize correctly. By the functionality, this engine repeats its well-known fellow Jmol written in Java, providing the richest number of features. The comparison would be finished here, if the size of this engine weren't so huge (and the initialization so slow). Furthermore, in this case the costs of supporting two languages appear. This is, in particular, ugly and unreadable JavaScript code.

ChemDoodle Web Components is based on proprietary software. This fact has advantages (excellent documentation, high quality of both the code and the product in whole) and disadvantages (license limitations, user tracking without turn off). The disadvantage is also the absence of canvas support (WebGL is obligatory), so slightly outdated hardware will not work. To turn a blind eye to these, ChemDoodle engine is the winner of this comparison.

RasmolJS is generated from its older brother RasMol, written in C, via Emscripten. British cheminformatician Noel O’Boyle had ported the original graphics part to the SDL library, which is supported by Emscripten, thus providing the rendering inside the HTML canvas element. Furthermore, the transpiled JavaScript code complies to asm.js standard, providing in theory considerable performance gain. In practice, the engine turned out to be quite slow and ponderous, also lacking a number of previous participants’ features. Probably, a bit more work on RasmolJS would change this.

Player.html was written in-house, using Three.js and Math.js. Its development started relatively recently, so the functionality isn’t very rich so far. The emphasis is laid on the speed and minimalism, as well as the support of the widest possible range of hardware. As a result, this engine works robustly even on the old smartphones and laptops. The plans of the further development include improvements of modularity and extended crystalline symmetry support, as well as custom CIF tags display.

To summarize, we have a healthy competitive environment. The browser has become indeed the convenient tool for crystalline visualizations.

--

--

Intelligent software for computational materials science and cheminformatics. Free and open-source. Inspired by #BlueObelisk Web: https://tilde.pro