MathML: Mathematics in XML

Typesetting mathematics has always been a chore, for as long as typesetting has been a thing. These articles give some insight into mathematical typesetting in the days of hot-metal type: From boiling lead and black art: An essay on the history of mathematical typography, by Eddie Smith, and The Monotype 4-Line System for Setting Mathematics, by Daniel Rhatigan. Things got better with the advent of computerized typesetting, but there were still problems:

  • How do you represent the mathematical symbols?
  • How do you represent the equations in a portable way?
  • How do you make some symbols, like summation and integral signs, expand to fit the following part of the equation?

Unicode took care of the first problem, by providing code points for an almost innumerable number of symbols. (The value of this, in all sorts of fields, should not be underestimated. For example, I have a book, Medieval Slavic Mannuscripts and SGML: Problems and Perspectives, by Marin Drinov, from the pre-Unicode era. Much of the book deals with how to represent Old Slavic characters in ASCII.)

The second and third problems were addressed by LaTeX (generally pronounced like LAY-tek or LAH-tek), a mathematical typesetting system that has become the de facto standard in the scientific and mathematical communities. (It was released in 1984, and is based on the TeX typesetting system introduced by Donald Knuth in 1978.) The Maxwell-Faraday equation shown at the top of this article would be represented like this in LaTex. (The equation states that a time-varying magnetic field is always accompanied by a spatially-varying, non-conservative electric field, and vice-versa.)

\oint_C {E \cdot d\ell = - \frac{d}{{dt}}} \int_S {B_n dA}

This is a compact notation, but not very easy to read unless you have studied LaTex. It is also not easy to process by machine.

MathML is another way of addressing the second and third problems, using an XML-based language. This makes it accessible to XML-based tools, such as XSLT and XQuery. Here is the Maxwell-Faraday equation expressed in MathML:

<?xml version="1.0" encoding="UTF-8"?>
<m:math xmlns:m="http://www.w3.org/1998/Math/MathML" display='block'>
   <m:mrow>
      <m:mstyle displaystyle='true'>
         <m:mrow>
            <m:msub>
               <m:mo>&#x222E;</m:mo><!-- Unicode Character 'CONTOUR INTEGRAL' -->
               <m:mi>C</m:mi>
            </m:msub>
            <m:mrow>
               <m:mi>E</m:mi>
               <m:mo>&#x22C5;</m:mo><!-- Unicode Character 'DOT OPERATOR' -->
               <m:mi>d</m:mi>
               <m:mi>&#x2113;</m:mi><!-- Unicode Character 'SMALL SCRIPT L' -->
               <m:mo>=</m:mo>
               <m:mo>&#x2212;</m:mo><!-- Unicode Character 'MINUS SIGN' -->
               <m:mfrac>
                  <m:mi>d</m:mi>
                  <m:mrow>
                     <m:mi>d</m:mi>
                     <m:mi>t</m:mi>
                  </m:mrow>
               </m:mfrac>
            </m:mrow>
         </m:mrow>
      </m:mstyle>
      <m:mstyle displaystyle='true'>
         <m:mrow>
            <m:msub>
               <m:mo>&#x222B;</m:mo><!-- Unicode Character 'INTEGRAL' -->
               <m:mi>S</m:mi>
            </m:msub>
            <m:mrow>
               <m:msub>
                  <m:mi>B</m:mi>
                  <m:mi>n</m:mi>
               </m:msub>
               <m:mi>d</m:mi>
               <m:mi>A</m:mi>
            </m:mrow>
         </m:mrow>
      </m:mstyle>
   </m:mrow>
</m:math>

Although MathML seems more intuitive (at least to me) and regular than LaTeX, it is extremely monotonous entering equations by hand. In manually coding an equation of any complexity, one quickly gets in the many nestings. But, like XML itself,  MathML is not intended to be entered manually.

So why use MathML? For one thing, it makes it possible to use XML stylesheets (XSLT) to automatically transform equations. For example, the tangent of X is represented in the US as “tan(x)”, but in some countries, it is represented as “tg(x)”. With XSLT, you can easily transform your equations to suit local preferences. And since you can do it in a stylesheet, you do not have to change each equation individually. There are two different forms of MathML: Content MathML and Presentation MathML. The former is concerned with describing equations in ways that they can be consumed by mathematical engines, like Mathematica. The latter is concerned with how the equations are displayed. If you need to deal with equations in both modes, you can use an XML stylesheet to transform Content MathML into Presentation MathML.

There are editors that make it possible to enter equations in MathML by selecting the pieces from menus, and there are even tools that will convert LaTeX to MathML. My favorite MathML editor is MathType, by WIRIS Math & Science. It allows you to build equations by picking pieces from menus. Here is a screenshot of its user interface:

It also has a “math input” panel, where you can draw your equation and it will OCR it. I suspect that this feature works better for those with better handwriting than mine.

Once you have your equation entered, you can cut it in either MathML or LaTex format (or several others) and paste it into your XML document. MathType can also take LaTex and convert it to MathML. So if you are accustomed to entering equations in LaTeX, but want to use them in your XML workflow, this may be the tool for you.

The LibreOffice Equation Editor is an open-source tool that can export MathML. I do not have any experience with it yet, but it looks promising.

Even with MathML, displaying equations on the web has been difficult, because of spotty and inconsistent support among the various web browsers. MathJax, a JavaScript display engine that works in all browsers, has improved this situation, as have the STIX Fonts, free Postscript type 1 fonts for math. (By the way, in case you want to enter your equations as LaTeX, and are not interested in XML workflow, MathJax can display LaTeX, too.)

And if you want to print your equations, you can use the JEuclid plug-in for Apache FOP processor to turn your XML into PDF files. (Be warned, this is not quite as easy as it sounds. Apache FOP reads XSL-FO [Formatting Objects], which describes the page layout. Setting up page templates will be part of the setup for your workflow.)

Here is an excellent reference for MathML: Ry’s MathML Tutorial [Kindle Edition], by Ryan Hodson (2014).

There is another XML-based mathematical representation called OpenMath. It is comparable to Content MathML, but concentrates on defining the semantics of each symbol. While OpenMath and MathML were once looked at as competing technologies, the two organizations have been working towards improving their compatibility, and the ability to convert between them, so that each can be used where it is stronger.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *