DEV Community

Furkan Kalkan
Furkan Kalkan

Posted on

4 1

Quick Hack: Converting MathML to LaTeX

Recently, I need to convert some MathML codes in article metadata from SCOAP3 to LaTex format. Most of institutional repositories escapes XML entities, so MathML doesn't render correctly. I tried the Wiris' API but it's very slow and give errors in most of long formulas.
Finally, I found Yaroshevich's XSL Schema that works without problem.

Example Python code:

import lxml.etree as ET

def to_latex(text):

    """ Remove TeX codes in text"""
    text = re.sub(r"(\$\$.*?\$\$)", " ", text) 

    """ Find MathML codes and replace it with its LaTeX representations."""
    mml_codes = re.findall(r"(<math.*?<\/math>)", text)
    for mml_code in mml_codes:
        mml_ns = mml_code.replace('<math>', '<math xmlns="http://www.w3.org/1998/Math/MathML">') #Required.
        mml_dom = ET.fromstring(mml_ns)
        xslt = ET.parse("mmltex/mmltex.xsl")
        transform = ET.XSLT(xslt)
        mmldom = transform(mml_dom)
        latex_code = str(mml_dom)
        text = text.replace(mml_code, latex_code)
    return text
Enter fullscreen mode Exit fullscreen mode

AWS GenAI LIVE image

Real challenges. Real solutions. Real talk.

From technical discussions to philosophical debates, AWS and AWS Partners examine the impact and evolution of gen AI.

Learn more

Top comments (2)

Collapse
 
gateragael profile image

Hello Furkan, Ignore my previous comment as the code works fine besides a small mistake. Right where you are transforming ----> mmldom = transform(mml_dom) you passed in the same variable "mmldom" into the convert to string function ----> latex_code = str(mml_dom)

Collapse
 
gateragael profile image
Gael Ruta Gatera
  • mmldom = transform(mml_dom)
  • latex_code = str(mmldom)

Sentry image

See why 4M developers consider Sentry, “not bad.”

Fixing code doesn’t have to be the worst part of your day. Learn how Sentry can help.

Learn more

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay