Josh Holbrook

Posted on May 20, 2020

Declarative LaTeX APIs

#latex #declarative

Over the course of a few days last week, I took on the task of updating my resume. In addition to updating the copy - not my favorite activity, if I can be real - I also overhauled the layout, which hadn't been touched in nearly a decade.

My resume is written in LaTeX. LaTeX is an environment for making pretty PDFs (*waves his hands a little*) built on top of TeX, a typesetting language invented by Donald Knuth in the late 70s. It's popular among mathematicians and other academics because it's really good at typesetting math equations. I was a math major in college and my hand-written homework was extremely messy, so I learned LaTeX in order to hand in work that my professors could actually read. Today I don't use LaTeX very often - usually I'm working in languages like markdown, reStructuredText or org, or in wysiwyg environments like Google Docs - but my resume is a legacy of that time.

I graduated college nearly a decade ago, and over that time I've slowly refactored and made improvements to the code. For instance, a few years ago I refactored it to be ~modular~, using includes:

\section{Employment History}
\input{./jobs/sqsp.tex}
\input{./jobs/gmg.tex}
\input{./jobs/conde_nast.tex}

Each job I've had is specified in its own file and I use \input to include any given job in my resume. This is nice because it means that I can easily include or exclude jobs (or projects, skills, etc) from versions of my resume tailored to different audiences.

This time, as I had renewed focus on my resume, I took on the task of making these specifications declarative and separating out layout and formatting concerns.

What do I mean by that? As an example of what I wanted to fix, here's what the snippet for my past role at Gizmodo Media Group looked like:

\affiliation[Gizmodo Media Group, New York NY]{Senior Developer, Data Engineering}{September 2016 - May 2019}
\normalsize
Job duties and accomplishments include:
\small
\begin{itemize}
    \item Developing, deploying and supporting ETL processes, pipelines and scheduling for data warehousing with Python and AWS Redshift
    \item Designing and delivering dashboards and explores in Looker
    \item Taking development and ops/support ownership of Kinja's internal data analytics pipeline and replatforming batch compute jobs from Scala to Python, using Apache Spark, MySql, Jenkins, EMR and Kinesis
    \item Leading the overhaul of Kinja's Amazon affiliate program tracking and analytics using Scala, MySQL, Python and Looker
\end{itemize}
\normalsize
\medskip

You can see here that formatting and layout are intermixed with this specification. \normalsize and \small change the size of the font, the itemize environment is used for lists, and \medskip is for vertical space.

By analogy, this is like the set of problems with HTML that CSS was written to avoid. With CSS, you can generally specify the shape of nodes in your HTML - if you're really lucky all of the HTML nodes have semantic meaning - and all of the details on what those elements actually look like get pushed elsewhere.

TeX is much older than HTML, has a very different design and, well, doesn't have an analog to CSS. LaTeX is a macro language, meaning that any one of these commands - \normalsize, \medskip - get expanded into other commands until all that's left are primitives. That said, this sort of factoring is still possible.

LaTeX has the concept of "document classes", which are specified in files that end in the ".cls" extension. At the top of my resume I have a statement that looks like this:

\documentclass[9pt]{resume}

and I have a file called resume.cls that specifies the document class. In this file is a big pile of package imports and code that defines commands used in my resume, which are then used to actually render it. For an example, here's the definition of \section:

\renewcommand{\section}[1]{
  \goodbreak\vspace{1em}\Large\textsf{\textbf{#1}}\normalsize\medskip
}

\renewcommand redefines a command - this document class builds on top of another one called "extarticle", which already defines a section command. This snippet defines the \section command as taking one argument, the section name. This command gets expanded to signal a "good place to put a page break", a small amount of vertical space, some big bold sans-serif text, and a medium sized amount of vertical space.

As I rewrote the document class, I refactored the commands used to specify various entities in the resume, like jobs, skills lists, projects and degrees. The specification GMG now looks like this:

\begin{job}
  \employer{Gizmodo Media Group}
  \offices{New York NY}
  \jobtitle{Senior Developer}
  \team{Data Engineering}
  \dates[September 2016]{May 2019}

  \begin{accomplishments}
    \item Acted as the technical lead to deliver Amazon affiliate program
    commerce analytics involving three teams and multiple contractors, providing
    experiment data used to optimize conversions on a front page redesign
    \item Architected an A/B testing API so experiments could be configured with
    a web UI instead of disparate scripts
    \item Owned internal analytics and recommendation systems that served over
    100 million users a month; overhauled legacy systems and implemented
    monitoring, vastly decreasing total outages and virtually eliminating
    undetected outages
    \item Ported the organization's microservice framework from Scala and Play
    to Python and Flask to enable development of services by the data team
    \item Directly mentored two junior employees
  \end{accomplishments}
\end{job}

At first glance, this more clearly separates formatting/layout from specification with an API that describes the what rather than the how. I now have environments that wrap different concepts in my job entry: my employer, the dates, my "accomplishments" and so on.

This is what I mean by "declarative". I don't want to specify the procedures or the actual process of how to render a job - I want to specify what the job is and delegate the actual decisions of how to present it to my document class.

It's not apparent from the format of this snippet, but I actually took this a step further: the order of the commands and environments doesn't matter. For instance, if I reformatted my job at GMG to look like this:

\begin{job}
  \dates[September 2016]{May 2019}
  \employer{Gizmodo Media Group}
  \offices{New York NY}

  \begin{accomplishments}
    \item Acted as the technical lead to deliver Amazon affiliate program
    commerce analytics involving three teams and multiple contractors, providing
    experiment data used to optimize conversions on a front page redesign
    \item Architected an A/B testing API so experiments could be configured with
    a web UI instead of disparate scripts
    \item Owned internal analytics and recommendation systems that served over
    100 million users a month; overhauled legacy systems and implemented
    monitoring, vastly decreasing total outages and virtually eliminating
    undetected outages
    \item Ported the organization's microservice framework from Scala and Play
    to Python and Flask to enable development of services by the data team
    \item Directly mentored two junior employees
  \end{accomplishments}

  \jobtitle{Senior Developer}
  \team{Data Engineering}
\end{job}

it would render the same thing on my resume. This is actually not completely straightforward in LaTeX - it requires some fiddling - and it's this process that I wanted to go over in this humble blog post.

-

The blocks that look like \begin{job} and \end{job} are called "environments" in LaTeX. Specifying an environment looks something like this:

\newenvironment{pullquote}{
  % commands to insert before the body
  \goodbreak\hrule\sffamily\Large
}{
  % commands to insert after the body
  \normalsize\rmfamily\hrule\goodbreak
}

The specification of an environment basically gives you a before hook and an after hook for the environment. In this example, I make a "pull quote" environment that wraps the body in some calls to make the text really big and in a sans serif font, with horizontal rules and good breaks above and below.

In my declarative environments, I used the body to set some state, and then used the after hook to take that state and render it. For a simpler example, let's look at my college education:


\begin{education}
  \institution{University of Alaska Fairbanks}
  \degree{Master of Science}
  \major{Mechanical Engineering}
  \dates[September 2005]{May 2011}
  \gpa{3.62}
\end{education}

To make this work, I first define some internal commands that on the outset resolve to nothing:

\newcommand{\@institution}{}
\newcommand{\@degree}{}
\newcommand{\@major}{}
\newcommand{\@minor}{}
\newcommand{\@thesis}{}
\newcommand{\@honor}{}
\newcommand{\@gpa}{}

Then, I create wrapper commands that redefine those commands to have values:

\newcommand{\institution}[1]{\renewcommand{\@institution}{#1}}
\newcommand{\degree}[1]{\renewcommand{\@degree}{#1}}
\newcommand{\major}[1]{\renewcommand{\@major}{#1}}
\newcommand{\minor}[1]{\renewcommand{\@minor}{#1}}
\newcommand{\thesis}[1]{\renewcommand{\@thesis}{#1}}
\newcommand{\honor}[1]{\renewcommand{\@honor}{#1}}
\newcommand{\gpa}[1]{\renewcommand{\@gpa}{#1}}

Remember, LaTeX is a macro language, so it doesn't really have the concept of variables - but here, we're effectively using macros to simulate variables.

In the body of the education environment, I can call my commands to define those "variables", and then in the after hook use them to render the section in my resume:

\newenvironment{education}{}
{
  \goodbreak
  \large
  \@degree\ifthenelse{\equal{\@major}{}}{}{,
    \@major}\ifthenelse{\equal{\@minor}{}}{}{(Minor: \@minor)} \hfill \makedaterange \normalsize \\
  \@institution\ifthenelse{\equal{\@gpa}{}}{}{\hfill GPA: \@gpa} \\
  \ifthenelse{\equal{\@thesis}{}}{\smallskip}{\smallskip Thesis: \textit{\@thesis}\medskip}

and finally, before exiting, reset the "variables" so that they're blank again:

  \renewcommand{\@institution}{}
  \renewcommand{\@degree}{}
  \renewcommand{\@major}{}
  \renewcommand{\@minor}{}
  \renewcommand{\@thesis}{}
  \renewcommand{\@gpa}{}
  \renewcommand{\@fromdate}{}
  \renewcommand{\@todate}{}
}

This is similar to a family of OOP design patterns called the builder pattern. In my version of a builder pattern, an analogous API in JavaScript that generates HTML might look something like this:

const collegeDegree = educationDirector(htmlBuilder)
  .institution('University of Alaska Fairbanks')
  .degree('Master of Science')
  .major('Mechanical Engineering')
  .from('September 2005')
  .to('May 2011')
  .gpa(3.62)
  .build();

You can imagine that calling these methods on my director collect some state on a director object, maybe something like this:

  institution(name) {
    this._institution = name;
    return this;
  }

and maybe build looks something like this:

  build() {
    let output = '';

    const left = [
      this._degree,
      this._major || '',
      this._minor ? `(Minor: ${this._minor})` : ''
    ].join(' ');

    const right = this._builder.daterange(this._from, this._to);

    output += this._builder.headline(left, right);

    // ...and so on

This hypothetical JavaScript version is a lot more sophisticated than the one in LaTeX - it uses objects and has pluggable builders - but the general notion of collecting state and then "building" the desired output at the end is similar.

So we have a technique for making declarative environments in LaTeX, but there's one important wrinkle that I haven't covered yet: what about the nested environments? My "job" snippet had one called "accomplishments". There isn't really a clear way to "put an environment in a variable" like we did the more simple cases. How did that work?

After some digging, I came across a LaTeX package called collect, which "Provides a 'collect' environment, that typesets text and saves it for later re-use". With this in mind, I defined the "accomplishments" environment like so:

\definecollection{job}

\newenvironment{accomplishments}{
  \@nameuse{collect}{job}{\begin{itemize}[topsep=0pt,parsep=1pt,itemsep=0pt,leftmargin=10pt]}{\end{itemize}}
}{
  \@nameuse{endcollect}
}

The normal API for defining a collect block is environment-based - \begin{collect} - but when using inside an environment, you have to use \@nameuse instead. In this case, I define a collect that wraps the stuff inside it with a list environment.

One thing I don't like about LaTeX is that some things aren't documented very well and a lot of times people kinda naively follow along without really understanding the thing that they're using. Unfortunately, I'm doing that here - I don't understand why the environment doesn't work, and I don't know where \@nameuse is defined or how it relates to environments. Some things will remain a mystery, I suppose.

The upshot, though, is that when I open an "accomplishments" block the list that I define gets captured and put into a file called resume.job. Then, in the after hook for my "job" environment, I can pull the contents back out. That looks like this:

\includecollection{job}

This automatically resets the job collector, so I don't have to do any renewing like I did with the "scalar" fields.

So that's how I refactored my resume to use declarative APIs! I was able to use this combination of techniques to make the items in my resume describe their semantics instead of their layout while pushing presentation concerns into my document class. By combining this with prior work to modularize my resume, I now have a fairly pleasant language for describing my work history and qualifications - and can focus on that copy.