# Stepping into math: Open-sourcing our step-by-step solver

### Evy Kassirer ăť11 min read

*[fork us onÂ **GitHub**!]*

Our mission at Socratic is to âmake learning easyâ. Our app lets you take a picture of a homework question, and we teach you how to answer itĂ˘âŹĹ âĂ˘âŹĹ magic!

Millions of students use our app and website to learn, and math (especially algebra) is consistently the top subject, for good reason: everyone has to take math, they take it for years, concepts build on each other, and many find it hard to understand.

To provide an excellent math learning experience, we wanted to guide students through their math problems, step-by-step. A good step-by-step solution for an algebra problem (such as âsimplify x + ĂÂ˝ + x + Ă˘âŚââ) should be **detailed** and have good **explanations** of what happens along the way. These steps should also feel **intuitive** Ă˘âŹĹ âĂ˘âŹĹ not just any step-by-step solution, but one that a tutor would show theirÂ student.

We looked around for existing solutions that we could integrate into our app, but the ones we found were closed-source, behind paywalls, or did not focus on the teaching behind the steps, so we decided to build ourÂ own.

Weâre thrilled to release **mathsteps**Ă˘âŹĹ âĂ˘âŹĹ **the first open-source project that teaches math step-by-step**. We would love for you to join us in making math easy and fun toÂ learn.

#### mathsteps in the SocraticÂ app

We use mathsteps to power the math experience in our latest update. Students can take a picture of a math question, and we teach you how to answerÂ it.

#### mathsteps in your ownÂ project

Our primary goal for this project is to build a math solver library that is focused on **pedagogy** (how best to teach). The math problems weâre currently focusing on are pre-algebra and algebra problems involving simplifying expressions, for example getting from (1 + 2) - abs(-3) * xĂÂ˛ to 3Ă˘âŹĹ âĂ˘âŹĹ 3xĂÂ˛. Our solution is a node module that, given a string of math, produces a list of steps to the solution. It is important that this step-by-step solution is similar to what a tutor would show aÂ student.

to install mathsteps:

```
npm install mathsteps
```

to use mathsteps in yourÂ project:

```
const mathsteps = require('mathsteps');
const steps = mathsteps.simplifyExpression('2x + 2x + x + x');
steps.forEach(step => {
console.log(step.oldNode.toString()); // "2 x + 2 x + x + x"
console.log(step.changeType); // "ADD_POLYNOMIAL_TERMS"
console.log(step.newNode.toString()); // "6 x"
console.log(step.substeps.length); // 3
});
```

### How mathsteps works

There are three main parts to building this math step-by-step solver:

- Parsing the input (a string of math) to create an expression tree
- Modifying the tree to make it easier to workÂ with
- Changing the tree in small ways, each change acting as aÂ âstepâ

### 1. Parsing mathÂ input

#### Math expressions areÂ trees

As humans, we read and write math as a line of text. If you were to type a math expression, it would probably look something likeÂ this:

(1 + 2) - abs(-3) *Â xĂÂ˛

You could also just look at that math expression and use your intuition to prioritize where to start simplifying. But a computer will understand the expression best when itâs stored in a tree. These trees can be surprisingly complicatedĂ˘âŹĹ âĂ˘âŹĹ even a short expression like (1 + 2) - abs(-3) * xĂÂ˛ becomes thisÂ tree:

There are many existing open source projects that parse strings of math and create trees like this one. Several of these projects are also full Computer Algebra Systems (CAS) which can provide answers to math problems, though not with step-by-step explanations.

When we were researching this project, we considered using an existing CAS and *adding* steps to it. SymPy, a well known open-source CAS, stood out as a great choice. Upon diving into the code, however, we realized that the structure of SymPy expression treesare optimized for finding answers, but not for teaching. Its trees donât store division or subtraction because these operations can be represented by multiplication, addition, and exponents.

^{-1}

Sympy introduces ambiguity; given a sympy tree, there are multiple user inputs that could have yielded it, which are mathematically equivalent but not necessarily the same to a student. When building a CAS, itâs beneficial to reduce the problem to make it easier to achieve the goal. **But** **the goal of a CAS, only getting the answer, is a different from our goal, a step-by-step solution**. And the step-by-step solution requires a different architecture.

#### The math.js expression tree

Looking further, we found math.js, a powerful and extensive open-source math library. Its expression trees provide lots of details about the structure of the math expression, which is well suited to creating the steps we want. Itâs been a huge pleasure working with math.js. Its community is great, and Jos has been very responsive and supportive as weâve been building mathsteps.

Itâs important to note that when math.js creates an expression tree, it represents all operations as binary (ie a node can have maximum two children). This can be explained by the textbook definition of arithmetic operations. For example, + is adding exactly *two* things together. So 2 + 3 + 4 is actually either (2 + 3) + 4 or 2 + (3 + 4). This means math.js has to make a choice about which two things are being added together. It implicitly adds parenthesis when constructing its tree to make the operations binary.

But because + and * are commutative and associative binary operations, they feel intuitively like they arenât binary but could have any number of arguments. 2 + 3 + 4 + 5 + 6 feels like an addition of 5 numbers. x * y * xĂÂ˛ * x feels like a multiplication of 4 terms, 3 of which have x in them and could be combined together. This combining step turns out to be important in teaching, and is why we need to change the math.js tree to not be binaryÂ anymore.

### 2. Modifying expression trees for step-by-step solutions

After using math.js to create a tree from a string of math, we transform the tree by **flattening operations**. This flattening step removes grouping choices made by the math.js parser. Through converting the binary tree into one that represents math in a more humanly intuitive way, it becomes much easier to perform step-by-step simplifications.

For example, consider the expression 2 + x + 2 + x. These are the steps for simplifying:

Your question:

2 + x + 2 +Â x> Collect like terms:(x + x) + (2 +Â 2)> Combine like terms:2x +Â 4

But hereâs the issue: the binary tree that math.js generates for 2 + x + 2 + x requires iterating up and down the tree to find like terms. These steps are way easier to do when we first transform the tree likeÂ this:

The transformed tree is a lot closer to the way we all intuitively view addition. We can then look at the children of (+), see that two of them are x and two of them are numbers, and collect those like terms to get (x + x) + (2 + 2). Flattening multiplication works in the sameÂ way.

Notice that even though we change the tree, we still preserve the userâs input and therefore our ability to teach what the student is asking. There is exactly one situation where how we store the tree is a bit different from what the student gave as input: subtraction. When you see the expression 2 - x - 2 - x you probably still see -x and -x as like terms. We restructure the tree in the following way to represent this:

However, the printed expression 2 + -x + -2 + -x doesnât really make sense, and we assume the student would never input this. So when we print the tree on the right, we replace the + - with just - to get â2 - x - 2 -Â xâ.

You can read the code for flattening operations here.

### 3. Simplifying expressions math step-by-step

Once we have an expression tree thatâs modified to best support step-by-step simplifications, we iteratively apply **simplifying rules** to the tree. Here are some examples of the main categories of simplifying rules that we iterate over each step, which are applied in an order that a tutor would show theirÂ student:

- Simplify basics (e.g. Ă˘âÂ˘Ă˘ÂÂ° => 1, where Ă˘âÂ˘ can be any expression)
- Evaluate arithmetic (e.g. 2 + 2 =>Â 4)
- Collect and combine (e.g. 2x + 4xĂÂ˛ + x => 4xĂÂ˛ +Â 3x)
- Distribute (e.g. (2x + 3)(x + 4) => 2xĂÂ˛ + 11x +Â 12

Each of these simplifying rules are tree searches that traverse through the whole math expression tree to see if we can perform that simplification anywhere. For example, searching for the rule Ă˘âÂ˘Ă˘ÂÂ° => 1 would look likeÂ this:

During the tree search, the algorithm checks nodes one at a time (shown in red in the gif) to see if they match a rule. For the Ă˘âÂ˘Ă˘ÂÂ° => 1 rule, the check looks likeÂ this:

- Check if the node is a ^ operation node. If not, moveÂ on.
- If so, check the exponent argument. If that exponent is not 0, moveÂ on.
- If it is, then this node has matched the rule. Replace the node with the constant node 1, to be recorded as the nextÂ âstepâ.

Every tree search in mathsteps finds one place in the tree to apply a simplification, then returns from the search with that simplification. We keep looking for simplifications, starting at the very top of the tree each time, until no more simplifications can be applied. As we go, we keep a list of each simplification that is applied, which then make up the final step-by-step solution.

#### Intuitive doesnât meanÂ simple

To create the best teaching experience, we sometimes add extra steps for detail. Ideally, thereâs as much detail as possible, so that weâre less likely to leave the student confused.

In this example, the best solution is more steps, and also requires more code. **Having pedagogical opinions makes the math solver more complex, but these extra details create a more intuitive learning experience** because weâre explaining things more thoroughly.

However, complexity can make thingsâŚ well, more complex. Once we start incorporating the process of *teaching* math, steps can stop being just simplifications, but that can create technical challenges. For example, the best way to teach someone to add fractions would be to explain making a common denominator first.

Note how the first step makes the expression *more* complicated! But then, if the tree search only does one change at a time without any context, this couldÂ happen:

Infinite loop! Well,Â darn.

To fix this, we have to remember that weâre in the middle of adding fractions while choosing the next step. Our solution in mathsteps is to group related simplifications (for example, all the steps to add two fractions together) in the same tree search iteration.

#### Grouping steps to make teachingÂ better

When we group steps, we can also introduce substepsĂ˘âŹĹ âĂ˘âŹĹ extra details behind a step that arenât shown at the top level. This is what it looks like in ourÂ app:

Collapsed substeps allow us to give detailed steps without overwhelming students at first glance. We can only expose the high level changes at first, and let the student explore the details of steps they donât understand. Grouped steps arenât just a technical simplification; they represent a real, intuitive, pedagogical concept.

If youâre curious, you can see the code for adding fractions here.

#### Detail in the explanations

We tried to make the step descriptions as detailed and specific to the situation as possible; for example, âadd the numerators together is better than just âadd the numbers togetherâ. This brings us closer to the language a human tutor might naturally use in a multi-step explanation. We also want to allow users of mathsteps to reference what changed specifically in the expression, for example âadd 2 and 3 to make 5â. We keep track of what part of the tree changed in theÂ node.

#### Optimizing simplifications

In mathsteps, there are two kinds of tree searches: one that simplifies the children of a node first and one that simplifies the parentÂ first.

In this example, simplifying the children before the parent (which is called postorder search) simplifies (xĂ˘ÂÂ°)Ă˘ÂÂ° to 1Ă˘ÂÂ° and *then* to 1, which is harder to follow than simplifying the parent before the children (preorder search) and going straight from (xĂ˘ÂÂ°)Ă˘ÂÂ° to 1. So for the rule Ă˘âÂ˘Ă˘ÂÂ° => 1, we change the *highest* part of the tree that matches theÂ rule.

For orders of operation in arithmetic, however, a postorder search makes more sense, since weâre taught to simplify whatever is deepest in the expression first. For example, (2 * (2 + 3)) will simplify 2 + 3 first, so itâs more efficient for the tree search to start by attempting to perform simplifications lower in theÂ tree.

In general, some searches are better as preorder searches, and others as postorder searches. Sometimes this is for technical reasons and efficiency, but a lot of the time itâs for teaching reasons. Itâs important that the codebase is organized around teaching, rather than just solvingÂ math.

### Thereâs more toÂ do

Weâve accomplished a lot, and currently provide high quality step-by-step solutions for a subset of high school algebra. However, there is so much to be done to increase our coverage and teach more students. Weâre excited for the future of this project, and weâre also excited to see what else will be built with mathsteps.

Here are some ideas weâve had about great teaching experiences built off of mathsteps:

- Before showing students a step, have them guess what to do next and check their work as theyÂ go
- Keep track of the types of problems a student asks and when they look at substeps, and use this information to customize the detail of their future step-by-step solutions

âŚ and there are many more possibilities! If you have ideas, weâd love to hearÂ them.

### Help expand mathsteps

Weâve made mathsteps open-source! Our goal is to help as many students as possible, so weâd love for you to join us to help expand mathsteps to support all kinds of math! A lot of this project was built by a student intern (me!) so you donât need a fancy PhD to understand it or contribute.

Here are some great places toÂ start:

- Check out our small tasks on GitHubĂ˘âŹĹ âĂ˘âŹĹ theyâre great candidates for your firstÂ change!
- Read through our tests of the steps or the examples in comments in the code, if you want to explore what the existing code does or how itÂ works
- Read our CONTRIBUTING.md for more details about contributing

Weâre excited about mathsteps and hope it will improve the world of math educational tech for both engineers and students. If youâd like to chat, work with us to create your first PR, or get some help using mathsteps in your projects, please reach outĂ˘âŹĹ âĂ˘âŹĹ weâd **love to hear fromÂ you**!

*Evy Kassirer* *is a Computer Science student at the University of Waterloo and has interned at Google, Khan Academy, and Socratic. She works to inspire curiosity, likes building stuff that helps people, and loves her communities.*

*This post was originally published on medium.com*