Background
A nasty surprise
Last summer while trying to deliver a feature for one of our customers, I encountered a nasty situation. The software we were developing, depended on a production grade license of Gurobi. People were on vacations except of my team and some unrelated staff, so developing the feature was in principle blocked. As I learnt due to some other situations, research stuff being participating in conferences, they could not update the license. These are the people who had the final saying. Still the situation for me was very uncomfortable, because this feature would be delayed a lot. Months before I had cautioned that the sole dependency on a closed source solution was a bad practice when there were free open source solutions like HiGHS. Gurobi is the leading player in the field with a very performant product that offers many conveniences. Actually, much more performant than the open source solutions in our case. But license disruptions could happen and users of the feature would be in a difficult situation.
In summary the feature amounted to the following workflow. Users could parametrize a process in a Web GUI. These parameters are translated to an ILP (Integer Linear Programming) problem which subsequently is solved and results are returned back to the WebGUI. We followed the standard approach of sending these parameters as a REST payload to a server. The server would do the translation to the ILP. Having also done the solution, the results are sent back.
You can get a taste of that here
The plan
Having some time available I decided to evaluate the possibility of providing an alternate implementation of the solution part instead of mocking it. It was important since performance considerations were also in scope. The first attempt bombed because the code was not clean. It was written by researchers after all. I was lucky enough to have some of their notebooks with outputs for comparison. So, given this opportunity, I went ahead to clean up their code considerably (and fix a number of serious bugs, yay!!!). This post focuses on the bringing up of the alternative and not the other parts of the feature that were equally important. But first let's outline the plan of attack I decided upon. We are talking about a Python code base.
- Cleanup the code so that the ILP problem is clear. Given the previous attempt of a colleague who worked on the cleanup before, I was able to further the cleanup, attach types, and make sense of the code. I will not get into more details, but it was not very pleasant.
- Given the Gurobi code, and the fact that there is an interchange format for ILP problems, called MPS, the workaround here was to serialize the Gurobi formulation to an MPS file, load it and solve the ILP with HiGHS. It involved some work, mostly writing a bunch of adapters and understanding how HiGHS works. This was the path of least resistance and worked fine. Acknowledging the bottleneck of moving the huge MPS file across the network instead of the way smaller set of parameters, as the original plan was, I hid the file generation within the computation server.
- While not having the best solution, I was more confident. The whole feature was progressing after all. I decided to give a shot in the re-implementation with HiGHS which would bring me in parity with the original plan. This would eliminate the serialization/deserialization of a big file. It was now easier than I anticipated.
Obviously I will not be able to share the code, but I will use a toy example to highlight the principles.
Highlights of the porting
Introduction
As a toy example I will use the famous "assignment problem". It is a very common and simple ILP problem, that pales in comparison to the ILP problem of the customer. However it is enough to highlight the main issues. I use this excellent reference. It is a good set of lectures for solving ILP problems. You can try to replicate what is presented here for the other problems.
The typical assignment problem amounts to assigning M people to N jobs with every possible assignment, say job -> person incurring a cost of C(job, person). The task is to find the minimum cost assignment. The constraints are:
- Every job must be assigned exactly one person
- Persons can be assigned to at most one job.
Obviously M should be at least N to cover all the jobs and M should be at most N to not leave people out. Our plan here is to solve this in three ways:
- Gurobi (Model in Gurobi and solve in Gurobi)
- Pseudo Gurobi (Model in Gurobi solve in HiGHS)
- HiGHS (Model in HiGHS and solve in HiGHS)
Code is here.
Gurobi approach
First of all we will use named binary variables to refer to our potential assignments. If they take the value 1 after a solution, these assignments have been realized.
import gurobipy as gp
from gurobipy import GRB
env = gp.Env()
model = gp.Model(env=env)
x = {}
for job_index in range(0, Njobs):
for worker_index in range(0, Njobs):
var_name = f"x_{job_index}_{worker_index}"
x[(job_index, worker_index)] = model.addVar(vtype=GRB.BINARY, name=var_name)
Now we need to have some assignment costs as we said previously.
import random
from typing import Dict, Tuple
random.seed(0)
cost: Dict[Tuple[int, int], float] = {}
for job_index in range(0, Njobs):
for worker_index in range(0, Njobs):
cost[job_index, worker_index] = random.randint(2 , 4) * 0.5
We selected random weights (fixing the random process by the seed for reproducibility) because if all the costs were the same an assignment of the form i -> i for every i, would be enough.
Now it is time for the constraints and the objective which model exactly what we said in the previous subsection
# all jobs must have an assignement
for job_index in range(0, Njobs):
model.addConstr(gp.quicksum(x[job_index, worker_index] for worker_index in range(0, Njobs)) == 1)
# all workers must have at least an assignement
for worker_index in range(0, Njobs):
model.addConstr(gp.quicksum(x[job_index, worker_index] for job_index in range(0, Njobs)) <= 1)
# objective function
objective = gp.quicksum(cost[job_index, worker_index] * x[job_index, worker_index] for job_index in range(0, Njobs) for worker_index in range(0, Njobs))
model.setObjective(objective, GRB.MINIMIZE)
This covers the first part, namely, the modeling of our problem. The second and last part is the solution.
It is enough to invoke the process
model.Params.timeLimit = 200.0 # seconds
model.Params.LogToConsole = 1
model.Params.IntegralityFocus=1
model.optimize()
The rest of the code is just for displaying the solution. Not a big deal. What is the deal breaker is the following notification from the library
Restricted license - for non-production use only - expires 2027-11-29
This means two things. The first is that we are working on borrowed time. The second has to do with the size of the problem we solve. If we set Njobs = 100 we are greeted with a crash.
GurobiError: Model too large for size-limited license; visit https://gurobi.com/unrestricted for more information
This work is in the gurobipy_formulation.ipynb
Pseudo-Gurobi and HiGHS approaches
In my case I was greeted with the "Unauthenticated" error because the license had expired and the exact error when I tried to run without the license. But not all is, lost. The solution, which is the selling point of Gurobi, is not working. However, the modelling part works perfectly. Armed with this knowledge I decided to follow the hybrid method. Model in Gurobi, solve in HiGHS. It is true that the documentation takes a bit to get used but I had to do only 2 changes. The first and more important is to swap the solution process. Because of the interoperability (an underappreciated concept haunting the Software Engineering business) it was painless. More specifically we swap this
model.Params.timeLimit = 200.0 # seconds
model.Params.LogToConsole = 1
model.Params.IntegralityFocus=1
model.optimize()
with this
import highspy
h: highspy.Highs = highspy.Highs()
model.write('mymodel.mps')
status = h.readModel('mymodel.mps')
print('Reading model file mymodel.mps returns a status of ', status)
h.setOptionValue("time_limit", 200)
h.solve()
print('Model has status ', h.getModelStatus())
Simple as that. The second change which understandably is HiGHS specific has to do with the pretty printing of the solutions.
This work is in pseudogurobipy_formulation.ipynb notebook.
Now for the pure HiGHS approach we replace the model instantiation. In other words we swap
import gurobipy as gp
from gurobipy import GRB
env = gp.Env()
model = gp.Model(env=env)
with this
import highspy
h = highspy.Highs()
Keep in mind, that this is the first part of the hybrid solution approach. Now we do not need the MPS file anymore. The solution process is simply a swap of this
model.Params.timeLimit = 200.0 # seconds
model.Params.LogToConsole = 1
model.Params.IntegralityFocus=1
model.optimize()
with this
h.setOptionValue("time_limit", 200)
h.solve()
print('Model has status ', h.getModelStatus())
What changes slightly is in the modeling. We have to define a utility function quicksum to mimic and replace the provided utility function gp.quicksum.
The second change has to do with how we instantiate a variable. We swap
x[job_index, worker_index] = model.addVar(vtype=GRB.BINARY, name=var_name)
with this
x[job_index, worker_index] = h.addBinary(name=var_name)
As you can see there is an easy swapping. what was not easy was to cleanup and debug the modelling process which is not straightforward at all.
This work is in the highspy_formulation.ipynb notebook.
Epilogue
We show how a problem that seemed insurmountable had two solutions. Not ideal, but still solutions. While the license of a production ready commercial ILP solver expired, we can still employ slower processing so as to keep the business moving. Not only that, I had to carefully review my options and cleanup the code base to make it amenable for applying the workaround. In the process the code became cleaner, bug free and I re-evaluated some modelling approaches (I did not mention it previously). They were approved by the researchers. The end result narrowed quite a bit the memory and processing gap between the Gurobi and HiGHS approaches. Since then, we had renewed the license and the feature is delivered. This time, we are prepared for a possible outage. I hope you enjoyed the article.
As always the code is provided. Feel free to open an issue if you see something wrong or add a comment.
Top comments (0)