DEV Community

Cover image for Parsing receipts with Mindee's Machine Learning API
Doug Sillars for Mindee

Posted on • Originally published at mindee.co

Parsing receipts with Mindee's Machine Learning API

Anyone who has filed an expense report can tell you: receipt tracking and expense logging is a headache.  Enter Mindee’s receipt parsing API, which uses deep learning to automatically, accurately and instantaneously parse your receipt details.

 

In this tutorial, we will walk through the steps to use Mindee’s Receipt Parsing API.  Let’s get started! 

 

API Prerequisites

  1. You’ll need a free Mindee account. Sign up and confirm your email to login.
  2. A receipt.  Look in your bag/wallet for a recent one, or do a Google Image search for a receipt and download a few to test with.   Alt Text  

Setting up the API
 

Log into you Mindee account and access your Expense Receipt API environment by clicking the Expense Receipts card:

Alt Text

To activate the API, click the “Try for Free” button to access our generous free tier. You’ll land on the dashboard page - where you can quickly see API usage (you have none right now, but that will change).  On the left navigation, there are links to “Documentation”, “Credentials” and “Live Interface”.  The docs tab has all of the technical details you’ll need to build for the receipts API endpoint, and the Live Interface is a cool interactive demo. Rather than try out the demo, we want to build with the API,  so click on “Credentials” to create an API token.

 

Add a new token. In this example, I’ve named it “Tutorial”:

Alt Text
 

Click “Add New Key” and you’ll be able to see your API token,.

 

Now, we are ready to make an API call.  In this example, we’ll be using cURL.

 

curl -X POST \ https://api.mindee.net/products/expense_receipts/v2/predict 

-H 'X-Inferuser-Token: {apiToken}’ \

-H 'content-type: multipart/form-data; boundary=----WebKitFormBoundary7MA4YWxkTrZu0gW' \

-F file=@/path/to/your/file.png
Enter fullscreen mode Exit fullscreen mode

 

Simply replace

{apiToken}

 with your new API token and /path/to/your/file/png with the path to your receipt. 

NOTE: You can also copy this code right from the documentation tab of the API with your API token inserted for you.

 

In this example, I used a receipt from the grocery store in Koln airport (my last business trip in 2020):

Alt Text
 

Pasting the cURL sample into my terminal, I hit enter and about a second later, I received a JSON response with the receipt details. Since the response is quite verbose, we will walk through the various fields section by section.

 

API Response: Parsing Results

 

Summary & Documents section:

 

The first two sections of the response contain information about the API call made:

 

"call": {
    "endpoint": {
        "name": "expense_receipts",
        "version": "2.1"
    },
    "finished_at": "2020-08-29T18:01:22+00:00",
    "id": "e47d8654-0df7-4839-a282-2c04bf293886",
    "n_documents": 1,
    "n_inputs": 1,
    "processing_time": 1.087,
    "started_at": "2020-08-29T18:01:21+00:00"
},
"documents": [
    {
    "id": "66d9adc6-76cf-4c42-8622-3dadb660ac32",
    "name": "IMG_20200301_073354.jpg"
    }
],
Enter fullscreen mode Exit fullscreen mode

 

The call section tells us that we ran on the expense receipts endpoint, uploading one document that is one page long, and after just about 1 second, the file was processed.  The documents section gives the Mindee id for the file, and the filename.

 

Predictions:  

 

Now we are getting to the exciting stuff.  The Predictions section is broken into several sections.  Several of these are identifying fields on the receipt, and others are using Machine Learning to deduce information from the receipt.  Let’s go through each section:

 

Category

 

category": {
     "probability": 0.51,
     "value": "miscellaneous"
},
Enter fullscreen mode Exit fullscreen mode

 

The API make a prediction on the type of purchase.  In this case, it is 51% sure it is miscellaneous.  The possible categories are [toll, food, parking, transport, accommodation, gasoline, miscellaneous].

 

Date

 

Identified from text on the receipt and converted into ISO format. This purchase was made on February 3, 2020, and the model is 99% confident in that choice.  The segmentation bounding box provides 4 (x,y) coordinates indicating where the date was pulled from the receipt [(0,0) is the upper left corner, (1,1) is the bottom right corner].

 

"date": {
    "iso": "2020-02-03",
    "probability": 0.99,
    "raw": "03-02-2020",
    "segmentation": {
    "bounding_box": [
      [0.64,0.661],
      [0.801,0.661],
      [0.801,0.686],
      [0.64,0.686]
  ]
}
Enter fullscreen mode Exit fullscreen mode

 

Locale

 

Using data from the receipt, the API can predict where the purchase was made, the language and the currency:. Check the documentation for the latest support. At the time of writing, support is centered on Europe and North America.

 

"locale": {
    "country": "DE",
    "currency": "EUR",
    "language": "de",
    "probability": 0.77,
    "value": "de-DE"
},
Enter fullscreen mode Exit fullscreen mode

 

In the case of my receipt, it is 77% confident that the purchase is in German, made in Germany, and in euros.

 

Merchant

 

"merchant": {
    "name": "REWE",
    "probability": 0.91,
    "segmentation": {
    "bounding_box": [
      [0.279, 0.135],
      [0.719, 0.135],
      [0.719,0.23],
      [0.279,0.23]
     ]
}
Enter fullscreen mode Exit fullscreen mode

The API correctly predicted (it was 91% sure) that it was a REWE store. Again, four (x,y) points mark the location of the text naming the Merchant on the image.

 

 Orientation

 

"orientation": {
   "degrees":0,
   "probability": 0.99
},
Enter fullscreen mode Exit fullscreen mode

Did the document require rotation before parsing?  Measured in 90 degree increments [0.90.180.270]. In this case, it did not require any rotation.

 

 Taxes

 

"taxes": [],
Enter fullscreen mode Exit fullscreen mode

 

If any taxes are identified in the receipt, the will appear here.  In this case, no taxes were found (but this is the correct result).

 

Time

 

"time": {
    "iso": "15:50",
    "probability": 0.99,
    "raw": "15:50",
    "segmentation": {
    "bounding_box": [
      [0.649,0.898],
      [0.732,0.898],
      [0.732,0.925],
      [0.649,0.925]
    ]
}
Enter fullscreen mode Exit fullscreen mode

 

Time the receipt was printed, confidence, and the (x,y) coordinates that bound the field in the image.

 

 Total

 

Perhaps the most important part of the receipt, the total spent, along with confidence and the box indicating the location on the receipt.

 

"total": {
    "amount": 17.74,
    "probability": 0.99,
    "segmentation": {
    "bounding_box": [
     [0.663,0.589],
     [0.765,0.589],
     [0.765,0.617],
     [0.663,0.617]
   ]
}
Enter fullscreen mode Exit fullscreen mode

 

 

Summary:

 

In just over 1 second, a receipt was uploaded, parsed and the response returned to the end user.  We know that €17.74 was spent at a REWE in Germany on February 3, 2020 at 15:50.  Using the bounding boxes, we can have the user validate the values, and then input this data into an expense management system.  

 

Conclusion

 

Using the MIndee receipt parsing API, you can quickly validate receipts, allowing for faster, more accurate (and less painful) expense management for our users. If you have questions, please reach out to us in the chat widget in the bottom right.

Top comments (0)