sehmimhaque

Posted on Apr 21, 2021

Serverless AWS Textract Document Scanner

#aws #node #serverless #machinelearning

In this blog we will use AWS Textract to scan and extract the texts of a document from a picture and get a JSON output response. We will also use AWS lambda function with Node.js to build a backend.

1. Setting up Backend with Serverless using Node

Assuming you already know how serverless works, we can continue with AWS Textract and the flow it follows. If you're not familiar with serverless with node please don't jump the gun, go checkout some tuts here.

Okay. Let's quickly setup our serverless

sls create --template aws-nodejs --path myService

Make sure you have the dependencies in your package.json file as well. Then run

npm install

{
  "name": "Document Scanner",
  "version": "1.0.0",
  "description": "",
  "main": "handler.js",
  "scripts": {
    "test": "mocha src/test/**"
  },
  "author": "",
  "license": "ISC",
  "devDependencies": {
    "aws-sdk": "^2.860.0",
    "aws-sdk-mock": "^4.5.0",
    "dirty-chai": "^2.0.1",
    "generator-serverless-policy": "^2.0.0",
    "mocha": "^8.3.1",
    "serverless": "^1.43.0",
    "serverless-iam-roles-per-function": "^1.0.4",
    "serverless-mocha": "^1.12.0",
    "serverless-mocha-plugin": "^1.12.0",
    "serverless-pseudo-parameters": "^2.4.0",
    "serverless-tag-api-gateway": "^1.0.0",
    "standard": "^11.0.1"
  },
  "dependencies": {
    "chai": "^4.3.3",
    "fs-extra": "^9.1.0",
    "serverless-secrets-plugin": "^0.1.0",
    "sharp": "^0.27.2"
  }
}

NOTE

Some things to keep in mind before continuing

Make sure you have proper authorization for this task.
Check your region.
Make sure the bucket url is accurate.

2. Now once AWS SDK is configured, we can write code for Textract

'use strict';
const AWS = require('aws-sdk');
AWS.config.update({region:'YOUR_REGION'});
const textract = new AWS.Textract();

module.exports.textractAnalyinzer = async (event) => {

  let { fileKey } = JSON.parse(event.body)

  const ttparams = {
      DocumentLocation: { S3Object: { Bucket: 'BUCKET_NAME', Name:  fileKey } },
      FeatureTypes: [ 
          "TABLES" , 
          // "FORMS" 
      ],
    };

  const analysis = await textract.startDocumentAnalysis(ttparams).promise();
  console.log(analysis);
  const JobId = analysis.JobId
  console.log('Waiting for processing');
  let response = {};
  do {
      await sleep(1000);
      response = await textract.getDocumentAnalysis({
          JobId,
          MaxResults : 1
      }).promise();
      //console.log(response.JobStatus)
  } while (response.JobStatus=="IN_PROGRESS");

  console.log(response);
  let Blocks = [...response.Blocks];

  do {
      response = await textract.getDocumentAnalysis({
          JobId, 
          NextToken : response.NextToken
      }).promise();
      Blocks = Blocks.concat(response.Blocks);
  } while( response.NextToken );

  # All Text By Line
  let textByLine = purifyAnalyzedDataToAllLines(Blocks)

  return {
    statusCode: 200,
    body: JSON.stringify(
      {
        message: 'Go Serverless v1.0! Your function executed successfully!',
        "fileKey": document,
        "textByLine": textByLine,
        "texTractblocks" : blocks ## Full response from textract
      },
      null,
      2
    ),
  };
};


function sleep(ms) {
  return new Promise(resolve => setTimeout(resolve, ms));
}

function purifyAnalyzedDataToAllLines(data) {
  return data.filter(item => item.BlockType === "LINE")
}

The following code finds a file with the specific key in s3/public/** and then runs Textract analysis on it.

3. Deploy the Code

sls deploy

Find the endpoint, for me it looks like this

4. For our next step, we will drop a file manually on the bucket so we can use it for testing.

Go to S3,
the navigate to /public
and then upload a img file

Im using this old receipt

5. Finally, Test it on post man.

payload:

    "fileKey" : "public/demo.jpeg"

If it gives you timeout error, change the function time out to 30s on .yml file.
You can see the type of data we get back. For this demo I'm gonna take every line and add them together in an array.

Your response should look somethig like this

{
    "fileKey": "public/demo.jpeg",
    "textByLine": [
        {
            "line": "01/027 APPROVED - THANK YOU",
            "confidence": 99.5232162475586
        },
        .
        .
        .
        .
    ],
    "texTractblocks": [
        {
            "BlockType": "PAGE",
            "Geometry": {
                "BoundingBox": {
                    "Width": 0.8844140768051147,
                    "Height": 0.8354079723358154,
                    "Left": 0.048781704157590866,
                    "Top": 0.15526676177978516
                },
                "Polygon": [
                    {
                        "X": 0.07131516188383102,
                        "Y": 0.1597394049167633
                    },
                    {
                        "X": 0.9331957697868347,
                        "Y": 0.15526676177978516
                    },
                    {
                        "X": 0.9245083928108215,
                        "Y": 0.9906747341156006
                    },
                    {
                        "X": 0.048781704157590866,
                        "Y": 0.9588059782981873
                    }
                ]
            },
            "Id": "9b384b8d-dcb8-4596-8511-af18659a9787",
            "Relationships": [
                {
                    "Type": "CHILD",
                    "Ids": [
                        "250a9339-d1ed-4c21-ad50-5a2154cd89da",
                        "aac798f2-3c05-41a2-979c-869509b53d58",
                        "eb878ad4-8b37-415d-b6ac-8cc909dab0a3",
                        "376c375f-94d1-47b7-9f4e-a9fb203043f2",
                        "628dbdd6-1225-43c9-867c-9a83ea91e1ae",
                        "aecacbf9-8727-4334-a904-6795df9c455b",
                        "c8e51b32-d010-4300-8e98-6002d6e5eee3",
                        "20e6422a-16c0-41b6-be2d-6c0c9d09ed44",
                        "82bfdb0d-20bd-407f-bc3b-33aef24fc097",
                        "aa3125fd-2e2d-48a5-9416-84ef7a987976",
                        "10ec162e-a937-4cd2-87d5-6d6b9205d719",
                        "b05a2ece-0a7f-4e65-87e5-fe4e49277f25",
                        "561f5c75-bbb4-4dc6-8660-fbc3f7386f9c",
                        "665bb6fe-8ac9-44b3-af49-189ac3ea7757",
                        "5d42a676-0621-42ad-89ff-7a16873290c4",
                        "bdb02d6e-3b80-4913-8359-ef7e70068582",
                        "28691f75-aef5-418d-8519-1d05bb991fda",
                        "8c4b9208-c2c5-4ad8-96a6-35e962043fbd"
                    ]
                }
            ]
        },
        .
        .
        .
}

That's it!

Next Step

Next week I will continue on with this app and build a front end for it using Flutter and AWS Amplify.

We will setup AWS Amplify suing Flutter,
Setup our camera to take pictures.
Once that's done we will confirm and send the picture to the S3 store,
Which will trigger our lambda function and send us the response back to our front-end.

Top comments (2)

sehmimhaque • Apr 26 '21

Thanks for taking the time to read it (:

DEV Community

Serverless AWS Textract Document Scanner

1. Setting up Backend with Serverless using Node

NOTE

2. Now once AWS SDK is configured, we can write code for Textract

3. Deploy the Code

4. For our next step, we will drop a file manually on the bucket so we can use it for testing.

Im using this old receipt

5. Finally, Test it on post man.

Next Step

Top comments (2)

Read next

Lambda Performance Evaluation: The Relationship Between Memory and Internal vCPU Architecture, and Their Comparison

Effortless Debugging: AWS CDK TypeScript Projects in VSCode

Amazon Bedrock announces support for cost allocation tags on inference profiles

Smart File Organizer Using AWS Lambda & S3 - (Let's Build 🏗️ Series)