Skip to content

DEV Community

Vincent

Posted on Dec 22, 2020

Implement a Minimalistic KYC Form & Identify Verification Check

#devops #python #php #webdev

In this post, you will learn how to make use of the PixLab API to implement a minimalistic KYC (Know Your Customer) form for your webapps, and verify the authenticity of the personal information submitted by any of your users using an image upload of their ID card or Passport. Let's dive in!

Table Of Contents

The Process of ID Verification
The PixLab API
Code Samples
Conclusion

The Process of ID Verification

Many services have been long-time accepting ID or Passports as identification documents from their customers to complete their KYC (Know Your Customer) form as required by the legislation in force. This is especially true and enforced for the Finance, HR or Travel sectors. In most cases, a human operator will verify the authenticity of the submitted document and grant validation or reject it.

Needless to say, the verification process has been manual for a very long time. We can all agree that lining up and wait for one guy to verify our documents takes too much time. Let's not forget that this process is prone to errors.

Things can get really complicated if you have hundreds of KYC forms to checks, but also if your clients differ in nationality. Quickly, you will find yourself drowning in physical copies of passports in different languages that you can not even understand. Let alone the potential legal problems you can face with passport copies laying around the office.
As a developer, you may have thought about how you can solve this problem. Well, an automated & safe solution for ID verification is more than required!

The PixLab API

PixLab is a ML focused, SaaS platform offering Machine Vision & Media Processing APIs for developers via a straightforward Web or Offline SDKs. PixLab API feature set includes but not not limited to:

Passports & ID Cards document scanning using state-of-the-art Machine Learning models via straightforward HTTP Rest API as shown in this Python & PHP gists and publicly available via the /docscan API endpoint.
Board of face analysis API endpoints including face detection, landmarks extraction, facial recognition, content moderation and many more.
On the fly image encryption, conversion, compression and full support for HTTP/2 and HTTP/3(QUIC) within the release of the PixLab API 1.9.72.
Proxy for AWS S3 and other cloud storage providers.
Over 130 Machine Vision & Media Processing API endpoints.

Python/PHP Code Samples

Given an input Passport Specimen as follow:

Process Passport Machine Readable Zone, Detect & Extract any present face and finally transform raw MRZ data into text content in the JSON format ready to be consumed by your app. The JSON object output should look like the following:

The result above was obtained via the following Pyhton gist:

import requests
import json

# Given a government issued passport document, extract the user face and parse all MRZ fields.
#
# PixLab recommend that you connect your AWS S3 bucket via your dashboard at https://pixlab.io/dashboard
# so that any cropped face or MRZ crop is stored automatically on your S3 bucket rather than the PixLab one.
# This feature should give you full control over your analyzed media files.
#
# https://pixlab.io/#/cmd?id=docscan for additional information.

req = requests.get('https://api.pixlab.io/docscan',params={
    'img':'https://i.stack.imgur.com/oJY2K.png', # Passport sample
    'type':'passport', # Type of document we are a going to scan
    'key':'Pixlab_key'
})
reply = req.json()
if reply['status'] != 200:
    print (reply['error'])
else:
    print ("User Cropped Face: " + reply['face_url'])
    print ("MRZ Cropped Image: " + reply['mrz_img_url'])
    print ("Raw MRZ Text: " + reply['mrz_raw_text'])
    print ("MRZ Fields: ")
    # Display all parsed MRZ fields
    print ("\tIssuing Country: " + reply['fields']['issuingCountry'])
    print ("\tFull Name: "       + reply['fields']['fullName'])
    print ("\tDocument Number: " + reply['fields']['documentNumber'])
    print ("\tCheck Digit: "   + reply['fields']['checkDigit'])
    print ("\tNationality: "   + reply['fields']['nationality'])
    print ("\tDate Of Birth: " + reply['fields']['dateOfBirth'])
    print ("\tSex: "           + reply['fields']['sex'])
    print ("\tDate Of Expiry: "    + reply['fields']['dateOfExpiry'])
    print ("\tPersonal Number: "   + reply['fields']['personalNumber']) # Optional field and may not be returned when not set by the issuing country.
    print ("\tFinal Check Digit: " + reply['fields']['finalcheckDigit'])

Python Gist Source Code: https://github.com/symisc/pixlab/blob/master/python/passport_scan.py

The same logic using PHP now:

Regardless of the underlying programming language, the logic is always same. We made a simple HTTP GET request with the input Passport image URL as a sole parameter. Most PixLab endpoints support multiple HTTP methods so you can easily switch to POST based requests if you want to upload your images & videos directly from your mobile or web app for analysis. Back to our sample, only a single API endpoint is needed for our official document scanning task:

docscan is the sole endpoint needed for such a task. It support various ID cards besides Passports & Visas and does face extraction automatically for you.
PixLab recommend that you connect your AWS S3 bucket via the dashboard so that any extracted face or MRZ crop is automatically stored on your S3 bucket rather than the PixLab one. This feature should give you full control over your analyzed media files.
finally, take a look at the docscan endpoint documentation for additional information such as the set of scanned fields, where face crops are stored, how to process PDF documents instead of images and so forth.

Conclusion

And that's it. Scanning government-issued documents for your app is as easy as it gets. You don't need to have advanced Machine Learning or Devops skills since the endpoint does all the heavy lifting for you and you can automate the scanning process for your KYC tasks. If you would love to test with different languages, or to try other Pixlab's endpoints, please check them out here at our Github repository.

symisc / pixlab

PixLab Resources & Code Samples

PixLab & FACEIO Guides, Announcments & Tutorials

Top comments (0)

Subscribe

Read next

User Interface (UI) Design: A Guide for Developers

Yuliana Sepúlveda Marín - Dec 16

Advent of Code 2024 - Day 15: Warehouse Woes

Grant Riordan - Dec 16

What’s New in React 19? A Quick Guide with Code Examples

Vladyusha - Dec 2

Deploying Traefik Proxy with Cloudflare Origin CA Certificate on k0s

Tingwei - Dec 3