<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Sujal Pandey</title>
    <description>The latest articles on DEV Community by Sujal Pandey (@sujal58).</description>
    <link>https://dev.to/sujal58</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3308373%2Fb0beb065-142e-4c2a-91b7-d8acfe597cda.jpg</url>
      <title>DEV Community: Sujal Pandey</title>
      <link>https://dev.to/sujal58</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/sujal58"/>
    <language>en</language>
    <item>
      <title>Auto KYC Verification with: How I Built a Smarter Identity Check System</title>
      <dc:creator>Sujal Pandey</dc:creator>
      <pubDate>Mon, 30 Jun 2025 12:01:42 +0000</pubDate>
      <link>https://dev.to/sujal58/auto-kyc-verification-with-how-i-built-a-smarter-identity-check-system-25aa</link>
      <guid>https://dev.to/sujal58/auto-kyc-verification-with-how-i-built-a-smarter-identity-check-system-25aa</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;“Upload your ID, selfie, and personal details and wait 24 to 48 hours for verification.” That’s the traditional KYC process. But in future? We can do better.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Manual KYC verification not only slowed the onboarding process but also created friction for users. So, I decided to take matters into my own hands.&lt;br&gt;
In this blog, I’ll walk you through how I built an Auto KYC (Know Your Customer) Verification System using a combination of OpenCV, Tesseract, and DeepFace (FaceNet) to create a faster, smarter, and secure identity check process.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1dm0muqrrynhxny9utyi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1dm0muqrrynhxny9utyi.png" alt="Person filling form to submit the kyc for verification." width="800" height="600"&gt;&lt;/a&gt;&lt;br&gt;
Photo by &lt;a href="https://unsplash.com/@romaindancre" rel="noopener noreferrer"&gt;Romain Dancre&lt;/a&gt; on &lt;a href="https://unsplash.com/" rel="noopener noreferrer"&gt;Unsplash&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  What is KYC and Why Automate It?
&lt;/h3&gt;

&lt;p&gt;KYC (Know Your Customer) is a standard process in fintech and crowdfunding to verify a user’s identity. Traditionally, this involves:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Uploading a valid government-issued document (e.g., citizenship, license)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Uploading a selfie&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Waiting for a human reviewer to verify both&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This manual method is time-consuming, costly, and prone to human error. Automating it not only reduces overhead but also enhances user experience.&lt;/p&gt;
&lt;h3&gt;
  
  
  The Problem with Manual KYC
&lt;/h3&gt;

&lt;p&gt;The typical KYC process is slow:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Users upload their details, documents, and selfies.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A human manually cross-checks everything.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It takes hours or even days.&lt;br&gt;
That doesn’t scale — especially when users expect instant access. I wanted to create a system where users could:&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Upload their personal details, citizenship/ID image, and selfie&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Let the system automatically verify:&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Are the text details valid and extracted from the document?&lt;br&gt;
Does the face on the document match the selfie?&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Can Emerging Technologies Solve This?
&lt;/h3&gt;

&lt;p&gt;Manual KYC processes have always been resource-heavy — requiring human verification, document handling, and judgment-based approval. But in an era where AI and automation are becoming mainstream, there’s a clear opportunity to streamline identity verification using emerging technologies.&lt;/p&gt;

&lt;p&gt;That’s where Computer Vision comes in.&lt;/p&gt;

&lt;p&gt;By leveraging OCR (Optical Character Recognition) and Facial Recognition, we can intelligently extract and verify identity data from uploaded documents and photos — with minimal human intervention.&lt;br&gt;
Modern open-source libraries like Tesseract, OpenCV, and DeepFace make it possible to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Automatically read and extract text from scanned ID cards&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Detect faces from document photos and selfies&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Compare facial features to ensure that the same person is present in both&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  How My System Aims to Solve This
&lt;/h3&gt;

&lt;p&gt;The system I’m building aims to do just that — with a workflow that looks like this:&lt;/p&gt;
&lt;h4&gt;
  
  
  1. User submits:
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Their basic personal information (e.g., name, DOB)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;scanned ID document&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;selfie&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;
  
  
  2. The system:
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Uses Tesseract to extract text from the document&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Applies OpenCV to detect and crop faces from both images&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Uses DeepFace (with the FaceNet model) to compare the selfie and document photo&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Cross-verifies the form data with OCR-extracted data and the selfie with the ID face&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;
  
  
  3. Based on this, it either:
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Automatically approves the KYC request&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Flags the submission for manual review&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This intelligent approach reduces verification time from hours to seconds without compromising trust or security.&lt;/p&gt;
&lt;h3&gt;
  
  
  Tech Stack Overview
&lt;/h3&gt;

&lt;p&gt;Here’s what I used to implement my KYC automation pipeline:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;OpenCV — For image pre-processing and face detection&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Tesseract OCR — To extract text from ID cards&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;DeepFace (FaceNet) — For comparing the ID photo with a selfie&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Spring Boot + ReactJS — Backend and frontend integration&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;PostgreSQL — Storing KYC metadata&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Step-by-Step: How Auto KYC Works
&lt;/h3&gt;

&lt;p&gt;Let’s break down the flow of the auto-verification process:&lt;/p&gt;
&lt;h4&gt;
  
  
  1. User Uploads KYC Document &amp;amp; Selfie
&lt;/h4&gt;

&lt;p&gt;We allow users to upload:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;An image of their citizenship card&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A selfie&lt;br&gt;
These are sent to the backend in a multipart/form-data request, where the verification logic begins.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;
  
  
  2. Preprocessing with OpenCV
&lt;/h4&gt;

&lt;p&gt;Before any recognition or comparison, I apply preprocessing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import cv2

img = cv2.imread('citizenship.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
denoised = cv2.GaussianBlur(gray, (5, 5), 0)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why this matters:&lt;/strong&gt;&lt;br&gt;
Reduces noise&lt;br&gt;
Increases OCR and face detection accuracy&lt;/p&gt;
&lt;h4&gt;
  
  
  3. Text Extraction via Tesseract
&lt;/h4&gt;

&lt;p&gt;Once the image is preprocessed, I pass it to Tesseract OCR to extract information like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Name&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Citizenship Number&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Date of Birth&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import pytesseract

text = pytesseract.image_to_string(denoised)
print(text)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h4&gt;
  
  
  4. Face Detection and Cropping
&lt;/h4&gt;

&lt;p&gt;Using OpenCV’s Haar cascades or DNN modules, I detect and crop the face from both:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;ID document&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Selfie&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import cv2

face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')
faces = face_cascade.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=5)

Get the largest face
x, y, w, h = max(faces, key=lambda f: f[2] * f[3])
face_crop = doc_img[y:y+h, x:x+w]

 Resize slightly larger (helps DeepFace)
face_resized = cv2.resize(face_crop, (224, 224), interpolation=cv2.INTER_CUBIC)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;This step is critical because FaceNet requires clean face crops to compute accurate embeddings.&lt;/p&gt;
&lt;h4&gt;
  
  
  5. Face Matching with DeepFace (FaceNet)
&lt;/h4&gt;

&lt;p&gt;Here comes the magic.&lt;br&gt;
I use DeepFace’s FaceNet backend to generate embeddings for both cropped faces, and then calculate the cosine distance between them.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from deepface import DeepFace
result = DeepFace.verify(img1_path="id_face.jpg", img2_path="selfie.jpg", model_name='Facenet')
print(result)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;&lt;p&gt;If the distance &amp;lt; threshold (e.g. 0.4), the faces match&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;This means the selfie is likely from the same person as on the ID&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Final Decision Logic
&lt;/h3&gt;

&lt;p&gt;Once all checks pass:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Text extracted correctly (e.g. name, citizenship number)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Face match confidence is high&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Match name from OCR with the user-entered full name&lt;br&gt;
Then, the user is marked as KYC verified.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Challenges Faced
&lt;/h3&gt;

&lt;h4&gt;
  
  
  1. Poor Image Quality
&lt;/h4&gt;

&lt;p&gt;Some users uploaded blurry or low-light images. Fixing this with CLAHE and adaptive thresholding helped improve OCR and face detection.&lt;/p&gt;

&lt;h4&gt;
  
  
  2. OCR Misreads
&lt;/h4&gt;

&lt;p&gt;Tesseract isn’t perfect — especially with fonts used in Nepali citizenship cards. I built a fallback where users can manually edit extracted fields before submission.&lt;/p&gt;

&lt;h3&gt;
  
  
  What About Security?
&lt;/h3&gt;

&lt;p&gt;All images are encrypted and stored temporarily&lt;br&gt;
Verification happens in memory — nothing permanent unless KYC succeeds&lt;br&gt;
Sensitive data (like extracted text) is masked during logging&lt;br&gt;
HTTPS and JWT authentication for every KYC API&lt;/p&gt;

&lt;h3&gt;
  
  
  What’s Next?
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Some exciting upgrades I’m planning:&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Liveness Detection: To prevent photo spoofing&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Nepali OCR: To support native-script ID cards&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Real-World Impact
&lt;/h3&gt;

&lt;p&gt;This auto-verification system:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Reduced verification time from hours to seconds&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Improved accuracy by combining textual and facial matching&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Allowed scaling to hundreds of verifications daily without human intervention&lt;br&gt;
And most importantly, users loved the instant onboarding experience.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Real-World Use Cases of Face Recognition
&lt;/h3&gt;

&lt;p&gt;Face recognition goes far beyond just KYC. It’s already transforming multiple industries with practical and impactful applications:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Banking &amp;amp; FinTech&lt;br&gt;
Used for remote KYC, fraud detection, and secure account recovery.No need of physical appearance in banks and finTech companies for kyc update.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;E-commerce&lt;br&gt;
Enables secure logins, customer identity verification, and personalized shopping experiences.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Healthcare&lt;br&gt;
Helps in patient check-ins, record matching, and reducing administrative overhead.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Travel&lt;br&gt;
Facilitates faster airport check-ins, e-passport systems, and border control automation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Security &amp;amp; Surveillance&lt;br&gt;
Provides real-time face detection and matching for access control and public safety.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Face-Verified Smart Card Attendance System &lt;br&gt;
student taps their ID card (RFID/NFC) to mark attendance, but the system also uses face recognition to verify that the person tapping the card is the card’s actual owner.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  How It Fits My Use Case (Crowdfunding Platform)
&lt;/h3&gt;

&lt;p&gt;In the context of my crowdfunding platform, facial recognition is a game-changer. Here’s how:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Prevents fake campaigns and fraudulent actors from misusing the platform.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Ensures each user is genuinely who they claim to be by matching ID and selfie.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Speeds up user onboarding with instant verification, no manual review bottlenecks.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Builds trust between donors and campaign creators — especially crucial when money and social impact are involved.&lt;br&gt;
In short, face recognition doesn’t just check an identity — it helps protect the entire system’s integrity.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Final Thoughts
&lt;/h3&gt;

&lt;p&gt;Building an auto KYC system was one of the most technically rewarding parts of my crowdfunding platform. It wasn’t just about writing code — It was about building trust at scale, solving real problems, saving user time,making onboarding seamless, and ensuring security in a world moving faster every day.&lt;/p&gt;

&lt;p&gt;If you’re building anything in fintech, banking, or even decentralized apps — I’d highly recommend exploring automated KYC with OpenCV + OCR + Face Verification.&lt;/p&gt;

&lt;p&gt;Let the code do the boring work — and let humans focus on what matters.&lt;/p&gt;

&lt;h3&gt;
  
  
  Let’s Collaborate!
&lt;/h3&gt;

&lt;p&gt;If you’re working on something similar — or have ideas to improve OCR accuracy, face matching, or KYC workflows — I’d love to chat!&lt;br&gt;
 Feel free to connect with me on LinkedIn or drop a message here.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Thanks for reading — and stay tuned!&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>ai</category>
      <category>automation</category>
    </item>
  </channel>
</rss>
