DEV Community

Xiao Ling
Xiao Ling

Posted on • Originally published at dynamsoft.com

Leveraging Dynamic Web TWAIN's New OCR API for Modern Document Management

Organizations of all sizes face a common challenge: efficiently managing the vast documents that power their operations. Manual data entry is a persistent bottleneck—time-consuming, error-prone, and costly—that hurts productivity and increases expenses. Optical Character Recognition (OCR) technology solves this by converting scanned documents into fully searchable, editable text for modern digital workflows.

Dynamsoft's Dynamic Web TWAIN has long been a leader in web-based document scanning. With its new OCRKit addon, developers can now seamlessly integrate powerful OCR capabilities into their web applications, revolutionizing document management workflows across industries.

This tutorial will guide you through building a document management application using Dynamic Web TWAIN's OCR API, showcasing its key features and benefits.

Demo Video: Document OCR and PDF Saving

Prerequisites

Step 1: Setting Up the Development Environment

  1. Install Dynamic Web TWAIN SDK.
  2. Extract DynamicWebTWAINOCRResources.zip.
  3. Unzip DynamicWebTWAINOCRPack.zip and run Install.cmd as administrator to copy the necessary modle files to the correct locations.

    Install OCR Addon

  4. Copy the Resources folder from the Dynamic Web TWAIN installation directory to your project directory.

  5. Copy the dynamsoft.webtwain.addon.ocrkit.js file from DynamicWebTWAINOCRResources to the Resources\addon folder in your project.

Step 2: Understanding the Project Structure

The application follows a simple, maintainable structure:

├── index.html          # Main HTML structure
├── css/
│   └── style.css       # Modern CSS with variables
├── js/
│   └── app.js          # Core application logic
├── Resources/          # Dynamic Web TWAIN SDK files
Enter fullscreen mode Exit fullscreen mode

The heart of the application is js/app.js, which integrates Dynamic Web TWAIN's OCR API into the workflow.

Step 3: Exploring the OCR API Integration

Let's dive into the key parts of the code that demonstrate Dynamic Web TWAIN's OCR capabilities.

Initializing the OCR Addon

First, ensure the OCR addon is properly loaded. The application checks this in the OnWebTwainReady event:

Dynamsoft.DWT.RegisterEvent("OnWebTwainReady", function () {
    DWTObject = Dynamsoft.DWT.GetWebTwain("dwtcontrolContainer");

    checkOCRInstalled();
});
Enter fullscreen mode Exit fullscreen mode

Recognizing Text from Images

The core OCR functionality is implemented in the recognizeOnePage function, which demonstrates the API's simplicity:

async function recognizeOnePage(index) {
    let language = document.getElementById("language").value;
    let result = await DWTObject.Addon.OCRKit.Recognize(index, { settings: { language: language } });
    await saveOCRResult(result.imageID, result);
    printPrettyResult(result);
}
Enter fullscreen mode Exit fullscreen mode

Key API Features Highlighted:

  • Multiple Language Support: Recognize text in English, French, Spanish, German, Italian, and Portuguese
  • Structured Results: Returns organized text data with blocks, lines, and words

Step 4: Implementing OCR Workflow

Let's walk through a complete document processing workflow:

Step 4.1: Scanning or Loading Images

The application provides two ways to get images into the buffer: scanning documents from a connected scanner or loading images from your local machine.

// Scan from connected device
async function acquireImage() {
    if (DWTObject) {
        DWTObject.SelectSourceAsync()
            .then(function () {
                return DWTObject.AcquireImageAsync({
                    IfCloseSourceAfterAcquire: true,
                });
            })
            .catch(function (exp) {
                alert(exp.message);
            });
    }
}

// Load from local files
function loadImage() {
    DWTObject.IfShowFileDialog = true;
    DWTObject.LoadImageEx(
        "",
        Dynamsoft.DWT.EnumDWT_ImageType.IT_ALL,
        function () {
            console.log("Image loaded successfully");
        },
        function (errorCode, errorString) {
            console.error(errorString);
        }
    );
}
Enter fullscreen mode Exit fullscreen mode

Step 4.2: Correcting Image Orientation

OCR is sensitive to character orientation. To ensure accurate results, call the DetectPageOrientation() method to detect and correct the document orientation before recognition:

async function correctOrientationForOne(index) {
    let result = await DWTObject.Addon.OCRKit.DetectPageOrientation(index);
    if (result.angle != 0) {
        DWTObject.Rotate(index, -result.angle, true);
    }
}
Enter fullscreen mode Exit fullscreen mode

Step 4.3: Performing OCR Recognition

When you click "Recognize Text", the application calls:

async function recognize() {
    if (document.getElementById("processingTarget").value === "all") {
        let count = DWTObject.HowManyImagesInBuffer;
        for (let i = 0; i < count; i++) {
            await recognizeOnePage(i);
        }
    } else {
        await recognizeOnePage(DWTObject.CurrentImageIndexInBuffer);
    }
}
Enter fullscreen mode Exit fullscreen mode

Step 4.4: Exporting to PDF

The application can export documents to PDF with or without a text layer:

async function saveAsPDF() {
    try {
        let format = document.getElementById('outputFormat').value;
        if (format === "extralayer") {
            let indicesOfAll = DWTObject.SelectAllImages();
            await DWTObject.Addon.OCRKit.SaveToPath(
                indicesOfAll,
                Dynamsoft.DWT.EnumDWT_OCRKitOutputFormat.PDF_WITH_EXTRA_TEXTLAYER,
                "document"
            );
        } else {
            await DWTObject.Addon.OCRKit.SaveToPath(
                DWTObject.SelectAllImages(),
                Dynamsoft.DWT.EnumDWT_OCRKitOutputFormat.PDF_PLAIN_TEXT,
                "document"
            );
        }
    } catch (error) {
        alert(error.message);
    }
}
Enter fullscreen mode Exit fullscreen mode

Step 5: Managing Images and OCR Results

The application provides intuitive tools for image management:

  • Remove Selected Image: Deletes the current image and its OCR results
  • Remove All Images: Clears all images and OCR data
async function removeSelectedImage() {
    let currentImageId = DWTObject.IndexToImageID(DWTObject.CurrentImageIndexInBuffer);
    DWTObject.RemoveImage(DWTObject.CurrentImageIndexInBuffer);
    await deleteOCRResult(currentImageId); // Remove from IndexedDB
    // Update UI...
}
Enter fullscreen mode Exit fullscreen mode

Running the Application

  1. Set the license key in Resources/dynamsoft.webtwain.config.js:

    Dynamsoft.DWT.ProductKey = "LICENSE-KEY";
    
  2. Start a web server in the project directory:

    # Using Python 3
    python -m http.server 8000
    
    # Using Node.js with http-server
    npx http-server -p 8000
    
  3. Open your web browser and navigate to http://localhost:8000.

Dynamic Web TWAIN document OCR

Industry Benefits of Dynamic Web TWAIN's OCR API

Now that you've seen the API in action, let's explore how it benefits document management industries:

1. Enhanced Efficiency

  • Automates data extraction from scanned documents
  • Reduces manual data entry by up to 90%
  • Accelerates document processing workflows

2. Improved Accuracy

  • Advanced OCR algorithms with high recognition accuracy
  • Structured output format for reliable data extraction
  • Multiple language support for global operations

3. Accessibility & Searchability

  • Makes scanned documents text-searchable
  • Enables screen readers for visually impaired users
  • Facilitates text-based document analysis

4. Cost Savings

  • Eliminates the need for expensive dedicated OCR software
  • Reduces storage costs by enabling text compression
  • Minimizes human error and associated correction costs

5. Compliance & Security

  • Local processing ensures data privacy
  • Audit trails for document processing
  • Supports regulatory requirements for document management

6. Flexible Integration

  • Seamless integration with web applications
  • Works with existing scanning hardware
  • Supports various output formats (PDF, text)

Real-World Use Cases

Dynamic Web TWAIN's OCR API is transforming workflows across industries:

  • Healthcare: Automating patient record digitization and data extraction
  • Finance: Processing invoices, receipts, and financial documents
  • Legal: Digitizing contracts and case files for text search
  • Education: Converting physical documents into accessible digital formats
  • Government: Streamlining form processing and citizen services

Source Code

https://github.com/yushulx/web-twain-document-scan-management/tree/main/examples/dynamic_web_twain_ocr

Top comments (0)