Organizations of all sizes face a common challenge: efficiently managing the vast documents that power their operations. Manual data entry is a persistent bottleneck—time-consuming, error-prone, and costly—that hurts productivity and increases expenses. Optical Character Recognition (OCR) technology solves this by converting scanned documents into fully searchable, editable text for modern digital workflows.
Dynamsoft's Dynamic Web TWAIN has long been a leader in web-based document scanning. With its new OCRKit addon, developers can now seamlessly integrate powerful OCR capabilities into their web applications, revolutionizing document management workflows across industries.
This tutorial will guide you through building a document management application using Dynamic Web TWAIN's OCR API, showcasing its key features and benefits.
Demo Video: Document OCR and PDF Saving
Prerequisites
Step 1: Setting Up the Development Environment
- Install Dynamic Web TWAIN SDK.
- Extract
DynamicWebTWAINOCRResources.zip. -
Unzip
DynamicWebTWAINOCRPack.zipand runInstall.cmdas administrator to copy the necessary modle files to the correct locations. Copy the
Resourcesfolder from the Dynamic Web TWAIN installation directory to your project directory.Copy the
dynamsoft.webtwain.addon.ocrkit.jsfile fromDynamicWebTWAINOCRResourcesto theResources\addonfolder in your project.
Step 2: Understanding the Project Structure
The application follows a simple, maintainable structure:
├── index.html # Main HTML structure
├── css/
│ └── style.css # Modern CSS with variables
├── js/
│ └── app.js # Core application logic
├── Resources/ # Dynamic Web TWAIN SDK files
The heart of the application is js/app.js, which integrates Dynamic Web TWAIN's OCR API into the workflow.
Step 3: Exploring the OCR API Integration
Let's dive into the key parts of the code that demonstrate Dynamic Web TWAIN's OCR capabilities.
Initializing the OCR Addon
First, ensure the OCR addon is properly loaded. The application checks this in the OnWebTwainReady event:
Dynamsoft.DWT.RegisterEvent("OnWebTwainReady", function () {
DWTObject = Dynamsoft.DWT.GetWebTwain("dwtcontrolContainer");
checkOCRInstalled();
});
Recognizing Text from Images
The core OCR functionality is implemented in the recognizeOnePage function, which demonstrates the API's simplicity:
async function recognizeOnePage(index) {
let language = document.getElementById("language").value;
let result = await DWTObject.Addon.OCRKit.Recognize(index, { settings: { language: language } });
await saveOCRResult(result.imageID, result);
printPrettyResult(result);
}
Key API Features Highlighted:
- Multiple Language Support: Recognize text in English, French, Spanish, German, Italian, and Portuguese
- Structured Results: Returns organized text data with blocks, lines, and words
Step 4: Implementing OCR Workflow
Let's walk through a complete document processing workflow:
Step 4.1: Scanning or Loading Images
The application provides two ways to get images into the buffer: scanning documents from a connected scanner or loading images from your local machine.
// Scan from connected device
async function acquireImage() {
if (DWTObject) {
DWTObject.SelectSourceAsync()
.then(function () {
return DWTObject.AcquireImageAsync({
IfCloseSourceAfterAcquire: true,
});
})
.catch(function (exp) {
alert(exp.message);
});
}
}
// Load from local files
function loadImage() {
DWTObject.IfShowFileDialog = true;
DWTObject.LoadImageEx(
"",
Dynamsoft.DWT.EnumDWT_ImageType.IT_ALL,
function () {
console.log("Image loaded successfully");
},
function (errorCode, errorString) {
console.error(errorString);
}
);
}
Step 4.2: Correcting Image Orientation
OCR is sensitive to character orientation. To ensure accurate results, call the DetectPageOrientation() method to detect and correct the document orientation before recognition:
async function correctOrientationForOne(index) {
let result = await DWTObject.Addon.OCRKit.DetectPageOrientation(index);
if (result.angle != 0) {
DWTObject.Rotate(index, -result.angle, true);
}
}
Step 4.3: Performing OCR Recognition
When you click "Recognize Text", the application calls:
async function recognize() {
if (document.getElementById("processingTarget").value === "all") {
let count = DWTObject.HowManyImagesInBuffer;
for (let i = 0; i < count; i++) {
await recognizeOnePage(i);
}
} else {
await recognizeOnePage(DWTObject.CurrentImageIndexInBuffer);
}
}
Step 4.4: Exporting to PDF
The application can export documents to PDF with or without a text layer:
async function saveAsPDF() {
try {
let format = document.getElementById('outputFormat').value;
if (format === "extralayer") {
let indicesOfAll = DWTObject.SelectAllImages();
await DWTObject.Addon.OCRKit.SaveToPath(
indicesOfAll,
Dynamsoft.DWT.EnumDWT_OCRKitOutputFormat.PDF_WITH_EXTRA_TEXTLAYER,
"document"
);
} else {
await DWTObject.Addon.OCRKit.SaveToPath(
DWTObject.SelectAllImages(),
Dynamsoft.DWT.EnumDWT_OCRKitOutputFormat.PDF_PLAIN_TEXT,
"document"
);
}
} catch (error) {
alert(error.message);
}
}
Step 5: Managing Images and OCR Results
The application provides intuitive tools for image management:
- Remove Selected Image: Deletes the current image and its OCR results
- Remove All Images: Clears all images and OCR data
async function removeSelectedImage() {
let currentImageId = DWTObject.IndexToImageID(DWTObject.CurrentImageIndexInBuffer);
DWTObject.RemoveImage(DWTObject.CurrentImageIndexInBuffer);
await deleteOCRResult(currentImageId); // Remove from IndexedDB
// Update UI...
}
Running the Application
-
Set the license key in
Resources/dynamsoft.webtwain.config.js:
Dynamsoft.DWT.ProductKey = "LICENSE-KEY"; -
Start a web server in the project directory:
# Using Python 3 python -m http.server 8000 # Using Node.js with http-server npx http-server -p 8000 Open your web browser and navigate to
http://localhost:8000.
Industry Benefits of Dynamic Web TWAIN's OCR API
Now that you've seen the API in action, let's explore how it benefits document management industries:
1. Enhanced Efficiency
- Automates data extraction from scanned documents
- Reduces manual data entry by up to 90%
- Accelerates document processing workflows
2. Improved Accuracy
- Advanced OCR algorithms with high recognition accuracy
- Structured output format for reliable data extraction
- Multiple language support for global operations
3. Accessibility & Searchability
- Makes scanned documents text-searchable
- Enables screen readers for visually impaired users
- Facilitates text-based document analysis
4. Cost Savings
- Eliminates the need for expensive dedicated OCR software
- Reduces storage costs by enabling text compression
- Minimizes human error and associated correction costs
5. Compliance & Security
- Local processing ensures data privacy
- Audit trails for document processing
- Supports regulatory requirements for document management
6. Flexible Integration
- Seamless integration with web applications
- Works with existing scanning hardware
- Supports various output formats (PDF, text)
Real-World Use Cases
Dynamic Web TWAIN's OCR API is transforming workflows across industries:
- Healthcare: Automating patient record digitization and data extraction
- Finance: Processing invoices, receipts, and financial documents
- Legal: Digitizing contracts and case files for text search
- Education: Converting physical documents into accessible digital formats
- Government: Streamlining form processing and citizen services



Top comments (0)