Read the original article:How to Implement OCR in HarmonyOS: A Step-by-Step Guide with Regex
Optical Character Recognition (OCR) is a powerful tool for extracting text from images — but raw text alone isn’t always enough. When building HarmonyOS applications, the ability to precisely extract specific data (like ID numbers, names, or dates) using regex (regular expressions) can take your app from good to great.
In this guide, we’ll show how to:
- Set up the camera in a HarmonyOS app
- Capture and process images using the AI Kit
- Perform OCR with the Core Vision Kit
- Extract meaningful data using Regex
Let’s dive in.
Prerequisites
To follow along, make sure you have:
- DevEco Studio installed
- A HarmanyOS project with the following kits enabled:
- CameraKit
- CoreVisionKit
- ImageKit
- AbilityKit
- Basic TypeScript knowledge
Step 1: Set Up the Camera
HarmonyOS provides powerful camera tools via CameraKit
. Below is how you can initialize and start a camera session:
async initCamera(surfaceId: string): Promise<void> {
this.cameraMgr = camera.getCameraManager(getContext(this) as common.UIAbilityContext);
let cameraArray = this.getCameraDevices(this.cameraMgr);
this.cameraDevice = cameraArray[0]; // Back camera
this.cameraInput = this.getCameraInput(this.cameraDevice, this.cameraMgr)!;
await this.cameraInput.open();
this.capability = this.cameraMgr.getSupportedOutputCapability(this.cameraDevice, camera.SceneMode.NORMAL_PHOTO);
this.previewOutput = this.getPreviewOutput(this.cameraMgr, this.capability, surfaceId)!;
this.photoOutput = this.getPhotoOutput(this.cameraMgr, this.capability)!;
// Register listener for photo capture
this.photoOutput.on('photoAvailable', async (errCode: BusinessError, photo: camera.Photo) => {
const imageObj = photo.main;
imageObj.getComponent(image.ComponentType.JPEG, async (errCode, component) => {
const buffer = component.byteBuffer;
this.idCardResult = await this.recognizeImage(buffer);
this.result = JSON.stringify(this.idCardResult);
});
});
// Set up photo session
this.captureSession = this.getCaptureSession(this.cameraMgr)!;
this.beginConfig(this.captureSession);
this.startSession(this.captureSession, this.cameraInput, this.previewOutput, this.photoOutput);
}
You can then capture an image by calling:
async takePicture() {
this.photoOutput!.capture();
}
Step 2: Perform OCR with CoreVisionKit
Once an image is captured, it needs to be processed for text recognition using the textRecognition.recognizeText()
API.
async recognizeImage(buffer: ArrayBuffer): Promise<IDCardData> {
const imageResource = image.createImageSource(buffer);
const pixelMapInstance = await imageResource.createPixelMap();
const visionInfo = { pixelMap: pixelMapInstance };
const textConfig = { isDirectionDetectionSupported: true };
let recognitionString = '';
if (canIUse('SystemCapability.AI.OCR.TextRecognition')) {
await textRecognition.recognizeText(visionInfo, textConfig).then((result) => {
recognitionString = result.value;
});
pixelMapInstance.release();
imageResource.release();
}
return this.extractDataWithRegex(recognitionString);
}
Step 3: Extract Specific Information Using Regex
OCR gives us all the text — but we often only want certain parts (like ID number or name). That’s where regex becomes a lifesaver.
Here’s how we define patterns and extract matches:
const patterns: RegexPatterns = {
tckn: /(?:T\.?\s*C\.?\s*Kimlik\s*No|TR\s*identity\s*No)[\s:]*([1-9]\d{10})/i,
surname: /(?:Soyadı|Surname)[\s:]*([A-ZÇĞİÖŞÜ]+)/i,
name: /(?:Adı|Given Name)[\s:]*([A-ZÇĞİÖŞÜ]+)/i,
birthDate: /(?:Doğum\s*Tarihi|Date\s*of\s*Birth)[\s:]*([\d./-]+)/i,
gender: /(?:Cinsiyeti\s*\/\s*Gender)[\s:]*([EM])/i,
documentNo: /(?:Seri\s*No|Document\s*No)[\s:]*([A-Z0-9]{5,})/i,
};
function extractDataWithRegex(text: string): IDCardData {
return {
tckn: text.match(patterns.tckn)?.[1],
name: text.match(patterns.name)?.[1],
surname: text.match(patterns.surname)?.[1],
birthDate: text.match(patterns.birthDate)?.[1],
gender: text.match(patterns.gender)?.[1],
documentNo: text.match(patterns.documentNo)?.[1],
rawText: text
};
}
This regex approach gives you full control over what you extract from the noisy output of OCR.
Example Output
Let’s say OCR returns the following raw text:
T.C. Kimlik No: 1234*******
Adı: MEHMET
Soyadı: YILMAZ
Doğum Tarihi: 01.01.1990
Cinsiyeti / Gender: E
Seri No: A1234**
Our parser will return:
{
"tckn": "1234*******",
"name": "MEHMET",
"surname": "YILMAZ",
"birthDate": "01.01.1990",
"gender": "E",
"documentNo": "A1234**"
}
Bonus: Releasing Camera Resources
Don’t forget to clean up when you’re done:
async releaseCamera(): Promise<void> {
await this.cameraInput?.close();
await this.previewOutput?.release();
await this.receiver?.release();
await this.photoOutput?.release();
await this.captureSession?.release();
}
Final Thoughts
By combining CameraKit, CoreVisionKit, and Regex, you can build smart and efficient OCR features in your HarmonyOS apps. Whether you’re processing ID cards, receipts, or business cards, this method allows for structured and precise text extraction.
TL;DR
- Use
CameraKit
to capture images - Process images with
CoreVisionKit
OCR - Use
Regex
to extract structured data like TCKN, name, date of birth - Always release camera resources properly
Top comments (0)