How to Implement OCR in HarmonyOS: A Step-by-Step Guide with Regex

Read the original article：How to Implement OCR in HarmonyOS: A Step-by-Step Guide with Regex

Optical Character Recognition (OCR) is a powerful tool for extracting text from images — but raw text alone isn’t always enough. When building HarmonyOS applications, the ability to precisely extract specific data (like ID numbers, names, or dates) using regex (regular expressions) can take your app from good to great.

In this guide, we’ll show how to:

Set up the camera in a HarmonyOS app
Capture and process images using the AI Kit
Perform OCR with the Core Vision Kit
Extract meaningful data using Regex

Let’s dive in.

Prerequisites

To follow along, make sure you have:

DevEco Studio installed
A HarmanyOS project with the following kits enabled:
- CameraKit
- CoreVisionKit
- ImageKit
- AbilityKit
Basic TypeScript knowledge

Step 1: Set Up the Camera

HarmonyOS provides powerful camera tools via CameraKit. Below is how you can initialize and start a camera session:

async initCamera(surfaceId: string): Promise<void> {
  this.cameraMgr = camera.getCameraManager(getContext(this) as common.UIAbilityContext);
  let cameraArray = this.getCameraDevices(this.cameraMgr);
  this.cameraDevice = cameraArray[0]; // Back camera
  this.cameraInput = this.getCameraInput(this.cameraDevice, this.cameraMgr)!;
  await this.cameraInput.open();

  this.capability = this.cameraMgr.getSupportedOutputCapability(this.cameraDevice, camera.SceneMode.NORMAL_PHOTO);
  this.previewOutput = this.getPreviewOutput(this.cameraMgr, this.capability, surfaceId)!;
  this.photoOutput = this.getPhotoOutput(this.cameraMgr, this.capability)!;

  // Register listener for photo capture
  this.photoOutput.on('photoAvailable', async (errCode: BusinessError, photo: camera.Photo) => {
    const imageObj = photo.main;
    imageObj.getComponent(image.ComponentType.JPEG, async (errCode, component) => {
      const buffer = component.byteBuffer;
      this.idCardResult = await this.recognizeImage(buffer);
      this.result = JSON.stringify(this.idCardResult);
    });
  });

  // Set up photo session
  this.captureSession = this.getCaptureSession(this.cameraMgr)!;
  this.beginConfig(this.captureSession);
  this.startSession(this.captureSession, this.cameraInput, this.previewOutput, this.photoOutput);
}

You can then capture an image by calling:
async takePicture() {
  this.photoOutput!.capture();
}

Step 2: Perform OCR with CoreVisionKit

Once an image is captured, it needs to be processed for text recognition using the textRecognition.recognizeText() API.

async recognizeImage(buffer: ArrayBuffer): Promise<IDCardData> {
  const imageResource = image.createImageSource(buffer);
  const pixelMapInstance = await imageResource.createPixelMap();

  const visionInfo = { pixelMap: pixelMapInstance };
  const textConfig = { isDirectionDetectionSupported: true };

  let recognitionString = '';
  if (canIUse('SystemCapability.AI.OCR.TextRecognition')) {
    await textRecognition.recognizeText(visionInfo, textConfig).then((result) => {
      recognitionString = result.value;
    });
    pixelMapInstance.release();
    imageResource.release();
  }

  return this.extractDataWithRegex(recognitionString);
}

Step 3: Extract Specific Information Using Regex

OCR gives us all the text — but we often only want certain parts (like ID number or name). That’s where regex becomes a lifesaver.

Here’s how we define patterns and extract matches:

const patterns: RegexPatterns = {
  tckn: /(?:T\.?\s*C\.?\s*Kimlik\s*No|TR\s*identity\s*No)[\s:]*([1-9]\d{10})/i,
  surname: /(?:Soyadı|Surname)[\s:]*([A-ZÇĞİÖŞÜ]+)/i,
  name: /(?:Adı|Given Name)[\s:]*([A-ZÇĞİÖŞÜ]+)/i,
  birthDate: /(?:Doğum\s*Tarihi|Date\s*of\s*Birth)[\s:]*([\d./-]+)/i,
  gender: /(?:Cinsiyeti\s*\/\s*Gender)[\s:]*([EM])/i,
  documentNo: /(?:Seri\s*No|Document\s*No)[\s:]*([A-Z0-9]{5,})/i,
};

function extractDataWithRegex(text: string): IDCardData {
  return {
    tckn: text.match(patterns.tckn)?.[1],
    name: text.match(patterns.name)?.[1],
    surname: text.match(patterns.surname)?.[1],
    birthDate: text.match(patterns.birthDate)?.[1],
    gender: text.match(patterns.gender)?.[1],
    documentNo: text.match(patterns.documentNo)?.[1],
    rawText: text
  };
}

This regex approach gives you full control over what you extract from the noisy output of OCR.

Example Output

Let’s say OCR returns the following raw text:

T.C. Kimlik No: 1234*******
Adı: MEHMET
Soyadı: YILMAZ
Doğum Tarihi: 01.01.1990
Cinsiyeti / Gender: E
Seri No: A1234**

Our parser will return:

{
  "tckn": "1234*******",
  "name": "MEHMET",
  "surname": "YILMAZ",
  "birthDate": "01.01.1990",
  "gender": "E",
  "documentNo": "A1234**"
}

Bonus: Releasing Camera Resources

Don’t forget to clean up when you’re done:

async releaseCamera(): Promise<void> {
  await this.cameraInput?.close();
  await this.previewOutput?.release();
  await this.receiver?.release();
  await this.photoOutput?.release();
  await this.captureSession?.release();
}

Final Thoughts

By combining CameraKit, CoreVisionKit, and Regex, you can build smart and efficient OCR features in your HarmonyOS apps. Whether you’re processing ID cards, receipts, or business cards, this method allows for structured and precise text extraction.