kouwei qing

Posted on Jun 30, 2025 • Edited on Jul 13, 2025

HarmonyOS Audio-Video: Audio Capture Practice

#harmonyosnext

HarmonyOS Audio-Video: Audio Capture Practice

Background

Many scenarios in application development require audio capture, such as voice messaging in chat functions, real-time speech-to-text conversion, voice calls, and video calls. On Android and iOS, the system provides two forms of audio capture:

Real-time audio stream capture
Audio file recording

The system also offers different API forms. For example, on Android:

AudioRecorder Java interface
MediaRecorder Java interface
OpenSLES C++ interface
AAudio C++ interface

During HarmonyOS adaptation, audio capture is also necessary. This article guides you through implementing audio capture step by step.

Introduction to Audio Recording Interfaces

HarmonyOS provides two audio capture interfaces for TS and C++:

AudioCapture
OHAudio

Below is an introduction to the APIs for both languages.

AudioCapture

Using AudioCapturer for audio recording involves creating an AudioCapturer instance, configuring audio capture parameters, starting and stopping capture, and releasing resources. The official state diagram clearly marks method calls and state transitions:

createAudioCapture

Creating a capturer mainly involves parameter configuration:

import { audio } from '@kit.AudioKit';

let audioStreamInfo: audio.AudioStreamInfo = {
  samplingRate: audio.AudioSamplingRate.SAMPLE_RATE_48000, // Sampling rate
  channels: audio.AudioChannel.CHANNEL_2, // Number of channels
  sampleFormat: audio.AudioSampleFormat.SAMPLE_FORMAT_S16LE, // Sample format
  encodingType: audio.AudioEncodingType.ENCODING_TYPE_RAW // Encoding format
};

let audioCapturerInfo: audio.AudioCapturerInfo = {
  source: audio.SourceType.SOURCE_TYPE_MIC,
  capturerFlags: 0
};

let audioCapturerOptions: audio.AudioCapturerOptions = {
  streamInfo: audioStreamInfo,
  capturerInfo: audioCapturerInfo
};

audio.createAudioCapturer(audioCapturerOptions, (err, data) => {
  if (err) {
  } else {
    let audioCapturer = data;
  }
});

Parameters include two main sections:

AudioStreamInfo: Audio format configuration
- samplingRate: Sampling rate
- channels: Number of channels
- sampleFormat: Sample format
- encodingType: Audio encoding type (only PCM's ENCODING_TYPE_RAW is supported currently)
AudioCapturerInfo: Capture configuration
- source: Audio source type, including:
- SOURCE_TYPE_INVALID: Invalid audio source
- SOURCE_TYPE_MIC: Microphone audio source
- SOURCE_TYPE_VOICE_RECOGNITION: Speech recognition source
- SOURCE_TYPE_PLAYBACK_CAPTURE: Playback audio stream (internal recording)
- SOURCE_TYPE_VOICE_COMMUNICATION: Voice call scenario source
- SOURCE_TYPE_VOICE_MESSAGE: Short voice message source
- capturerFlags: Audio capturer flags (0 represents an audio capturer)

on('readData')

The on('readData') method subscribes to audio data read callbacks:

let readDataCallback = (buffer: ArrayBuffer) => {
  // Process the audio stream
}
audioCapturer.on('readData', readDataCallback);

start

The start method begins recording:

import { BusinessError } from '@kit.BasicServicesKit';
audioCapturer.start((err: BusinessError) => {
  if (err) {
    // Handle error
  } else {
    // Recording started
  }
});

stop

The stop method stops recording:

import { BusinessError } from '@kit.BasicServicesKit';
audioCapturer.stop((err: BusinessError) => {
  if (err) {
    // Handle error
  } else {
    // Recording stopped
  }
});

release

The release method destroys the instance and releases resources:

import { BusinessError } from '@kit.BasicServicesKit';
audioCapturer.release((err: BusinessError) => {
  if (err) {
    // Handle error
  } else {
    // Resources released
  }
});

OHAudio

OHAudio is a set of C APIs introduced in API version 10. These APIs are designed to be unified, supporting both normal and low-latency audio paths. They only support PCM format and are suitable for scenarios where native-layer audio input is required. Since many audio encoding libraries are implemented in C/C++, using OHAudio C++ interfaces on HarmonyOS reduces data transfer overhead between TS and C++ layers, improving efficiency.

OHAudio depends on the libohaudio.so dynamic library. Import the header files <native_audiostreambuilder.h> and <native_audiocapturer.h> to use audio recording-related APIs.

Create a Builder

OH_AudioStreamBuilder* builder;
OH_AudioStreamBuilder_Create(&builder, AUDIOSTREAM_TYPE_CAPTURER);

Configure Audio Stream Parameters

Refer to the following example:

// Set audio sampling rate
OH_AudioStreamBuilder_SetSamplingRate(builder, 48000);
// Set audio channel count
OH_AudioStreamBuilder_SetChannelCount(builder, 2);
// Set audio sample format
OH_AudioStreamBuilder_SetSampleFormat(builder, AUDIOSTREAM_SAMPLE_S16LE);
// Set audio stream encoding type
OH_AudioStreamBuilder_SetEncodingType(builder, AUDIOSTREAM_ENCODING_TYPE_RAW);
// Set the working scenario for the input audio stream
OH_AudioStreamBuilder_SetCapturerInfo(builder, AUDIOSTREAM_SOURCE_TYPE_MIC);

Parameters function similarly to those in AudioCapture.

Set Audio Callback Functions

// Custom data write function
int32_t MyOnReadData(
    OH_AudioCapturer* capturer,
    void* userData,
    void* buffer,
    int32_t length)
{
    // Extract 'length' bytes of recording data from 'buffer'
    return 0;
}
// Custom audio stream event function
int32_t MyOnStreamEvent(
    OH_AudioCapturer* capturer,
    void* userData,
    OH_AudioStream_Event event)
{
    // Update player state and UI based on the audio stream event
    return 0;
}
// Custom audio interrupt event function
int32_t MyOnInterruptEvent(
    OH_AudioCapturer* capturer,
    void* userData,
    OH_AudioInterrupt_ForceType type,
    OH_AudioInterrupt_Hint hint)
{
    // Update recorder state and UI based on the audio interrupt info
    return 0;
}
// Custom error callback function
int32_t MyOnError(
    OH_AudioCapturer* capturer,
    void* userData,
    OH_AudioStream_Result error)
{
    // Handle the audio error based on 'error'
    return 0;
}

OH_AudioCapturer_Callbacks callbacks;
// Configure callback functions
callbacks.OH_AudioCapturer_OnReadData = MyOnReadData;
callbacks.OH_AudioCapturer_OnStreamEvent = MyOnStreamEvent;
callbacks.OH_AudioCapturer_OnInterruptEvent = MyOnInterruptEvent;
callbacks.OH_AudioCapturer_OnError = MyOnError;

// Set the callback for the audio input stream
OH_AudioStreamBuilder_SetCapturerCallback(builder, callbacks, nullptr);

Configure callback functions via OH_AudioStreamBuilder_SetCapturerCallback.

Construct the Recording Audio Stream

OH_AudioCapturer* audioCapturer;
OH_AudioStreamBuilder_GenerateCapturer(builder, &audioCapturer);

Use the Audio Stream

OH_AudioStream_Result OH_AudioCapturer_Start(OH_AudioCapturer* capturer): Start recording
OH_AudioStream_Result OH_AudioCapturer_Pause(OH_AudioCapturer* capturer): Pause recording
OH_AudioStream_Result OH_AudioCapturer_Stop(OH_AudioCapturer* capturer): Stop recording
OH_AudioStream_Result OH_AudioCapturer_Flush(OH_AudioCapturer* capturer): Release cached data
OH_AudioStream_Result OH_AudioCapturer_Release(OH_AudioCapturer* capturer): Release the recording instance

Release the Builder

OH_AudioStreamBuilder_Destroy(builder);

Audio Recording Best Practices

Let's implement a full-process audio capture example by recording to an MP3 file.

Permission Application

Audio capture requires dynamic permission application. First, declare the permission in module.json5:

"requestPermissions": [  
  {  
    "name": "ohos.permission.MICROPHONE",  
    "reason": "$string:reason",  
    "usedScene": {  
      "abilities": [  
        "FormAbility"  
      ],  
      "when": "inuse"  
    }  
  }  
],

Dynamic permission application:

function reqPermissionsFromUser(permissions: Array<Permissions>, context: common.UIAbilityContext): void {  
  let atManager: abilityAccessCtrl.AtManager = abilityAccessCtrl.createAtManager();  
  atManager.requestPermissionsFromUser(context, permissions).then((data) => {  
    let grantStatus: Array<number> = data.authResults;  
    let length: number = grantStatus.length;  
    for (let i = 0; i < length; i++) {  
      if (grantStatus[i] !== 0) {  
        // User denied permission; prompt and guide to settings  
        return;  
      }  
    }  
    // Permission granted successfully  
  }).catch((err: BusinessError) => {  
    console.error(`Failed to request permissions. Code: ${err.code}, Message: ${err.message}`);  
  })  
}

Call the permission application method in aboutToAppear and start recording after authorization:

const context: common.UIAbilityContext = getContext(this) as common.UIAbilityContext;  
reqPermissionsFromUser(permissions, context);

Configure C++ Project

After creating a C++ module, configure the ohaudio dynamic library dependency:

cmake_minimum_required(VERSION 3.5.0)  
project(audiorecorderdemo)  

set(NATIVERENDER_ROOT_PATH ${CMAKE_CURRENT_SOURCE_DIR})  

if(DEFINED PACKAGE_FIND_FILE)  
    include(${PACKAGE_FIND_FILE})  
endif()  

include_directories(${NATIVERENDER_ROOT_PATH}  
                    ${NATIVERENDER_ROOT_PATH}/include)  

add_library(capture SHARED napi_init.cpp)  
target_link_libraries(capture PUBLIC libace_napi.z.so)  
target_link_libraries(capture PUBLIC libohaudio.so)

Configure NAPI methods:

static napi_value start(napi_env env, napi_callback_info info)  
{  
    // Implementation omitted  
    return nullptr;  
}  
static napi_value stop(napi_env env, napi_callback_info info)  
{  
    // Implementation omitted  
    return nullptr;  
}  
EXTERN_C_START  
static napi_value Init(napi_env env, napi_value exports)  
{  
    napi_property_descriptor desc[] = {  
        { "start", nullptr, start, nullptr, nullptr, nullptr, napi_default, nullptr },  
        { "stop", nullptr, stop, nullptr, nullptr, nullptr, napi_default, nullptr }  
    };  
    napi_define_properties(env, exports, sizeof(desc) / sizeof(desc[0]), desc);  
    return exports;  
}

Implement Start Recording

// Custom data read function  
int32_t MyOnReadData(  
    OH_AudioCapturer* capturer,  
    void* userData,  
    void* buffer,  
    int32_t length)  
{  
    // TODO: Extract recording data from buffer  
    return 0;  
}  
// Custom audio stream event function  
int32_t MyOnStreamEvent(  
    OH_AudioCapturer* capturer,  
    void* userData,  
    OH_AudioStream_Event event)  
{  
    // TODO: Update state/UI based on event  
    return 0;  
}  
// Custom audio interrupt event function  
int32_t MyOnInterruptEvent(  
    OH_AudioCapturer* capturer,  
    void* userData,  
    OH_AudioInterrupt_ForceType type,  
    OH_AudioInterrupt_Hint hint)  
{  
    // TODO: Update state/UI based on interrupt info  
    return 0;  
}  
// Custom error callback function  
int32_t MyOnError(  
    OH_AudioCapturer* capturer,  
    void* userData,  
    OH_AudioStream_Result error)  
{  
    // TODO: Handle error based on 'error'  
    return 0;  
}  
static napi_value start(napi_env env, napi_callback_info info)  
{  
    OH_AudioStreamBuilder* builder;  
    OH_AudioStreamBuilder_Create(&builder, AUDIOSTREAM_TYPE_CAPTURER);  
    // Set audio parameters  
    OH_AudioStreamBuilder_SetSamplingRate(builder, 48000);  
    OH_AudioStreamBuilder_SetChannelCount(builder, 2);  
    OH_AudioStreamBuilder_SetSampleFormat(builder, AUDIOSTREAM_SAMPLE_S16LE);  
    OH_AudioStreamBuilder_SetEncodingType(builder, AUDIOSTREAM_ENCODING_TYPE_RAW);  
    OH_AudioStreamBuilder_SetCapturerInfo(builder, AUDIOSTREAM_SOURCE_TYPE_MIC);  

    OH_AudioCapturer_Callbacks callbacks;  
    // Configure callbacks  
    callbacks.OH_AudioCapturer_OnReadData = MyOnReadData;  
    callbacks.OH_AudioCapturer_OnStreamEvent = MyOnStreamEvent;  
    callbacks.OH_AudioCapturer_OnInterruptEvent = MyOnInterruptEvent;  
    callbacks.OH_AudioCapturer_OnError = MyOnError;  
    OH_AudioStreamBuilder_SetCapturerCallback(builder, callbacks, nullptr);  

    OH_AudioCapturer* audioCapturer;  
    OH_AudioStreamBuilder_GenerateCapturer(builder, &audioCapturer);  
    return nullptr;  
}

Best Practice 1:

To avoid unexpected behavior, ensure each callback in OH_AudioCapturer_Callbacks is initialized with a custom callback or a null pointer. For example:

OH_AudioCapturer_Callbacks callbacks;

// Configure needed callbacks
callbacks.OH_AudioCapturer_OnReadData = MyOnReadData;
callbacks.OH_AudioCapturer_OnInterruptEvent = MyOnInterruptEvent;

// (Mandatory) Initialize unused callbacks with null
callbacks.OH_AudioCapturer_OnStreamEvent = nullptr;
callbacks.OH_AudioCapturer_OnError = nullptr;

Best Practice 2:

For devices supporting low-latency mode, use low-latency mode to create the audio recording builder for scenarios with strict latency requirements (e.g., voice calls) to achieve higher-quality audio:

OH_AudioStream_LatencyMode latencyMode = AUDIOSTREAM_LATENCY_MODE_FAST;
OH_AudioStreamBuilder_SetLatencyMode(builder, latencyMode);

Audio File Processing

Process audio data in the audio callback, which can be passed to ASR or written to a file directly. The next article will cover encoding to MP3 and writing to a file.

Stop and Release

OH_AudioCapturer_Stop(audioCapturer);
OH_AudioStreamBuilder_Destroy(builder);

Summary

This article introduces two audio capture methods in HarmonyOS: AudioCapture at the TS layer and OHAudio at the C++ layer, and implements real-time audio capture using OHAudio interfaces.

DEV Community

HarmonyOS Audio-Video: Audio Capture Practice

HarmonyOS Audio-Video: Audio Capture Practice

Background

Introduction to Audio Recording Interfaces

AudioCapture

createAudioCapture

on('readData')

start

stop

release

OHAudio

Create a Builder

Configure Audio Stream Parameters

Set Audio Callback Functions

Construct the Recording Audio Stream

Use the Audio Stream

Release the Builder

Audio Recording Best Practices

Permission Application

Configure C++ Project

Implement Start Recording

Audio File Processing

Stop and Release

Summary

Top comments (0)