HarmonyOS Audio-Video: Audio Capture Practice
Background
Many scenarios in application development require audio capture, such as voice messaging in chat functions, real-time speech-to-text conversion, voice calls, and video calls. On Android and iOS, the system provides two forms of audio capture:
- Real-time audio stream capture
- Audio file recording
The system also offers different API forms. For example, on Android:
- AudioRecorder Java interface
- MediaRecorder Java interface
- OpenSLES C++ interface
- AAudio C++ interface
During HarmonyOS adaptation, audio capture is also necessary. This article guides you through implementing audio capture step by step.
Introduction to Audio Recording Interfaces
HarmonyOS provides two audio capture interfaces for TS and C++:
- AudioCapture
- OHAudio
Below is an introduction to the APIs for both languages.
AudioCapture
Using AudioCapturer for audio recording involves creating an AudioCapturer instance, configuring audio capture parameters, starting and stopping capture, and releasing resources. The official state diagram clearly marks method calls and state transitions:
createAudioCapture
Creating a capturer mainly involves parameter configuration:
import { audio } from '@kit.AudioKit';
let audioStreamInfo: audio.AudioStreamInfo = {
samplingRate: audio.AudioSamplingRate.SAMPLE_RATE_48000, // Sampling rate
channels: audio.AudioChannel.CHANNEL_2, // Number of channels
sampleFormat: audio.AudioSampleFormat.SAMPLE_FORMAT_S16LE, // Sample format
encodingType: audio.AudioEncodingType.ENCODING_TYPE_RAW // Encoding format
};
let audioCapturerInfo: audio.AudioCapturerInfo = {
source: audio.SourceType.SOURCE_TYPE_MIC,
capturerFlags: 0
};
let audioCapturerOptions: audio.AudioCapturerOptions = {
streamInfo: audioStreamInfo,
capturerInfo: audioCapturerInfo
};
audio.createAudioCapturer(audioCapturerOptions, (err, data) => {
if (err) {
} else {
let audioCapturer = data;
}
});
Parameters include two main sections:
-
AudioStreamInfo: Audio format configuration
-
samplingRate
: Sampling rate -
channels
: Number of channels -
sampleFormat
: Sample format -
encodingType
: Audio encoding type (only PCM'sENCODING_TYPE_RAW
is supported currently)
-
-
AudioCapturerInfo: Capture configuration
-
source
: Audio source type, including: -
SOURCE_TYPE_INVALID
: Invalid audio source -
SOURCE_TYPE_MIC
: Microphone audio source -
SOURCE_TYPE_VOICE_RECOGNITION
: Speech recognition source -
SOURCE_TYPE_PLAYBACK_CAPTURE
: Playback audio stream (internal recording) -
SOURCE_TYPE_VOICE_COMMUNICATION
: Voice call scenario source -
SOURCE_TYPE_VOICE_MESSAGE
: Short voice message source -
capturerFlags
: Audio capturer flags (0 represents an audio capturer)
-
on('readData')
The on('readData')
method subscribes to audio data read callbacks:
let readDataCallback = (buffer: ArrayBuffer) => {
// Process the audio stream
}
audioCapturer.on('readData', readDataCallback);
start
The start
method begins recording:
import { BusinessError } from '@kit.BasicServicesKit';
audioCapturer.start((err: BusinessError) => {
if (err) {
// Handle error
} else {
// Recording started
}
});
stop
The stop
method stops recording:
import { BusinessError } from '@kit.BasicServicesKit';
audioCapturer.stop((err: BusinessError) => {
if (err) {
// Handle error
} else {
// Recording stopped
}
});
release
The release
method destroys the instance and releases resources:
import { BusinessError } from '@kit.BasicServicesKit';
audioCapturer.release((err: BusinessError) => {
if (err) {
// Handle error
} else {
// Resources released
}
});
OHAudio
OHAudio is a set of C APIs introduced in API version 10. These APIs are designed to be unified, supporting both normal and low-latency audio paths. They only support PCM format and are suitable for scenarios where native-layer audio input is required. Since many audio encoding libraries are implemented in C/C++, using OHAudio C++ interfaces on HarmonyOS reduces data transfer overhead between TS and C++ layers, improving efficiency.
OHAudio depends on the libohaudio.so
dynamic library. Import the header files <native_audiostreambuilder.h>
and <native_audiocapturer.h>
to use audio recording-related APIs.
Create a Builder
OH_AudioStreamBuilder* builder;
OH_AudioStreamBuilder_Create(&builder, AUDIOSTREAM_TYPE_CAPTURER);
Configure Audio Stream Parameters
Refer to the following example:
// Set audio sampling rate
OH_AudioStreamBuilder_SetSamplingRate(builder, 48000);
// Set audio channel count
OH_AudioStreamBuilder_SetChannelCount(builder, 2);
// Set audio sample format
OH_AudioStreamBuilder_SetSampleFormat(builder, AUDIOSTREAM_SAMPLE_S16LE);
// Set audio stream encoding type
OH_AudioStreamBuilder_SetEncodingType(builder, AUDIOSTREAM_ENCODING_TYPE_RAW);
// Set the working scenario for the input audio stream
OH_AudioStreamBuilder_SetCapturerInfo(builder, AUDIOSTREAM_SOURCE_TYPE_MIC);
Parameters function similarly to those in AudioCapture.
Set Audio Callback Functions
// Custom data write function
int32_t MyOnReadData(
OH_AudioCapturer* capturer,
void* userData,
void* buffer,
int32_t length)
{
// Extract 'length' bytes of recording data from 'buffer'
return 0;
}
// Custom audio stream event function
int32_t MyOnStreamEvent(
OH_AudioCapturer* capturer,
void* userData,
OH_AudioStream_Event event)
{
// Update player state and UI based on the audio stream event
return 0;
}
// Custom audio interrupt event function
int32_t MyOnInterruptEvent(
OH_AudioCapturer* capturer,
void* userData,
OH_AudioInterrupt_ForceType type,
OH_AudioInterrupt_Hint hint)
{
// Update recorder state and UI based on the audio interrupt info
return 0;
}
// Custom error callback function
int32_t MyOnError(
OH_AudioCapturer* capturer,
void* userData,
OH_AudioStream_Result error)
{
// Handle the audio error based on 'error'
return 0;
}
OH_AudioCapturer_Callbacks callbacks;
// Configure callback functions
callbacks.OH_AudioCapturer_OnReadData = MyOnReadData;
callbacks.OH_AudioCapturer_OnStreamEvent = MyOnStreamEvent;
callbacks.OH_AudioCapturer_OnInterruptEvent = MyOnInterruptEvent;
callbacks.OH_AudioCapturer_OnError = MyOnError;
// Set the callback for the audio input stream
OH_AudioStreamBuilder_SetCapturerCallback(builder, callbacks, nullptr);
Configure callback functions via OH_AudioStreamBuilder_SetCapturerCallback
.
Construct the Recording Audio Stream
OH_AudioCapturer* audioCapturer;
OH_AudioStreamBuilder_GenerateCapturer(builder, &audioCapturer);
Use the Audio Stream
-
OH_AudioStream_Result OH_AudioCapturer_Start(OH_AudioCapturer* capturer)
: Start recording -
OH_AudioStream_Result OH_AudioCapturer_Pause(OH_AudioCapturer* capturer)
: Pause recording -
OH_AudioStream_Result OH_AudioCapturer_Stop(OH_AudioCapturer* capturer)
: Stop recording -
OH_AudioStream_Result OH_AudioCapturer_Flush(OH_AudioCapturer* capturer)
: Release cached data -
OH_AudioStream_Result OH_AudioCapturer_Release(OH_AudioCapturer* capturer)
: Release the recording instance
Release the Builder
OH_AudioStreamBuilder_Destroy(builder);
Audio Recording Best Practices
Let's implement a full-process audio capture example by recording to an MP3 file.
Permission Application
Audio capture requires dynamic permission application. First, declare the permission in module.json5
:
"requestPermissions": [
{
"name": "ohos.permission.MICROPHONE",
"reason": "$string:reason",
"usedScene": {
"abilities": [
"FormAbility"
],
"when": "inuse"
}
}
],
Dynamic permission application:
function reqPermissionsFromUser(permissions: Array<Permissions>, context: common.UIAbilityContext): void {
let atManager: abilityAccessCtrl.AtManager = abilityAccessCtrl.createAtManager();
atManager.requestPermissionsFromUser(context, permissions).then((data) => {
let grantStatus: Array<number> = data.authResults;
let length: number = grantStatus.length;
for (let i = 0; i < length; i++) {
if (grantStatus[i] !== 0) {
// User denied permission; prompt and guide to settings
return;
}
}
// Permission granted successfully
}).catch((err: BusinessError) => {
console.error(`Failed to request permissions. Code: ${err.code}, Message: ${err.message}`);
})
}
Call the permission application method in aboutToAppear
and start recording after authorization:
const context: common.UIAbilityContext = getContext(this) as common.UIAbilityContext;
reqPermissionsFromUser(permissions, context);
Configure C++ Project
After creating a C++ module, configure the ohaudio dynamic library dependency:
cmake_minimum_required(VERSION 3.5.0)
project(audiorecorderdemo)
set(NATIVERENDER_ROOT_PATH ${CMAKE_CURRENT_SOURCE_DIR})
if(DEFINED PACKAGE_FIND_FILE)
include(${PACKAGE_FIND_FILE})
endif()
include_directories(${NATIVERENDER_ROOT_PATH}
${NATIVERENDER_ROOT_PATH}/include)
add_library(capture SHARED napi_init.cpp)
target_link_libraries(capture PUBLIC libace_napi.z.so)
target_link_libraries(capture PUBLIC libohaudio.so)
Configure NAPI methods:
static napi_value start(napi_env env, napi_callback_info info)
{
// Implementation omitted
return nullptr;
}
static napi_value stop(napi_env env, napi_callback_info info)
{
// Implementation omitted
return nullptr;
}
EXTERN_C_START
static napi_value Init(napi_env env, napi_value exports)
{
napi_property_descriptor desc[] = {
{ "start", nullptr, start, nullptr, nullptr, nullptr, napi_default, nullptr },
{ "stop", nullptr, stop, nullptr, nullptr, nullptr, napi_default, nullptr }
};
napi_define_properties(env, exports, sizeof(desc) / sizeof(desc[0]), desc);
return exports;
}
Implement Start Recording
// Custom data read function
int32_t MyOnReadData(
OH_AudioCapturer* capturer,
void* userData,
void* buffer,
int32_t length)
{
// TODO: Extract recording data from buffer
return 0;
}
// Custom audio stream event function
int32_t MyOnStreamEvent(
OH_AudioCapturer* capturer,
void* userData,
OH_AudioStream_Event event)
{
// TODO: Update state/UI based on event
return 0;
}
// Custom audio interrupt event function
int32_t MyOnInterruptEvent(
OH_AudioCapturer* capturer,
void* userData,
OH_AudioInterrupt_ForceType type,
OH_AudioInterrupt_Hint hint)
{
// TODO: Update state/UI based on interrupt info
return 0;
}
// Custom error callback function
int32_t MyOnError(
OH_AudioCapturer* capturer,
void* userData,
OH_AudioStream_Result error)
{
// TODO: Handle error based on 'error'
return 0;
}
static napi_value start(napi_env env, napi_callback_info info)
{
OH_AudioStreamBuilder* builder;
OH_AudioStreamBuilder_Create(&builder, AUDIOSTREAM_TYPE_CAPTURER);
// Set audio parameters
OH_AudioStreamBuilder_SetSamplingRate(builder, 48000);
OH_AudioStreamBuilder_SetChannelCount(builder, 2);
OH_AudioStreamBuilder_SetSampleFormat(builder, AUDIOSTREAM_SAMPLE_S16LE);
OH_AudioStreamBuilder_SetEncodingType(builder, AUDIOSTREAM_ENCODING_TYPE_RAW);
OH_AudioStreamBuilder_SetCapturerInfo(builder, AUDIOSTREAM_SOURCE_TYPE_MIC);
OH_AudioCapturer_Callbacks callbacks;
// Configure callbacks
callbacks.OH_AudioCapturer_OnReadData = MyOnReadData;
callbacks.OH_AudioCapturer_OnStreamEvent = MyOnStreamEvent;
callbacks.OH_AudioCapturer_OnInterruptEvent = MyOnInterruptEvent;
callbacks.OH_AudioCapturer_OnError = MyOnError;
OH_AudioStreamBuilder_SetCapturerCallback(builder, callbacks, nullptr);
OH_AudioCapturer* audioCapturer;
OH_AudioStreamBuilder_GenerateCapturer(builder, &audioCapturer);
return nullptr;
}
Best Practice 1:
To avoid unexpected behavior, ensure each callback in OH_AudioCapturer_Callbacks
is initialized with a custom callback or a null pointer. For example:
OH_AudioCapturer_Callbacks callbacks;
// Configure needed callbacks
callbacks.OH_AudioCapturer_OnReadData = MyOnReadData;
callbacks.OH_AudioCapturer_OnInterruptEvent = MyOnInterruptEvent;
// (Mandatory) Initialize unused callbacks with null
callbacks.OH_AudioCapturer_OnStreamEvent = nullptr;
callbacks.OH_AudioCapturer_OnError = nullptr;
Best Practice 2:
For devices supporting low-latency mode, use low-latency mode to create the audio recording builder for scenarios with strict latency requirements (e.g., voice calls) to achieve higher-quality audio:
OH_AudioStream_LatencyMode latencyMode = AUDIOSTREAM_LATENCY_MODE_FAST;
OH_AudioStreamBuilder_SetLatencyMode(builder, latencyMode);
Audio File Processing
Process audio data in the audio callback, which can be passed to ASR or written to a file directly. The next article will cover encoding to MP3 and writing to a file.
Stop and Release
OH_AudioCapturer_Stop(audioCapturer);
OH_AudioStreamBuilder_Destroy(builder);
Summary
This article introduces two audio capture methods in HarmonyOS: AudioCapture at the TS layer and OHAudio at the C++ layer, and implements real-time audio capture using OHAudio interfaces.
Top comments (0)