kouwei qing

Posted on Dec 25, 2024 • Edited on Jan 8

HarmonyOS Next Audio and Video - Lame MP3 Encoding Implementation

#harmonyos #harmonyosnext

Background

MP3 is a widely used audio compression format, renowned for its efficient compression algorithm and extensive compatibility. It is one of the most popular audio formats, and almost all audio playback devices, mobile devices, computers, and audio software support MP3 playback. This makes MP3 a de facto standard format. Compared to compression performance, compatibility is more crucial in ensuring MP3's market share.

However, MP3 is a copyrighted encoding. Generally, mobile phone manufacturers do not include MP3 hardware encoders but only MP3 hardware decoders. The most commonly used open-source MP3 soft encoder in the market is Lame. In this article, we will take Lame as an example to implement the MP3 soft encoder from cross-platform compilation to integration into applications in the whole process.

Compiling Lame

There are generally three compilation methods for third-party open-source C/C++ libraries:

cmake
make
configure

Different build scripts require the configuration of different variables. Lame is based on the Configure build script, and you can view the configuration parameters through ./Configure -h. OpenHarmony provides a cross-compilation framework called lycium. After configuring the information of third-party libraries according to the template, you can execute the build script. The tpc_c_cplusplus project already includes the lame module in the thirdparty directory, and we can directly compile it.

Enter the lycium directory and execute ./build.sh lame to start the compilation. After the compilation is completed, you can see the user/lame directory under the lycium directory, where the corresponding dynamic libraries have been compiled.

When compiling on a MacOS ARM version computer, an error occurs. Check the log in tpc_c_cplusplus/thirdparty/lame/lame-3.100/armeabi-v7a-build/build.log:

1 error generated.
1 error generated.
1 error generated.
../../mpglib/dct64_i386.c:34:10: fatal error: 'config.h' file not found
#include <config.h>
         ^~~~~~~~~~
make[2]: *** [tabinit.lo] Error 1
make[2]: *** Waiting for unfinished jobs....
make[2]: *** [common.lo] Error 1
make[2]: *** [interface.lo] Error 1
make[2]: *** [decode_i386.lo] Error 1
make[2]: *** [layer1.lo] Error 1
1 error generated.
1 error generated.
make[2]: *** [layer2.lo] Error 1
make[2]: *** [dct64_i386.lo] Error 1
1 error generated.
make[2]: *** [layer3.lo] Error 1
make[1]: *** [all-recursive] Error 1
** [tabinit.lo] Error 1
make[2]: *** Waiting for unfinished jobs....
make[2]: *** [common.lo] Error 1
make[2]: *** [interface.lo] Error 1
make[2]: *** [decode_i386.lo] Error 1
make[2]: *** [layer1.lo] Error 1
1 error generated.
1 error generated.
make[2]: *** [layer2.lo] Error 1
make[2]: *** [dct64_i386.lo] Error 1
1 error generated.
make[2]: *** [layer3.lo] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all] Error 2
"build.log" 310L, 21047Bmake: *** [all] Error 2

The generation of the config.h header file failed. Since Lame is based on Configure, it is caused by the mismatch of the versions of some dependent tools. After changing to an Intel computer, the compilation was successful:

Integrating into the HarmonyOS Project

Next, integrate the compiled lame dynamic library into our project in a pre-compiled manner.

First, create a native C++ project:

Create a new third_party/lame folder under the cpp directory, and copy the compiled .so files and the exported header files to this path:

Next, modify the CMakeLists.txt file to import the pre-compiled lame dynamic library and add the link:

add_library(lame SHARED IMPORTED)  
set_target_properties(lame  
    PROPERTIES  
    IMPORTED_LOCATION ${CMAKE_CURRENT_SOURCE_DIR}/third_party/lame/libs/${OHOS_ARCH}/libmp3lame.so)  

add_library(audio_engine SHARED napi_init.cpp)  
target_link_libraries(audio_engine PUBLIC libace_napi.z.so lame)

To use the lame header files, you also need to configure the header file search path:

include_directories(${NATIVERENDER_ROOT_PATH}  
                    ${NATIVERENDER_ROOT_PATH}/include  
                   ${NATIVERENDER_ROOT_PATH}/third_party/lame/include  
                    )

After integrating into the project, it can be used.

Recording MP3 Audio Files

After the encoding library is integrated, it is necessary to encapsulate interfaces on the C++ side for the TS side to call. Here, we provide three basic interfaces:

Create an encoder.
Encode the data.
Close the encoder.

Creating the Encoder

After creating the encoder with Lame, it is necessary to set the encoding parameters:

Input audio sampling rate
Input audio channel number
Output sampling rate
Output bit rate
Output quality

Define the initLame method and provide five parameters:

size_t argc = 5;  
napi_value args[5] = {nullptr};  
napi_get_cb_info(env, info, &argc, args, nullptr, nullptr);  
int inSamplerate;  
napi_get_value_int32(env, args[0], &inSamplerate);  

int inChannel;  
napi_get_value_int32(env, args[1], &inChannel);  

int outSamplerate;  
napi_get_value_int32(env, args[2], &outSamplerate);  

int outBitrate;  
napi_get_value_int32(env, args[3], &outBitrate);  

int quality;  
napi_get_value_int32(env, args[4], &quality);

Next, configure the encoder:

lame = lame_init();  
lame_set_in_samplerate(lame, inSamplerate);  
lame_set_num_channels(lame, inChannel);// Input stream channels  
lame_set_out_samplerate(lame, outSamplerate);  
lame_set_brate(lame, outBitrate);  
lame_set_quality(lame, quality);  
lame_init_params(lame);

Encoding Audio Data

The prototype of the Lame encoding function is as follows:

/*  
 * input pcm data, output (maybe) mp3 frames. * This routine handles all buffering, resampling and filtering for you. * * return code     number of bytes output in mp3buf. Can be 0 *                 -1:  mp3buf was too small *                 -2:  malloc() problem *                 -3:  lame_init_params() not called *                 -4:  psycho acoustic problems * * The required mp3buf_size can be computed from num_samples, * samplerate and encoding rate, but here is a worst case estimate: * * mp3buf_size in bytes = 1.25*num_samples + 7200 * * I think a tighter bound could be:  (mt, March 2000) * MPEG1: *    num_samples*(bitrate/8)/samplerate + 4*1152*(bitrate/8)/samplerate + 512 * MPEG2: *    num_samples*(bitrate/8)/samplerate + 4*576*(bitrate/8)/samplerate + 256 * * but test first if you use that! * * set mp3buf_size = 0 and LAME will not check if mp3buf_size is * large enough. * * NOTE: * if gfp->num_channels=2, but gfp->mode = 3 (mono), the L & R channels * will be averaged into the L channel before encoding only the L channel * This will overwrite the data in buffer_l[] and buffer_r[]. **/  
int CDECL lame_encode_buffer (  
        lame_global_flags*  gfp,           /* global context handle         */  
        const short int     buffer_l [],   /* PCM data for left channel     */  
        const short int     buffer_r [],   /* PCM data for right channel    */  
        const int           nsamples,      /* number of samples per channel */  
        unsigned char*      mp3buf,        /* pointer to encoded MP3 stream */  
        const int           mp3buf_size ); /* number of valid octets in this  
                                              stream                        */

It is necessary to input the left channel, right channel data, the number of samples for each channel, the output encoded data buffer, and the size of the output encoded data buffer.

The NAPI interface requires three buffers to be input:

static napi_value NAPI_Global_encodeLame(napi_env env, napi_callback_info info)  
{  
    size_t argc = 4;  
    napi_value args[4] = {nullptr};  
    napi_get_cb_info(env, info, &argc, args, nullptr, nullptr);  

    napi_typedarray_type type;// Data type  
    napi_value left_input_buffer;  
    size_t byte_offset;// Data offset  
    size_t length;// Data byte size  
    napi_get_typedarray_info(env, args[0], &type, &length, NULL, &left_input_buffer, &byte_offset);  
        void* leftBuffer;   
size_t leftLength;   
napi_get_arraybuffer_info(env, left_input_buffer, &leftBuffer, &leftLength);   

napi_value right_input_buffer;  
    napi_get_typedarray_info(env, args[1], &type, &length, NULL, &right_input_buffer, &byte_offset);  
    void* rightBuffer;   
size_t rightLength;   
napi_get_arraybuffer_info(env, right_input_buffer, &rightBuffer, &rightLength);   

int samples;  
    napi_get_value_int32(env, args[2], &samples);  
    napi_value mp3_output_buffer;  
    napi_get_typedarray_info(env, args[3], &type, &length, NULL, &mp3_output_buffer, &byte_offset);  
    void* mp3Buffer;   
size_t mp3Length;   
napi_get_arraybuffer_info(env, mp3_output_buffer, &mp3Buffer, &mp3Length);   

int result = lame_encode_buffer(lame, (short int*)leftBuffer, (short int*)rightBuffer,  
          samples, (unsigned char*)mp3Buffer, mp3Length);    
napi_value result_value;  
    napi_create_int32(env, result,&result_value);  
    return result_value;  

}

The main methods used are napi_get_typedarray_info and napi_get_arraybuffer_info:

On the Native C++ side, it accepts the input ArkTS Array. Through napi_get_typedarray_info, the obtained data is passed into the typedarray array to generate the input_buffer. Then, through napi_get_arraybuffer_info, the array data is obtained.
On the ArkTS side, it receives the Array returned by the Native C++ side. Through napi_create_arraybuffer, an arraybuffer array is created. According to the created arraybuffer, a typedarray is created through napi_create_typedarray and the arraybuffer is stored in the output_array. Then, the arraybuffer is assigned a value, and finally, the output_array is returned.

Closing the Encoder

Call the lame_close function to close the encoder. Before closing, it is necessary to use the lame_encode_flush method to obtain the cached data in the encoder to ensure data integrity:

static napi_value NAPI_Global_flushLame(napi_env env, napi_callback_info info)  
{  
    size_t argc = 1;  
    napi_value args[1] = {nullptr};  
    napi_get_cb_info(env, info, &argc, args, nullptr, nullptr);  
    napi_typedarray_type type;// Data type  
    napi_value output_buffer;  
    size_t byte_offset;// Data offset  
    size_t length;// Data byte size  
    napi_get_typedarray_info(env, args[0], &type, &length, NULL, &output_buffer, &byte_offset);  
        void* outBuffer;   
size_t outLength;   
napi_get_arraybuffer_info(env, output_buffer, &outBuffer, &outLength);   
int result = lame_encode_flush(lame, (unsigned char *)outBuffer, outLength);  
    napi_value result_value;  
    napi_create_int32(env, result,&result_value);  
    return result_value;  
}

Problems Encountered

After linking the mp3lame library and calling the native method, an error Cannot read property encodeLame of undefined is reported. The reason is that the compiled mp3lame.so contains a version number. Directly removing the version number and copying it to the project causes this problem. After retaining the highest digit of the version number, the problem is solved.
Thread issues. Encoding is a time-consuming operation and needs to be processed in an independent thread. Create threads and caches on the C++ side for interaction.

Summary

Due to copyright issues, Android and iOS mobile phones do not directly provide MP3 hardware decoders. When recording MP3, the Lame third-party library is usually used. This article introduced the whole process of implementing MP3 software encoding under HarmonyOS, from compiling the third-party library to integrating it into the project, encapsulating and calling it. Although HarmonyOS provides a hardware encoding method for MP3, this article takes Lame as an example to provide the best practice for integrating third-party C++ libraries.

DEV Community