kouwei qing

Posted on Jun 30 • Edited on Jul 13

HarmonyOS Next Video Compression Coding Principles and Best Practices

#harmonyosnext

HarmonyOS Next Video Compression Coding Principles and Best Practices

Background

Previous articles introduced that HarmonyOS Next allows the system album and camera tools to acquire images and videos without special permissions. After acquiring these assets, we generally need to compress them to reduce bandwidth and storage resource consumption. For example, a 10-second video shot by the Huawei Mate60 Pro's system camera generates a file of about 38MB, with an analysis showing a video bitrate as high as 30Mbps—significantly higher than required in many scenarios where video quality demands are lower:

Image compression methods have been discussed previously. This article focuses on video compression coding based on the HarmonyOS Next platform.

Principles of Video Compression

Related Concepts

What is a video?

A video is a sequence of continuous images. Typically, a frame rate of 18 frames per second (fps) ensures smooth playback.

What is video compression?

Video compression uses digital signal processing techniques and algorithms to reduce video data size, minimizing storage space and transmission bandwidth while maintaining original quality.

Why is compression possible?

Because images and video sequences contain redundant information, compression aims to remove this redundancy:

Intra-frame redundancy: Large areas of uniform color or blank spaces in an image can be stored with fewer bits.
Inter-frame redundancy: Adjacent frames in a video sequence (e.g., a person running) often differ only slightly (e.g., leg movements).

Key compression techniques include:

Intra-frame compression: Encodes individual frames without considering adjacent frames (e.g., DCT, Discrete Cosine Transform).
Inter-frame compression: Utilizes correlation between adjacent frames (e.g., motion estimation, motion compensation).
Transform coding: Converts spatial-domain image information to the frequency domain for efficient encoding.
Quantization: Reduces data volume by quantizing transformed coefficients.

What is compression transcoding?

The degree of video compression is measured by bitrate (bit rate), the number of bits transmitted per unit time (e.g., kbps, Mbps). Higher bitrates generally mean higher resolution, smoother motion, and more accurate color reproduction.

For example, a 1280×720 video at 18 fps in RGBA format would require:

1 second uncompressed: 1280×720×18 bytes = 3,686,400×18 bytes = 3,686,400×18×8 bits = 530 Mbps. In contrast, high-quality 4K/8K videos typically have bitrates of only a few Mbps, demonstrating the 100× compression achievable through coding.

Videos captured by mobile phone cameras are often large, but we may need to reduce their size for scenarios where full quality isn’t necessary—hence the focus on video compression coding.

Introduction to Audio-Video Processing Flow

The typical audio-video processing flow in daily scenarios is as follows:

The main principle of video compression coding is to:

Decode and demultiplex the video into raw data.
Optionally process the raw data.
Re-encode it with a new bitrate to reduce file size.

Introduction to HarmonyOS Next Audio-Video Codec APIs

Third-party libraries like FFmpeg can compress videos, but they rely on software codecs (CPU processing). Mobile phones often provide specialized hardware for audio-video processing, accessible via system-provided hardware codec interfaces. This section introduces HarmonyOS Next’s hardware audio-video codec APIs.

Software vs. Hardware Codecs:

Software codecs: Run on CPU, offering flexible iteration, better compatibility, and easier protocol/format extension.
Hardware codecs: Operate on dedicated hardware, delivering better power efficiency, lower latency, higher throughput, and reduced CPU load.

HarmonyOS Next provides audio-video codec interfaces only via C APIs, covering demultiplexing, decoding, encoding, and muxing.

Demultiplexing

Demultiplexing operates on files (local or network), parsing them to extract DRM information and media samples (audio, video, subtitles).

HarmonyOS Next supports data input types:

Remote connections (HTTP protocol, requires ohos.permission.INTERNET).
File descriptors (FD, requires ohos.permission.READ_MEDIA for local files).

Dynamic libraries for demultiplexing:

libnative_media_codecbase.so: Codec base library
libnative_media_avdemuxer.so: Demultiplexer library
libnative_media_avsource.so: Media source library
libnative_media_core.so: Media core library

Demultiplexing involves structured file reading (IO operations) with minimal CPU overhead.

1. Include Header Files

#include <multimedia/player_framework/native_avdemuxer.h>
#include <multimedia/player_framework/native_avsource.h>
#include <multimedia/player_framework/native_avcodec_base.h>
#include <multimedia/player_framework/native_avformat.h>
#include <multimedia/player_framework/native_avbuffer.h>
#include <fcntl.h>
#include <sys/stat.h>

2. Create a Resource Management Object

// Create a source object for an FD resource file (offset=0, size=fileSize is recommended for completeness)
OH_AVSource *source = OH_AVSource_CreateWithFD(fd, 0, fileSize);
if (source == nullptr) {
   printf("create source failed");
   return;
}

For network resources:

// Create a source object for a URI resource (optional)
OH_AVSource *source = OH_AVSource_CreateWithURI(uri);

For custom data sources (requires implementing AVSourceReadAt):

g_filePath = filePath;
OH_AVDataSource dataSource = {fileSize, AVSourceReadAt};
OH_AVSource *source = OH_AVSource_CreateWithDataSource(&dataSource);

3. Create a Demultiplexer Instance

// Create a demuxer for the source object
OH_AVDemuxer *demuxer = OH_AVDemuxer_CreateWithSource(source);
if (demuxer == nullptr) {
   printf("create demuxer failed");
   return;
}

4. Get the Number of File Tracks

// Get the source format to obtain the track count
OH_AVFormat *sourceFormat = OH_AVSource_GetSourceFormat(source);
if (sourceFormat == nullptr) {
   printf("get source format failed");
   return;
}
int32_t trackCount = 0;
if (!OH_AVFormat_GetIntValue(sourceFormat, OH_MD_KEY_TRACK_COUNT, &trackCount)) {
   printf("get track count from source format failed");
   return;
}
OH_AVFormat_Destroy(sourceFormat);

5. Get Track Indices and Information

uint32_t audioTrackIndex = 0;
uint32_t videoTrackIndex = 0;
int32_t w = 0, h = 0, trackType;
for (uint32_t index = 0; index < static_cast<uint32_t>(trackCount); index++) {
   OH_AVFormat *trackFormat = OH_AVSource_GetTrackFormat(source, index);
   if (trackFormat == nullptr) {
      printf("get track format failed");
      return;
   }
   if (!OH_AVFormat_GetIntValue(trackFormat, OH_MD_KEY_TRACK_TYPE, &trackType)) {
      printf("get track type from track format failed");
      return;
   }
   // Classify tracks as audio or video
   if (static_cast<OH_MediaType>(trackType) == OH_MediaType::MEDIA_TYPE_AUD) {
      audioTrackIndex = index;
   } else {
      videoTrackIndex = index;
      // Get video dimensions
      if (!OH_AVFormat_GetIntValue(trackFormat, OH_MD_KEY_WIDTH, &w) ||
          !OH_AVFormat_GetIntValue(trackFormat, OH_MD_KEY_HEIGHT, &h)) {
         printf("get track dimensions failed");
         return;
      }
   }
   OH_AVFormat_Destroy(trackFormat);
}

6. Select Tracks for the Demuxer

if (OH_AVDemuxer_SelectTrackByID(demuxer, audioTrackIndex) != AV_ERR_OK) {
   printf("select audio track failed: %d", audioTrackIndex);
   return;
}
if (OH_AVDemuxer_SelectTrackByID(demuxer, videoTrackIndex) != AV_ERR_OK) {
   printf("select video track failed: %d", videoTrackIndex);
   return;
}

Use OH_AVDemuxer_UnselectTrackByID to deselect tracks when done.

7. Start Demultiplexing and Read Samples

// Create a buffer to hold demuxed data
OH_AVBuffer *buffer = OH_AVBuffer_Create(w * h * 3 >> 1);
if (buffer == nullptr) {
   printf("build buffer failed");
   return;
}
OH_AVCodecBufferAttr info;
bool videoIsEnd = false, audioIsEnd = false;
int32_t ret;
while (!audioIsEnd || !videoIsEnd) {
   // Read audio samples
   if (!audioIsEnd) {
      ret = OH_AVDemuxer_ReadSampleBuffer(demuxer, audioTrackIndex, buffer);
      if (ret == AV_ERR_OK) {
         OH_AVBuffer_GetBufferAttr(buffer, &info);
         printf("audio info.size: %d\n", info.size);
         if (info.flags == OH_AVCodecBufferFlags::AVCODEC_BUFFER_FLAGS_EOS) {
            audioIsEnd = true;
         }
      }
   }
   // Read video samples
   if (!videoIsEnd) {
      ret = OH_AVDemuxer_ReadSampleBuffer(demuxer, videoTrackIndex, buffer);
      if (ret == AV_ERR_OK) {
         OH_AVBuffer_GetBufferAttr(buffer, &info);
         printf("video info.size: %d\n", info.size);
         if (info.flags == OH_AVCodecBufferFlags::AVCODEC_BUFFER_FLAGS_EOS) {
            videoIsEnd = true;
         }
      }
   }
}
OH_AVBuffer_Destroy(buffer);

Use OH_AVDemuxer_SeekToTime to seek to a specific time (like scrubbing in a player).

8. Destroy Demultiplexer Resources

OH_AVSource_Destroy(source);
OH_AVDemuxer_Destroy(demuxer);

Decoding

HarmonyOS Next supports AVC (H.264) and HEVC (H.265) for both software and hardware decoding.

Decoded data can be output via:

Surface: Uses OHNativeWindow for data transfer (e.g., to XComponent).
Buffer: Outputs decoded data via shared memory.

Like Android’s MediaCodec, HarmonyOS Next decoders operate as state machines with six main states:

State transitions:

OH_VideoDecoder_Reset or creation → Initialized
OH_VideoDecoder_Configure → Configured
OH_VideoDecoder_Prepare → Prepared
OH_VideoDecoder_Start → Executing
OH_VideoDecoder_Destroy → Released

Refer to the official flow chart for the interaction process:

Dynamic libraries for codecs:

libnative_media_codecbase.so: Codec base library
libnative_media_core.so: Media processing core
libnative_media_venc.so: Video encoding
libnative_media_vdec.so: Video decoding

For specific interfaces, see the official example:

https://developer.huawei.com/consumer/cn/doc/harmonyos-guides-V5/video-decoding-V5

Decoders accept buffers from demuxers and output YUV buffers or surface data. Use OH_VideoDecoder_SetSurface(videoDec, window) to set a surface for direct rendering (reducing data copy overhead). Data writing is triggered via callback functions registered with OH_VideoDecoder_RegisterCallback:

// Error callback implementation
static void OnError(OH_AVCodec *codec, int32_t errorCode, void *userData)
{
    (void)codec;
    (void)errorCode;
    (void)userData;
}

// Stream change callback implementation
static void OnStreamChanged(OH_AVCodec *codec, OH_AVFormat *format, void *userData)
{
    (void)codec;
    (void)userData;
    OH_AVFormat_GetIntValue(format, OH_MD_KEY_VIDEO_PIC_WIDTH, &width);
    OH_AVFormat_GetIntValue(format, OH_MD_KEY_VIDEO_PIC_HEIGHT, &height);
    OH_AVFormat_GetIntValue(format, OH_MD_KEY_VIDEO_STRIDE, &widthStride);
    OH_AVFormat_GetIntValue(format, OH_MD_KEY_VIDEO_SLICE_HEIGHT, &heightStride);
}

// Input buffer request callback implementation
static void OnNeedInputBuffer(OH_AVCodec *codec, uint32_t index, OH_AVBuffer *buffer, void *userData)
{
    // Process input buffer index and data
}

// New output buffer callback implementation (buffer is null in Surface mode)
static void OnNewOutputBuffer(OH_AVCodec *codec, uint32_t index, OH_AVBuffer *buffer, void *userData)
{
    // Process output buffer index and data
}

// Register asynchronous callbacks
OH_AVCodecCallback cb = {&OnError, &OnStreamChanged, &OnNeedInputBuffer, &OnNewOutputBuffer};
int32_t ret = OH_VideoDecoder_RegisterCallback(videoDec, cb, NULL);
if (ret != AV_ERR_OK) {
    // Error handling
}

Software vs. Hardware Decoding Differences:

Software decoding (via MimeType) currently supports only H.264 (OH_AVCODEC_MIMETYPE_VIDEO_AVC).
Hardware decoding supports H.264 and H.265 (OH_AVCODEC_MIMETYPE_VIDEO_HEVC).

Encoding

Video encoding is the reverse of decoding, currently supporting only hardware encoding for H.264 and H.265.

Encoders accept input via Surface or Buffer modes (e.g., screen rendering or camera capture data).

Like decoders, encoders operate as state machines:

Refer to the official flow chart for the encoding process:

For specific interfaces, see:

https://developer.huawei.com/consumer/cn/doc/harmonyos-guides-V5/video-encoding-V5

To compress videos, focus on encoder configuration:


c
// Configure encoder parameters
double frameRate = 30.0;
bool rangeFlag = false;
int32_t primary = static_cast<int32_t>(OH_ColorPrimary::COLOR_PRIMARY_BT709);
int32_t transfer = static_cast<int32_t>(OH_TransferCharacteristic::TRANSFER_CHARACTERISTIC_BT709);
int32_t matrix = static_cast<int32_t>(OH_MatrixCoefficient::MATRIX_COEFFICIENT_IDENTITY);
int32_t profile = static_cast<int32_t>(OH_AVCProfile::AVC_PROFILE_BASELINE);
int32_t rateMode = static_cast<int32_t>(OH_VideoEncodeBitrateMode::CBR);
int32_t iFrameInterval = 23000;
int64_t bitRate = 3000000;
int64_t quality = 0;

// Create and configure the format
OH_AVFormat *format = OH_AVFormat_Create();
OH_AVFormat_SetIntValue(format, OH_MD_KEY_WIDTH, width);       // Required
OH_AVFormat_SetIntValue(format, OH_MD_KEY_HEIGHT, height);     // Required
OH_AVFormat_SetIntValue(format, OH_MD_KEY_PIXEL_FORMAT, DEFAULT_PIXELFORMAT); // Required

OH_AVFormat_SetDoubleValue(format, OH_MD_KEY_FRAME_RATE, frameRate);
OH_AVFormat_SetIntValue(format, OH_MD_KEY_RANGE_FLAG, rangeFlag);
OH_AVFormat_SetIntValue(format, OH_MD_KEY_COLOR_PRIMARIES, primary);
OH_AVFormat_SetIntValue(format, OH_MD_KEY_TRANSFER_CHARACTERISTICS, transfer);
OH_AVFormat_SetIntValue(format, OH_MD_KEY_MATRIX_COEFFICIENTS, matrix);
OH_AVFormat_SetIntValue(format, OH_MD_KEY_I_FRAME_INTERVAL, iFrameInterval);
OH_AVFormat_SetIntValue(format, OH_MD_KEY_PROFILE, profile);

// Set bitrate based on rate mode
if (rateMode == static_cast<int32_t>(OH_VideoEncodeBitrateMode::CQ)) {
    OH_AVFormat_SetIntValue(format, OH_MD_KEY_QUALITY, quality);
} else if (rateMode == static_cast<int32_t>(OH_VideoEncodeBit

DEV Community

HarmonyOS Next Video Compression Coding Principles and Best Practices

HarmonyOS Next Video Compression Coding Principles and Best Practices

Background

Principles of Video Compression

Related Concepts

Introduction to Audio-Video Processing Flow

Introduction to HarmonyOS Next Audio-Video Codec APIs

Demultiplexing

1. Include Header Files

2. Create a Resource Management Object

3. Create a Demultiplexer Instance

4. Get the Number of File Tracks

5. Get Track Indices and Information

6. Select Tracks for the Demuxer

7. Start Demultiplexing and Read Samples

8. Destroy Demultiplexer Resources

Decoding

Encoding

Top comments (0)