DEV Community

SameX
SameX

Posted on

HarmonyOS Next Array Deep Exploration: Array and VArray Memory Model

In HarmonyOS Next development, arrays, as commonly used data structures, their performance and memory management are crucial to the overall performance of the application.Cangjie Language provides reference type Array<T> and value type VArray<T, $N>. In-depth understanding of their memory model differences is very important for optimizing code performance and resource utilization.As a senior technical expert, I will conduct in-depth analysis of these two array types based on practical experience.

Chapter 1: Type System

Array<T> is a reference type array, and its layout in memory is to allocate a continuous piece of memory space on the heap to store references to elements.The actual data for each element is stored elsewhere in the heap, and the reference in Array<T> points to this data.This means that when Array<T> is passed or assigned to other variables as an argument, only the reference is passed, not the data itself.For example:

let arr1: Array<Int64> = [1, 2, 3]
let arr2 = arr1
arr2[0] = 4
println(arr1[0]) // Output 4, because arr1 and arr2 point to the same array
Enter fullscreen mode Exit fullscreen mode

From the perspective of memory layout, assuming that array elements 1, 2, and 3 are stored at the heap address 0x1000, 0x1008, and 0x1010, arr1 stores references to 0x1000 on the stack.When arr2 = arr1, arr2 also points to 0x1000, so modification of the arr2 element will affect arr1.

VArray<T, $N> is an array of value types, and its memory layout is more compact.It allocates a continuous piece of memory space on the stack, directly storing the value of the element.$N represents the length of the array, which is determined at compile time.For example:

var vArr: VArray<Int64, $3> = [1, 2, 3]
Enter fullscreen mode Exit fullscreen mode

At this time, vArr allocates a continuous memory space on the stack that can accommodate 3 Int64 type values, directly storing 1, 2, and 3.This memory layout allows VArray<T, $N> to copy the value when passing and assigning values, rather than passing the reference.

Chapter 2: Performance comparison

To understand the performance differences between Array<T> and VArray<T, $N>, benchmarks were conducted with 100,000 element accesses.The test code is as follows:

import std.time.*

func testArrayAccess() {
    let startTime = getCurrentTime()
    let arr: Array<Int64> = Array(100000, item: 0)
    for (i in 0..100000) {
        let _ = arr[i]
    }
    let endTime = getCurrentTime()
    let elapsedTime = endTime - startTime
println("Array access takes 100,000 times: \(elapsedTime) ms")
}

func testVArrayAccess() {
    let startTime = getCurrentTime()
    var vArr: VArray<Int64, $100000> = VArray(item: 0)
    for (i in 0..100000) {
        let _ = vArr[i]
    }
    let endTime = getCurrentTime()
    let elapsedTime = endTime - startTime
println("VArray access takes 100,000 times: \(elapsedTime) ms")
}

testArrayAccess()
testVArrayAccess()
Enter fullscreen mode Exit fullscreen mode

Normally, since VArray<T, $N> directly accesses elements on the stack, avoiding the overhead of heap memory lookup and pointer indirect access, the access speed will be faster than Array<T> in small-scale data access scenarios.However, as the array size increases, the value copying feature of VArray<T, $N> will bring greater performance overhead, such as when passing parameters or assigning values.When Array<T> is used to process large-scale data, it has advantages in data sharing and delivery due to the characteristics of reference delivery.

Chapter 3: C Interoperability

In the interoperability scenario with C language, VArray<T, $N> provides the possibility to implement a zero-copy FFI (Foreign Function Interface) interface.Since VArray<T, $N> stores data continuously on the stack, similar to the memory layout of arrays in C language, you can directly pass the memory address of VArray<T, $N> to the C function to avoid data copying.

Suppose there is a function in C language that calculates the sum of array elements:

#include <stdint.h>

int64_t sumArray(int64_t* arr, int64_t size) {
    int64_t sum = 0;
    for (int64_t i = 0; i < size; i++) {
        sum += arr[i];
    }
    return sum;
}
Enter fullscreen mode Exit fullscreen mode

In Cangjie language, you can call it like this:

import "C"

func callCSumFunction() {
    var vArr: VArray<Int64, $5> = [1, 2, 3, 4, 5]
    let sum = C.sumArray(&vArr[0], vArr.size)
println("Sum of array calculated by C function: \(sum)")
}

callCSumFunction()
Enter fullscreen mode Exit fullscreen mode

In this way, the first address of VArray<T, $N> is directly passed to the C function, which realizes efficient C interoperability, reduces the performance loss caused by data copying, and improves the efficiency of cross-language calls.

A deep understanding of the memory model, performance differences and C interoperability characteristics of Array<T> and VArray<T, $N> can help developers select appropriate array types based on specific scenarios in HarmonyOS Next development, optimize code performance, and improve the overall quality of their applications.

Top comments (0)