DEV Community: Leesoo Ahn

The Anatomy of Barriers

Leesoo Ahn — Mon, 05 May 2025 10:08:31 +0000

I've been working on migrating an Arm-based product to a different architecture. Throughout the process, I came across some lines of code with barriers. The challenge was that I couldn't fully understand or modify them properly without knowing exactly what those barriers were.

In this post, we'll take a closer look at the two major types of barriers you often encounter: compiler barriers and memory barriers.

Barriers?

A barrier (also known as fence) is a mechanism of preventing memory operations from being reordered by compilers or CPUs. Modern processors and compilers often execute instructions out of order to improve performance, which can lead to unexpected behaviors in concurrent or low-level programming. Barriers enforce strict ordering by ensuring that certain memory operations are completed before others begin.

Compiler Barriers

A compiler barrier is an instruction or directive that tells the compiler:

Do not move memory operations across this point. Preserve the order of instructions as written.

It is important to understand:

A compiler barrier only affects the compiler's optimization passes.
It does not directly affect the CPU's execution or memory ordering. In other words, a compiler barrier controls the compiler, not the hardware.

Modern compilers are insanely aggressive. They assume that reordering memory accesses is fine if the program's observable behavior (the as-if rule) doesn't change.

However, it's a critical correctness requirement if you're writing:

Lock-free data structures
Spinlocks or mutexes
Hardware drivers
IPC code

Even a simple-looking optimization can break things:

*ptr = 1;
flag = 1;

If flag signals to another thread that *ptr is ready, but the compiler reorders these two stores, your system may crash or behave unpredictably as flag taken first before *ptr set to one.

You need a compiler barrier in that case to guarantee the order you wrote is the order that gets emitted in the machine code:

*ptr = 1;
asm volatile("" ::: "memory");
flag = 1;

It tells compilers not to reorder them.

Memory Barriers on Arm

When working with Arm processors, especially in multi-core or multi-threaded environments, memory consistency issues quickly become a real concern. Because Arm implements a weakly ordered memory model, the order in which memory operations appear to execute is not always the order you wrote in your code.

This can lead to subtle, hard-to-reproduce bugs unless you use memory barriers properly.

Modern CPUs like Arm prioritize performance. Hence, they allow memory accesses (loads and stores) to be: Reordered, Delayed, and Speculated.

For instances:

A store you wrote earlier might become visible to another core after a later store.
A load you wrote later might complete before an earlier store is visible.

This is usually harmless in single-threaded programs, but when multiple cores or devices are involved, it can break correctness.

Arm defines three main types of memory barrier instructions: dmb, dsb, and isb.

DMB (Data Memory Barrier)

Ensures that memory accesses before the dmb are globally observed before memory accesses after it.
Only affects memory accesses - instructions can still be fetched and decoded out of order.

use cases:

Message passing: ensuring that data is visible before a flag is set.
Synchronizing shared variables across cores.

str r5, [r1]    ; write data
dmb             ; make sure the data is globally visible
str r0, [r2]    ; signal that data is ready

DSB (Data Synchronization Barrier)

Stronger than dmb.
Ensures that all memory accesses and side effects before the dsb are complete, and execution doesn't proceed until they are done.

use cases:

Before entering low-power states: wfi, and wfe.
Before sending interrupts via memory-mapped registers.
After cache or TLB maintenance operations.

str r5, [r1]    ; update a shared buffer
dsb             ; ensure the update is complete
wfi             ; wait for interrupt

ISB (Instruction Synchronization Barrier)

Flushes the CPU's pipeline.
Ensures that all instructions following the ISB are fetched anew.
Used after changing system control registers or modifying code at runtime.

use cases:

After enabling or disabling MMU, caches, or other system registers.

mcr p15, 0, r0, c1, c0, 0  ; update system control register
isb                        ; ensure the update takes effect immediately

Let's summarize when to use each barrier:

Scenario	Barrier
Ensure memory write ordering across cores	`dmb`
Complete all previous memory transactions before continuing	`dsb`
Flush the instruction pipeline after system configuration changes	`isb`
Sending an interrupt (through a mailbox) after writing data	`dsb`
Cleaning cache lines and invalidating TLBs	`dsb` + `isb`

Conclusion

Compiler, and memory barriers are essential tools for writing correct and reliable low-level code on a specific architecture. They might seem like magic words at first, but once you understand their role: controlling visibility and ordering of memory accesses, they become a logical part of your system design.

Understanding barriers is a rite of passage for serious system programmers. And once you get it right, your systems will be faster, safer, and far less mysterious.

Remember your code doesn't always execute as you write and expect.

The Power of Memory Map

Leesoo Ahn — Sat, 02 Nov 2024 16:59:58 +0000

Since early this year, I’ve been working on a BSP project. The biggest challenge was understanding physical memory layout, specifically why certain addresses are defined in the DTS and don’t fall within other expected ranges.

To tackle this, I created a complete memory map of the chip1, which helped me gain a clear understanding, and use the resources of the reference board2 to explain.

Fortunately, NXP has made the kernel source and reference manuals for the S32G3 chipset publicly available. This allows us to practice designing memory map diagrams freely using these resources—big thanks to NXP!

The image below shows the complete memory map, including kernel-reserved memory regions.

The S32G3 chip includes five categories of memory ranges:

Extended Address Map
External DRAM
Peripherals
RAM
QSPI Memory

Extended Address Map

A 4GB DRAM can be mapped within a 32-bit address space. However, since the lower half of this range is allocated to peripherals, only up to 2GB is available for DRAM.

To overcome this limitation, the system can extend the address space to 40-bit mode. This allows more than 2GB of DRAM to be mapped and provides additional address space for other devices, including the PCIe endpoint as shown in the diagram.

External DRAM

This is the basic range where DRAM is mapped. It serves as the main memory used by the kernel, where tasks like loading the kernel image during boot, memory management like the page allocation.

Peripherals

This range is where most peripherals are mapped to specific areas of the SoC, allowing access to their controllers.

RAM

It’s integrated into SoC chips because key components like PCIe, CPU, and GPU need ultra-high-speed communication. Its size is quite limited compared to DRAM, typically ranging from KB to MB, due to the high cost of larger capacities. This type of memory is commonly used for cache and CPU registers, where ultra-high speed is essential.

QSPI Flash Memory

A QSPI-interfaced flash memory is used to store resources such as pre/boot loaders, kernel images, and additional binaries. This area is used by the M7 cores to store their firmware.

However, the actual accessible address size on the board is limited to 0x03FF_FFFF (64MB), even though the total address space extends up to 0x1FFF_FFFF (512MB), because the NOR flash3 is designed as a 64MB storage.

Conclusion

We have been exploring the memory map of the chip. This can be challenging for BSP newcomers, but it’s essential knowledge. For instance, U-Boot, a bootloader uses environment variables such as loadaddr and fdtaddr to load binaries into DRAM. In such cases, understanding the accessible memory range is crucial.

I hope you found this post helpful and insightful!

A Yocto Cheatsheet

Leesoo Ahn — Fri, 01 Nov 2024 03:48:37 +0000

Use external kernel source

Add the following lines to local.conf file.

INHERIT += "externalsrc"
EXTERNALSRC:pn-linux-raspberrypi = "/path/to/linux-kernel"

Add extra tasks in recipe file

do_reloc_what_you_want() {
    // specific jobs
}
addtask reloc_what_you_want before do_configure after do_prepare_recipe_sysroot

DebConf24, a conference trip

Leesoo Ahn — Sat, 12 Oct 2024 04:40:47 +0000

때는 날씨가 더워지기 시작했던 5월과 6월 사이의 어느 날... Debian 개발자들의 연례 행사인 DebConf24가 대한민국 부산에서 열린다는 소식에 땀을 닦지도 않은 채 등록을 시작했다. 국제 컨퍼런스가 한국에서 열린다고 하니 통 크게 놀아보고 싶어 발표까지 하기로 마음먹었다. 청중을 위한 발표이니 영어를 사용해야했지만 무섭지 않았다. 이번이 아니면 큰 무대에서 발표할 수 있는 기회가 얼마나 많을까 싶은 마음뿐이었다.

선 등록, 후 고민

어릴때는 그렇게나 겁이 많았다. 툭하면 "엄마!" 하며 울었다고 하는데 난 기억이 전혀 없다. 그렇게 중/고등학교를 지나 대학교에서 이것저것 일을 벌렸고 혼자 프로젝트 진행, 연구 대회도 개인으로 참여하여 개발/발표까지 하며 경험을 쌓았다.

졸업한지 몇 년이 지났지만 하고 싶은게 생기면 별로 고민하지 않고 일단 시작한 뒤에 해결 방법을 고민한다.

못하면 쪽팔리는 것 밖에 더하겠어? 죽는 것도 아니여 ~

그렇게 테스트도 하지 않았던, 머릿속에 오로지 "잘 될거야!" 라는 생각 하나만으로 연구중이던 AppArmor Namespaces + Linux Container 주제로 발표 신청을 했다.

시간은 흘러 8월초가 되어 부경대로 향했다. 지난 몇 년간 '준비, 발표, 후회' 사이클을 겪고 나니 이제는 그러려니 하면서 긴장하지 않았다.

발표요? 놀고나서 생각합시다!

DebConf처럼 길게(2주정도) 진행되는 컨퍼런스의 경우 중간중간 영화 상영, 음악/와인 파티, Day trip의 세션이 존재하기도 한다. 경주, 울산, 부산 코스가 있었는데 나는 경주를 선택했다.

이날 날씨가 상당히 더웠음에도 불구하고 각자 자신에게 맞는 한복을 열심히 골라본다.

모자도 중요하죠! 신중하게 고르는 그들. 왕이니까 신분에 맞는 모자도 써야지!

풍경이 예뻐서 찰칵!

두근두근 발표날

초기 계획은 데모없이 자료만 사용하여 발표하려고 했다. 하지만 데모가 없으면 아마추어 같아서 급하게 발표 전날에 준비했다. WSL에서 QEMU 기반으로 테스트 환경을 만들어 데모를 준비했는데 예상과 다르게 테스트가 잘 되지 않아 조마조마했다. 다행히 AppArmor 매뉴얼과 코드 분석을 통해 무사히 끝낼 수 있었다.

발표 주제는 Linux Containers with AppArmor Policy Namespaces로 LXC 컨테이너가 동작중인 리눅스 환경에서 Host와 Container가 서로 다른 AppArmor 보안 정책을 사용할 수 있는 기술에 대한 내용이다.

AppArmor, SELinux 같은 LSM 기반의 보안 모듈들은 커널에서 동작하며 이는 Host, Container 구분없이 모든 system-call에 대해 동일한 정책을 사용하게 된다는 의미이다. AppArmor는 Policy Namespace 기능을 지원하므로 커널은 Host와 Container의 system-call을 구분할 수 있게 되고 서로 독립적인 정책을 사용할 수 있게 되는 것이다.

세션 참여

loong64 port BoF
- Debian 배포판에 loong64 arch를 포팅하는 내용이며 메인테이너가 진행했다. 아키텍처 메인테이너가 어떻게 작업하는지 볼 수 있는 귀중한 시간이었다.
What's new in eBPF and how you could use it today
- eBPF에 대한 세션이었으며 해당 기술이 무엇인지, 어떤 도구가 존재하고 어떻게 사용하는지 간단하게 알아보았다.
Past, Present and Future of Networking in Debian
- Netplan 메인테이너가 진행했고 프로젝트의 현 상황과 앞으로의 계획에 대해서 토론하였다.

Kernel Engineer BoF

컨퍼런스가 얼마 남지 않은 날에 이렇게 끝내기에는 아쉽다는 느낌이 많아서 한국 커널 개발자분들에게 "Kernel BoF라도 하는게 어떨까요"라고 넌지시 물었고 다들 긍정적으로 생각하셨다. 그래서 급하게 Contents 팀에 연락하여 Kernel Engineer BoF 세션을 등록해달라 부탁하였고 컨퍼런스 막바지였던 금요일에 진행했다.

해당 세션은 각자가 진행중인 커널 프로젝트 또는 패치에 대해 짧게 얘기하고 토론하는 시간으로 편성했다. 나는 당시에 작업했던 sparsemap_buf 최적화를 바탕으로 리뷰하였다. BoF 운영이 처음이다 보니 타이트한 시간 편성 및 작은 실수들이 많았는데 그래도 많은 분들이 와주셨다.

BoF 끝나고 기념 사진! (초상권 중요하죠)

후기

중간에 참여한 것이 못내 아쉬웠다. 참여하고 싶었으나 시간이 맞지 않고 Video 녹화도 되지 않던 세션이 여렀있었다. 그래도 국내 컨퍼런스들과는 사뭇 다른 분위기여서 새롭고 재밌었다.

편안한 환경에서 서로 장난치며 얘기하고, 대학교 프로젝트의 분위기가 강하다.
메인테이너, 컨트리뷰터라고 상대를 내려다 보거나 반대로 우러러 보지 않는다. 모두 같은 컨트리뷰터로 대하고 자유롭게 의견을 낸다.
BoF 세션을 처음 참여했는데 같은 방식의 micro-conference가 한국에도 많이 있으면 좋겠다.

후원

NIPA와 Open-UP의 지원을 받아 DebConf24에 참여하였습니다. 이 자릴 빌어 지원해준 기관에 감사드립니다.

Tracing the Arm64 Linux System Call Path

Leesoo Ahn — Tue, 13 Aug 2024 13:23:12 +0000

Arm64 system has two type of traps,

Synchronous
Asynchronous

and four exceptions which start with el (stands for exception level.)

el0 (userspace)
el1 (kernel)
el2 (hypervisor)
el3 (secure mode)

Synchronous is known as system-call among many, while Asynchronous is as hardware interrupt in Arm whitepaper. But the latter is off-topic in this article.

One process is working in el0 and it would raise its hand by itself if it needs any system resource at a time. This is system-call and switches the exception level of CPUs from el0 to el1. Kernel takes the CPU and does something for the leftovers instead of the process. Once it's done, it hands out the CPU to the process again.

The following code is about one of (real) system-call APIs from musl, a well-known libc library.

#define __asm_syscall(...) do { \
    __asm__ __volatile__ ( "svc 0" \
    : "=r"(x0) : __VA_ARGS__ : "memory", "cc"); \
    return x0; \
} while (0)

static inline long __syscall0(long n)
{
    register long x8 __asm__("x8") = n;
    register long x0 __asm__("x0");
    __asm_syscall("r"(x8));
}

Imagine that one process mentioned above is about to call fork() very soon. The API doesn't take any arguments and therefore, it maps to __syscall0(..).

What you need to keep in mind regarding to the code is svc instruction (stands for supervisor-call), to switch from el0 to el1 with x8 register holding digits that represent the system-call number.

el0t_64_sync_handler would be called in el1 by the exception vector table describing what to do if svc raised and jump to el0_svc(..) by esr system register holds syndrome information which is used to recognize the exception class (also known as exception reason.)

el0t_64_sync_handler(struct pt_regs *regs)
{
    unsigned long esr = read_sysreg(esr_el1);

    switch (ESR_ELx_EC(esr)) {
    case ESR_ELx_EC_SVC64:
        el0_svc(regs);
    ...
}

From now on, showing a code diagram will be easier than words to understand for everyone. (code is based on v5.15)

el0_svc(struct pt_regs *regs)
{
    ...
    do_el0_svc(regs);
    ...   |
}         |
    +-----+
    |
    V
do_el0_svc(struct pt_regs *regs)
{
    ...
    el0_svc_common(regs, regs->regs[8],
           |       __NR_syscalls,
    ...    |       sys_call_table);
}          |
    +------+
    |
    V
el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
               const syscall_fn_t syscall_table[])
{
    ...
    invoke_syscall(regs, scno, sc_nr, syscall_table);
    ...
}

We're almost at our destination now. scno was from x8 register (again, it was holding digits that represent a system-call number) and invoke_syscall(..) is looking up the system-call function in syscall_table using the number from scno. Eventually, it will carry out what was requested.

invoke_syscall(struct pt_regs *regs, unsigned int scno,
               unsigned int sc_nr,
               const syscall_fn_t syscall_table[])
{
    ...
    if (scno < sc_nr) {
        syscall_fn_t syscall_fn;
        syscall_fn = syscall_table[array_index_nospec(scno, sc_nr)];
        ret = __invoke_syscall(regs, syscall_fn);
    }                |
    ...              |
}                    |
    +----------------+
    |
    V
__invoke_syscall(struct pt_regs *regs, syscall_fn_t syscall_fn)
{
    return syscall_fn(regs);
}

You may wonder that as far as we know, each system-call has a different number of parameters. But syscall_fn(..) takes only one, regs. We will see two cases by code, one for taking nothing and another does five parameters.

fork() takes nothing in parameters, therefore struct pt_regs object passing to syscall_fn is unused.

#define SYSCALL_DEFINE0(sname) \
    ...
    asmlinkage long __arm64_sys_##sname(const struct pt_regs *__unused)

On the other hands, clone() takes five parameters, therefore struct pt_regs object expands itself to the number of parameters by SC_ARM64_REGS_TO_ARGS(..) and __MAP(..).

SYSCALL_DEFINE5(clone, unsigned long, clone_flags, unsigned long, newsp,
      |        int __user *, parent_tidptr,
      |        unsigned long, tls,
      |        int __user *, child_tidptr)
      |
      +--------+
               |
               V
#define __SYSCALL_DEFINEx(x, name, ...) \
    ...
    __arm64_sys##name(const struct pt_regs *regs) \
    { \
        return __se_sys##name(SC_ARM64_REGS_TO_ARGS(x,__VA_ARGS__)); \
    } \           |
         +--------+
         |
         V
    __se_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
    { \
        long ret = __do_sys##name(__MAP(x,__SC_CAST,__VA_ARGS__)); \
        ...              |
        return ret; \    |
    } \                  |
         +---------------+
         |
         V
    __do_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))

We have walked through the system-call code from el0 to el1. It wasn't a long journey, but wasn't easy either. I hope this tiny map (I like using metaphors) guides you to where you want to be.

happy hacking!

esBPF: Stress-Testing compares Software-Offload with iptables

Leesoo Ahn — Mon, 12 Aug 2024 15:30:08 +0000

This article was written on Nov 29th, 2022.

esBPF project has been over one year and it began with the idea that Is it worth filtering ingress packets on Software-Offload layer instead of Network Stack? Software-Offload is similar to Hardware-Offload, but it works in ethernet driver. Now time to do Stress-testing since its prototype was released and the comparison object will be iptables.

Before walking through the article, let me define a few short terms against typing exhausting long terms,

Long Term	Short Term
Raspberry Pi 3	Rpi3
Host Machine	Host

Testbed

Host and Rpi3 are on link connection of the same LAN of the AP below that it supports HW-offload and being Bridge mode against its Kernel interrupts forwarding packets between them.

                    High-Performance AP
                      - HW-offload Supported
                      - Bridge Mode
                    +-----------------+
                    |   Wireless AP   |
                    +-----------------+
      100Mbps link    |             |     1Gbps link
           +----------+             +-----------+
           |                                    |
+-------------------+                 +-------------------+
| Raspberry Pi 3    |                 | Host Machine      |
| (192.168.219.103) |                 | (192.168.219.108) |
+-------------------+                 +-------------------+

Also using hping3 program for Stress-testing that is going to be just flooding ICMP packets to Rpi3.

$ hping3 --icmp --faster 192.168.219.103 -d 20

Tuning Raspberry-Pi 3 for the testing

Ubuntu 22.10 Kinetic Release - Kernel 5.19.0-1007 (Arm64)
Enable CONFIG_HOTPLUG_CPU to on/off CPU cores
esBPF-based customized eth driver, smsc95xx-esbpf
Off wlan0 interface not to mess up routing

It's set up using 2 cores instead of entire CPUs to load up full traffic on a specific number of cores by maxcpus=2 at boot command-line. Hence we have 2 online and offline cores respectively,

ubuntu@ubuntu:~$ lscpu
Architecture:            aarch64
  CPU op-mode(s):        32-bit, 64-bit
  Byte Order:            Little Endian
CPU(s):                  4
  On-line CPU(s) list:   0,1
  Off-line CPU(s) list:  2,3
Vendor ID:               ARM
  Model name:            Cortex-A53

Briefing about smsc95xx-esbpf

Two significant files exist under a directory /proc/smsc95xx/esbpf once the driver has been loaded on Kernel and each other is responsible for ...

rx_enable : turns on/off esbpf operations.
rx_hooks : is supposed to be written by a program of cBPF instructions.

Stress-testing

We are going to look at mpstat values and compare NET_RX in /proc/softirqs before and after executing hping3. Please suppose the program would be running for 60 seconds on Host in each case.

Here is the idle usage of the CPUs of Rpi3. The idle columns are almost the same in both testing cases, iptables and Software-Offload before generating massive traffic on the LAN.

$ mpstat -P ALL 3
CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
all    0.00    0.00    0.17    0.00    0.00    0.17    0.00    0.00    0.00   99.66
  0    0.00    0.00    0.34    0.00    0.00    0.00    0.00    0.00    0.00   99.66
  1    0.00    0.00    0.00    0.00    0.00    0.34    0.00    0.00    0.00   99.66

1. iptables

In the first test, the following rule is supposed to be appended in INPUT part on Rpi3 and as the result, one of the CPUs is being performed by softirq which means so busy to work.

$ iptables -A INPUT -p icmp -j DROP
$ iptables -nvL
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 DROP       icmp --  *      *       0.0.0.0/0            0.0.0.0/0

# NET_RX softirq count before massive traffic
                    CPU0       CPU1       CPU2       CPU3
      NET_RX:        123         66          0          0

# NET_RX softirq count after that
                    CPU0       CPU1       CPU2       CPU3
      NET_RX:      15040      35021          0          0

# mpstat
CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
all    0.00    0.00    0.18    0.00    0.00   52.89    0.00    0.00    0.00   46.94
  0    0.00    0.00    0.37    0.00    0.00    0.74    0.00    0.00    0.00   98.89
  1    0.00    0.00    0.00    0.00    0.00  100.00    0.00    0.00    0.00    0.00

2. esBPF

In the second test, it's going to drop the same type of packets in Software-Offload, in other words, in-driver. Special tools are required for doing that, tcpdump and filter_icmp but the latter already has hard-coded cBPF instructions, so tcpdump ain't necessary at this point.

The hard-coded part is as follows

struct sock_filter insns[] = {
  /* tcpdump -dd -nn icmp */
  { 0x28, 0, 0, 0x0000000c },
  { 0x15, 0, 3, 0x00000800 },
  { 0x30, 0, 0, 0x00000017 },
  { 0x15, 0, 1, 0x00000001 },
  { 0x6, 0, 0, 0x00040000 },
  { 0x6, 0, 0, 0x00000000 },
};

and the program is executed by the following command that actually tries writing the above instructions to esBPF module.

$ sudo ./filter_icmp /proc/smsc95xx/esbpf/rx_hooks
$ sudo echo 1 > /proc/smsc95xx/esbpf/rx_enable

Even though hping3 works in the same flow, NET_RX didn't rise as much as the first case.

# NET_RX softirq count before massive traffic
                    CPU0       CPU1       CPU2       CPU3
      NET_RX:        129         81          0          0

# NET_RX softirq count after that
                    CPU0       CPU1       CPU2       CPU3
      NET_RX:        141         94          0          0

Also, the average usage of CPUs by softirq is around 8% up to 30% by looking at the best and worst cases respectively.

# mpstat in the best case
CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
all    0.00    0.00    0.64    0.00    0.00    7.99    0.00    0.00    0.00   91.37
  0    0.00    0.00    0.65    0.00    0.00    6.54    0.00    0.00    0.00   92.81
  1    0.00    0.00    0.62    0.00    0.00    9.38    0.00    0.00    0.00   90.00

# mpstat in the worst case
CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
all   18.31    0.00    4.58    0.96    0.00   27.47    0.00    0.00    0.00   48.67
  0   14.50    0.00    4.00    1.00    0.00   26.00    0.00    0.00    0.00   54.50
  1   21.86    0.00    5.12    0.93    0.00   28.84    0.00    0.00    0.00   43.26

Notice that you may sometimes see a few ICMP packets coming to the Network Stack although esBPF is enabled. No worries they are just from lo interface.

Conclusion

esBPF works in Software-Offload, as known as device driver layer against Netfilter, a super-set of iptables which works in Network Stack. Therefore it drops all incoming packets matched to the filters in tasklet level instead of NET_RX (part of Network Stack) and as we see the result of esBPF, Kernel doesn't need any extra tasks.

The project could be better than packet filtering in Network Stack in some cases even though the worst case takes CPU resources about four times than the best case. Of course, it depends on how big/long cBPF instructions are in esBPF.

The project is still in progress for making it to be more flexible, optimization, and taking cache mechanism.

I figured out through this Stress-testing that it will be worth putting more effort into the project and keep working. Also, it was a great time to take the responsibility for the entire process from design to testing.

happy hacking!

AppArmor testsuite

Leesoo Ahn — Sat, 11 May 2024 08:47:21 +0000

유저레벨에서 개발되는 미들웨어 프로그램들은 대부분(?) unittest를 지원한다. 그러나 kernel은 이야기가 좀 달라지는데... 하드웨어 위에서 자원을 관리하는 프로그램이다 보니 unittest를 수행할 환경이 없다. 최근에 KUnit을 이용해서 함수/기능 단위로 테스트하긴 하지만 오늘은 유저 레벨에서 수행하는 테스트에 대해 이야기해보자.

클라이언트와 kernel 관련 업무를 진행하다보면 항상 stability 단어가 언급되곤 한다. 특히나 선행연구 과제라면 kernel에서 지원하지 않는 기능에 대해 프로토타입으로 개발하고 양산에 적용할지 고민하는데, 클라이언트는 기능이 동작하는 동안 크리티컬 이슈가 발생하지 않길 원하여 안정성을 검증할 만한 테스트를 요구한다. 오늘 소개하는 프레임워크(?)가 바로 AppArmor의 안정성을 검증할 수 있는 하나의 방법이다.

AppArmor 프로젝트 사이트에 접속하면 user-space 도구를 관리하는 repo를 볼 수 있는데 그중에 tests/regression/apparmor가 testsuite이다. HOW TO INSTALL은 off topic이고 여기서는 간단하게 사용하고 결과를 보도록 하자.

해당 repo를 clone 하고 tests/regression/apparmor로 이동하면 아래와 같이 여러개의 파일이 보인다.

이제 make tests 명령어를 실행하여 testsuite를 순차적으로 실행한다. (실행에 필요한 디펜던시는 이미 설치하였다)

자동화 프레임워크이다 보니 혼자서 수행하고 아래와 같이 report를 출력하며, 사용자는 이를 통해 PASS 및 FAIL 서브 테스트를 확인할 수 있다.

결국 testsuite를 수행함으로서 선행연구 패치가 적용된 kernel이 apparmor 기능 수행에 문제가 없는지 판단하여 양산에 적용할지 결정하게 된다. 물론 이를 통해 발견하지 못한 이슈들도 여전히 있으나 testsuite는 기능이 안정적으로 동작함을 검증할 수 있는 첫번째 단계가 되기도 한다.