This article was partially developed with the support of AI‑assisted writing tools.
0. Introduction
Over the past half‑century, the core assumptions of computer architecture have remained largely unchanged:
the CPU is the center, the instruction set is the foundation, the operating system manages hardware, and applications rely on system calls.
However, with the rapid rise of heterogeneous computing, compute‑in‑memory architectures, neuromorphic processors, and edge‑native systems, this traditional model is gradually failing.
Hardware has become diverse, complex, and unpredictable, while software abstractions remain stuck in a 20th‑century paradigm.
The abstraction boundaries of traditional operating systems—syscalls, drivers, process models, kernel/user mode—are no longer capable of handling the complexity of future computing.
Meanwhile, the diversification of hardware architectures (x86, ARM, RISC‑V, GPU ISAs, NPU ISAs, compute‑in‑memory arrays, etc.) is fragmenting the software ecosystem.
This proposal introduces a new direction:
Shift the microkernel downward into firmware, and use a “firmware‑level System Macro‑Instruction Set (System Macro‑ISA)” as the unified abstraction layer to achieve true cross‑architecture, cross‑device, and cross‑era computing.
In such a system:
- Programs no longer depend on CPU ISAs
- The operating system no longer manages hardware
- The microkernel resides in firmware rather than the OS
- All system capabilities are exposed as an “instruction set”
- Compute‑in‑memory and neuromorphic hardware become naturally compatible
- Extended capabilities are provided through “instruction extension sets”
- Unsupported hardware maintains semantic consistency through fallback mechanisms
This is not an incremental improvement to existing systems—it is a redefinition of future computing.
1. Limitations of Traditional OS Architectures
1.1 The Core Assumptions of Traditional OS Design Are Collapsing
Traditional operating systems (Linux, Windows, Android, macOS) rely on assumptions such as:
- The CPU is the sole execution unit
- The ISA is the foundation of software
- The driver model can abstract all hardware
- Syscalls define the boundary between applications and the kernel
- The process/thread model fits all computation
These assumptions no longer hold.
1.2 The Explosive Growth of Heterogeneous Computing
Modern devices include:
- CPUs
- GPUs
- NPUs
- TPUs
- DSPs
- FPGAs
- Compute‑in‑memory arrays
- Neuromorphic chips
- Cryptographic engines
- Security modules
Each with its own:
- Instruction set
- Memory model
- Scheduling model
- Execution model
- Dataflow model
Traditional OS abstractions cannot unify these.
1.3 Data Movement Costs Far Exceed Instruction Execution
The bottleneck of future computing is no longer CPU performance but:
- Data movement
- Memory access
- Cross‑device communication
Traditional OS abstraction layers cannot optimize these paths.
1.4 ISA Diversity Is Fragmenting Software
x86, ARM, RISC‑V, GPU ISAs, NPU ISAs…
Software must be recompiled, adapted, and optimized for each architecture.
This is unsustainable.
2. Architectural Vision: Unified Abstraction Layer (UAL)
The core goal of UAL is:
Eliminate hardware differences, unify execution models, and free programs from CPU ISAs.
It is built on three key ideas:
2.1 De‑Kernelization
Traditional OS kernels handle:
- Scheduling
- Memory management
- Drivers
- Security
- IPC
UAL moves all of these into firmware.
2.2 De‑ISA‑ization
Programs no longer depend on CPU ISAs.
They depend on the firmware‑provided System Macro‑ISA.
2.3 Capability‑Based Execution
Hardware no longer exposes registers and instructions, but capabilities:
- Compute capability
- Storage capability
- Communication capability
- Security capability
- Heterogeneous execution capability
3. A Three‑Layer Architecture
3.1 Firmware Layer — The “Kernel” of the Future
The firmware layer handles:
- Scheduling
- Memory management
- Security isolation
- Capability modeling
- System Macro‑ISA execution
- Device abstraction
- Heterogeneous scheduling
It is a firmware‑level microkernel.
3.2 Infrastructure Layer — The “OS” of the Future
Responsible for:
- Language runtimes
- Component models
- Optional file systems
- Optional networking stacks
- Optional UI frameworks
It no longer manages hardware.
3.3 Application Layer
- Pure logic
- ISA‑independent
- Syscall‑independent
- Driver‑independent
- Kernel‑independent
4. System Macro‑ISA (Firmware‑Level System Macro‑Instruction Set)
In this proposal, “macro‑instructions” are not syscalls or high‑level functions.
They form a system‑level instruction set architecture that abstracts microkernel‑level capabilities: scheduling, memory isolation, security, task models, IPC, device access, and more.
It can be understood as:
A “System Macro‑ISA” defined above hardware ISAs (x86/ARM/RISC‑V, etc.), with fixed encodings and execution semantics describing all OS‑level logic.
Upper‑layer OSes, runtimes, and even applications can target this System Macro‑ISA directly, without relying on CPU ISAs or traditional syscalls.
4.1 Positioning: A System‑Level ISA, Not an API
In short:
It is not “calling the kernel,” but “executing system‑level instructions.”
Characteristics:
- Instruction‑oriented
- Compiler‑targetable
- Pipeline‑friendly
- Formally verifiable
- ISA‑independent
4.2 Semantic Scope: Microkernel Capabilities as Instructions
System Macro‑ISA covers all responsibilities of a traditional microkernel, expressed as instructions.
4.2.1 Address Space and Memory Isolation Instructions
Examples:
-
CREATE_ASID_REGION -
MAP_REGION -
SET_REGION_POLICY -
SWITCH_ASID
These turn memory management into explicit system instructions.
4.2.2 Task / Process / Thread Model Instructions
Examples:
-
CREATE_TASK -
SET_TASK_PRIORITY -
BIND_TASK_UNIT -
YIELD -
WAIT_EVENT/SIGNAL_EVENT
These form an instruction‑level scheduling interface.
4.2.3 Capability and Security Instructions
Examples:
-
GRANT_CAP -
REVOKE_CAP -
CHECK_CAP -
ENTER_SECURE_DOMAIN/EXIT_SECURE_DOMAIN
Each permission operation becomes a fixed‑semantic system instruction.
4.2.4 IPC and Communication Instructions
Examples:
-
CREATE_CHANNEL -
SEND_MSG/RECV_MSG -
MAP_SHARED_REGION -
NOTIFY
These form a unified system‑level communication ISA.
4.2.5 Device and Heterogeneous Unit Control Instructions
Examples:
-
ATTACH_DEVICE -
SUBMIT_IO -
SUBMIT_COMPUTE -
QUERY_UNIT_CAP
Devices become capability sources accessed through instructions.
4.3 Instruction Form: More Like an ISA Than an API
Key points:
- Fixed binary encoding
- Compiler‑targetable
- Firmware‑optimizable
- IR‑friendly
- System‑level ISA, not an API
4.4 Extension Instructions and Fallback
System Macro‑ISA supports:
- Base Macro‑ISA
- Extended Macro‑ISA
- Semantic fallback
This ensures:
- Compatibility
- Performance scalability
- Hardware innovation without fragmentation
4.5 Fundamental Differences from Syscalls
| Feature | Syscall | System Macro‑ISA |
|---|---|---|
| Form | Function call | Instruction |
| Execution | OS kernel | Firmware‑level instruction engine |
| Optimizable | Low | High |
| Verifiable | Weak | Strong |
| ISA dependency | Strong | None |
| Hardware abstraction | Drivers | Capability model |
| Heterogeneous support | Weak | Strong |
In short:
Syscalls enter the kernel; System Macro‑ISA instructions are the kernel.
5. ISA‑Independent Execution Model
One core goal of System Macro‑ISA is to free programs from CPU ISAs (x86/ARM/RISC‑V, etc.).
Programs no longer contain machine code but:
System Macro‑ISA instruction streams (Macro‑ISA Binaries).
Firmware translates, schedules, and maps these instructions to hardware.
5.1 Executable Formats Will Fundamentally Change
Traditional executables (ELF/PE/Mach‑O) contain:
- CPU‑specific machine code
- Syscall tables
- Linking information
- Dynamic library dependencies
In UAL, executables contain:
- System Macro‑ISA instruction streams
- Capability declarations
- Isolation domain descriptions
- Resource model descriptions
Executables become system‑level IR, not CPU machine code.
5.2 Firmware as the “System Instruction Engine”
Firmware handles:
- Decoding
- Mapping
- Scheduling
- Optimization
- Extension handling
- Security enforcement
It resembles JVM/WASM/GPU drivers but at a lower abstraction level.
5.3 Execution Path of a Macro‑Instruction
Steps:
- Decode
- Capability check
- Execution path selection
- Scheduling
- Execution
- Synchronization
5.4 Unified Execution Semantics
All hardware executes the same system‑level semantics.
5.5 Natural Fit for Compute‑in‑Memory and Neuromorphic Hardware
Because they lack:
- Traditional ISAs
- Register models
- CPU‑style execution semantics
System Macro‑ISA becomes their common language.
5.6 Not a Virtual Machine
It is:
A system‑level ISA whose execution engine resides in firmware.
6. Drivers and Hardware Abstraction
System Macro‑ISA fundamentally restructures the driver model.
6.1 Drivers Move from OS to Firmware
Firmware handles:
- Device initialization
- Capability exposure
- Resource management
- Execution path selection
- Security isolation
- Extension support
The OS no longer needs drivers.
6.2 Device Capability Model
Devices expose:
- Storage
- Sensing
- Compute
- Communication
- Acceleration
6.3 Unified Abstraction for Heterogeneous Hardware
All hardware becomes “instruction execution units.”
6.4 Vendor Ecosystem Transformation
Vendors provide:
- Extended Macro‑ISA
- Capability descriptors
- Firmware plugins
Not drivers or proprietary SDKs.
7. Security Model
Based on:
- Capabilities
- Isolation domains
- Firmware‑level TCB
- Instruction‑level semantics
7.1 Capability Granting
Tasks can only execute instructions they have capabilities for.
7.2 Isolation Domains
Each task has:
- Its own address space
- Its own capabilities
- Its own scheduling context
- Its own security policy
7.3 Firmware as the TCB
Smaller, more auditable, more secure.
8. Performance Considerations
A common question regarding System Macro‑ISA is:
“Will raising the abstraction layer reduce efficiency?”
This intuition comes from the traditional CPU‑centric era, but it no longer holds in future computing systems.
8.1 Future Bottlenecks Lie in Data Movement, Not Instruction Execution
In modern and future computing, performance bottlenecks primarily arise from:
- Memory access latency
- Data movement costs
- Cross‑device communication
- Synchronization across heterogeneous units
- Data layout constraints in compute‑in‑memory arrays
Rather than:
- CPU instruction execution speed
- Instruction set complexity
The advantage of System Macro‑ISA is:
It enables firmware‑level optimization of data paths, rather than relying on OS‑level or application‑level workarounds.
For example:
-
SUBMIT_COMPUTEcan keep data in an NPU’s local SRAM -
MAP_REGIONcan avoid unnecessary memory copies -
SEND_MSGcan achieve zero‑copy communication at the firmware level -
SUBMIT_IOcan directly schedule DMA engines
These optimizations are nearly impossible in traditional OS architectures.
8.2 Firmware‑Level Scheduling Is More Efficient Than OS Scheduling
Problems with traditional OS schedulers:
- Frequent kernel traps
- Complex process/thread structures
- No unified scheduling across CPU/GPU/NPU
- No pipeline optimization for system‑level operations
System Macro‑ISA’s scheduler:
- Resides in firmware
- Directly controls all execution units
- Can reorder system instructions
- Can batch IPC, memory, and task operations
- Can unify scheduling across heterogeneous units
Thus:
System‑level operations execute far more efficiently than in traditional OS designs.
8.3 Extension Instructions Provide Optimal Performance Paths
Examples:
- GPU vendors may provide
MATRIX_MUL_EXT - Compute‑in‑memory arrays may provide
MEM_ARRAY_REDUCE_EXT - Neuromorphic chips may provide
NEURON_UPDATE_EXT
These extension instructions:
- Map directly to hardware‑optimal execution paths
- Avoid OS‑layer abstraction overhead
- Avoid driver‑layer context switching
- Avoid API‑layer encapsulation overhead
Fallback ensures:
- Unsupported hardware still executes the same semantics
- Performance differences depend on hardware, not software
8.4 Performance Summary
System Macro‑ISA’s performance advantages come from:
- Data‑path optimization
- Firmware‑level scheduling
- Extension instructions
- Zero‑copy communication
- Heterogeneous execution path selection
- Removal of syscall/driver/kernel‑mode transitions
Therefore:
System Macro‑ISA does not reduce performance; it is likely the most performance‑enhancing abstraction layer for future computing systems.
9. Extensibility & Ecosystem
The ecosystem design goals of System Macro‑ISA are:
- No fragmentation
- No vendor lock‑in
- No loss of compatibility
- No obstruction to innovation
To achieve this, it adopts a three‑layer extension mechanism.
9.1 Base Macro‑ISA
Mandatory for all hardware, including:
- Memory isolation instructions
- Task model instructions
- Capability model instructions
- IPC instructions
- Basic device access instructions
This ensures:
- Program portability
- Firmware verifiability
- Consistent system semantics
9.2 Extended Macro‑ISA
Hardware vendors may provide their own extension instructions, such as:
- GPU: matrix multiplication, convolution, rasterization
- NPU: tensor operations, activation functions
- Compute‑in‑memory: array reduction, local computation
- Neuromorphic chips: spike propagation, synapse updates
Extension instructions have:
- Fixed encodings
- Fixed semantics
- Firmware‑level execution paths
They do not break compatibility with the base instruction set.
9.3 Fallback Mechanism
If hardware does not support an extension instruction:
- Firmware automatically falls back
- Uses base instruction sequences to implement the same semantics
- Ensures functional consistency
- Performance differences depend on hardware
Thus:
- New hardware can showcase capabilities via extensions
- Old hardware can still execute the same programs
- The software ecosystem remains unified
9.4 Vendor Ecosystem Transformation
Vendors no longer provide:
- Drivers
- SDKs
- Proprietary APIs
Instead, they provide:
- Extension instruction sets
- Capability description files
- Firmware plugins
This results in a more unified, secure, and compatible ecosystem.
10. Use Cases
System Macro‑ISA applies to a wide range of devices—from IoT to supercomputers.
10.1 IoT and Embedded Devices
Characteristics:
- Limited resources
- Diverse architectures
- High security requirements
Advantages of System Macro‑ISA:
- Firmware‑level isolation
- No OS kernel required
- Programs run across architectures
- Device capabilities exposed as instructions
10.2 Mobile and Consumer Electronics
Mobile SoCs include:
- CPU
- GPU
- NPU
- ISP
- DSP
- Security modules
System Macro‑ISA enables:
- Unified scheduling
- Unified capability abstraction
- Unified security model
- Unified execution semantics
This is more efficient and secure than Android/Linux driver models.
10.3 PCs and General‑Purpose Computing
Future PC trends:
- Heterogeneous acceleration
- Security isolation
- Virtualization
- Multi‑unit collaboration
System Macro‑ISA can replace:
- Syscalls
- Drivers
- Kernel/user mode transitions
- Traditional virtualization layers
10.4 Compute‑in‑Memory and Neuromorphic Computing
This is the most revolutionary use case.
Compute‑in‑memory arrays:
- Have no traditional ISA
- Have no register model
- Have no CPU‑style execution model
Neuromorphic chips:
- Event‑driven
- Spike‑based
- Synapse‑update‑driven
System Macro‑ISA:
- Exposes capabilities via extension instructions
- Maps semantics through firmware
- Maintains compatibility via fallback
Thus:
All future computing devices can run the same “system instruction code.”
11. Challenges & Limitations
Despite its potential, System Macro‑ISA faces real challenges.
11.1 Designing the Macro‑Instruction Set Is Extremely Difficult
It requires expertise from:
- Microkernel design
- ISA design
- Compiler engineering
- Heterogeneous computing
- Compute‑in‑memory systems
- Security models
- Firmware engineering
This is a massive cross‑disciplinary effort.
11.2 Firmware Security Requirements Are Extremely High
Firmware becomes:
- Scheduler
- Memory manager
- Capability manager
- Instruction engine
It must be:
- Minimal
- Stable
- Verifiable
- Auditable
This demands exceptional engineering rigor.
11.3 Vendor Interest Conflicts
System Macro‑ISA:
- Smooths out hardware differences
- Weakens ISA moats
- Weakens driver ecosystems
- Weakens platform lock‑in
Vendors may resist:
- ARM
- Intel
- NVIDIA
- Qualcomm
- Apple
However, in the long run:
The complexity of heterogeneous computing will force the industry toward a unified abstraction layer.
11.4 Ecosystem Migration Costs
Migrating from:
- Syscalls
- Drivers
- OS kernels
- CPU ISAs
to:
- System Macro‑ISA
requires:
- New compilers
- New runtimes
- New firmware
- New toolchains
This is a long‑term process.
12. Future Directions
System Macro‑ISA is not an improvement to existing systems—it is a redefinition of future computing.
It may become:
(1) The Next‑Generation BIOS Standard
Firmware becomes a “system instruction engine,” not just hardware initialization.
(2) The Foundation of Next‑Generation Operating Systems
OSes run on top of System Macro‑ISA rather than managing hardware.
(3) The Unified Abstraction Layer of Future Computers
All devices execute the same “system instruction code.”
(4) The Common Language of the Heterogeneous Computing Era
CPU/GPU/NPU/TPU/compute‑in‑memory/neuromorphic chips all understand the same semantics.
(5) The Compilation Target of Future Software
Programs compile to System Macro‑ISA, not x86/ARM/RISC‑V.
Conclusion
This proposal introduces a unified abstraction layer for future computing systems:
the firmware‑level System Macro‑Instruction Set (System Macro‑ISA).
Its core ideas include:
- Microkernel downward integration into firmware
- System capabilities expressed as instructions
- ISA‑independent program execution
- Capability‑based device access
- Unified heterogeneous execution
- Capability‑based security
- Extension instructions + fallback
It is not an improvement to existing systems—it is a redefinition of future computing.
Regardless of whether this architecture is ultimately adopted, it represents a possibility:
future computers may be defined not by CPU instruction sets, but by system‑level instruction semantics.
Top comments (0)