Result Handling in the Hybrid Async-Native Engine for Python

#hejhdiss

Author: @hejhdiss (Muhammed Shafin P)

Original Concept: Hybrid Async-Native Engine for Python – Design Concept

Introduction

This article clarifies how task results are handled and returned in the Hybrid Async-Native Engine, focusing on Python coroutines and C-level task results stored in the Shared Memory Bus.

A key rule: Python can only access raw binary representations of memory managed by C threads, and all reads are non-destructive. Python cannot modify or remove data in the Shared Memory Bus.

Python Coroutine Results

All Python coroutine tasks submitted via spawn() return a Future-like handle which is awaitable:

future = engine.spawn(fetch_data("https://example.com"))
data = await future  # Retrieves the result when complete

Features:

Futures track completion of the coroutine.
Exceptions propagate through the Future.
Multiple awaits on the same Future are supported.
Results are stored in Python-managed memory.

C Task Results

C tasks execute on Non-GIL threads for parallel computation.

Results are written to the Shared Memory Bus in binary format.

Push / Pop Semantics

memory_bus_push(key, data, size) → C threads write freely. (C)
memory_bus_pop(key) → Python reads a binary snapshot, non-destructively. (Python)
Optional memory_bus_get(key) → read-only, non-destructive access. (C)
Python memory_bus_push(key, data, size) → writes only if the key does not already exist; it does not modify existing data. (Python)
memory_bus_pop(key) → destructive. (C)

Concurrency Rules

Multiple C threads may read/write atomically.
Python threads must check the memory-busy flag before reading.

Result Access Summary

Task Type	Result Access	Python Interaction	Notes
Python Coroutine	Awaitable Future	Full access	Supports exceptions, await, multiple awaits
C Function	Shared Memory / Binary	Non-destructive read only	Only when C threads are idle; Python reads raw binary
Hybrid Scenarios	N/A	Python can wait for completion	Safe only if memory is idle

Task Capacity per Thread

The tasks_per_thread setting defines how many concurrent async I/O coroutines a single GIL thread can manage. This parameter only applies to GIL threads, not to Non-GIL C threads, which execute tasks immediately and in parallel.

Effectively, it controls the number of Python async operations a thread can handle without blocking, ensuring that Python workload remains isolated from C-level performance. By limiting concurrency per GIL thread, the engine prevents Python slowness from interfering with high-performance C computation, while still allowing thousands of asynchronous operations per thread.

Example scenario:

If the engine is configured with 3 GIL threads and tasks_per_thread set to 1000, each thread can handle 1000 async tasks simultaneously. In practice, this means the Python layer could manage up to 3000 concurrent I/O operations, such as HTTP requests or database queries, without affecting the Non-GIL C threads.

Another example: with 2 GIL threads and 500 tasks_per_thread, Python can handle 1000 async operations concurrently. The Non-GIL threads continue executing CPU-intensive computations independently, ensuring that Python's async tasks never block the parallel C workload.

Note About Result Handling

Before crafting the article, the LLM may have missed some details because the explanation was spread across multiple messages. This section clarifies the design choices for task results and Python access conditions.

Python Coroutine Results: Returned as awaitable Futures, fully integrated with Python memory and exception handling.
C Task Results: Written to the Shared Memory Bus in binary format for parallel computation.

Python Access Rules for C Results:

Reads are non-destructive; data remains in the bus.
Reads are only allowed if no C thread is actively writing to that memory key.
Python cannot mutate C-managed memory.
Binary data must be converted by Python if structured data is needed.
Push from Python is allowed only if there is no existing data at that key (it does not modify existing data).

This is why in the result access table, there is no mention of memory pop being usable from Python.

The design ensures that Python’s relative slowness does not interfere with or block C thread memory operations, preserving true parallel performance on Non-GIL threads.

The omission in the original article was not a flaw in the design itself—result handling was already designed—but rather it was skipped during article generation. This note ensures that readers now have a complete understanding of how results and memory access rules are enforced.

Note on Python Push Behavior:

Python is allowed to push data to the Shared Memory Bus because keys are managed by the engine and new pushes allocate fresh memory. Existing C-managed memory is never modified, so Python additions do not interfere with active Non-GIL C execution.