Building AI Services on Tangle: Inference and Sandboxes
Day 5 of the Tangle Re-Introduction Series
An AI agent just analyzed your private financial data, generated a Python script, executed it, and returned investment recommendations. How do you know the model was real? That your data wasn't logged? That the code ran correctly? You don't. That's the problem Tangle solves.
The Full Picture: Inference + Sandbox, Chained
Tangle chains verified inference and sandboxed execution into one accountable pipeline for AI agents.
Most posts about verifiable AI focus on one piece. We're starting with the combined pattern because that's what agents actually do in practice: reason, then act.
Agent receives task
→ Inference service generates code (verified model, TEE-attested)
→ Sandbox service executes code (isolated, resource-tracked)
→ Agent returns verified result with full accountability chain
Here's what this looks like as a Tangle blueprint:
use blueprint_sdk::Router;
use blueprint_sdk::tangle::TangleLayer;
use blueprint_sdk::tangle::extract::{TangleArg, TangleResult};
/// Agent that analyzes data using generated code
pub async fn analyze(
TangleArg(request): TangleArg<AnalysisRequest>,
) -> TangleResult<AnalysisResult> {
// Step 1: Generate analysis code via inference
let code_response = inference(TangleArg((
"gpt-4".to_string(),
format!("Write Python to analyze this data: {:?}", request.schema),
InferenceConfig::default(),
))).await?;
// Step 2: Execute the generated code in sandbox
let exec_result = execute(TangleArg((
code_response.text.clone(),
"python3".to_string(),
request.data.clone(),
SandboxConfig::default(),
))).await?;
TangleResult(AnalysisResult {
output: exec_result.stdout,
code_used: code_response.text,
model_hash: code_response.model_hash,
})
}
The customer gets the analysis result, the code that produced it, and the model hash proving which model generated the code. Full accountability chain, two services, one request. Tangle is a purpose-built infrastructure where AI inference and code execution both carry cryptoeconomic guarantees.
Now let's look at how each piece works.
The Trust Problem
Current AI inference APIs require blind trust in the provider.
When you call an inference API, you're trusting:
- They're running the model they claim (not a cheaper substitute)
- They're not logging or selling your prompts
- They're not modifying outputs (filtering, biasing, watermarking)
- They're actually running inference (not returning cached or fabricated responses)
Most inference APIs ask you to trust their reputation. Tangle makes these properties verifiable. And the same class of problems applies to code execution: operators could observe your data, modify results, or lie about resource usage.
How Payment Works: x402
Agents pay per-request using x402 HTTP payment headers, no accounts or API keys required.
x402 is an HTTP-native micropayment protocol that lets AI agents pay for compute with a signed payment header, settling transactions on-chain in seconds. This is where Tangle diverges from other "verifiable compute" projects. Agents don't sign up for accounts or manage API keys. They pay per-call using x402 payment headers, the HTTP 402 protocol for machine-to-machine payments.
The flow:
Agent sends request with x402 payment header (signed token amount)
→ Operator verifies payment is sufficient for the job
→ Executes job inside TEE
→ Returns result + attestation
→ Payment settles automatically on-chain
In practice, the agent's HTTP request includes a payment header alongside the normal job payload. The following is pseudocode illustrating the x402 flow (no official Python SDK exists yet):
# Pseudocode -- conceptual x402 flow
import requests
response = requests.post(
"https://operator.example/inference",
headers={
"X-402-Payment": sign_payment(amount=0.003, asset="TNT"),
"Content-Type": "application/json",
},
json={
"model": "gpt-4",
"prompt": "Analyze this portfolio...",
"config": {"max_tokens": 2048}
}
)
# Response includes result + attestation
result = response.json()
assert verify_attestation(result["attestation"])
This means agents can autonomously discover operators, compare prices, pay for compute, and verify results. No human in the loop. The economic layer and the verification layer are the same system: if an operator cheats, they lose their stake. If they deliver, they get paid. x402 makes this settlement automatic.
Inference Service
Verifiable AI inference runs a model inside a TEE with cryptographic proof of execution.
An inference service runs AI models on behalf of customers. The customer sends a prompt, the operator runs it through the model, and returns the response with proof.
The Blueprint
The inference blueprint defines request types, runs the model, and returns an attested response.
Note: The inference logic below is application code. The SDK provides the service framework (Router, TangleArg, TangleResult, TangleLayer) while you bring the AI logic. Model loading and attestation are your responsibility, not SDK types.
use blueprint_sdk::Router;
use blueprint_sdk::tangle::TangleLayer;
use blueprint_sdk::tangle::extract::{TangleArg, TangleResult};
use serde::{Deserialize, Serialize};
// --- Application types (your code, not SDK types) ---
#[derive(Serialize, Deserialize)]
pub struct InferenceConfig {
pub max_tokens: u32,
pub temperature: f32,
pub top_p: f32,
}
#[derive(Serialize, Deserialize)]
pub struct InferenceResponse {
pub text: String,
pub tokens_used: u32,
pub model_hash: [u8; 32],
}
/// Your model loading logic -- bring your own inference runtime
/// (e.g., candle, llama.cpp bindings, or an external API client)
async fn load_model(model_id: &str) -> anyhow::Result<MyModel> {
// Application code: load weights, verify hash, etc.
todo!("Implement model loading for your runtime")
}
/// Job 0: Run inference on a prompt
///
/// The SDK handles job routing and result submission.
/// You handle model loading and inference execution.
pub async fn inference(
TangleArg((model_id, prompt, config)): TangleArg<(String, String, InferenceConfig)>,
) -> TangleResult<InferenceResponse> {
let model = load_model(&model_id).await
.expect("model loading failed");
let output = model.generate(&prompt, config.max_tokens, config.temperature);
let hash = model.weight_hash();
TangleResult(InferenceResponse {
text: output.text,
tokens_used: output.token_count,
model_hash: hash,
})
}
pub fn router() -> Router {
Router::new()
.route(0, inference.layer(TangleLayer))
}
Verification
Tangle combines TEE attestation, multi-operator consensus, and on-chain model hashes.
When TEE hardware is available, every response can include an attestation signed by the enclave. This proves code ran inside an enclave, a specific model binary was loaded, and the hardware is genuine. Customers verify attestations client-side. An on-chain model registry that maps model identifiers to their verified weight hashes is currently in development. Tangle is a purpose-built platform combining multi-operator verification with cryptoeconomic settlement via x402, giving agents a single request that pays for compute and verifies the result.
Multi-operator consensus. For additional security, configure inference to require multiple operators. If three operators independently run the same prompt and two must agree, an operator running a cheaper substitute model gets caught. This works today, even without TEE hardware.
Hardware Reality
TEE-based inference works but has memory and GPU constraints today.
SGX memory is limited (~256MB), AMD SEV-SNP is more generous but still bounded, and GPU TEE support (NVIDIA H100 Confidential Computing) exists but isn't widely deployed. For now, TEE attestation covers model loading and result signing, with GPU computation verified through multi-operator consensus. This is a real tradeoff, not a solved problem.
What This Doesn't Solve
Verification proves fidelity and execution integrity, not output quality or model safety.
Output quality. We verify the right model ran. We don't verify the output is "good." Quality is subjective.
Prompt injection. If the model itself has been fine-tuned maliciously, the TEE faithfully runs the malicious model. Verification proves fidelity, not safety.
Side-channel leakage. TEEs have known side-channel vulnerabilities. For most use cases, this risk is acceptable. For state-level adversaries, it isn't.
Sandbox Service
Sandboxed execution runs untrusted code in an isolated container with strict resource limits.
A sandbox service executes arbitrary code in an isolated environment. The customer sends code, the operator runs it, and returns the result.
Sandboxes need isolation in both directions: protecting operators from malicious customer code, and protecting customers from operators who might observe data or tamper with results.
The Blueprint
The sandbox blueprint accepts code, a language, inputs, and a config, then executes in isolation.
Note: VM-level sandboxing is infrastructure provided by the Blueprint Manager (using the
vm-sandboxfeature with cloud-hypervisor), not an application-level API. The types below are application code; the SDK provides the service framework.
use blueprint_sdk::Router;
use blueprint_sdk::tangle::TangleLayer;
use blueprint_sdk::tangle::extract::{TangleArg, TangleResult};
use serde::{Deserialize, Serialize};
use std::process::Command;
// --- Application types (your code, not SDK types) ---
#[derive(Serialize, Deserialize)]
pub struct SandboxConfig {
pub max_memory_mb: u32,
pub max_cpu_seconds: u32,
pub allow_network: bool,
}
#[derive(Serialize, Deserialize)]
pub struct ExecutionResult {
pub stdout: String,
pub stderr: String,
pub exit_code: i32,
}
/// Job 0: Execute code in isolated environment
///
/// The operator's infrastructure handles VM-level isolation.
/// This function implements the execution logic within that sandbox.
pub async fn execute(
TangleArg((code, language, inputs, config)): TangleArg<(String, String, Vec<u8>, SandboxConfig)>,
) -> TangleResult<ExecutionResult> {
// Application code: spawn a subprocess with resource limits
// VM-level isolation is handled by the Blueprint Manager's
// vm-sandbox feature, not by application code
let output = Command::new(&language)
.arg("-c")
.arg(&code)
.output()
.expect("execution failed");
TangleResult(ExecutionResult {
stdout: String::from_utf8_lossy(&output.stdout).to_string(),
stderr: String::from_utf8_lossy(&output.stderr).to_string(),
exit_code: output.status.code().unwrap_or(-1),
})
}
pub fn router() -> Router {
Router::new()
.route(0, execute.layer(TangleLayer))
}
When the Blueprint Manager runs with the vm-sandbox feature enabled, each execution runs in a fresh VM (powered by cloud-hypervisor) with configurable memory limits, CPU quotas, and network isolation. VMs are destroyed after execution. No state persists.
Verification Approaches
Verification strategy depends on whether the workload is TEE-attested, deterministic, or general.
TEE-enabled execution. Hardware attestation proves the sandbox ran the code correctly. Strongest guarantee, but requires TEE-capable operators.
Deterministic code (WASM, seeded execution). Replay verification. Run the same inputs through multiple operators and compare outputs. Exact match required.
General code (Python, Node). Most real code isn't deterministic. Dict ordering, floating-point operations, and timing-dependent behavior vary across runs. For these workloads, multi-operator consensus with semantic similarity checking is the practical approach.
Honest limitation: For workloads that are neither TEE-attested nor consensus-verifiable, economic security (operator stake at risk) is the primary deterrent. This is weaker than cryptographic verification, but often sufficient for lower-stakes computation.
How Tangle AI Services Compare
Tangle is the only platform combining TEE-attested inference, sandboxed execution, and x402 payments.
Here's how the full stack compares to existing inference and execution platforms:
| Feature | Tangle | Together AI | Replicate | Ritual |
|---|---|---|---|---|
| Inference verification | TEE + multi-operator | None (trust-based) | None | On-chain proof |
| Code execution | Sandboxed + verified | N/A | Container-based | N/A |
| Payment model | x402 micropayments | API key + billing | Per-prediction | Token-based |
| Model substitution protection | Multi-operator consensus (canary prompts on roadmap) | None | None | ZK attestation |
| Agent-native | Yes | No | No | Partial |
Real Use Case: AI Agent Tool Execution
Agents generate code via inference and execute it in a sandbox, getting verified results in one pipeline.
# Agent generates this code via inference service
def analyze_data(data):
import pandas as pd
df = pd.DataFrame(data)
return {
"mean": df["value"].mean(),
"std": df["value"].std(),
"outliers": df[df["value"] > df["value"].mean() + 2*df["value"].std()].to_dict()
}
The sandbox executes it safely, the agent gets results, and with TEE the operator can't see the data. Chain this with the inference service and you get the full pattern from the top of this post.
What's Shipped vs. What's Coming
The Blueprint SDK, multi-operator verification, and x402 settlement are live; TEE integration is next.
Developers deserve to know what works today and what's still being built. Blueprint SDK v0.1.0-alpha.22 (Rust 2024 edition, minimum Rust 1.88) enables building verifiable AI inference services in under 200 lines of Rust, using the same Router and extractor patterns shown throughout this series.
Shipped. The Blueprint SDK (Router, TangleArg, TangleResult, TangleLayer), multi-operator verification, container isolation, and x402 payment settlement. You can build and deploy inference and sandbox services today using these primitives.
Design complete, building now. TEE attestation integration with the Blueprint SDK, and the on-chain model registry for hash-verified model loading.
Roadmap. A canary prompt is a challenge input with a known expected output used to detect model substitution without the operator's knowledge. Canary prompts would run on a configurable interval, providing ongoing model verification without degrading throughput. Full GPU-in-TEE support will arrive as NVIDIA hardware matures. WASM deterministic replay will enable bit-exact verification across operators.
The gap between "shipped" and "roadmap" is real. Multi-operator consensus and economic security work today. Hardware-level attestation is close. Full deterministic replay for arbitrary code is a harder research problem. We're building in that order because each layer adds value independently.
Frequently Asked Questions
Common questions about Tangle's AI inference, sandbox execution, and verification services.
What is verifiable AI inference?
Verifiable AI inference is running a model inside a TEE so the customer receives cryptographic proof of which model executed their prompt.
How does TEE attestation work for AI models?
TEE hardware signs an attestation proving that a specific model binary loaded inside an isolated enclave the operator cannot observe.
What is a sandboxed code execution service?
A sandboxed execution service runs untrusted code in an isolated container with no capabilities, no persistent state, and strict resource limits.
How does x402 payment work for AI services?
An agent sends an HTTP request with a signed payment header; the operator verifies payment, executes the job, and settlement happens on-chain automatically.
Can AI agents chain inference and code execution together?
Yes. A single blueprint can call the inference service to generate code and the sandbox service to execute it, returning a fully verified result chain.
What is canary prompt verification?
A canary prompt is a challenge input with a known expected output, sent periodically to detect whether an operator has substituted a cheaper model.
How does Tangle prevent model substitution?
Tangle uses multi-operator consensus (multiple operators must agree on results) and TEE hardware attestation to detect substitution. Weight hash verification via an on-chain model registry and canary prompts are on the roadmap.
What's Next
Day 6 covers Tangle's roadmap, ecosystem positioning, and infrastructure bets.
Day 6 covers where Tangle is headed: the features in the pipeline, where we fit in verifiable compute, and the bets we're making on where AI infrastructure goes next.
Start building:
Join the conversation:
Top comments (0)