The problem nobody talks about in EMEA AI development
Every tutorial about building LLM-powered apps assumes
the same thing: you can freely send your user data to
OpenAI or Anthropic.
In EMEA, that assumption is wrong.
Tunisia is currently examining a 123-article organic law
regulating AI and automated decision-making.
France enforces GDPR strictly — and LLM prompts containing
personal data count as data processing.
Morocco is formalizing its digital governance framework.
If you're building AI features for EMEA customers and
forwarding their data to US inference providers, you have
a compliance problem you may not know about yet.
What I built
SovereignGuard is an open source AI privacy gateway.
It sits between your application and any LLM API.
It intercepts outbound prompts, strips PII before the
request leaves your server, and restores original values
locally after the response comes back.
The LLM never sees real customer data.
Your app never changes beyond one line.
How it works — the full mechanism
Here's the complete request lifecycle:
Your App
|
| sends prompt with real customer data
v
SovereignGuard (running on YOUR server)
|
|-- creates session ID
|-- runs PII detection (fast regex + heavy recognizer)
|-- replaces PII with reversible SG tokens
|-- stores token→original mapping locally
v
LLM Provider (OpenAI / Anthropic / Mistral / etc.)
|
| receives only tokenized text
| returns tokenized response
v
SovereignGuard
|
|-- detects SG tokens in response
|-- restores original values from local mapping
|-- destroys or expires session mapping
v
Your App
|
receives clean, usable response
The token format
Tokens look like this:
{{SG_PERSON_NAME_a3f9b2}}
{{SG_TN_PHONE_c4d5e6}}
{{SG_TN_NATIONAL_ID_f7e3b1}}
{{SG_EMAIL_b2c3d4}}
Format: {{SG_ENTITY_TYPE_randomhex}}
The hex suffix is random per session.
Even if someone intercepts the tokenized prompt,
they cannot reverse the tokens without the local mapping.
Live proof
Input from your app:
"Contact Baha at +216 XX XXX XXX, CIN 12345678"
What the LLM actually receives:
"Contact {{SG_PERSON_a3f9b2}} at {{SG_TN_PHONE_c4d5e6}},
{{SG_TN_NATIONAL_ID_f7e3b1}}"
What your app gets back:
"Contact Baha at +216 XX XXX XXX, CIN 12345678"
restoration_completeness = 1.0
tokens_restored = 3
tokens_not_found = 0
Tested live against DeepSeek API. Works.
The EMEA gap
Every existing PII tool was built for US or Western EU data.
Here's what they miss:
Tunisia:
- CIN: exactly 8 digits (e.g., 12345678)
- Phone: +216 followed by 8 digits
- Matricule Fiscale: 7 digits + letter + 3 digits + 3 digits
Morocco:
- CIN: 1–2 letters + 5–6 digits (e.g., AB123456)
- Phone: +212 followed by 9 digits
- ICE: exactly 15 digits
France:
- NIR (social security): 13 digits + 2-digit key
- SIRET: 14 digits
- Phone: +33 followed by 9 digits
SovereignGuard is the only open source gateway
with native recognizers for these patterns.
Integration — one line
If you're using the OpenAI Python SDK:
python
from openai import OpenAI
# Before
client = OpenAI(api_key="sk-...")
# After — that's it
client = OpenAI(
api_key="sg-your-gateway-key",
base_url="http://localhost:8000/v1"
)
Your entire application stays the same.
SovereignGuard handles everything transparently.
Running it
Docker (recommended):
git clone https://github.com/bahaeddinmselmi/sovereignguard
cd sovereignguard
cp .env.example .env
# edit .env with your provider API key
docker compose up --build
Smoke test:
curl http://localhost:8000/health
# {"status":"healthy","gateway":"SovereignGuard","version":"0.2.0"}
What's next
Countries I need recognizers for:
🇩🇿 Algeria
🇩🇪 Germany
🇳🇬 Nigeria
🇸🇦 Saudi Arabia
🇦🇪 UAE
🇸🇳 Senegal
Adding a country recognizer takes about 30 minutes.
See [docs/adding-recognizers.md] for the guide.
GitHub: https://github.com/bahaeddinmselmi/sovereignguard
If you're building AI features for EMEA customers,
this layer belongs in your stack.
Star it, break it, contribute to it.

Top comments (0)