Hassann

Posted on May 21 • Originally published at apidog.com

Self-Hosted API Tools: Should You Leave the Cloud?

Self-hosted API tools are no longer just a compliance checkbox. After GitHub confirmed attackers stole data from roughly 3,800 internal repositories through a poisoned VS Code extension on one employee’s laptop, API teams have a practical question to answer: where do your API specs, shared collections, test data, and environment secrets actually live?

Try Apidog today

For many teams, the answer is: “in a vendor cloud, and we are not fully sure where.” That is not automatically wrong. Cloud-synced API tooling is convenient, fast to adopt, and useful for collaboration. But the GitHub incident is a good trigger to review your API source-of-truth and decide deliberately whether it belongs inside your perimeter or outside it.

TL;DR

Self-hosted API tools, also called on-premise API platforms, keep your OpenAPI specs, request collections, test data, and credentials inside infrastructure you control instead of a vendor’s multi-tenant cloud.

After the May 2026 GitHub breach, where attackers exfiltrated data from about 3,800 internal repositories through a trojanized VS Code extension, more teams are comparing data residency against cloud convenience.

Use self-hosted or offline API tooling when:

You work in a regulated environment.
Your API client stores credentials or customer-like test data.
Your network is air-gapped or restricted.
You need a clear chain of custody for auditors.
You want to reduce dependency on one vendor’s cloud.

Cloud sync still makes sense when:

Your team needs real-time collaboration.
Your API data is low sensitivity.
You do not have the ops capacity to run self-hosted infrastructure safely.

Apidog supports both models: a cloud product, a self-hosted/on-premise deployment, and an offline mode, so teams can choose where API data lives.

What happened at GitHub, and why API teams should care

On May 20, 2026, GitHub confirmed that attackers had stolen data from approximately 3,800 internal code repositories.

The entry point was not a zero-day in GitHub’s core platform. It was a poisoned VS Code extension installed on a GitHub employee’s device. Once the extension ran with the employee’s permissions, attackers gained a foothold inside GitHub’s internal network.

The threat group, tracked as TeamPCP, is known for supply-chain attacks across npm, PyPI, and PHP package ecosystems. Security reporting indicates the group put the stolen dataset up for sale on underground forums for more than $50,000.

GitHub said it found no evidence that customer data stored outside its internal repositories was affected, and the investigation is ongoing.

This was not GitHub’s only rough month. In April 2026, cloud security firm Wiz disclosed CVE-2026-3854, a critical remote code execution flaw in GitHub’s internal Git infrastructure that, before it was patched, exposed millions of repositories. SecurityWeek documented the vulnerability and its scope.

For API teams, the key point is this: GitHub is often more than a code host. It is also where your API source-of-truth lives.

That can include:

OpenAPI and Swagger specs
Request collections committed to repos
.env.example files
Terraform for API gateways
CI workflows with deploy tokens
Integration test fixtures
Mock-server definitions
Internal API documentation

The stolen GitHub data was GitHub’s own internal code, not customer repositories. That distinction matters.

But the pattern generalizes: a malicious developer tool running on one laptop can become a path into a much larger environment. The same attack chain can work against any vendor whose product connects to your development workflow.

For related developer-side risks, see:

This article focuses on the platform-level question: should your API design data live in a vendor cloud at all?

What your API client may sync to a vendor cloud

Before choosing cloud, self-hosted, or offline tooling, inventory what your API client actually syncs.

In a shared cloud workspace, the following data often leaves local machines and lands in vendor infrastructure.

1. API specifications

OpenAPI documents describe:

Endpoints
Parameters
Schemas
Authentication flows
Error responses
Service boundaries

A full API spec is not always a secret, but it is a map. If exposed, it can reduce attacker recon time by showing which endpoints exist, what identifiers are used, and where authentication boundaries may be.

2. Request collections and saved examples

Saved requests often contain realistic payloads.

Examples may include:

{
  "email": "customer@example.com",
  "accountId": "acct_12345",
  "plan": "enterprise",
  "internalRegion": "us-east-1-private"
}

Saved responses can be more sensitive because they may contain entire user objects, account records, internal IDs, or staging data copied during debugging.

3. Environment variables and secrets

This is usually the highest-risk category.

Teams often store values like:

API_BASE_URL=https://api.internal.example.com
BEARER_TOKEN=eyJhbGciOi...
OAUTH_CLIENT_SECRET=...
DATABASE_URL=postgres://...

If those environments sync to a cloud workspace, production or staging credentials may now exist in a third-party multi-tenant database.

If you have debugged teammate sync issues before, you know how opaque this layer can become. See the diagnostic guide on Postman environment sync issues for a closer look at this surface.

4. Test data and mock definitions

Mock servers and automated tests can expose:

Example customer records
Internal business rules
Validation logic
State transitions
Role and permission behavior

Even when individual records are fake, the structure can reveal how your system works.

5. Workspace metadata and activity

This can include:

Service names
Folder structure
Comments
Team member lists
Change history
Internal naming conventions

Individually, these may seem minor. Together, they can describe your org chart and product roadmap.

Cloud sync is not reckless by default. But the synced data is real, sensitive in aggregate, and worth classifying before an incident forces the question.

For a deeper breakdown of the cloud-sync model, see Is Postman secure?

The real attack surface of cloud-synced API tooling

Cloud-synced API tooling adds attack surface that does not exist when data stays local.

That does not mean every cloud vendor is weak. Many vendors have stronger security teams than their customers. The issue is structural: the more places data can be reached from, the more places it can be attacked from.

Vendor compromise

A multi-tenant SaaS platform that stores API specs and credentials for many companies is a high-value target.

When you use it, you inherit:

The vendor’s security posture
Their patch cadence
Their employee device security
Their incident response process
Their cloud and sub-processor exposure

The GitHub incident is a concrete example: one employee device became the weak link, and the blast radius included thousands of internal repositories.

Account takeover

Cloud tools rely on accounts, sessions, API tokens, and OAuth integrations.

Attackers can get access through:

Phished credentials
Reused passwords
Leaked session tokens
OAuth token theft
Compromised SSO accounts

MFA helps and should be enforced, but it does not eliminate session hijacking or token theft.

Over-broad workspace sharing

Shared workspaces are useful, but they tend to accumulate stale access.

Common patterns:

Contractors added temporarily and never removed
New hires added to broad “Engineering” workspaces
Old production environments left in shared projects
Teams copying collections with secrets still attached

Access reviews need to include API tooling, not just source control and cloud infrastructure.

Extensions, plugins, and integrations

This is the vector that hit GitHub.

Developer tools often support:

IDE extensions
API client plugins
CI integrations
Browser extensions
Third-party automation

Each integration can run with permissions close to the developer’s own access. A poisoned extension can read local files, tokens, synced data, or API client configuration.

Supply-chain attacks against npm, PyPI, and extensions are now a reliable path into developer environments.

Telemetry, logs, and sub-processors

Cloud tools may generate:

Crash reports
Debug logs
Analytics events
Request traces
Support snapshots

If request bodies or headers are captured, they may include credentials or sensitive payloads.

Your data may also flow through:

The vendor’s cloud provider
Analytics tools
Support tools
Logging systems
Error monitoring providers

The lesson is similar to the Vercel breach and what it taught API teams: map which third parties can touch sensitive data, then shrink that map where the risk justifies it.

To keep this balanced: mature cloud vendors usually encrypt data at rest and in transit, run security programs, hold SOC 2 or ISO 27001 certifications, and patch quickly. A small team’s data may be safer in a mature vendor cloud than on a poorly maintained self-hosted server.

The point is not “cloud is unsafe.” The point is “cloud sync is a trade-off.”

Compliance and data residency: when self-hosted becomes required

For regulated teams, self-hosted API tooling may not be optional. It may be required by law, contract, or audit policy.

Data residency and sovereignty

Regulations like GDPR and national data-localization laws can restrict where personal data is stored or processed.

If your API test data or saved request payloads contain personal data of EU residents, syncing that data to a US-region multi-tenant database may create compliance risk.

A self-hosted API platform running in your data center or in a cloud region you explicitly control gives you direct control over data location.

The European Data Protection Board’s guidance is a reference point for cross-border transfer rules.

Industry-specific frameworks

Some industries have explicit requirements for regulated data handling:

Healthcare teams handling PHI under HIPAA
Payment teams under PCI DSS
US federal vendors under FedRAMP
Defense contractors under CMMC

In sensitive environments, air-gapped or on-premise tooling may be the only acceptable option.

For that scenario, see air-gapped API testing tools for secure environments.

Contractual data-handling obligations

Even without formal regulation, customer contracts may limit where data can go.

Example: a customer contract says their data cannot be processed by unapproved sub-processors. If your API client syncs test payloads containing that customer data to its own cloud, you may violate that commitment without realizing it.

Audit and chain of custody

Auditors often ask:

Who can access this data, and how do you know?

With self-hosted infrastructure, the answer is easier to evidence:

Data is on your servers.
Access is behind your network controls.
Logs are in your systems.
Identity policies are yours.
Backups are under your process.

With multi-tenant SaaS, part of the answer is always: “we trust the vendor.”

That may be acceptable, but it is harder to prove.

When self-hosted wins vs. when cloud wins

Self-hosting is not automatically better. It is an engineering trade-off with real operational cost.

Factor	Cloud-synced API tooling	Self-hosted / on-premise / offline
Setup and maintenance	Minutes; vendor runs everything	You provision, patch, back up, monitor
Real-time collaboration	Strong; built for distributed teams	Works, but inside your network or VPN
Data residency control	Limited to vendor regions and policy	Full; you choose the exact location
Attack surface	Vendor cloud, account auth, sub-processors	Your perimeter only
Compliance fit (HIPAA, PCI, FedRAMP)	Depends on vendor certifications	Strong; data never leaves your control
Cost model	Per-seat subscription	License plus your infrastructure and ops time
Works air-gapped or offline	No	Yes
Disaster recovery	Vendor’s responsibility	Yours to design and test

Choose self-hosted or offline when

Use self-hosted, on-premise, or offline API tooling when:

You are in a regulated industry.
You store production credentials in your API client.
Your collections include customer data or customer-like test data.
You work in air-gapped or restricted networks.
Security or legal teams need a defensible chain of custody.
You want to reduce concentration of critical data in one vendor.

In these cases, the operational overhead is not waste. It is the cost of control.

Choose cloud sync when

Cloud-synced tooling can be the right choice when:

Real-time collaboration is central to your workflow.
Your team is distributed across time zones.
You do not have ops capacity to run infrastructure securely.
Your API data is low sensitivity.
You are moving fast in early-stage product work.

A poorly maintained self-hosted server can be worse than a mature vendor cloud.

The best model is often split by data class:

Use self-hosted or offline tooling for secrets, production-like data, and regulated workloads.
Use cloud workspaces for low-risk collaboration, public docs, and non-sensitive API design.

Revisit the decision periodically as your team, data, and compliance exposure change.

Keeping API data inside your perimeter with Apidog

If the GitHub breach has you reviewing where API data lives, the practical move is to choose tooling that gives you deployment options.

Apidog is an all-in-one API platform for design, debugging, testing, mocking, and documentation. It supports cloud usage, but it also supports self-hosted and offline workflows for teams that need tighter control.

This is not an anti-cloud argument. Apidog also offers a cloud product, and for many teams that is the right fit.

The point is that teams can choose where API specs, collections, test data, and credentials live.

Option 1: Self-hosted / on-premise deployment

Apidog offers a fully self-hosted, on-premise deployment for enterprises.

You can run the platform inside:

A private data center
Your own cloud VPC
A hybrid environment

According to the Apidog self-hosting documentation, supported deployment models include:

Standalone Docker, with the application, MySQL database, and Redis cache running on hosts you own
Hybrid deployment, where the application runs in your environment while database/cache services use managed infrastructure you control
Kubernetes for enterprise-scale deployments

In this model, your API data stays behind your controls:

OpenAPI specs
Collections
Test data
Environment variables
Mock definitions
Access logs

For audit questions like “who can access this data?”, the answer is concrete because the infrastructure is yours.

The self-hosted edition also supports self-hosted test runners. That means automated API tests can execute inside your network instead of routing through a third party.

That matters when tests:

Use real tokens
Hit internal-only services
Exercise staging systems
Carry sensitive request bodies

Self-hosted Apidog also includes enterprise user and access management, so teams can scope access by project instead of relying on broad shared workspaces.

Option 2: Offline Space for local-first work

You do not need a full on-premise rollout to keep sensitive work local.

Apidog’s Offline Space lets a developer or small team work entirely on-device. Per the Apidog Offline Space documentation, all data stays on the local machine and is never uploaded to the cloud.

That means:

No background sync
No temporary “cache until reconnect” behavior
Local-only API design, debugging, and testing
Local environment and global variables

Offline Space is useful for secrets because environment values are stored locally and are not shared with teammates or synced to the cloud.

For example, values like these can remain local:

INTERNAL_API_TOKEN=...
STAGING_BASIC_AUTH=...
PRIVATE_GATEWAY_URL=...

For air-gapped or restricted networks, this can be the difference between a usable API tool and one that cannot be approved.

Local-first control as a security posture

The shared idea behind both options is local-first control.

With on-premise deployment, the team’s API source-of-truth lives on infrastructure you control.

With Offline Space, an individual developer’s sensitive work lives on their device.

Either way, API specs, test data, and credentials are not delegated to a multi-tenant cloud by default.

To try the desktop workflow, Download Apidog and enable Offline Space. If you are evaluating enterprise deployment, review the self-hosting documentation.

Apidog would not have stopped GitHub’s breach. No API tool would have. What it does provide is deployment choice, so you can decide where your API data belongs before an incident forces the question.

Implementation checklist: audit your API tooling this week

Use this checklist to turn the discussion into action.

1. Inventory synced data

List every API tool your team uses.

For each one, document whether it stores or syncs:

OpenAPI specs
Collections
Saved examples
Environment variables
Global variables
Test data
Mock data
API documentation
Workspace comments
Access logs

2. Classify each data type

Use simple labels:

Data type	Sensitivity
Public API docs	Low
Internal OpenAPI specs	Medium
Staging request examples	Medium
Customer-like payloads	High
Production tokens	Critical
Regulated data	Critical

3. Decide the allowed location

For each category, define where it may live:

Sensitivity	Allowed location
Low	Cloud workspace allowed
Medium	Cloud allowed with access controls
High	Self-hosted or approved region only
Critical	Offline, self-hosted, or restricted network only

4. Review access

Check:

Who has access to shared workspaces?
Are contractors still active?
Are old teams still assigned?
Are production environments shared too broadly?
Is MFA enforced?
Are API client tokens rotated?

5. Remove secrets from shared collections

Search for common secret patterns:

grep -R "Bearer " .
grep -R "api_key" .
grep -R "client_secret" .
grep -R "DATABASE_URL" .
grep -R "Authorization" .

Move secrets into approved local, self-hosted, or vault-backed storage.

6. Revisit vendor and deployment fit

Ask:

Do we need cloud collaboration for this workspace?
Does this data include regulated or contractual data?
Would a vendor breach expose credentials or customer data?
Can we operate self-hosted infrastructure safely?
Should this workflow be offline instead?

Conclusion

The GitHub breach is not proof that cloud tooling is broken. It is a reminder to verify where sensitive developer data lives.

Key takeaways:

GitHub was breached through a poisoned VS Code extension on one employee’s device.
About 3,800 internal repositories had data stolen.
Many teams keep API specs, collections, test data, and environment secrets near their source code.
Cloud-synced API tooling adds attack surface through vendor infrastructure, account takeover, broad workspace sharing, extensions, integrations, logs, and sub-processors.
Cloud sync also has real benefits, especially for distributed teams and low-sensitivity work.
Self-hosted or offline tooling becomes important for regulated data, production credentials, customer data, and air-gapped environments.
The best approach is often per-data-class, not all-or-nothing.

The next step is straightforward: inventory what your API client syncs, classify each data type by sensitivity, and decide where each class is allowed to live.

If part of the answer is “inside our perimeter,” Apidog provides self-hosted deployment and offline mode to support that model. Download Apidog to start, or review the self-hosting documentation if an enterprise rollout is on the table.

DEV Community