Self-hosted API tools are no longer just a compliance checkbox. After GitHub confirmed attackers stole data from roughly 3,800 internal repositories through a poisoned VS Code extension on one employee’s laptop, API teams have a practical question to answer: where do your API specs, shared collections, test data, and environment secrets actually live?
For many teams, the answer is: “in a vendor cloud, and we are not fully sure where.” That is not automatically wrong. Cloud-synced API tooling is convenient, fast to adopt, and useful for collaboration. But the GitHub incident is a good trigger to review your API source-of-truth and decide deliberately whether it belongs inside your perimeter or outside it.
TL;DR
Self-hosted API tools, also called on-premise API platforms, keep your OpenAPI specs, request collections, test data, and credentials inside infrastructure you control instead of a vendor’s multi-tenant cloud.
After the May 2026 GitHub breach, where attackers exfiltrated data from about 3,800 internal repositories through a trojanized VS Code extension, more teams are comparing data residency against cloud convenience.
Use self-hosted or offline API tooling when:
- You work in a regulated environment.
- Your API client stores credentials or customer-like test data.
- Your network is air-gapped or restricted.
- You need a clear chain of custody for auditors.
- You want to reduce dependency on one vendor’s cloud.
Cloud sync still makes sense when:
- Your team needs real-time collaboration.
- Your API data is low sensitivity.
- You do not have the ops capacity to run self-hosted infrastructure safely.
Apidog supports both models: a cloud product, a self-hosted/on-premise deployment, and an offline mode, so teams can choose where API data lives.
What happened at GitHub, and why API teams should care
On May 20, 2026, GitHub confirmed that attackers had stolen data from approximately 3,800 internal code repositories.
The entry point was not a zero-day in GitHub’s core platform. It was a poisoned VS Code extension installed on a GitHub employee’s device. Once the extension ran with the employee’s permissions, attackers gained a foothold inside GitHub’s internal network.
The threat group, tracked as TeamPCP, is known for supply-chain attacks across npm, PyPI, and PHP package ecosystems. Security reporting indicates the group put the stolen dataset up for sale on underground forums for more than $50,000.
GitHub said it found no evidence that customer data stored outside its internal repositories was affected, and the investigation is ongoing.
This was not GitHub’s only rough month. In April 2026, cloud security firm Wiz disclosed CVE-2026-3854, a critical remote code execution flaw in GitHub’s internal Git infrastructure that, before it was patched, exposed millions of repositories. SecurityWeek documented the vulnerability and its scope.
For API teams, the key point is this: GitHub is often more than a code host. It is also where your API source-of-truth lives.
That can include:
- OpenAPI and Swagger specs
- Request collections committed to repos
-
.env.examplefiles - Terraform for API gateways
- CI workflows with deploy tokens
- Integration test fixtures
- Mock-server definitions
- Internal API documentation
The stolen GitHub data was GitHub’s own internal code, not customer repositories. That distinction matters.
But the pattern generalizes: a malicious developer tool running on one laptop can become a path into a much larger environment. The same attack chain can work against any vendor whose product connects to your development workflow.
For related developer-side risks, see:
This article focuses on the platform-level question: should your API design data live in a vendor cloud at all?
What your API client may sync to a vendor cloud
Before choosing cloud, self-hosted, or offline tooling, inventory what your API client actually syncs.
In a shared cloud workspace, the following data often leaves local machines and lands in vendor infrastructure.
1. API specifications
OpenAPI documents describe:
- Endpoints
- Parameters
- Schemas
- Authentication flows
- Error responses
- Service boundaries
A full API spec is not always a secret, but it is a map. If exposed, it can reduce attacker recon time by showing which endpoints exist, what identifiers are used, and where authentication boundaries may be.
2. Request collections and saved examples
Saved requests often contain realistic payloads.
Examples may include:
{
"email": "customer@example.com",
"accountId": "acct_12345",
"plan": "enterprise",
"internalRegion": "us-east-1-private"
}
Saved responses can be more sensitive because they may contain entire user objects, account records, internal IDs, or staging data copied during debugging.
3. Environment variables and secrets
This is usually the highest-risk category.
Teams often store values like:
API_BASE_URL=https://api.internal.example.com
BEARER_TOKEN=eyJhbGciOi...
OAUTH_CLIENT_SECRET=...
DATABASE_URL=postgres://...
If those environments sync to a cloud workspace, production or staging credentials may now exist in a third-party multi-tenant database.
If you have debugged teammate sync issues before, you know how opaque this layer can become. See the diagnostic guide on Postman environment sync issues for a closer look at this surface.
4. Test data and mock definitions
Mock servers and automated tests can expose:
- Example customer records
- Internal business rules
- Validation logic
- State transitions
- Role and permission behavior
Even when individual records are fake, the structure can reveal how your system works.
5. Workspace metadata and activity
This can include:
- Service names
- Folder structure
- Comments
- Team member lists
- Change history
- Internal naming conventions
Individually, these may seem minor. Together, they can describe your org chart and product roadmap.
Cloud sync is not reckless by default. But the synced data is real, sensitive in aggregate, and worth classifying before an incident forces the question.
For a deeper breakdown of the cloud-sync model, see Is Postman secure?
The real attack surface of cloud-synced API tooling
Cloud-synced API tooling adds attack surface that does not exist when data stays local.
That does not mean every cloud vendor is weak. Many vendors have stronger security teams than their customers. The issue is structural: the more places data can be reached from, the more places it can be attacked from.
Vendor compromise
A multi-tenant SaaS platform that stores API specs and credentials for many companies is a high-value target.
When you use it, you inherit:
- The vendor’s security posture
- Their patch cadence
- Their employee device security
- Their incident response process
- Their cloud and sub-processor exposure
The GitHub incident is a concrete example: one employee device became the weak link, and the blast radius included thousands of internal repositories.
Account takeover
Cloud tools rely on accounts, sessions, API tokens, and OAuth integrations.
Attackers can get access through:
- Phished credentials
- Reused passwords
- Leaked session tokens
- OAuth token theft
- Compromised SSO accounts
MFA helps and should be enforced, but it does not eliminate session hijacking or token theft.
Over-broad workspace sharing
Shared workspaces are useful, but they tend to accumulate stale access.
Common patterns:
- Contractors added temporarily and never removed
- New hires added to broad “Engineering” workspaces
- Old production environments left in shared projects
- Teams copying collections with secrets still attached
Access reviews need to include API tooling, not just source control and cloud infrastructure.
Extensions, plugins, and integrations
This is the vector that hit GitHub.
Developer tools often support:
- IDE extensions
- API client plugins
- CI integrations
- Browser extensions
- Third-party automation
Each integration can run with permissions close to the developer’s own access. A poisoned extension can read local files, tokens, synced data, or API client configuration.
Supply-chain attacks against npm, PyPI, and extensions are now a reliable path into developer environments.
Telemetry, logs, and sub-processors
Cloud tools may generate:
- Crash reports
- Debug logs
- Analytics events
- Request traces
- Support snapshots
If request bodies or headers are captured, they may include credentials or sensitive payloads.
Your data may also flow through:
- The vendor’s cloud provider
- Analytics tools
- Support tools
- Logging systems
- Error monitoring providers
The lesson is similar to the Vercel breach and what it taught API teams: map which third parties can touch sensitive data, then shrink that map where the risk justifies it.
To keep this balanced: mature cloud vendors usually encrypt data at rest and in transit, run security programs, hold SOC 2 or ISO 27001 certifications, and patch quickly. A small team’s data may be safer in a mature vendor cloud than on a poorly maintained self-hosted server.
The point is not “cloud is unsafe.” The point is “cloud sync is a trade-off.”
Compliance and data residency: when self-hosted becomes required
For regulated teams, self-hosted API tooling may not be optional. It may be required by law, contract, or audit policy.
Data residency and sovereignty
Regulations like GDPR and national data-localization laws can restrict where personal data is stored or processed.
If your API test data or saved request payloads contain personal data of EU residents, syncing that data to a US-region multi-tenant database may create compliance risk.
A self-hosted API platform running in your data center or in a cloud region you explicitly control gives you direct control over data location.
The European Data Protection Board’s guidance is a reference point for cross-border transfer rules.
Industry-specific frameworks
Some industries have explicit requirements for regulated data handling:
- Healthcare teams handling PHI under HIPAA
- Payment teams under PCI DSS
- US federal vendors under FedRAMP
- Defense contractors under CMMC
In sensitive environments, air-gapped or on-premise tooling may be the only acceptable option.
For that scenario, see air-gapped API testing tools for secure environments.
Contractual data-handling obligations
Even without formal regulation, customer contracts may limit where data can go.
Example: a customer contract says their data cannot be processed by unapproved sub-processors. If your API client syncs test payloads containing that customer data to its own cloud, you may violate that commitment without realizing it.
Audit and chain of custody
Auditors often ask:
Who can access this data, and how do you know?
With self-hosted infrastructure, the answer is easier to evidence:
- Data is on your servers.
- Access is behind your network controls.
- Logs are in your systems.
- Identity policies are yours.
- Backups are under your process.
With multi-tenant SaaS, part of the answer is always: “we trust the vendor.”
That may be acceptable, but it is harder to prove.
When self-hosted wins vs. when cloud wins
Self-hosting is not automatically better. It is an engineering trade-off with real operational cost.
| Factor | Cloud-synced API tooling | Self-hosted / on-premise / offline |
|---|---|---|
| Setup and maintenance | Minutes; vendor runs everything | You provision, patch, back up, monitor |
| Real-time collaboration | Strong; built for distributed teams | Works, but inside your network or VPN |
| Data residency control | Limited to vendor regions and policy | Full; you choose the exact location |
| Attack surface | Vendor cloud, account auth, sub-processors | Your perimeter only |
| Compliance fit (HIPAA, PCI, FedRAMP) | Depends on vendor certifications | Strong; data never leaves your control |
| Cost model | Per-seat subscription | License plus your infrastructure and ops time |
| Works air-gapped or offline | No | Yes |
| Disaster recovery | Vendor’s responsibility | Yours to design and test |
Choose self-hosted or offline when
Use self-hosted, on-premise, or offline API tooling when:
- You are in a regulated industry.
- You store production credentials in your API client.
- Your collections include customer data or customer-like test data.
- You work in air-gapped or restricted networks.
- Security or legal teams need a defensible chain of custody.
- You want to reduce concentration of critical data in one vendor.
In these cases, the operational overhead is not waste. It is the cost of control.
Choose cloud sync when
Cloud-synced tooling can be the right choice when:
- Real-time collaboration is central to your workflow.
- Your team is distributed across time zones.
- You do not have ops capacity to run infrastructure securely.
- Your API data is low sensitivity.
- You are moving fast in early-stage product work.
A poorly maintained self-hosted server can be worse than a mature vendor cloud.
The best model is often split by data class:
- Use self-hosted or offline tooling for secrets, production-like data, and regulated workloads.
- Use cloud workspaces for low-risk collaboration, public docs, and non-sensitive API design.
Revisit the decision periodically as your team, data, and compliance exposure change.
Keeping API data inside your perimeter with Apidog
If the GitHub breach has you reviewing where API data lives, the practical move is to choose tooling that gives you deployment options.
Apidog is an all-in-one API platform for design, debugging, testing, mocking, and documentation. It supports cloud usage, but it also supports self-hosted and offline workflows for teams that need tighter control.
This is not an anti-cloud argument. Apidog also offers a cloud product, and for many teams that is the right fit.
The point is that teams can choose where API specs, collections, test data, and credentials live.
Option 1: Self-hosted / on-premise deployment
Apidog offers a fully self-hosted, on-premise deployment for enterprises.
You can run the platform inside:
- A private data center
- Your own cloud VPC
- A hybrid environment
According to the Apidog self-hosting documentation, supported deployment models include:
- Standalone Docker, with the application, MySQL database, and Redis cache running on hosts you own
- Hybrid deployment, where the application runs in your environment while database/cache services use managed infrastructure you control
- Kubernetes for enterprise-scale deployments
In this model, your API data stays behind your controls:
- OpenAPI specs
- Collections
- Test data
- Environment variables
- Mock definitions
- Access logs
For audit questions like “who can access this data?”, the answer is concrete because the infrastructure is yours.
The self-hosted edition also supports self-hosted test runners. That means automated API tests can execute inside your network instead of routing through a third party.
That matters when tests:
- Use real tokens
- Hit internal-only services
- Exercise staging systems
- Carry sensitive request bodies
Self-hosted Apidog also includes enterprise user and access management, so teams can scope access by project instead of relying on broad shared workspaces.
Option 2: Offline Space for local-first work
You do not need a full on-premise rollout to keep sensitive work local.
Apidog’s Offline Space lets a developer or small team work entirely on-device. Per the Apidog Offline Space documentation, all data stays on the local machine and is never uploaded to the cloud.
That means:
- No background sync
- No temporary “cache until reconnect” behavior
- Local-only API design, debugging, and testing
- Local environment and global variables
Offline Space is useful for secrets because environment values are stored locally and are not shared with teammates or synced to the cloud.
For example, values like these can remain local:
INTERNAL_API_TOKEN=...
STAGING_BASIC_AUTH=...
PRIVATE_GATEWAY_URL=...
For air-gapped or restricted networks, this can be the difference between a usable API tool and one that cannot be approved.
Local-first control as a security posture
The shared idea behind both options is local-first control.
With on-premise deployment, the team’s API source-of-truth lives on infrastructure you control.
With Offline Space, an individual developer’s sensitive work lives on their device.
Either way, API specs, test data, and credentials are not delegated to a multi-tenant cloud by default.
To try the desktop workflow, Download Apidog and enable Offline Space. If you are evaluating enterprise deployment, review the self-hosting documentation.
Apidog would not have stopped GitHub’s breach. No API tool would have. What it does provide is deployment choice, so you can decide where your API data belongs before an incident forces the question.
Implementation checklist: audit your API tooling this week
Use this checklist to turn the discussion into action.
1. Inventory synced data
List every API tool your team uses.
For each one, document whether it stores or syncs:
- OpenAPI specs
- Collections
- Saved examples
- Environment variables
- Global variables
- Test data
- Mock data
- API documentation
- Workspace comments
- Access logs
2. Classify each data type
Use simple labels:
| Data type | Sensitivity |
|---|---|
| Public API docs | Low |
| Internal OpenAPI specs | Medium |
| Staging request examples | Medium |
| Customer-like payloads | High |
| Production tokens | Critical |
| Regulated data | Critical |
3. Decide the allowed location
For each category, define where it may live:
| Sensitivity | Allowed location |
|---|---|
| Low | Cloud workspace allowed |
| Medium | Cloud allowed with access controls |
| High | Self-hosted or approved region only |
| Critical | Offline, self-hosted, or restricted network only |
4. Review access
Check:
- Who has access to shared workspaces?
- Are contractors still active?
- Are old teams still assigned?
- Are production environments shared too broadly?
- Is MFA enforced?
- Are API client tokens rotated?
5. Remove secrets from shared collections
Search for common secret patterns:
grep -R "Bearer " .
grep -R "api_key" .
grep -R "client_secret" .
grep -R "DATABASE_URL" .
grep -R "Authorization" .
Move secrets into approved local, self-hosted, or vault-backed storage.
6. Revisit vendor and deployment fit
Ask:
- Do we need cloud collaboration for this workspace?
- Does this data include regulated or contractual data?
- Would a vendor breach expose credentials or customer data?
- Can we operate self-hosted infrastructure safely?
- Should this workflow be offline instead?
Conclusion
The GitHub breach is not proof that cloud tooling is broken. It is a reminder to verify where sensitive developer data lives.
Key takeaways:
- GitHub was breached through a poisoned VS Code extension on one employee’s device.
- About 3,800 internal repositories had data stolen.
- Many teams keep API specs, collections, test data, and environment secrets near their source code.
- Cloud-synced API tooling adds attack surface through vendor infrastructure, account takeover, broad workspace sharing, extensions, integrations, logs, and sub-processors.
- Cloud sync also has real benefits, especially for distributed teams and low-sensitivity work.
- Self-hosted or offline tooling becomes important for regulated data, production credentials, customer data, and air-gapped environments.
- The best approach is often per-data-class, not all-or-nothing.
The next step is straightforward: inventory what your API client syncs, classify each data type by sensitivity, and decide where each class is allowed to live.
If part of the answer is “inside our perimeter,” Apidog provides self-hosted deployment and offline mode to support that model. Download Apidog to start, or review the self-hosting documentation if an enterprise rollout is on the table.


Top comments (0)