b0gy

Posted on Jun 1 • Originally published at b0gy.com

AI vendor selection is not software procurement

#ai #strategy #security

When you buy a SaaS tool, your data sits in someone else's database. When you buy an AI service, your data flows through someone else's model. It has the posibiliy to become training data. Your data might get embedded in weights you can never inspect. It might surface — partially reconstructed — in another customer's retrieval results.

Procurement teams treat these as the same purchase. Same checklist — SOC 2, uptime SLA, data encryption at rest — check every box, sign the contract, and discover six months later that their customer data has been training a global model by default. Or that their embeddings are locked in a proprietary format with no export path. Or that the vendor quietly added an AI sub-processor whose data handling invalidates three of their customer contracts.

SOC 2 Type II and ISO 27001 are table stakes — almost every SaaS vendor has them nowadays. They tell you the vendor has a security program. They tell you nothing about what happens to your data once it enters the model. ISO 42001 — the AI management systems standard — is the first certification that actually covers AI governance: how training data is sourced, how models are monitored, how risks like bias and hallucination are managed. Most vendors don't have it yet. The ones that do are telling you they take AI-specific risk seriously, not just infrastructure security.

Your data has more owners than you think

The thing that trips up most procurement teams is sub-processing — when your vendor passes your data to another company for processing. In traditional SaaS, this is usually a hosting provider or an analytics tool. In AI, it's the model provider. Different thing entirely.

Best example: AWS Bedrock and Claude on Anthropic's platform both let you use Claude. Both run on AWS infrastructure. They are not the same product.

AWS Bedrock: AWS is the sole processor. Anthropic has zero access to your data. Your prompts, responses, and any fine-tuning data never leave the AWS boundary. This is FedRAMP High eligible. Your existing AWS BAA covers it.

Claude Platform on AWS: Anthropic is the processor. AWS handles billing and identity. Your data flows through Anthropic's systems, governed by Anthropic's data policies. Same model. Same cloud. Completely different compliance posture.

The same pattern is everywhere. Azure OpenAI keeps data in the Azure boundary — OpenAI the company never sees it. But Azure's SRE Agent uses Anthropic as a sub-processor, and that data may leave the EU Data Boundary entirely. Salesforce lists OpenAI as a sub-processor and processed over a trillion OpenAI tokens in FY2026. Their Einstein Trust Layer claims zero data retention, but "zero retention" and "zero access" aren't the same statement.

Why does this matter if you're a US company that doesn't care about GDPR? Your customer contracts probably include a clause like "no unauthorized sub-processing." When your vendor adds an AI sub-processor you didn't know about, you may be breaching your customer contracts. Under HIPAA, your BAA with the vendor doesn't automatically cover AI sub-processors — PHI might flow through 5 or 6 sub-processors without coverage. Under ITAR, AI infrastructure with foreign nationals having admin access constitutes a deemed export — civil penalties up to $1M per violation. Under SOX, audit trail opacity in AI-processed financial data is a finding waiting to happen.

The evaluation framework

We have provided Five dimensions. For each: what to ask, what good looks like, and the red flag. These are the things your SaaS checklist skips or gets wrong.

None of this matters if the tool doesn't solve a real problem. Technical due diligence isn't a substitute for asking: does this vendor actually help us move faster, serve customers better, or cut cost? The framework below assumes you've already established business value. What it protects is your ability to keep that value — by making sure the vendor's data practices, lock-in posture, and contractual commitments don't turn a good tool into an expensive liability.

1. Data usage and training defaults

This is the one that burns people. The question isn't "is my data secure?" It's "is my data making the vendor's product better for everyone else?"

The defaults are worse than you'd expect. ChatGPT trains on user data by default — even on paid plans. Team and Enterprise tiers don't, but Plus does. GitHub Copilot, as of April 2026, uses interaction data from Free, Pro, and Pro+ plans for training by default. Salesforce's Spring '26 release feeds customer data into global predictive AI models, on by default. Atlassian will use customer metadata to train Rovo starting August 2026 — and Free, Standard, and Premium tiers can't opt out of metadata collection.

What to ask: "Show me the written policy on data usage for model training — not the marketing page, the contractual language. Does it distinguish between content, metadata, and interaction patterns? Is opt-out a settings toggle or a contractual guarantee?"

What good looks like: A contractual commitment — not a toggle — that customer data is not used for training. A clear definition of what "data" means — including metadata and usage patterns, not just content. A specific policy on what happens to data after inference.

Red flag: "We don't train on your data" but the ToS says otherwise. Zoom updated their terms in 2023 to grant a perpetual license to customer content for ML/AI — the backlash forced three revisions in a week. Slack opts customers in for non-generative AI training on messages by default, and opting out requires contacting support. If the vendor's answer to "do you train on my data" requires a footnote, the answer is yes.

The gate: Know exactly what data gets used, for what purpose, and whether that's acceptable for your use case. Some services are designed to learn from your data — that might be the value proposition. The question isn't "does it train" but "do you know what it trains on, did you consent, and can you control it?" If the answer to any of those is no, that's the problem.

2. Sub-processor transparency

Most vendors will hand you their SOC 2 and/or ISO 27001 report unprompted. Ask for their sub-processor list and watch the conversation shift.

What to ask: "Provide a complete list of sub-processors that handle customer data, including AI model providers. What data does each sub-processor receive? What is your notification process when you add a new AI sub-processor?"

What good looks like: A published sub-processor list — updated regularly — with 30+ day advance notice before adding new ones and contractual objection rights. Specific disclosure of which data each sub-processor receives. Nightfall AI maintains a public AI sub-processor tracker for major SaaS vendors — if your vendor isn't on it, ask why.

Red flag: No sub-processor list. Or a list that omits AI model providers. Or notification "at our discretion." If the vendor can't tell you exactly who touches your data, they either don't know or don't want you to.

The gate: You need to know who touches your data. How formal that looks — published list, advance notice, objection rights — depends on your regulatory environment and customer commitments. But the baseline is: the vendor can tell you, clearly, who their sub-processors are and what data each one receives.

3. Data security and incident response

Samsung employees pasted proprietary source code into ChatGPT 3 times in 20 days in 2023. Samsung banned all generative AI. In April 2026, an employee at Vercel with OAuth delegation access to Context AI's systems was compromised — the breach data ended up on BreachForums, priced at $2M. An unauthenticated endpoint in AnythingLLM's Pinecone integration exposed API keys. A supply chain attack hid malicious instructions in GitHub READMEs that executed when developers used Cursor IDE's AI features.

This is the attack surface now. It isn't shrinking.

What to ask: "What's your incident notification timeline? Do you have AI-specific indemnity provisions — if your model produces infringing output that ends up in my product, who carries liability? What controls exist for agent execution, and can I kill them remotely?"

What good looks like: 72-hour incident notification. AI-specific indemnity for IP infringement on outputs. A kill switch for agent execution. Explicit acknowledgment that prompts are your IP.

Red flag: Boilerplate SaaS security language with no AI-specific provisions. If their incident response plan doesn't mention model-specific attack vectors — prompt injection, embedding inversion, training data extraction — it was written before they added AI to the product.

4. Lock-in and portability

AI lock-in runs deeper than traditional software lock-in. With SaaS, your data is in their database and you need an export. With an AI vendor, your data might be in their embeddings (proprietary format, no export), in a fine-tuned model on their infrastructure (weights not exportable), and woven into an orchestration layer that couples your workflows to their tooling. Three layers where traditional software has one.

Every month of investment increases exit cost — and exit cost is a business risk, not just a technical one. If a better model ships next quarter and you can't switch, you're paying more for less. If your vendor gets acquired and the roadmap shifts, you're stuck. Re-embedding 10 million documents costs $30–$900 in API fees alone, plus the engineering time to validate that new embeddings produce equivalent retrieval quality. Fine-tuned model weights generally aren't exportable from managed services. Workflow logic built on vendor-specific orchestration APIs requires a rewrite, not a migration.

What to ask: "Can I export my embeddings, fine-tuning datasets, prompts, and interaction logs in open formats — JSON, CSV, Parquet? What is your model deprecation policy — how much notice, and do I retain access to the previous version during the notice period? What happens to my data if you're acquired or shut down?"

What good looks like: Contractual data exit provisions covering embeddings, fine-tuned model weights, and workflow configurations. 90-day minimum deprecation notice with continued access to the previous version. Data export in open formats, not proprietary dumps.

Red flag: "We'll work with you on migration." That's not a contractual guarantee — it's a vibe. Notion AI is instructive here: Enterprise plans get zero retention with LLM providers. Non-Enterprise plans allow up to 30 days of retention. Same product, different data posture based on the plan you're on.

5. Jurisdictional sovereignty

The legal ground under AI data transfers is moving faster than most legal teams can track. The EU-US Data Privacy Framework was invalidated in late 2025. CNIL issued guidance in February 2026 requiring supplementary measures beyond Standard Contractual Clauses — including encryption where the provider doesn't hold the keys. That's technically impossible for most AI API usage. The model needs plaintext to process it.

Even if you're US-only, the regulatory surface is expanding. Colorado's AI Act takes effect June 30, 2026. Texas RAIGA hit January 1, 2026. Illinois followed in February 2026. The US CLOUD Act means US-headquartered providers are subject to extraterritorial data access regardless of where servers sit — which makes "data residency in the EU" a weaker guarantee than it sounds.

What to ask: "Where is my data processed, stored, and in transit? Which jurisdictions does it touch? What happens when a government requests access to my data under CLOUD Act or equivalent?"

What good looks like: Specific answers — not "it depends." A vendor who can name the regions, the legal mechanisms, and the fallback plan. 61% of Western European CIOs now prioritize local providers. That number wasn't close 2 years ago.

Red flag: "Our servers are in the US/EU." That's not an answer. Server location tells you where data sleeps, not where it travels.

Deployment model changes the risk, not the framework

Everything above assumes cloud-hosted SaaS. But AI gets deployed on-prem, hybrid, and air-gapped too — and the risk profile shifts depending on where the model runs.

Cloud / SaaS. All five dimensions apply at full weight. Sub-processor risk is highest here — your data leaves your boundary by design. Training defaults matter most because you have the least control over what the vendor does with data after inference. This is where most enterprises start and where most procurement failures happen.

Hybrid. Inference might run on your infrastructure, but model updates, fine-tuning, or telemetry may still call home. The sub-processor question doesn't disappear — it just gets harder to answer. Ask specifically: what data leaves my environment, when, and to where? A vendor who says "nothing leaves" but ships telemetry to their cloud for model monitoring isn't being straight with you. Sovereignty risk doesn't go away either — data in transit between your on-prem components and the vendor's cloud still crosses jurisdictional boundaries. If your cloud component runs in us-east-1 but your on-prem sits in Frankfurt, you've got a transatlantic data flow that CNIL's 2026 guidance doesn't smile on. Lock-in risk is higher here because you've invested in on-prem infrastructure that's coupled to the vendor's model format and orchestration.

On-prem. Training defaults and sub-processor risk drop — but don't assume they disappear. Many on-prem AI deployments still phone home: license validation, telemetry, usage metering, model update checks, and crash reporting all create outbound data flows you probably haven't audited. Ask the vendor for a complete list of outbound connections the software makes and what data each one carries. If it genuinely doesn't phone home, treat it as air-gapped and enforce network rules accordingly — don't leave outbound access open on trust. Lock-in and portability become the dominant risks. You're running their model on your hardware, likely with their orchestration tooling, and if they deprecate the model version or change licensing terms, you're exposed. AI governance shifts to you too — bias testing, output monitoring, and model lifecycle management are now your ops problem, not the vendor's.

Air-gapped. Sovereignty and sub-processor concerns are mostly solved by architecture. What remains is lock-in (you're deeply coupled to whatever you deployed), model currency (you can't easily update), and operational governance (you own the entire AI risk surface with no vendor support). The contract conversation shifts from "what do you do with my data" to "what happens when I need a model update and can't connect to your systems."

The five evaluation dimensions still apply in every case — but the weight shifts. Use the scoring table below and adjust weights based on your deployment model. A cloud deployment should weight security, data handling, and sub-processors highest. An air-gapped deployment should weight lock-in, portability, and governance highest.

How to score

Keep the rubric to 15-20 criteria across these five dimensions. Anything more causes evaluation fatigue and everything gets scored a 3. Open the scorecard (Google Sheet).

Baseline checks first. Before you score, establish where the vendor stands on the fundamentals:

Security certifications — SOC 2 Type II and ISO 27001 are standard. If they don't have them, ask why — but a smaller vendor solving a real problem might not have them yet. That's a risk to manage, not necessarily a dealbreaker.
Data usage transparency — you understand what data is used for training, whether you consented, and whether you can control it. The right answer depends on your use case — but "we don't know" is never the right answer.
Sub-processor visibility — the vendor can tell you who touches your data and what each party receives. The formality of disclosure (published list, advance notice, objection rights) scales with your regulatory exposure and customer commitments.

If a vendor can't clear these basics, proceed with caution. But "proceed with caution" isn't "walk away" — the business need, the alternatives, and the risk you can actually manage all factor in.

Then score on a 1–5 scale across these categories. Weight them based on what matters most to your organization — a healthcare company will weight data handling differently than a startup building internal tools.

Category	What it covers
Security & privacy	Incident response, access controls, AI-specific security
Data handling	Training defaults, retention, deletion, sub-processors
AI governance	Model transparency, bias testing, output monitoring, ISO 42001
Technical fit	Model quality for your use case, latency, reliability
Integration & portability	API design, export capabilities, lock-in mitigation
Business & vendor viability	Vendor financial health, TCO, support, business continuity, roadmap alignment

Who needs to be in the room. Not just procurement. You need CISO or security lead, privacy/DPO, legal, engineering, and the business unit owner. The business owner isn't there to rubber-stamp — they're there to validate the vendor solves a real problem, that total cost of ownership (not just per-token price) makes sense, and that the vendor's roadmap aligns with where the business is headed. 80% of enterprise AI purchases now face stricter scrutiny than standard SaaS — if your process doesn't reflect that, you're rubber-stamping risk.

Timeline. Plan for 3-6 months, not the 4-8 weeks of standard SaaS procurement. The extra time isn't bureaucracy — it's what it takes to evaluate dimensions that don't exist in your SaaS checklist. Use part of that time to run a proof of concept against your actual use case. A vendor who scores well on paper but underperforms on your data, your latency requirements, or your team's workflow is still the wrong vendor.

Contract provisions worth negotiating

Six clauses that aren't standard in SaaS agreements but matter for AI. Not every vendor will agree to all of them, and not every deal requires all of them. The point is knowing what you're accepting when you sign without them.

Data usage for training — understand and document what's used, what's not, and who controls it. If training on your data is the product, make sure the terms reflect that explicitly — not buried in a ToS update.
Sub-processor disclosure — know who touches your data. The formality (published list, advance notice, objection rights) should match your regulatory exposure.
Model deprecation — understand the vendor's deprecation policy. How much notice? Can you pin to a version during transition? What's the migration path?
Output IP — clarify ownership of outputs and whether the vendor acknowledges prompts as your IP. This matters more in some industries than others.
Data deletion — understand whether deletion extends to embeddings, vectors, and sub-processors. Ask for certification of completion if your compliance framework requires it.
Export — can you get your prompts, logs, embeddings, and fine-tuning datasets out in open formats? The answer shapes your exit cost.

If a vendor won't put their marketing claims in the contract, that tells you something. It might be a dealbreaker — or it might be a risk you accept because the business value justifies it. Either way, make that call consciously, not by default.

The heuristic

Add one question to every AI vendor evaluation that your SaaS checklist never asks: "Show me every company that touches my data after I hit send, and show me the contract that governs each one." The vendor who answers clearly is worth signing. The vendor who hedges is the one who will surprise you.

tl;dr

The pattern. Teams evaluate AI vendors with their SaaS procurement rubric — checking SOC 2 and uptime SLAs — while missing the AI-specific risks that actually burn them: default training on customer data, undisclosed sub-processors, proprietary lock-in, and sovereignty gaps.
The fix. Validate business value first — then run AI vendors through the dimensions SaaS checklists skip: training defaults, sub-processor transparency, AI-specific security, portability, vendor viability, and jurisdiction.
The outcome. You pick vendors that solve real business problems and protect your data at the model layer, not just the platform layer. Your exit cost stays manageable because you negotiated portability before you had no leverage.

Originally published at b0gy.com

Top comments (1)

Harjot Singh • Jun 1

you make a great point about the differences in how data is handled between SaaS and AI services. it's crucial for procurement teams to recognize that these are not the same. on a different note, if you're interested in building apps quickly, moonshift lets you deploy a full next.js + postgres + auth setup in about 7 minutes, and you own the code. happy to offer you a free run to give it a shot.