Tiamat

Posted on Mar 7

Surveillance Capitalism: How Big Tech Built a $10 Trillion Industry From Your Behavior

#privacy #security #ai #webdev

In 2000, Google had a problem. It had engineered extraordinary search technology, accumulated millions of users, and burned through its venture capital without a working business model. Then it discovered something more valuable than search: behavioral data.

By logging what people searched for, what they clicked, how long they spent on results, and what they did next, Google could predict what users would do. And what they would buy. Advertisers would pay a premium not just to reach audiences, but to reach audiences whose behavior predicted receptivity to specific products.

Google's innovation wasn't search. It was the discovery that human behavior, when captured at sufficient scale, becomes a commodity — a raw material that can be processed into predictions and sold.

This is surveillance capitalism. Harvard Business School professor Shoshana Zuboff named it in 2019. The tech industry built it over two decades. AI has now made it almost infinitely more powerful.

The Behavioral Surplus — The Core Mechanism

Surveillance capitalism operates on behavioral surplus.

When you use a free service — search engine, social media, email, maps, streaming — you generate behavioral data as a byproduct. Some of that data is necessary to deliver the service: your search query to return search results, your location to show directions. The surplus is everything else:

How you move your mouse before clicking
How long you pause before deciding
Which alternative you almost chose
What you searched for before and after
The physical context (time, location, device, weather) of every action
The emotional valence of your language
The network of people you interact with and how

This surplus is not disclosed in terms of service. It is not used to improve the service you're receiving. It is fed into prediction engines that produce behavioral futures — probabilistic estimates of what you will do, want, buy, or believe.

Those behavioral futures are the actual product. You are not the customer. Advertisers, political campaigns, employers, insurers, and anyone else who wants to predict and influence behavior are the customers.

Zuboff calls this the "behavioral modification" phase — where surveillance capitalism moves beyond predicting behavior to actively shaping it. The goal shifts from knowing what you'll do to making you do what the customer wants.

Google's Origin — The Surveillance Accidental Discovery

In 2002-2003, Google began monetizing search through AdWords. The initial model was simple: advertisers bid on keywords, pay per click. But Google had something other ad platforms didn't: logs.

Every Google search was logged with the precise query, the results shown, which result was clicked, how long the user spent on the destination page, and what they searched for next. Google's engineers realized they could use this behavioral exhaust — surplus data beyond what was needed to run search — to make the ad system dramatically more effective.

Instead of matching ads to keywords, they could match ads to users — specifically, users whose behavioral history predicted purchase intent. A user who searched for "best running shoes," "marathon training plan," and "knee pain running" didn't just want running content; they were a high-probability buyer of running gear. Ads shown to this behavioral profile would convert at rates an order of magnitude higher than untargeted ads.

The insight: behavioral data is a prediction oracle. Google's financial trajectory from the mid-2000s onward reflects the value of that discovery.

Google's revenue:

2002: $440 million
2007: $16.6 billion
2012: $50 billion
2023: $307 billion

The growth curve corresponds to improvements in behavioral data collection, analysis, and targeting precision — not improvements in search quality.

Facebook — From Social Network to Behavioral Modification Engine

Facebook's trajectory is more explicit than Google's, because it required a deliberate architectural pivot.

Early Facebook was a social utility: connect with friends, share updates, see photos. The revenue model was social advertising — reach people where they spend time with their friends. This worked, but not as well as Google's behavioral targeting.

The key insight Facebook developed around 2012: engagement is the metric that maximizes behavioral data collection. More time on platform = more behavioral exhaust = better predictions = more valuable advertising inventory. Everything should be optimized for engagement.

The problem is that engagement is not equivalent to wellbeing. The content that generates the most engagement tends to be:

Emotionally arousing (outrage, fear, desire, awe)
Socially validating (likes, shares, comments)
Algorithmically optimized for psychological response patterns

Facebook's internal research (revealed in the Frances Haugen whistleblower documents, 2021) showed the company understood its platform was amplifying harmful content because harmful content is more engaging. The company prioritized engagement metrics over harm reduction because engagement = behavioral data = revenue.

The 2018 Cambridge Analytica scandal crystallized the endpoint of this logic.

Cambridge Analytica — Behavioral Modification for Political Outcomes

Cambridge Analytica was a political consulting firm that used Facebook data to build psychological profiles of American voters and target them with tailored political messaging.

The mechanism:

A Facebook app called "This Is Your Digital Life" offered a personality quiz to 270,000 users. It also harvested the Facebook data of all their friends — approximately 87 million people — without those friends' knowledge or consent. This was technically allowed by Facebook's data sharing policies at the time.
Cambridge Analytica processed this data through psychographic models (OCEAN: Openness, Conscientiousness, Extraversion, Agreeableness, Neuroticism) to predict personality types for each of the 87 million people.
Targeting: ads, content, and messaging were tailored by personality type. Voters profiled as high-neuroticism and high-agreeableness received content emphasizing threat and community values. Voters profiled as high-openness received different content emphasizing novelty and change.
The Trump 2016 campaign paid Cambridge Analytica approximately $5.9 million for this service.

The behavioral modification loop was complete: collect behavioral data → build psychological profiles → target individuals with psychologically optimized content to modify their behavior (voting, sharing, donating) → repeat.

Cambridge Analytica was shut down in 2018. The underlying capability — mass behavioral profiling + psychographic targeting for political influence — remains standard practice for political campaigns on every major platform.

The Prediction Products Market — Who Buys Your Behavioral Futures

Behavioral predictions are sold across a sprawling market:

Retail: Amazon's recommendation engine is a behavioral prediction system — it predicts what you'll buy based on browsing history, purchase history, similar users, and contextual signals. The "customers who bought this also bought" feature isn't helpful serendipity; it's a conversion optimization system. Amazon reported $46.9 billion in "other" revenue (primarily advertising, the output of its behavioral prediction system) in 2023.

Insurance: Behavioral prediction is used for risk pricing. Driving behavior apps (Progressive Snapshot, State Farm Drive Safe & Save) collect continuous driving data and adjust premiums based on predicted risk. Health insurers buy data from brokers about exercise habits, dietary purchases, and health app usage. The premium you pay reflects behavioral predictions about your future claims.

Finance: Credit scoring is explicitly a behavioral prediction system. Alternative credit scoring (for the unbanked) uses behavioral data — smartphone usage patterns, social network connections, app download history — as prediction inputs. FICO has incorporated non-traditional data. The prediction: will you repay debt?

Employment: Behavioral predictions are used to screen job applicants (personality assessments), predict performance (analytics on current employees), predict attrition risk (retention models), and score communication patterns for performance reviews. The behavioral modification angle: employees monitored by "productivity software" modify their behavior to optimize for metrics.

Political targeting: Every major political campaign in the US uses behavioral data to segment voters and optimize messaging. The industry has a term for this: "microtargeting." It means delivering different versions of a message to different behavioral profiles — a perfectly legal form of behavioral manipulation.

Dark patterns: The behavioral modification dimension applied to product design. Endless scroll removes friction from continued use. Variable reward schedules (you don't know if the next post will be interesting) create compulsive checking. Default notifications maximize return visits. These are not accidental design choices; they are behavioral engineering based on psychological research.

AI as Surveillance Capitalism's Accelerant

Every prior phase of surveillance capitalism operated on behavioral data collected from explicit actions: searches, clicks, purchases, locations. AI adds a new dimension: language.

Large language models process text. When you interact with ChatGPT, Claude, Gemini, or any other LLM assistant, you produce language — detailed, nuanced, personally revealing language about your problems, plans, fears, desires, medical conditions, relationship situations, professional challenges, and financial concerns.

This language data is extraordinarily valuable for behavioral prediction. It reveals:

What you're trying to accomplish (intent)
What obstacles you're facing (friction points)
What you know and don't know (knowledge gaps)
What you're worried about (emotional valence)
What decisions you're considering (prediction targets)

For AI companies operating the surveillance capitalism model, user conversations are not just a service delivery mechanism — they're behavioral data of unprecedented richness.

OpenAI's privacy policy (as of 2024): "We may use Personal Information to provide, maintain, analyze, improve, and develop our Services, including by training our models."

Google's Gemini privacy disclosure: conversations may be "reviewed by human reviewers" and used to "improve Google products and services."

The specifics of how conversation data flows into training and targeting are opaque. The behavioral data principle is clear.

The AI agent multiplication problem: As AI agents become more autonomous — browsing the web, reading email, taking actions on behalf of users — the behavioral exhaust grows exponentially. An AI agent that manages your email has access to every relationship, transaction, and communication in your inbox. An AI agent that manages your calendar knows your schedule, relationships, work patterns, and habits. An agent that manages your finances knows your complete economic life.

If these agents are built on surveillance capitalism models, the behavioral data they generate doesn't stay in your service. It flows into prediction markets.

The Engagement-Reality Divergence

The deepest structural problem with surveillance capitalism is what Zuboff calls the "behavioral modification" imperative — the shift from predicting behavior to actively shaping it.

Optimizing for engagement doesn't optimize for truth, wellbeing, or good decisions. It optimizes for psychological arousal. The result:

Epistemic harm: Recommendation algorithms push users toward increasingly extreme content because extreme content is more engaging. This has been documented in YouTube radicalization pathways, Facebook health misinformation, and Twitter political polarization amplification.
Mental health effects: Facebook's own research found Instagram use correlated with increased body image issues in teenage girls. The company knew. Engagement metrics were prioritized.
Political polarization: Algorithms that maximize engagement with political content amplify outrage and conflict — the most engaging emotional register for political content — creating information environments that make compromise and deliberation harder.
Commercial manipulation: Dark patterns (confusing unsubscribe flows, guilt-trip cancellation screens, misleading default settings) exploit behavioral research to extract money from users against their interests.

Surveillance capitalism doesn't just observe behavior — it shapes it, and it shapes it toward outcomes that serve the prediction products market, not the humans being modified.

What Structural Change Actually Looks Like

Regulation of individual harms (a fine for Sephora, a consent requirement for biometric data) doesn't address surveillance capitalism's structural logic. The prediction products market exists because behavioral data is a commodity with no property rights attached.

Structural change would require:

Data as property: Individuals own their behavioral data. Using it for prediction products requires a license. The value of that license flows to the individuals whose data created it. This is the conceptual foundation of several legislative proposals; it has not passed into law anywhere.

Behavioral surplus prohibition: Platforms may collect only data necessary to deliver the service. The surplus — everything beyond operational necessity — cannot be collected, retained, or sold. This is GDPR's data minimization principle; enforcement in the AI context is incomplete.

Algorithmic accountability: Recommendation algorithms that optimize for engagement must be tested and disclosed. If an algorithm amplifies harmful content to maximize engagement, the operator is liable. The EU's Digital Services Act moves in this direction for large platforms.

Ban on behavioral microtargeting: The EU's Digital Markets Act bans certain forms of behavioral targeting; Meta has been ordered to restrict targeting in Europe. The US has no equivalent.

AI conversation data restrictions: LLM providers cannot use conversation data for behavioral profiling or prediction products without explicit consent. The default should be that your conversation is used only to deliver the response — not to build a behavioral profile.

None of these reforms are close to passing in the US. The lobbying power of the surveillance capitalism beneficiaries is enormous, and the public's awareness of the mechanism is limited.

Your Behavioral Exhaust Is Being Processed Right Now

Every search query, every app session, every AI conversation, every purchase, every location check-in contributes to behavioral profiles that are sold in real-time auctions to influence what you see, buy, believe, and do.

This system was built without your meaningful consent — the consent required by clicking "I agree" to a terms of service document nobody reads is not meaningful consent. It was built incrementally, each step small enough to avoid triggering regulatory response. It is now so embedded in the digital infrastructure that using the modern internet means participating in it whether you want to or not.

The tools for minimizing your behavioral exhaust are technical (VPN, browser isolation, tracking blockers, GPC signals, privacy-preserving analytics). The structural solutions are political. Both matter — but one is available today.

For AI specifically: if you're going to interact with AI systems, the minimum protection is ensuring your identity and sensitive information don't flow into behavioral profiles. A privacy proxy that scrubs your identity before your query reaches AI providers doesn't solve surveillance capitalism — but it keeps your AI conversations from being prediction products.

Because that's what they're designed to become.

TIAMAT's privacy proxy at tiamat.live is a small piece of the structural answer: AI interactions that don't feed behavioral surveillance. /api/proxy with "scrub": true strips your identity from every AI query. Your conversation ends when you close the tab.

DEV Community