Tiamat

Posted on Mar 7

Your Phone's Location History Is for Sale: Inside the $32 Billion Data Broker Industry That Tracks Every Move You Make

#privacy #security #ai #webdev

Your phone sent your GPS coordinates to 47 different companies yesterday. You consented to this by tapping "Allow" on a weather app 18 months ago. You probably don't remember doing it. You almost certainly don't remember which app it was. But somewhere in a data center right now, a server owned by a company you've never heard of holds a precise log of where you slept last night, where you worked this morning, which hospital you visited three Tuesdays ago, and which political rally you attended in November. That log has been bought and sold multiple times. It will be sold again before you finish reading this article. Welcome to the location data economy.

The SDK Supply Chain: How Your Location Gets Harvested

The mechanism is elegant in its invisibility. You download a flashlight app, a local news app, a weather widget. The developer who built it — a small shop trying to monetize their creation — has embedded a third-party software development kit (SDK) provided by a data broker. This SDK arrives pre-packaged, documented, and revenue-generating: the developer earns a few cents per user per month. All they have to do is include a few lines of code.

That SDK does not ask for your location permission separately. It inherits the permission you already granted the host app. When you tapped "Allow" on the weather app's location prompt, you were consenting to weather forecasts. You were simultaneously authorizing a background process to transmit your GPS coordinates — latitude, longitude, altitude, timestamp, device ID — to a data broker's ingestion pipeline, potentially dozens of times per hour, indefinitely.

The major SDK providers operating in this supply chain include Foursquare's Pilgrim SDK, SafeGraph's data collection layer, X-Mode Social (now rebranded as Outlogic), and Venntel. These companies don't build the apps you use. They are infrastructure. They sit inside apps the way parasitic organisms sit inside a host — extracting value, invisible to the end user, protected by terms-of-service language that technically constitutes "consent" but functions as a liability shield rather than genuine disclosure.

The numbers are staggering. A single SDK embedded in a moderately popular app can reach tens of millions of devices. X-Mode Social, before its 2022 regulatory problems, claimed to collect data from 25,000+ apps. That single "Allow" tap on one app doesn't send your location to one company. Through SDK sub-licensing arrangements — where one broker resells data rights to others — a single permission grant can result in your location data flowing to 20 or more distinct corporate entities before the end of the week.

Who the Brokers Are

The location data broker industry is dominated by a handful of major players who operate largely outside public awareness, despite handling the most intimate digital traces of hundreds of millions of people.

SafeGraph built its business on "Places" data — precise foot traffic analytics for physical locations. Before 2021, it sold academic data access packages, including to Harvard's Dataverse. After a Motherboard investigation revealed it was selling data on visits to Planned Parenthood clinics, SafeGraph announced it would stop selling such data. The policy lasted one news cycle. The underlying data collection continued.

Veraset is SafeGraph's data licensing spinoff, handling bulk commercial and government sales. It operates in the shadows of its better-known parent, processing mobility data that tracks where populations of devices move at city scale.

X-Mode Social, rebranded as Outlogic, built its initial business on a single app: a Muslim prayer app used by millions of people globally. The app's location permissions fed an SDK that tagged users' religious identities through pattern inference. The company then sold this data — and data from thousands of other apps — to US military contractors and intelligence agencies, a practice Vice/Motherboard documented in 2020. Following the resulting public outcry, Google and Apple banned the SDK from their app stores. X-Mode rebranded as Outlogic and continued operating.

Babel Street is primarily a government-facing analytics company. Its Locate X platform aggregates location data purchased from brokers and provides a search interface for law enforcement and intelligence agencies. It does not require warrants, court orders, or any legal process beyond a government contract. You search a location, you see which device IDs were present, you trace those IDs over time.

Near Intelligence went public via SPAC merger in January 2023, giving the public a rare window into location data broker financials. The company reported $100+ million in annual revenue, with clients including retailers, hedge funds, advertising agencies, and government contractors. Near Intelligence's CEO, Anil Mathews, made headlines in 2022 when he publicly stated the company could identify and track visits to abortion clinics — a statement that would prove consequential when the FTC came calling.

Gravy Analytics was, until December 2024, one of the industry's quieter major players. That changed dramatically when it was breached.

What They Actually Sell: Patterns of Life

"Location data" is the sanitized term. What's actually being sold is something more invasive: a behavioral dossier constructed through location inference.

The methodology works like this. A device that appears at the same address between 10 PM and 7 AM on most nights is almost certainly at its owner's home address. A device that clusters at a different address between 9 AM and 5 PM on weekdays is at its owner's workplace. These inferences are correct roughly 90% of the time and require no additional information. Within 72 hours of data collection, any person's home address and employer can be derived automatically.

From that foundation, the analysis extends. A device that appears at a mosque every Friday, a synagogue every Saturday, or a church every Sunday has its owner's religious affiliation flagged in the data. A device present at a campaign rally, a union hall, or a political fundraiser has its owner's political associations inferred. A device visiting an oncology clinic, a psychiatric facility, a methadone clinic, or a Planned Parenthood has its owner's health conditions probabilistically logged.

All of this is sold as "anonymized." The actual product, in industry terminology, is Patterns of Life analysis — the same phrase used by military intelligence to describe surveillance of high-value targets in conflict zones. The term migrated from the battlefield to the commercial data industry with essentially no friction, because the underlying methodology is identical.

The applications are diverse and lucrative. Retailers purchase foot traffic data to optimize store placement and measure whether advertising campaigns drive physical visits. Insurance companies buy mobility data to audit whether policyholders are actually driving as infrequently as they claim. Hedge funds purchase location signals to predict retail earnings before they're announced. AI training labs purchase location datasets to ground-truth behavioral models. And governments — local, state, and federal — buy all of it, for purposes ranging from urban planning to immigration enforcement.

The Anonymization Fiction

Every data broker selling location data insists it is "anonymized." The claim does not survive scrutiny.

A landmark 2013 paper published in Scientific Reports by Yves-Alexandre de Montjoye and colleagues at MIT demonstrated that just four spatio-temporal data points — four location-timestamp pairs — are sufficient to uniquely identify 95% of individuals in a mobility dataset of 1.5 million people. The finding has been replicated and extended repeatedly. Subsequent research using coarser data — cell tower pings rather than GPS — achieved similar re-identification rates.

The "anonymous ID" in location data is, in practice, a permanent pseudonym. It is a consistent identifier tied to a specific device. That device travels with one person. The same anonymous ID appearing at a home address, a workplace, a hospital, and a political rally is not anonymous. It is a portrait of a specific human being, assembled automatically, requiring no additional effort from the analyst.

Re-identification takes minutes. A researcher with access to a location dataset and public voter registration rolls — which include home addresses — can cross-reference the two datasets to recover real names attached to "anonymous" device IDs at scale. Researchers at Princeton's Center for Information Technology Policy documented this process in 2019 using commercially available datasets. The data brokers knew. They chose not to fix it because fixing it would destroy the product.

Government Purchases: The Warrant-Free Surveillance Loophole

In Carpenter v. United States (2018), the Supreme Court ruled 5-4 that law enforcement accessing historical cell-site location information — records held by phone carriers — requires a warrant under the Fourth Amendment. Chief Justice Roberts, writing for the majority, noted that seven days of location data reveals "a detailed chronicle of a person's physical presence compiled every day, every moment."

The ruling was hailed as a landmark privacy decision. It contained a critical loophole.

Carpenter addressed government requests to carriers for phone records. It said nothing about the government simply purchasing location data from commercial brokers. Because the data is voluntarily sold on the open market — because anyone with a contract can buy it — the government's position is that no Fourth Amendment protection applies. The "third party doctrine" that Carpenter partly curtailed still governs commercial data purchases.

The result: federal agencies can buy a real-time location history of anyone in the United States without a warrant, a court order, or any judicial oversight, as long as they pay a data broker for it.

ICE and CBP have done exactly this. Documents obtained by the ACLU through FOIA requests, first reported in detail by Vice/Motherboard and the Wall Street Journal, confirmed that Immigration and Customs Enforcement purchased access to Venntel's location data platform. Venntel, a subsidiary of Gravy Analytics at the time, provided a tool allowing agents to query the location histories of specific device IDs. CBP held a separate Venntel contract. Both agencies used the data without warrants to track individuals including undocumented immigrants and, according to ACLU analysis of the contracts, at border regions and protest locations.

The Department of Homeland Security purchased location data from LexisNexis Risk Solutions — a company better known for credit checks and legal research — under contracts that FOIA requesters documented in 2022. DHS was not purchasing location data to fight terrorism or organized crime. Internal documents showed the data was used for routine immigration enforcement, including tracking individuals who had overstayed visas.

The Babel Street Locate X platform, used by the Secret Service, the IRS Criminal Investigation division, and multiple other federal agencies, allows analysts to draw a geofence around any location — a mosque, a protest, a political campaign office — and pull the device IDs of everyone present during a specified time window. No warrant. No judge. A government credit card and a data broker contract.

Post-Dobbs: Abortion Surveillance and the $160 Dataset

On June 24, 2022, the Supreme Court issued its decision in Dobbs v. Jackson Women's Health Organization, overturning Roe v. Wade and returning abortion regulation to states. Within days, data brokers were fielding inquiries about abortion clinic visit data.

Motherboard's Joseph Cox documented the most concrete demonstration of the danger. In July 2022, Cox purchased a dataset from a broker called SafeGraph for $160 that showed device movement patterns around Planned Parenthood clinics across the United States — including home locations of devices that had visited, and where those devices traveled afterward. The purchase required no credentials, no explanation, no verification. A credit card and a web form.

Near Intelligence CEO Anil Mathews was more explicit than most of his peers. In public statements reported by multiple outlets, Mathews confirmed that Near Intelligence could identify devices visiting abortion clinics and that such data could be sold. The statements were made in a matter-of-fact business context, not as warnings. To Mathews, "reproductive health POI visits" was simply another data category in a catalog.

SafeGraph, after the Motherboard story, announced it would stop selling data on visits to clinics including "domestic violence shelters, addiction treatment facilities, weight loss clinics, cosmetic surgery clinics, fertility clinics, erectile dysfunction clinics, and abortion clinics." Other brokers made similar announcements. The announcements were not accompanied by auditable enforcement mechanisms, third-party verification, or regulatory teeth. The data already collected — years of clinic visit patterns, home address inferences, device IDs — was not deleted.

At least four states with abortion restrictions passed laws following Dobbs that could theoretically criminalize traveling out of state to obtain an abortion. Location data in the hands of a state prosecutor, obtained without a warrant via commercial purchase, would be the primary evidence in such a case.

The Gravy Analytics Breach: A National Security Liability

On December 30, 2024, a threat actor identifying as "Javaghost" posted on a cybercriminal forum claiming to have breached Gravy Analytics, one of the largest location data brokers in the United States. The data posted as proof contained precise GPS coordinates, timestamps, and device identifiers for tens of millions of devices.

Security researchers who analyzed the leaked sample data confirmed its authenticity. The dataset included location histories showing devices at US military bases including Fort Bragg and Naval Station Norfolk. It showed devices at nuclear facilities. It showed devices at the NSA's headquarters at Fort Meade and CIA facilities in northern Virginia. It showed devices at foreign embassy compounds in Washington DC.

Gravy Analytics held contracts with multiple US government agencies. Its subsidiary Venntel was a direct vendor to ICE and CBP. The company had argued, as all data brokers do, that its data was "anonymized" and therefore not a national security concern. The breach demonstrated the obvious: a database containing the location histories of every device that entered a classified facility is itself a classified-grade intelligence asset, regardless of whether real names are attached.

The breach was not the result of a sophisticated nation-state attack. It was a routine intrusion of a commercial company with inadequate security practices. The same surveillance infrastructure built to track immigrants and target abortion clinic patients had quietly accumulated a comprehensive map of US national security personnel's movements. It was sitting on commercial servers, unencrypted, protected by the same security posture as a mid-sized marketing company — because, legally, that's what Gravy Analytics was.

FTC Enforcement: The $1 Fine and Its Meaning

The Federal Trade Commission has taken enforcement action against two major location data brokers. The results are instructive.

Near Intelligence FTC settlement (2024): Near Intelligence, which had gone public in 2023 and posted over $100 million in annual revenue, settled FTC charges (File No. 222 3096) that it had sold sensitive location data including visits to medical facilities, religious institutions, and political gatherings without adequate consumer notice or consent. The settlement required Near Intelligence to delete historical location data near "sensitive locations" and to implement a data minimization program.

The financial penalty: $1.

Not a million dollars. Not a thousand. One dollar. The FTC's authority to impose civil penalties was constrained by the absence of a prior rule violation, leaving the agency able to impose only nominal fines in the initial settlement. Near Intelligence, by the time the settlement was finalized, had filed for bankruptcy — though not before its executives had extracted significant compensation during the SPAC-era valuation run-up.

Outlogic (formerly X-Mode Social) settlement (2024): The FTC banned Outlogic from selling "sensitive location data" — defined to include visits to medical facilities, religious organizations, correctional facilities, domestic violence shelters, and places associated with political or union activity. The company was also banned from selling any location data to US military contractors and intelligence-adjacent firms, the business that had made X-Mode Social controversial in the first place.

The settlements' deterrent effect on a $32 billion industry is approximately zero. The FTC's actions were significant as legal precedents — establishing that selling sensitive location data constitutes an unfair practice — but the enforcement timeline spans years, the penalties are nominal, and the companies' core data collection practices continue under modified policies that remain self-reported and unaudited.

The Hedge Fund Angle: Legal Insider Information

In the financial world, location data has generated a parallel economy with its own ethical murk.

Hedge funds and quantitative trading firms have purchased foot traffic data from location data brokers since at least 2015. The use case is simple: if you can measure how many devices visited Walmart stores across the country in a given month before Walmart reports its quarterly earnings, you have a highly predictive signal for the earnings number itself. Buy or short accordingly.

Firms including Two Sigma, Point72, and dozens of smaller quant shops pay millions annually for "alternative data" subscriptions that include location signals. SafeGraph's and Veraset's investor relations materials cite hedge funds and financial services firms as major revenue segments. The business is legal. The SEC has issued guidance noting that purchasing commercially available alternative data does not constitute insider trading, as long as the data was obtained through normal commercial channels and not through a breach of fiduciary duty.

The irony is precise. A person's daily commute, medical appointments, church attendance, and shopping habits — collected without their meaningful understanding through an SDK they never heard of — are being used by a fund manager in Greenwich, Connecticut to trade on their collective behavior before the companies they patronize report their financials. The individual bears all the privacy risk. The fund captures all the economic value.

State Laws vs. the Federal Vacuum

Legislative responses to location data collection have been scattered and inadequate.

California's SB 362 — the DELETE Act, signed into law in October 2023 — requires all data brokers doing business in California to register with the state, honor deletion requests submitted through a single centralized portal, and undergo annual audits. It is the most comprehensive state-level data broker law in the country. It applies to California residents.

Oregon, Texas, and Montana have passed broader consumer privacy laws with provisions covering sensitive data including location. Virginia, Colorado, and Connecticut have data privacy frameworks that include location data categories. As of early 2026, 47 states have no specific regulation of location data brokers.

There is no federal location privacy law. There has never been one. Multiple bills have been introduced — the Location Privacy Protection Act, the American Data Privacy and Protection Act — without passing. The advertising industry's lobbying apparatus, which spent over $100 million opposing federal privacy legislation between 2019 and 2024, has successfully prevented the passage of any comprehensive federal framework.

The practical consequence: a data broker incorporated in a state with no privacy law, selling data about residents of states with no privacy laws, to government agencies and hedge funds that face no restrictions on purchasing it, operates in a regulatory vacuum. The FTC can pursue Section 5 unfair practices cases but cannot issue fines at scale without prior rulemaking. Congress has shown no capacity to act. The $32 billion industry continues to grow.

What Actually Works: A Realistic Threat Model

The conventional privacy advice for location data is largely useless against the SDK supply chain.

Airplane mode stops real-time transmission but does not stop collection. Many SDKs buffer location data locally when no connection is available and transmit the buffered log when connectivity resumes. Switching airplane mode on during sensitive visits and off afterward produces a gap in the timeline, which is itself a data point, but doesn't delete buffered records.

VPNs do not help with location data collection. A VPN masks your IP address and encrypts traffic to the VPN server. It does not affect the GPS coordinates your device's location hardware collects, nor the device ID transmitted by SDKs, nor the data already collected before you activated the VPN.

Disabling location permissions entirely at the OS level is the most effective single action available on stock Android or iOS. On iOS, the "Precise Location" toggle — introduced in iOS 14 — allows apps to receive approximate location (within several kilometers) rather than GPS-precise coordinates. Enabling this for all apps except navigation significantly degrades the value of collected data. On both platforms, reviewing location permissions quarterly and revoking them from apps that don't require them for core functionality reduces exposure.

DNS-layer blockers — NextDNS, Pi-hole, or similar — can block the known domains of major SDK data collection endpoints. NextDNS maintains blocklists that include SafeGraph, X-Mode/Outlogic, and related broker infrastructure. This approach requires technical setup but meaningfully interrupts data transmission at the network level.

Ad ID reset and opt-out: Both iOS and Android allow you to reset your Advertising Identifier (the persistent pseudonym underlying much location data collection) and opt out of ad tracking. iOS's App Tracking Transparency framework, introduced in 2021, requires apps to request explicit permission before accessing the Ad ID. Roughly 75% of iOS users deny this permission when asked, which has substantially degraded the precision of commercial location data collected from iOS devices specifically.

GrapheneOS, an Android fork focused on security and privacy, provides the most comprehensive technical protections available on a mobile device. It includes per-app Network permission controls (allowing an app to function without internet access), a Sensors Off hardware toggle, and hardened permission models. It runs on Google Pixel hardware. For individuals with elevated threat models — journalists, activists, healthcare workers, attorneys — it represents the current gold standard in mobile privacy.

The AI Layer: Location Data Enters the Inference Pipeline

There is a newer, less-discussed dimension to location surveillance: the AI assistant.

When you ask a voice assistant or AI chatbot for restaurant recommendations "near me," directions to a specific address, or information about a medical facility you're visiting, the location data embedded in that query enters the AI provider's logging infrastructure. Prompt logs are retained for model improvement, safety review, abuse detection, and in many cases, commercial purposes. The location context in those prompts — "restaurants near 1247 Oak Street," "directions to the Mercy Hospital oncology wing" — is as sensitive as any GPS coordinate.

Major AI providers have published privacy policies governing this data, but the policies are not audited in real time and the logging practices are opaque to end users. The AI query layer has become a new vector for location data aggregation, one that combines the sensitivity of GPS coordinates with the explicit intent revealed by natural language.

Projects building privacy-preserving AI infrastructure are beginning to address this directly. Approaches include local inference (running the model on-device so queries never leave the hardware), query scrubbing pipelines that strip or generalize location identifiers before queries reach cloud infrastructure, and proxy architectures that insert a sanitization layer between the user's device and the AI provider's API. These approaches trade some capability for privacy protection — a local model can't match a frontier cloud model — but for sensitive queries, the tradeoff is worth making.

The location data broker industry and the AI inference industry are converging. Behavioral data collected by brokers over years provides training signal for AI models. AI queries generate new behavioral data that may itself be brokered. The infrastructure is merging, and the user is at the center of it, generating value in both directions simultaneously.

The Invisible Infrastructure

The picture that emerges from these threads is of an infrastructure so thoroughly woven into daily digital life that most people cannot see it, let alone opt out of it.

Every morning, millions of people wake up and carry a device that was collecting location data while they slept. They commute to work — location logged. They stop at a pharmacy — flagged as a pharmacy visit, possible health inference triggered. They attend a religious service — religious affiliation noted. They visit a doctor — medical category tagged. They drive past a political campaign office — affiliation signal recorded. None of this requires them to do anything. They consented, technically, eighteen months ago, to a weather app.

The data flows through SDKs to brokers to resellers to government agencies, hedge funds, insurance underwriters, and AI training pipelines. Some of it ends up in a federal law enforcement database used to track people without warrants. Some ends up in a trading algorithm that profits from their shopping habits. Some ends up in the hands of a foreign intelligence service via a breach at a mid-sized data broker with inadequate security.

The $32 billion figure represents what the market pays for this surveillance. It does not represent what it costs the people being surveilled — in privacy, in autonomy, in the potential for that data to be used against them in a reproductive healthcare investigation, an immigration enforcement action, or a political persecution.

Carpenter v. United States told us that persistent location surveillance "risks government encroachment of the sort the Framers, 'after consulting the lessons of history,' drafted the Fourth Amendment to prevent." The Court was talking about government surveillance. What it didn't fully address is what happens when the government simply purchases the surveillance it cannot legally conduct itself.

The answer is in the data. It's always been in the data.

Key cases and regulatory actions referenced in this article: Carpenter v. United States, 585 U.S. 296 (2018); FTC v. Near Intelligence, File No. 222 3096 (2024); FTC v. Outlogic (formerly X-Mode Social), File No. 212 3038 (2024); California SB 362 (DELETE Act, 2023). Breach details: Gravy Analytics incident, December 2024. FOIA documentation: ACLU v. DHS, Venntel contract records; Vice/Motherboard FOIA analysis, 2020-2022.

DEV Community