DEV Community

Super Funicular
Super Funicular

Posted on

How to Read an Android Camera App's Architecture Like a Texas Regulator: A Three-File Audit

A week ago I wrote about the Texas AG opening a privacy investigation into Meta's AI Glasses and what it implies for any product that captures video. That piece was about the legal precedent. This piece is the developer-facing companion: if you wanted to audit an Android camera app the way a regulator's expert witness would, where would you actually look? It turns out you can answer most of the meaningful privacy questions by reading three files inside the APK — no marketing copy, no privacy policy, no vendor blog. The architecture is forced to tell on itself.

I'll walk through the three files using Background Camera RemoteStream, the Android app I ship, as the worked example. The point isn't that we're better than anyone else — the point is that this audit applies to any APK on the Play Store, and the patterns are easy to spot once you know where to look.

File 1: AndroidManifest.xml — the permission and intent surface

Every Android app declares what it can do in its manifest. You can extract it from any APK with apktool d <app>.apk and read the result. The interesting question isn't "what does this app advertise" — it's "what does the OS think this app is allowed to do?" The two are often different.

Three patterns to look for:

Permission inflation. A camera app should request CAMERA, RECORD_AUDIO, and storage permissions appropriate to its API level. A camera app should not need ACCESS_FINE_LOCATION, READ_PHONE_STATE, READ_CONTACTS, GET_ACCOUNTS, or BLUETOOTH_CONNECT. When you see permission inflation, you're looking at a signal that the app does something other than record video — usually telemetry, ad targeting, or device-fingerprinting. A flashlight app that wants location permission is a meme for a reason; the same dynamic applies to camera apps that want unrelated permissions.

Background-execution intent filters. Look for BOOT_COMPLETED receivers, JobService declarations, WorkManager work specs, and foreground services with non-camera types. An app that registers itself to run at boot, or that declares a dataSync foreground service in addition to a camera one, has a path to do work when the user isn't looking at it. That isn't automatically bad — many legitimate apps need background sync — but for a camera app where the user expectation is "I press record, the camera records," any background-execution scaffolding deserves scrutiny.

Third-party SDK manifest entries. Many SDKs — analytics, crash reporting, ad networks, attribution — inject their own activities, services, and receivers into the host manifest. You can read off the SDK inventory by looking at the package prefixes on declared components: com.google.android.gms.ads.*, com.facebook.appevents.*, com.amplitude.*, io.branch.*, com.appsflyer.*. Each one of those is a data flow the user didn't see in the marketing copy.

For our app, the manifest declares CAMERA, RECORD_AUDIO, FOREGROUND_SERVICE with the camera type, INTERNET (needed for the optional YouTube Live RTMP stream), and the storage permissions appropriate to the SDK version. There is no ACCESS_FINE_LOCATION, no READ_PHONE_STATE, no BOOT_COMPLETED receiver, and no third-party SDK manifest entries. That is what a "structurally incapable of tracking you" manifest looks like in source. It isn't a virtue — it's the absence of the things that would make tracking possible.

File 2: network_security_config.xml — the cloud-boundary declaration

Since Android 7 (API 24), apps can declare their network behavior in a network_security_config.xml referenced from the manifest. This file is the cleanest single source of truth for "where can this app talk to over the network." For a camera app, this is where you find out whether the cloud-relay architecture exists.

Three patterns to look for:

Cleartext traffic permissions. A cleartextTrafficPermitted="true" flag at the base-config level means the app can talk to any HTTP endpoint without TLS. Combined with INTERNET permission, this is usually a sign of legacy SDK behavior or a sloppy backend — not necessarily malicious, but it makes traffic interception trivial and almost always indicates the app talks to non-trusted endpoints. A camera app whose only network use is the user's own local Wi-Fi shouldn't need cleartext at all.

Domain pinning. A well-architected app that does need network access usually pins specific domains. The absence of pinning is a signal that the app talks to many endpoints, often through SDK abstractions where the destination isn't even known at build time. If you see <domain-config> blocks with explicit certificate pins for *.googleapis.com and youtube.com, you're looking at a deliberate, narrow cloud surface. If you see no <domain-config> blocks at all but the app has INTERNET permission and ad SDKs, the network surface is whatever the SDKs decide it is.

Cleartext local-network exceptions. This is the interesting one for camera apps that stream over LAN. To run an HTTP server on the local subnet, the app needs cleartextTrafficPermitted="true" for the local IP ranges only — typically declared as <domain-config cleartextTrafficPermitted="true"> with explicit 10.0.0.0/8, 172.16.0.0/12, and 192.168.0.0/16 ranges. Seeing this pattern (cleartext narrowly scoped to RFC-1918 ranges) is a positive signal: it means the app expects to serve traffic locally, not to ship it to a vendor cloud.

Our app's network_security_config.xml permits cleartext for the local RFC-1918 ranges only (so the in-app HTTP server can be reached by the user's other devices on the same Wi-Fi), and the only outbound TLS endpoint is YouTube's RTMP/RTMPS ingest for the optional Live feature. There is no analytics endpoint, no attribution endpoint, no crash reporter, no SDK-driven backend. The cloud bill is literally zero per user per day — and that economic property is what removes the structural incentive to monetize the data in the first place, a dynamic I unpacked in The Cloud-Bill Theory of Free Camera Apps.

File 3: the dependency graph — what is actually linked into the binary

The third file isn't one file, exactly — it's the dependency declaration in build.gradle (or the resolved transitive set you can read out of META-INF/). This is where the real answer to "what does this app do" lives, because every behavior an app can have ultimately comes from code that's either in its own source tree or in a library it depends on.

Three patterns to look for:

Analytics/attribution SDK presence. Firebase Analytics, Mixpanel, Amplitude, Segment, Branch, Adjust, AppsFlyer, Singular, Kochava — any one of these is a built-in telemetry pipeline. The user opens the app, the SDK reports session data to the vendor's backend, the vendor sells aggregate signals. The presence of any of these in a camera app where the user wasn't told about it is the gap between privacy policy and code.

Ad network SDK presence. Google AdMob, IronSource, AppLovin MAX, Meta Audience Network, Unity Ads. Each one is a separate data flow. A camera app that displays no visible ads but still depends on AdMob is worth investigating — sometimes it's a leftover from a free tier, sometimes it's running invisible ad-call cycles for fraud-attribution money.

Cloud-storage / cloud-relay SDK presence. AWS Mobile SDK, Firebase Storage, Google Cloud Storage client libraries, third-party WebRTC TURN servers, custom RTMP relays. These are the load-bearing dependencies of a cloud-backed camera app. If you see them in the dependency graph, you're looking at an architecture that requires the cloud — independent of what the marketing says about local storage.

Our app's dependency graph: AndroidX core, AndroidX Camera2 wrappers, NanoHTTPD for the embedded local HTTP server, the RTMP client library used for the optional YouTube Live feature, and a small number of utility libraries. There is no analytics SDK, no attribution SDK, no ad network SDK, no cloud-storage SDK. The dependency graph is the architecture — there isn't a hidden "and also" line.

Why the three-file audit holds up

The interesting property of this audit is that it can be done by a third party, on any APK, without trusting the vendor's claims. The Texas AG's investigators won't ask Meta to self-report. They'll have someone extract the APK and read the same files. The files don't lie — they're what the OS enforces against at runtime.

The three architectural questions I raised in the anchor piece map cleanly onto the three files:

Question from the anchor File that answers it
Is recording opt-in per session, or always-on by default? Manifest (intent filters, services, BOOT_COMPLETED receivers)
Where does the footage live — locally, in a vendor cloud, or fed into a model? network_security_config.xml + dependency graph
Is biometric inference happening, and if so, where? Dependency graph (vision/ML SDK presence) + manifest (background services)

A privacy claim that the manifest, network config, and dependency graph all agree with is structurally durable. A privacy claim that only the marketing copy makes is structurally fragile — the next product manager can break it without breaking anything users can see.

What this changes for developers

If you build anything that touches a camera, audio, or biometric-derivable data, this is the audit your product will eventually be subjected to. The Texas Meta filing is one instance of a broader pattern: regulators are getting more comfortable reading architecture, not just policy.

Three concrete things worth doing this week:

  1. Run apktool against your own release APK and read the manifest. Look for permissions you don't remember requesting, services you don't remember adding, and SDK-injected components. Decide for each one whether you can defend it on the record.
  2. Open your network_security_config.xml and walk through every endpoint your app can talk to. Some will be your own infrastructure, some will be SDK-driven. Decide whether each is documented in your privacy policy.
  3. Resolve your full transitive dependency tree (./gradlew app:dependencies) and grep for the SDK names in this article. Decide whether each one earns its place in the binary.

The interesting outcome of this exercise isn't that you'll find a smoking gun — it's that you'll discover what your product actually does, separate from what it says it does. For most products, those two are closer than for Meta Glasses but further apart than you'd hope.

The architecture is the privacy story. The marketing copy is a hypothesis.


Related reading

If you want to look at what a manifest, network config, and dependency graph deliberately designed to make tracking structurally impossible look like in shipping form, Background Camera RemoteStream is the one we ship. More on the architectural reasoning at superfunicular.com.

Top comments (0)