DEV Community: Spicy

Why HIPAA Doesn't Cover the Health Data Your API Just Pulled

Spicy — Sat, 04 Jul 2026 12:10:45 +0000

If you're building anything on top of HealthKit, Google Fit, or a direct wearable API integration, there's a compliance assumption worth checking before you ship: HIPAA almost certainly doesn't apply to the data you're pulling.

The Covered Entity Problem

HIPAA only regulates "covered entities" — hospitals, doctors, health insurers, and their direct business associates. A consumer wearable API (Apple HealthKit, Google Fit, Fitbit Web API, Oura API, Whoop API) doesn't make your app a covered entity just because it's handling heart rate or sleep data. Unless you're building specifically for a healthcare provider under a signed business associate agreement, you're outside HIPAA's scope entirely, according to a legal breakdown from the Law Office of Jeffrey Hall.

That means the usual HIPAA checklist — encryption at rest, audit logs, minimum necessary access — is good practice, but it's not a legal requirement for most consumer-facing health apps. Nobody is going to fine you under HIPAA for over-collecting sleep data. That's the trap, not the reassurance.

What Actually Regulates This Data

Regulation	Applies to consumer wearable apps?
HIPAA	No (unless working with a covered entity)
GDPR (EU users)	Yes, as sensitive personal data
CCPA (California users)	Yes
Washington My Health My Data Act	Yes, if any WA users
App store health data policies	Yes

The most interesting one for engineers is Washington's My Health My Data Act (RCW 19.373), passed in 2023. It's the first US state law written specifically to close the HIPAA gap for consumer health data, and it applies to any business with Washington users, not just Washington-based companies. If your app has any US user base, you likely have Washington users. The Act requires:

A separate, prominently linked Consumer Health Data Privacy Policy (not folded into your general privacy policy)
Explicit opt-in consent before collection or sharing, not just a terms-of-service checkbox
A functioning deletion request path across all systems, including backups
Written authorization retained for six years if you ever sell health data

Full text is at RCW 19.373 if you're scoping compliance work.

Practical Implementation Notes

Data retention. If your backend keeps synced health metrics after a user deletes their account, you're not just accumulating liability, you're likely violating deletion-right requirements in at least one jurisdiction your users touch. Build a real deletion pipeline that hits backups, not just the primary table.

Third-party SDKs. If you're piping HealthKit data into an analytics SDK, an ad network, or a crash reporter, check that SDK's data handling terms specifically. "Anonymized" health data has been shown to be re-identifiable at high accuracy when cross-referenced with other datasets, which is a real risk if you're forwarding granular biometric streams anywhere.

Consent UX. A single "Allow Health Access" iOS permission prompt is not equivalent to the affirmative, specific consent required under Washington's law or GDPR. If you're monetizing health data in any way (even aggregated), build a separate consent flow for that specific use.

Employer/insurer integrations. If you're building a wellness-program integration that feeds data to an employer or insurer, treat that data flow as high-risk by default. There's no HIPAA wall preventing that data from eventually informing risk decisions, and the legal exposure sits with whoever built the pipe.

What I Actually Found

Most developer documentation for these APIs focuses entirely on the technical integration and barely mentions the regulatory landscape. Apple's HealthKit guidelines are the exception — they're genuinely strict about app review for health data access. Google Fit and most third-party wearable APIs leave compliance almost entirely up to you.

If I were architecting a new wearable-data feature today, I'd default to treating every user as if they were covered by Washington's My Health My Data Act, since it's currently the strictest applicable standard and building to it covers most other jurisdictions by default.

Full write-up with the consumer-facing angle and more on the legal background: https://lucas8.com/wearable-health-data-hipaa-gap

The Solar Tax Credit Died in 2026. Here's the Decision Tree.

Spicy — Sat, 04 Jul 2026 12:09:41 +0000

If you priced out solar before 2026 and priced it out again this year, the numbers probably look worse. That's not a sales tactic or a market shift. The 30% federal residential solar tax credit (Section 25D) ended on December 31, 2025, with zero phase-down period. No grandfathering, no partial credit for contracts signed in 2025 but installed in 2026.

That said, the credit's disappearance created a fork in the decision tree rather than a dead end. Here's the breakdown, without the sales pitch.

What Actually Ended

Section 25D let homeowners deduct 30% of system cost directly from federal taxes, with no cap and no income limit. It was legislated to run through 2034. The One Big Beautiful Bill Act, signed July 2025, moved the termination date to the end of 2025 instead — a nearly decade-early sunset. If your system was placed in service on or before December 31, 2025, you still claim the credit on that year's return. If it goes live in 2026 or later, the credit doesn't apply, full stop.

The mechanism that survived is ownership structure. Section 25D covered direct ownership only. Section 48E, the commercial/business investment credit, is still active for entities that own solar equipment and lease or sell power from it. That single distinction is why "no tax credit" isn't accurate for every path to solar in 2026.

The Three Remaining Paths

1. Third-party ownership (lease/PPA). A company owns the panels on your roof; you buy the electricity. That company can still claim the Section 48E credit and typically passes savings through as a lower monthly rate instead of a lump-sum deduction on your return. You lose equity-building and any ownership-dependent state incentives, but if your tax liability was too small to use 30% anyway, this closes most of the gap.

2. State-level programs, independent of federal policy. These never depended on 25D and mostly survived untouched:

State income tax credits (a minority of states)
Sales tax exemptions on solar equipment (CA, FL, AZ, CO, NY, and others)
Property tax exemptions so added home value doesn't raise your tax bill (30+ states)
Production-based payments like New Jersey's SuSI program, which pays per kWh for 15 years

Check your specific state through the Database of State Incentives for Renewables & Efficiency, the DOE-funded clearinghouse — coverage is wildly inconsistent state to state, so a generic "solar incentives still exist" claim is close to useless without checking your own address.

3. Net metering and community solar. Net metering — getting credited for excess power sent back to the grid — is a utility mechanism, not a tax credit, so it wasn't touched by the OBBBA at all. Community solar (buying a share of an off-site array instead of installing your own panels) extends this to renters and shaded-roof homes with zero upfront installation cost.

Quick Comparison

Path	Who claims the credit	You own the system	Works if you have low tax liability
Cash/loan purchase	No one (credit gone)	Yes	No benefit either way
Lease / PPA	Third-party owner	No	Yes — savings via lower rate
State credit + exemptions	You	Yes	Depends on state
Net metering / community solar	You	Optional	Yes — utility-side, not tax-side

What I Actually Found

Most coverage treats this as one national story: "credit's gone, solar's still worth it long-term." That framing skips the part that actually matters — the size of the gap left behind depends entirely on your state's net metering rules and exemption stack, not on a national average payback period.

If I were running these numbers for myself, I'd ask an installer for two quotes side by side: total cost with zero incentives applied, and total cost after every program that applies specifically to my address. If an installer can't produce both numbers without a follow-up call, they're probably still quoting off a pre-2026 script.

The federal credit for direct ownership isn't coming back this cycle. But state programs, net metering, and lease structures are still on the table for most of the country, and stacking even two of them gets you closer to 2025's numbers than most headlines suggest.

Full breakdown with a state-by-state incentive walkthrough: https://lucas8.com/solar-tax-credit-alternatives-2026

The Nutrition Label That Software Always Lacked

Spicy — Fri, 03 Jul 2026 16:45:31 +0000

On December 9, 2021, a security researcher posted a proof-of-concept exploit for a vulnerability in a Java library called Log4j. Within 72 hours, hundreds of millions of systems were at risk. The chaos that followed wasn't primarily about the vulnerability's severity. It was about something more fundamental: most organizations had no way to quickly search their own software to find out whether Log4j was buried somewhere inside it.

That moment made the software bill of materials — SBOM — go from a niche compliance topic to a board-level conversation almost overnight.

What Is an SBOM?

A software bill of materials is a structured, machine-readable inventory of every component inside a piece of software. Think of it as the nutrition label on a food package, applied to code. It tells you which open-source libraries, third-party packages, and internal components make up a given application — along with their versions, licenses, and any known vulnerabilities.

Modern software is rarely written entirely from scratch. A typical commercial application might contain 500 to 1,500 third-party dependencies by the time all transitive relationships are counted. Without an SBOM, tracking all of those relationships manually is effectively impossible.

Two open standards dominate today:

SPDX (Software Package Data Exchange)
  Origin: Linux Foundation
  Primary focus: License compliance
  Best for: Teams where legal is the main concern

CycloneDX
  Origin: OWASP
  Primary focus: Security and vulnerability tracking
  Best for: DevSecOps and security-focused teams

Why Log4Shell Made This Non-Negotiable

The Log4Shell vulnerability — CVE-2021-44228 affected an estimated 3 billion devices. But the more instructive part of the story was the response time gap.

Companies that had SBOMs — even rough, incomplete ones — could run an automated search and know within hours whether they were affected. Companies without them spent weeks manually auditing every application, tracing every dependency tree, and waiting on vendor responses.

The 2024 XZ Utils backdoor reinforced the same point. An attacker spent nearly two years contributing to an open-source project before inserting a hidden backdoor into a widely used compression library. Most organizations still had no automated way to know whether that library was in their stack. Supply chain attacks exploit exactly the visibility gap that an SBOM closes.

Who Requires SBOM Now

Compliance pressure has moved faster than most timelines:

US Executive Order 14028 (May 2021)
  → Federal software vendors must provide SBOMs
  → Status: Active

FDA Medical Device Guidance (Oct 2023)
  → SBOM required with every premarket submission
  → Status: Active

EU Cyber Resilience Act (Dec 2027)
  → Covers all software and connected devices sold in the EU
  → SBOM documentation explicitly required
  → Status: Full enforcement December 2027

According to ENISA's 2026 figures, 78% of enterprises have begun SBOM adoption. A DigiCert survey found that 43% of companies expect to face explicit SBOM requirements within 24 months. The gap between "starting adoption" and "production-ready compliance" remains substantial for most of them.

How to Actually Get Started

CISA maintains a comprehensive SBOM resource library covering formats, tooling, and implementation guidance. For most teams, the practical starting point looks like this:

Step 1: Pick a standard
  → Security-first team: CycloneDX
  → License compliance priority: SPDX

Step 2: Integrate automated generation
  → Open source: Syft (Anchore) — generates from container images and filesystems
  → Open source: Grype — vulnerability scanning against generated SBOMs
  → Commercial: FOSSA, Snyk, Black Duck — CI/CD pipeline integration

Step 3: Connect to a vulnerability database
  → NIST NVD
  → OSV (Open Source Vulnerabilities)
  → Commercial feeds from vendors above

Step 4: Establish update cadence
  → Tie SBOM generation to every build, not a quarterly audit

Start with a single application. Attempting to retrofit SBOM generation across an entire portfolio at once is where most programs stall.

What I Actually Found

The framing that dominates most SBOM coverage — "generate an SBOM, know what's in your software, fix vulnerabilities faster" — is accurate but incomplete in a way that matters practically.

Most organizations that invest in SBOM tooling discover the same thing in the first few months: the SBOM reveals a dependency tree that's far messier than anyone expected, and the organizational process for responding to what it reveals doesn't exist yet.

An SBOM is a diagnostic tool. It tells you what's there. But knowing your application has 847 dependencies, 23 of which have known CVEs, doesn't automatically generate a prioritization framework, a patch ownership model, or a vendor communication process.

The companies that responded fastest to Log4Shell weren't the ones with the most sophisticated tooling. They were the ones that had already answered "who owns this dependency when something goes wrong?" before the crisis hit. The SBOM tells you what you have. The harder question it surfaces is whether your organization is structured to act on it.

Full piece with more detail and external sources: lucas8.com/what-is-sbom-software-bill-of-materials

Is AI to Blame for Your Rising Electric Bill?

Spicy — Wed, 01 Jul 2026 15:06:30 +0000

In January 2026, a Virginia man received an electric bill for $281 — nearly triple the $100 he had paid the previous month. He had lived in the same house for 40 years. Nothing in his home had changed. What had changed was what surrounded it: Northern Virginia's "Data Center Alley," the largest concentration of AI data centers on the planet.

The question he started asking has become one of the defining consumer issues of 2026: are AI data centers genuinely responsible for rising electric bills? The honest answer is more nuanced than most headlines suggest. But the short version is: yes, partly, and the share is growing.

How Much Electricity Do AI Data Centers Actually Use?

The numbers are genuinely staggering. According to the IEA, data centers accounted for roughly half of all new electricity demand growth in the US last year — a share the agency expects to hold through 2030.

Total US data center energy demand is projected to nearly double between 2025 and 2028, jumping from 80 to 150 gigawatts. That 70-gigawatt addition is roughly equivalent to the entire annual electricity consumption of Spain added to the US grid in just three years.

Virginia alone now hosts nearly 600 data centers, with facilities accounting for close to 40 percent of all electricity used in the state in 2024. The race to find enough power for this demand has pushed the energy industry toward nuclear and other alternative energy sources, turning what was once a tech industry concern into a kitchen-table issue for millions of American households.

Why the Connection to Your Bill Is Real

The mechanism isn't a direct surcharge. But the path from data center construction to higher household costs is real, and it runs through the utility infrastructure system.

When a data center connects to the grid, utilities must upgrade infrastructure to handle the added load — new transmission lines, transformers, generation capacity. Those investments get approved by state regulators and recovered through rate increases spread across all customers in the service territory, including residential ones.

Goldman Sachs reported that US electricity prices jumped 6.9% in 2025 — more than double the headline inflation rate — and forecasts an additional 6% increase through 2027. In areas with heavy data center concentrations, Bloomberg found electricity costs rose 267% over the past five years.

A Consumer Reports survey from May 2026 found that 78% of Americans are concerned that data centers will continue raising their household energy bills. And when told about the "Ratepayer Protection Pledge" — a White House-sponsored agreement signed by Amazon, Google, Meta, Microsoft, OpenAI, Oracle, and xAI pledging to cover their full energy costs — 75% said they were not confident companies would follow through.

But Is AI the Only Culprit?

Here's where the honest answer gets complicated.

US residential electricity prices rose more than 30% between 2021 and early 2026 — a trend that began well before ChatGPT launched. Aging grid infrastructure, climate change, coal and natural gas plant closures, and structural issues in regional electricity markets were already pushing bills upward before hyperscalers started adding hundreds of gigawatts of demand.

SemiAnalysis published a detailed analysis arguing that an obscure capacity auction mechanism in the PJM market — covering 13 mid-Atlantic and Midwest states — accounts for much of the electricity price surge, with data center load playing an amplifying but not singular role.

The most accurate framing: AI data centers are a real and growing contributor to higher bills, but they're landing on top of a system that was already struggling with affordability. Your bills would be lower without the data center boom. They wouldn't be cheap regardless.

What I Actually Found

What surprised me wasn't the scale of the numbers — those are well documented. It was how clearly the politics had already shifted by mid-2026.

In 2023 and 2024, criticism of data center energy use came mostly from environmental advocates and local communities. By 2026, it had become a mainstream voter concern: bipartisan calls in Congress, gubernatorial campaigns in Virginia and New Jersey fought partly on utility affordability, a Maine moratorium on new data center construction, and a White House pledge that would not have existed if this weren't a live political issue at the presidential level.

The practical implication for a household is limited — there's no individual action that fully insulates you from regional rate increases driven by infrastructure decisions made at the utility and regulatory level. The most effective lever is engagement with state public utility commission proceedings, where rate increase requests are actually approved or rejected.

The Virginia man with the $281 bill was right to ask the question. The full answer is complicated, but it starts with the same honest admission: the data centers that run the AI tools we use every day are not free, and someone is paying for them.

Full piece with more detail and sources: lucas8.com/ai-data-center-electric-bill

That QR Code on the Parking Meter Might Be a Scam

Spicy — Tue, 30 Jun 2026 15:35:19 +0000

You scan QR codes constantly without thinking about it — the restaurant menu, the parking meter, the flyer taped to a lamppost. A QR code scam usually doesn't look like a scam at all, which is exactly the problem. It looks like a sticker. It looks like part of the wallpaper.

That gap between how QR codes look and what they can actually do is where quishing — QR code phishing — has quietly become one of the fastest-growing scam categories of 2026.

How QR Code Scams Work

A QR code scam works because the malicious part is invisible until the moment you scan it. Unlike a phishing email, where a suspicious link sits in plain text, a QR code hides its destination inside a pattern of black and white squares that no human can read without a camera.

Attackers exploit that blind spot by printing their own code on a sticker and placing it directly over a legitimate one — on a parking meter, a restaurant table tent, a flyer, or a delivery package. The fake code usually leads to a cloned login page, a fraudulent payment screen, or a malware download.

This isn't a small trend. Quishing incidents jumped 146% in the first quarter of 2026 alone, with nearly 18.7 million cases recorded in March, according to threat intelligence data reported by Watauga Democrat.

Where It Shows Up Most

Public parking meters are one of the most common targets, since the sticker format is easy to replicate. Police departments in Denver and Austin have documented fake QR stickers placed over legitimate parking codes, redirecting drivers to payment pages that steal card details.

Restaurant table tents, parcel delivery notices, fake court summons, and event posters round out the most common categories — all relying on the same trick: a context where scanning feels expected, not suspicious.

Email-based quishing has also grown. QR codes embedded in PDF attachments or images slip past traditional phishing filters that only scan visible text. According to the complete quishing guide from Is This QR Safe, this is exactly why security teams describe it as a blind spot in standard email defenses.

Why Even Careful People Fall For It

Most people scan QR codes without checking the destination first, and surveys have found that the majority of consumers can't reliably tell a malicious code from a legitimate one just by looking at it. High trust plus low verification is exactly what gives quishing room to grow even as awareness of regular email phishing improves.

How to Spot a Fake Code Before You Scan

Look at the sticker itself before you look at your phone. A code that's crooked, layered on top of a different sticker, or peeling at one corner is a strong sign someone placed it there after the fact.

Use your phone's built-in camera preview instead of a dedicated scanning app whenever possible. Most modern phones show you the destination URL before opening it — check whether the domain matches who you'd expect.

Dynamic QR codes route through a shortened link before reaching their final destination, which makes the preview less useful on its own. In those cases, look at whether the page that finally loads matches the branding, fonts, and layout you'd expect.

Pause on anything that asks you to log in or enter payment details immediately after scanning. Legitimate parking apps and ordering systems rarely require fresh credentials every single time.

What I Actually Found

What surprised me digging into this wasn't how sophisticated these scams are — it's how little sophistication they need. Most of the fake QR codes documented by police departments weren't elaborate forgeries. They were printer-paper stickers, sometimes a visibly different shade of white than the surface underneath, placed by someone who knew nobody really inspects a parking meter before paying it.

Most security advice focuses on apps and scanner tools that preview links before opening them. Those help, but they miss the simpler habit that actually prevents most of these scams: physically looking at the sticker for two seconds before you scan, not just checking the link after you've already decided to trust it.

If I had to pick one habit to actually keep, it would be treating any QR code on a payment-related surface — meters, parking lots, toll booths — as default-suspicious until proven otherwise. Those are the highest-value, lowest-effort targets for this exact scam.

Full piece with more detail and visuals: lucas8.com/qr-code-scam-parking-meter

Samsung vs LG vs Roku vs Fire TV vs Apple TV: Which One Tracks You the Least?

Spicy — Sun, 28 Jun 2026 15:33:23 +0000

Your smart TV is running a feature you almost certainly never turned on.

It's called ACR — Automatic Content Recognition. It works like Shazam, but instead of identifying songs on request, it runs continuously in the background: capturing screenshots of whatever is on your screen, matching those frames against a content database, and sending that viewing history to advertisers and data brokers. It tracks everything — streaming apps, live TV, and devices plugged into HDMI like your game console or cable box.

In December 2025, Texas Attorney General Ken Paxton sued Samsung, LG, Sony, Hisense, and TCL over ACR, calling it "an uninvited, invisible digital invader." His office obtained temporary restraining orders against Samsung and Hisense while the cases proceed.

Here's how the six major platforms actually compare.

The Comparison at a Glance

Platform	ACR Used	HDMI Tracking	On by Default	Data Sold
Samsung Tizen	✅ Yes	✅ Yes	✅ Yes	✅ Yes
LG webOS	✅ Yes	✅ Yes	✅ Yes	✅ Yes
Roku	✅ Yes	✅ Yes	✅ Yes	✅ Yes
Google TV	⚠️ Platform: No / Brand: Varies	⚠️ Varies	✅ Yes	✅ Yes
Amazon Fire TV	❌ No (HDMI)	❌ No	⚠️ Partial	⚠️ Partial
Apple TV 4K	❌ No	❌ No	N/A	❌ No

Samsung Tizen

ACR is called "Viewing Information Services" — not ACR — which is why most users never find it. It's buried under Settings → Support → Terms & Policy, not in the Privacy menu.

Texas obtained a temporary restraining order against Samsung specifically. The court agreed there was sufficient reason to pause data collection while the lawsuit proceeds.

Disable it:

Settings → Support → Terms & Policy → Viewing Information Services → Off
Also disable: Interest-Based Advertisement Services

LG webOS

LG's ACR is called "Live Plus." It defaults to on and tracks content across HDMI inputs.

Disable it:

Settings → All Settings → General → Live Plus → Off
Settings → All Settings → General → About This TV → User Agreements → Viewing Information → Off

Two separate settings, not one. Missing the second one still leaves partial tracking enabled.

Roku

Roku is the most transparent about ACR — its published documentation explicitly states that collected data is shared with third parties and that previously collected data is retained even if you disable the feature later.

Disabling ACR on Roku stops HDMI-based tracking but doesn't affect data collected from Roku's own streaming channels. That requires separate steps.

Disable it:

Settings → Privacy → Smart TV Experience → Use Info from TV Inputs → uncheck
Settings → Privacy → Advertising → Limit Ad Tracking → on
Settings → Privacy → Privacy Choices → disable data sharing

Google TV (Sony, TCL, Philips, Hisense)

Google's platform doesn't use ACR directly. But the individual TV brand (Sony, TCL, etc.) may add its own ACR layer separately. And Google's own data collection from the platform — searches, YouTube viewing, app usage — feeds into the same ad profile used across all your Google-connected devices. There's no opting out of Google's core policies if you want smart TV functionality.

Check for brand-specific ACR under:

Settings → Device Preferences → Samba Interactive TV → Disable
or
Settings → Display & Sound → Intelligent Settings → all off
Settings → Privacy → Usage & Diagnostics → off

Amazon Fire TV

Amazon has publicly confirmed it doesn't use ACR to track content from HDMI-connected devices. That's the key differentiator from every other platform in this list.

It still collects data on what you watch through antenna inputs and Fire TV streaming apps. But the absence of HDMI ACR tracking is a meaningful privacy win.

Tighten it further:

Settings → Preferences → Privacy Settings → Collect App and Over-the-Air Usage Data → off
Settings → Preferences → Privacy Settings → Interest-Based Ads → off

Apple TV 4K

No ACR. Full stop. Apple has confirmed this, and independent research supports it.

Apple does collect some usage data within its own apps, but processes it through differential privacy — anonymized in aggregate before use. Third-party apps must explicitly request tracking permission under tvOS, which you can block globally.

Settings → Privacy → Tracking → Allow Apps to Request to Track → off

One caveat: Apple TV is a streaming box, not a TV. If the physical TV it's plugged into has ACR enabled, that TV's OS is still screenshotting whatever Apple TV displays. The correct setup is Apple TV as your streaming device plus ACR disabled on the TV itself.

The Part Most Guides Miss

Disabling ACR at the OS level doesn't stop individual streaming apps from tracking you — Netflix, YouTube, and every other app collect data separately under their own policies. You need to treat the TV's OS and each app as separate tracking systems.

Also: firmware updates on Samsung, LG, and Roku have been documented to reset privacy settings to their defaults. Set a reminder to recheck after every major update.

The practical hierarchy if privacy matters to you:

Apple TV 4K — no ACR, anonymized data, explicit tracking consent model
Amazon Fire TV — no HDMI ACR, still collects streaming data
Google TV — platform ACR-free, but brand ACR and Google's data model apply
Roku — ACR on by default, unusually transparent documentation
LG webOS — ACR on by default, requires two separate settings to disable
Samsung Tizen — ACR on by default, buried deepest in legal menus

Full breakdown with per-brand step-by-step settings: https://lucas8.com/smart-tv-acr-privacy-comparison

Your Baby Monitor's Biggest Security Flaw Isn't Hackers. It's the Company That Built It.

Spicy — Thu, 25 Jun 2026 16:50:56 +0000

In May 2026, a French ethical hacker named Sammy Azdoufal bought a baby monitor off Amazon and spent a few hours looking at its network traffic. What he found: 1.1 million cameras across 300+ brand names, all running on the same shared platform, accessible to anyone with a free account. No password cracking. No exploit chain. He clicked a URL and got the image.

The vulnerability wasn't a clever attack. It was negligence — hardcoded credentials, an MQTT broker with no per-device access controls, and motion-alert images sitting on an Alibaba OSS bucket with no authentication required.

This is the actual baby monitor security problem. Not a stranger breaking in through your Wi-Fi. The architecture itself.

The White-Label Problem

Most budget smart cameras on Amazon are the same product under different names.

Meari Technology (Hangzhou, China) supplies hardware, software, and cloud infrastructure to 300+ brands. When you buy a monitor you've never heard of — or even some you have — there's a real chance it shares a backend platform with hundreds of other products. The box doesn't disclose which cloud it connects to. The app is often interchangeable across brands.

A flaw at the platform level means millions of devices are exposed simultaneously, regardless of brand name.

Rapid7 tested nine popular baby monitors and gave eight of them an "F" for security. Higher prices didn't correlate with better security — more features meant more attack surface.

What "Encrypted" Actually Means Here

Many monitors market end-to-end encryption. Most don't implement it correctly.

Real E2E encryption: only your device and your app can decrypt the stream — not the manufacturer's servers.

What most Wi-Fi monitors actually do: encrypt the leg from your camera to their servers, and the leg from their servers to your app. The manufacturer sees plaintext at the server level.

The Meari MQTT broker (CVE-2026-33356) let any authenticated platform account subscribe to camera activity across all devices on a regional broker. Azdoufal observed 2,000+ device messages within minutes from a single broker endpoint.

What Actually Reduces Risk

Network segmentation — most underused, highest impact. Put your baby monitor on a guest Wi-Fi VLAN, isolated from your main devices. A compromised monitor can't pivot to your laptop or NAS. Takes 5 minutes on most home routers.

Firmware updates over passwords — a strong password won't protect you if the backend has an unpatched CVE. Enable auto-updates or check monthly. This matters more than password complexity on a vulnerable platform.

Brand selection criteria that actually signal security:

Published vulnerability disclosure program or bug bounty
Documented patch history (not just "we take security seriously")
Privacy policy that specifies data collected during idle operation — not just when you're actively viewing

What doesn't move the needle as much:

Strong passwords alone on a platform-level vulnerability
"No cloud" marketing that still phones home for firmware and analytics
Price as a proxy for security

The Regulatory Gap

The FCC's US Cyber Trust Mark — a voluntary IoT security labeling program — is still in development. "Good cybersecurity is invisible to consumers. They can't tell what products are risky," Stacey Higginbotham of Consumer Reports said following the Meari disclosure.

Until mandatory baseline security requirements exist for IoT devices sold in the US, the research burden falls on buyers. Consumer Reports now runs security and privacy evaluations on baby monitors specifically — worth checking before purchase.

Practical Checklist

[ ] Monitor is on a dedicated guest/IoT VLAN
[ ] Firmware auto-updates enabled (or manual check set monthly)
[ ] Default credentials changed immediately on setup
[ ] Brand has published vulnerability disclosure program
[ ] Privacy policy reviewed for third-party data sharing
[ ] App permissions audited (disable anything not needed)

The Meari disclosure made the problem concrete at a scale that's hard to ignore. 1.1 million cameras. Five critical CVEs. Footage of children's bedrooms accessible to anyone with a free account and a few minutes.

That's not a hacking story. It's a product design story.

Full breakdown: https://lucas8.com/baby-monitor-privacy-risks

Two-Factor Auth Isn't the Shield You Think It Is.

Spicy — Mon, 22 Jun 2026 13:58:28 +0000

You enabled two-factor authentication. Good call. But here's what most security guides skip: the biggest account takeovers of the past three years didn't require attackers to crack your 2FA code at all. They found ways to make you hand it over — or made the authentication step irrelevant entirely.

MFA fatigue attacks are now one of the most documented techniques in real breaches, and they work precisely because 2FA gave everyone a false sense of being done.

Why 2FA Still Gets Beaten in 2026

Traditional 2FA was designed to stop credential stuffing — someone stealing your password and trying to log in. For that specific threat, it works brilliantly. Microsoft's data shows MFA blocks over 99% of automated credential attacks.

The problem is the threat didn't stay still. Attackers noticed 2FA made the password less valuable, so they shifted focus: steal the session, exhaust the user, or reroute the auth flow entirely. These aren't exotic techniques. They show up in the Verizon DBIR every year, and the companies hit aren't careless ones.

The Three Techniques Hackers Use to Walk Past MFA

1. MFA Fatigue (Push Bombing)

An attacker who has your username and password triggers your push notification repeatedly — sometimes dozens of times overnight. Most people eventually tap "Approve" to make it stop. That single tap hands over a valid session token.

This exact method took down Uber in 2022. The attacker bought leaked credentials, bombed the contractor's phone with push requests, then sent a WhatsApp message posing as Uber IT saying "approve once and it'll stop." The contractor did. Game over.

Mitigation: Enable number matching on push apps. You type a two-digit code shown on your screen into the push prompt — blind approvals stop working immediately. Microsoft Authenticator and Duo both support this and Microsoft enabled it by default across Entra ID in 2023.

2. Adversary-in-the-Middle (AiTM) Attacks

An attacker sets up a proxy that looks identical to a real login page. When you log in through it:

The proxy relays your credentials to the real site
The real site sends back a 2FA challenge — the proxy relays that too
You enter your code — the proxy captures your authenticated session cookie
The attacker replays that cookie from a clean browser, no further auth required

You → [Fake Proxy] → Real Site
             ↓
      Session Cookie stolen
             ↓
Attacker → [Replays Cookie] → Logged In

The session cookie is the key. Your 2FA was used — legitimately — to create a session the attacker now owns. The Verizon 2025 DBIR flags stolen session tokens as a growing proportion of breach vectors for exactly this reason.

Mitigation: Only passkeys and hardware security keys are immune to this. They use cryptographic proofs tied to the exact origin domain — a proxy gets a different challenge and the login fails silently.

3. SIM Swapping

An attacker calls your carrier posing as you — armed with info from data brokers or social media — and convinces a rep to transfer your number to a SIM they control. From that point, every SMS code sent to your phone goes to them instead.

No malware. Nothing on your device. The FTC has received thousands of SIM swap reports annually since 2021.

Mitigation: Move off SMS 2FA entirely for anything financial or high-value. Authenticator apps (Google Authenticator, Aegis, Authy) generate codes locally on your device — a SIM swap can't intercept them.

The 2FA Method Comparison

Method	Stops Credential Stuffing	Stops Push Bombing	Stops AiTM	Stops SIM Swap
SMS OTP	✅	✅	❌	❌
Authenticator App (TOTP)	✅	✅	❌	✅
Push (standard)	✅	❌	❌	✅
Push + Number Matching	✅	✅	❌	✅
Hardware Key / Passkey	✅	✅	✅	✅

What "Phishing-Resistant" Actually Means

Phishing-resistant MFA uses public-key cryptography where your device proves it's physically present at the real domain. There's no code to intercept, no push to trick you into approving, no SMS to redirect.

Both hardware security keys (YubiKey etc.) and passkeys qualify. FIDO Alliance reports 5 billion active passkeys worldwide as of 2026, and 68% of organizations surveyed are actively deploying them for employee sign-in. Google, Apple, and Microsoft have all made passkeys the default for new accounts.

The Practical Upgrade Path

You don't need to switch everything at once:

This week: Remove SMS 2FA from any financial account. Switch to an authenticator app — takes ~30 seconds per account in settings.

This month: Enable number matching on any push-based MFA app. Check your organization's Entra ID or Okta settings to confirm it's on.

When ready: Add a hardware key ($25–55 for a basic YubiKey) for email, password manager, and any admin access. Set up passkeys on Google, Apple, and GitHub — each takes under three minutes.

The goal isn't to scare you off 2FA — it's to make sure what you're running matches the current threat, not the threat from 2019. Getting off SMS and onto an authenticator app this week already moves you out of the most exploited tier.

Full breakdown with real breach examples: https://lucas8.com/mfa-fatigue-attack-two-factor-bypass

Browser Fingerprinting in Practice — The Signals, the Math, and What Actually Defeats It

Spicy — Thu, 18 Jun 2026 16:38:31 +0000

Most privacy advice still centers on cookies — clear them, block them, use incognito. Meanwhile, fingerprinting has become the dominant tracking method precisely because it doesn't touch cookies at all. Here's what's actually happening at the API level, and what countermeasures hold up.

The Core Signals (And the Code Behind Them)

Canvas fingerprinting exploits subtle rendering differences between GPU/driver combinations:

function getCanvasFingerprint() {
  const canvas = document.createElement('canvas');
  const ctx = canvas.getContext('2d');

  // Draw text with specific font, size, and emoji — 
  // rendering varies by OS font rasterizer and GPU
  ctx.textBaseline = 'top';
  ctx.font = '14px Arial';
  ctx.fillStyle = '#f60';
  ctx.fillRect(125, 1, 62, 20);
  ctx.fillStyle = '#069';
  ctx.fillText('Cwm fjordbank glyphs vext quiz 🎮', 2, 15);
  ctx.fillStyle = 'rgba(102, 204, 0, 0.7)';
  ctx.fillText('Cwm fjordbank glyphs vext quiz 🎮', 4, 17);

  // Hash the resulting pixel data
  return canvas.toDataURL();
}

// Hash the output for compact comparison
async function hashFingerprint(dataUrl) {
  const encoder = new TextEncoder();
  const data = encoder.encode(dataUrl);
  const hashBuffer = await crypto.subtle.digest('SHA-256', data);
  return Array.from(new Uint8Array(hashBuffer))
    .map(b => b.toString(16).padStart(2, '0')).join('');
}

The same code produces different pixel-level output across GPU vendors (NVIDIA vs AMD vs Apple Silicon), driver versions, and even font hinting settings — none of which the user can see, but all of which produce a consistent hash per device.

WebGL fingerprinting goes deeper into hardware identification:

function getWebGLFingerprint() {
  const canvas = document.createElement('canvas');
  const gl = canvas.getContext('webgl') || canvas.getContext('experimental-webgl');
  if (!gl) return null;

  const debugInfo = gl.getExtension('WEBGL_debug_renderer_info');
  return {
    vendor: gl.getParameter(debugInfo.UNMASKED_VENDOR_WEBGL),
    renderer: gl.getParameter(debugInfo.UNMASKED_RENDERER_WEBGL),
    // e.g. "ANGLE (NVIDIA, NVIDIA GeForce RTX 4070 Direct3D11..."
    extensions: gl.getSupportedExtensions(),
    maxTextureSize: gl.getParameter(gl.MAX_TEXTURE_SIZE),
  };
}

This often reveals your exact GPU model, which on its own significantly narrows the population of matching devices.

AudioContext fingerprinting uses hardware-dependent audio processing variance:

async function getAudioFingerprint() {
  const audioCtx = new (window.AudioContext || window.webkitAudioContext)();
  const oscillator = audioCtx.createOscillator();
  const analyser = audioCtx.createAnalyser();
  const gainNode = audioCtx.createGain();

  gainNode.gain.value = 0; // silent — user hears nothing
  oscillator.type = 'triangle';
  oscillator.connect(analyser);
  analyser.connect(gainNode);
  gainNode.connect(audioCtx.destination);
  oscillator.start(0);

  const buffer = new Float32Array(analyser.frequencyBinCount);
  analyser.getFloatFrequencyData(buffer);
  oscillator.stop();

  return buffer.slice(0, 30).join(','); // sample of frequency data
}

Font enumeration via measurement comparison (no direct font list API exists, so it's inferred):

function detectFonts(testFonts) {
  const baseFonts = ['monospace', 'sans-serif', 'serif'];
  const testString = 'mmmmmmmmmmlli';
  const testSize = '72px';
  const span = document.createElement('span');
  span.style.fontSize = testSize;
  span.innerHTML = testString;
  document.body.appendChild(span);

  const baseWidths = {};
  baseFonts.forEach(font => {
    span.style.fontFamily = font;
    baseWidths[font] = span.offsetWidth;
  });

  const detected = testFonts.filter(font => {
    return baseFonts.some(base => {
      span.style.fontFamily = `'${font}', ${base}`;
      return span.offsetWidth !== baseWidths[base];
    });
  });

  document.body.removeChild(span);
  return detected;
}

Composite Scoring — How These Combine

A single signal rarely identifies anyone uniquely. The entropy comes from combining them:

import math

def calculate_entropy(signal_distribution: dict) -> float:
    """
    Shannon entropy in bits — higher = more identifying
    e.g. if 1 in 1000 users share your exact value, 
    that signal contributes ~10 bits of entropy
    """
    total = sum(signal_distribution.values())
    entropy = 0
    for count in signal_distribution.values():
        p = count / total
        entropy -= p * math.log2(p)
    return entropy

# Example combined fingerprint entropy
signals = {
    'screen_resolution': 4.2,      # bits
    'canvas_hash': 8.1,
    'webgl_renderer': 6.7,
    'font_list': 5.9,
    'timezone': 2.1,
    'audio_fingerprint': 5.3,
}

total_entropy = sum(signals.values())  # ~32.3 bits
# 2^32.3 ≈ 5.3 billion possible combinations
# Global population ~8 billion — this fingerprint 
# alone approaches unique identification

This is why the EFF Panopticlick research consistently found most tested browsers had fingerprints unique among hundreds of thousands of samples — the combinatorial entropy adds up fast even when no individual signal is rare.

What Actually Reduces Entropy (Tested)

Firefox privacy.resistFingerprinting standardizes the highest-entropy signals:
Testing before/after on the same machine with amiunique.org:

Signal	Default Firefox	resistFingerprinting
Canvas	Unique hash	Blocked/randomized
Timezone	Local (e.g. PST)	UTC
Screen resolution	Exact (e.g. 1512x982)	Rounded (1500x950)
Fonts detected	40+ system fonts	~12 bundled fonts
WebGL renderer	Full GPU string	Generic string

The tradeoff: sites relying on accurate viewport dimensions for layout can render incorrectly, and some canvas-dependent web apps (image editors, games) break entirely.

Brave's fingerprinting protection takes a different approach — randomizing per-session rather than blocking:
Tor Browser standardizes nearly everything to a single shared profile across all users, which is the only approach that achieves near-zero fingerprint uniqueness — at the cost of significant performance and compatibility overhead.

Detection-Side: If You're Building Anti-Fraud Systems

For legitimate use cases (fraud detection, not tracking users for ads), fingerprinting libraries like FingerprintJS provide production-ready implementations:

import FingerprintJS from '@fingerprintjs/fingerprintjs';

const fpPromise = FingerprintJS.load();

(async () => {
  const fp = await fpPromise;
  const result = await fp.get();
  console.log(result.visitorId); // stable identifier
  console.log(result.confidence.score); // 0-1 reliability
})();

Worth noting: privacy-focused browsers and extensions specifically target known fingerprinting libraries, so production fraud detection systems increasingly see degraded signal quality from privacy-conscious users — which itself becomes a (weaker) signal.

Practical Takeaway

Cookie-based privacy controls (clearing cookies, incognito mode, cookie blockers) have zero effect on any of the above. If you're building privacy-respecting infrastructure or just hardening your own setup, the signals that matter are canvas, WebGL, audio context, and font enumeration — and the only consistently effective countermeasures are resistFingerprinting-style signal normalization or full standardization (Tor).

Consumer-level explanation without the code: lucas8.com/incognito-mode-browser-fingerprinting

Data Brokers: What They Collect, How the Industry Works, and How to Opt Out at Scale

Spicy — Sun, 14 Jun 2026 15:16:43 +0000

Most developers know abstractly that data brokers exist. Fewer have actually looked up their own profile and seen what's there — their home address, every previous address, relatives' names and addresses, income estimate, vehicle history, court records, and consumer interest categories.

Here's how the data pipeline actually works, what a profile contains at the data level, and how to approach opt-outs at scale rather than one form at a time.

How Data Broker Pipelines Work

Data brokers aggregate from three primary source categories:

Public records — property records, voter registration, court filings, professional licenses, business registrations, marriage/divorce records, death records. These are legally public in the US and most other countries. Brokers ingest them continuously via bulk data agreements with county, state, and federal agencies.

Commercial data — purchase history from retailers (via loyalty programs and direct sales), subscription records, warranty registrations, financial transaction metadata (purchased from banks and credit card processors), insurance records, telecommunications data.

Third-party data — scraped from social media and public web, purchased from other data brokers (the industry extensively resells to itself), purchased from app developers who include data-sharing SDKs.

The aggregation logic:

# Simplified version of identity resolution logic
# (what brokers call "entity resolution" or "data matching")

def resolve_identity(records: list[dict]) -> PersonProfile:
    """
    Match records across sources using probabilistic 
    identity resolution — name + address + DOB + phone
    combinations weighted by confidence score
    """
    clusters = []

    for record in records:
        matched = False
        for cluster in clusters:
            if identity_match_score(record, cluster) > THRESHOLD:
                cluster.merge(record)
                matched = True
                break
        if not matched:
            clusters.append(PersonCluster(record))

    return [cluster.to_profile() for cluster in clusters]

def identity_match_score(record, cluster) -> float:
    score = 0.0
    if fuzzy_name_match(record.name, cluster.names): score += 0.4
    if record.address in cluster.addresses: score += 0.3
    if record.dob == cluster.dob: score += 0.2
    if record.phone in cluster.phones: score += 0.1
    return score

This is why a data broker profile contains people you've lived with — shared address history creates a probabilistic link that their systems treat as a relationship signal.

What's Actually in a Profile (Data Schema)

A typical commercial data broker profile at the API level:

{
  "person": {
    "names": ["Jane Smith", "Jane A. Smith", "Jane Adams"],
    "dob_range": {"min": "1985-01-01", "max": "1985-12-31"},
    "phones": ["+15551234567", "+15559876543"],
    "emails": ["jane@gmail.com", "jane.smith@oldwork.com"]
  },
  "locations": [
    {
      "address": "123 Main St, Austin TX 78701",
      "type": "current",
      "confidence": 0.94,
      "date_range": {"from": "2021-03", "to": "present"}
    },
    {
      "address": "456 Oak Ave, Denver CO 80203", 
      "type": "previous",
      "confidence": 0.87,
      "date_range": {"from": "2018-06", "to": "2021-02"}
    }
  ],
  "associates": [
    {
      "name": "Robert Smith",
      "relationship": "relative",
      "confidence": 0.76,
      "shared_addresses": ["123 Main St, Austin TX 78701"]
    }
  ],
  "financials": {
    "income_estimate": {"min": 75000, "max": 100000},
    "net_worth_estimate": {"min": 50000, "max": 150000},
    "homeowner": true,
    "property_value": 385000
  },
  "records": {
    "criminal": [],
    "civil": [],
    "bankruptcies": [],
    "liens": []
  },
  "consumer_segments": [
    "health_conscious_shopper",
    "frequent_traveler", 
    "suburban_homeowner",
    "political_donor_democrat"
  ]
}

The consumer_segments field is the advertising product — these interest/demographic categories are what marketers buy. The address and associate data is what stalkers, scammers, and PI firms buy.

The Opt-Out Landscape

There are approximately 4,000 data brokers. Manually opting out of each one is not realistic. The practical approach is tiered:

Tier 1 — High-traffic consumer-facing sites (manual opt-out, highest priority)

Site	Opt-Out URL	Method	TTL
Spokeo	spokeo.com/optout	Email form	~3-6 months
WhitePages	whitepages.com/suppression_requests	Web form	~3-6 months
BeenVerified	beenverified.com/opt-out	Web form	~3-6 months
MyLife	mylife.com	Phone call required	~3-6 months
Radaris	radaris.com/page/privacy	Email form	~3-6 months
Intelius	intelius.com/optout	Web form	~3-6 months

TTL = time before listing typically reappears from re-aggregation.

Tier 2 — Automated opt-out via paid services

Services like DeleteMe, Incogni, and Privacy Bee submit opt-outs across 100–750 brokers and resubmit on a schedule. Worth the cost if you're doing this for yourself or building it into a product for users.

Tier 3 — Enterprise data brokers (requires legal process)

Acxiom, LexisNexis, CoreLogic, Equifax (non-credit), TransUnion Marketing — these serve enterprise customers and have different opt-out mechanisms. Acxiom has an opt-out at aboutthedata.com. LexisNexis requires a written request with ID verification. California CCPA requests get the fastest response for these sources.

Automating Opt-Out Submissions

For the manual tier, the process is repetitive and automatable for the sites that use web forms rather than email or phone:

// Playwright automation for form-based opt-outs
// (Shown for educational purposes — 
//  check each site's ToS before automating)

const { chromium } = require('playwright');

async function submitOptOut(site, profileUrl, email) {
  const browser = await chromium.launch({ headless: false });
  const page = await browser.newPage();

  switch(site) {
    case 'spokeo':
      await page.goto('https://www.spokeo.com/optout');
      await page.fill('#email', email);
      await page.fill('#profile_url', profileUrl);
      await page.click('[type="submit"]');
      break;

    case 'radaris':
      await page.goto('https://radaris.com/page/privacy');
      await page.fill('input[name="email"]', email);
      await page.fill('input[name="url"]', profileUrl);
      await page.click('.submit-btn');
      break;
  }

  await browser.close();
}

// Rate limit to avoid triggering bot detection
async function batchOptOut(profiles) {
  for (const profile of profiles) {
    await submitOptOut(profile.site, profile.url, profile.email);
    await new Promise(r => setTimeout(r, 2000 + Math.random() * 3000));
  }
}

The main friction points: CAPTCHA on some forms, email confirmation required on most, and a few sites require the user to find their own profile URL first (can't just submit a name).

CCPA as a Lever

For California residents (and US residents targeting California-based brokers), the CCPA gives individuals the right to:

Know what data is collected about them
Request deletion
Opt out of sale

Submitting a CCPA deletion request often gets faster and more thorough responses than the standard opt-out form, even from brokers that theoretically don't have to respond. Use the Global Privacy Control (GPC) signal in your browser header — it's legally recognized in California and several other states:

// GPC header — supported by Firefox and Brave natively
// Can be set programmatically:
navigator.globalPrivacyControl // true if GPC enabled

// For server-side requests:
headers['Sec-GPC'] = '1'

The Realistic Picture

Manual opt-outs from the top 10-15 consumer-facing sites takes about 2-3 hours and provides meaningful short-term privacy improvement, particularly for address exposure. The data comes back in 3-6 months.

The deeper problem is that the legal architecture in the US makes this a whack-a-mole exercise until federal privacy legislation passes. For users who need durable protection — domestic violence survivors, public figures, journalists — the paid services plus CCPA requests plus synthetic identity strategies (PO boxes, registered agents for property) are the more serious toolkit.

Consumer-facing explanation of the same topic, including the exact opt-out steps for each major site: lucas8.com/data-broker-opt-out-guide

How AI Phishing Emails Are Built (And the One Pattern That Always Gives Them Away)

Spicy — Sun, 14 Jun 2026 14:24:56 +0000

Most phishing detection advice is now actively harmful. Teaching users to look for typos and generic greetings made sense when phishing was a spray-and-pray operation running on bad grammar. That era is over.

Here's how modern AI phishing is actually constructed, what signals remain reliable, and how to implement detection logic that accounts for the current threat model.

How an AI Spear Phishing Email Gets Built

The workflow an attacker follows in 2026 is largely automated:

Step 1: Target reconnaissance

# Typical OSINT data sources for a targeted attack
sources = {
    "linkedin": "job title, manager name, team structure, tenure",
    "company_website": "email format (first.last@company.com), press releases",
    "social_media": "recent posts, projects mentioned, travel",
    "data_broker": "personal email, phone, home address",
    "previous_breaches": "password patterns, security question answers"
}

All of this is publicly available or purchasable. A targeted attack on a finance manager will include their correct name, their CFO's actual name, and may reference a real business event pulled from a press release.

Step 2: Prompt engineering for the attack

The attacker doesn't write the email. They prompt a model:
The output is indistinguishable from a real internal email.

Step 3: Infrastructure

Lookalike domains are registered with realistic names (company-billing.com, companyfinance.io), SSL certificates acquired (free via Let's Encrypt — the padlock means nothing), and emails sent through legitimate SMTP infrastructure to pass basic spam filters.

What Traditional Detection Gets Wrong

The signals security training still teaches:

Signal	Why It Fails Now
Typos / bad grammar	LLMs produce perfect prose
Generic greeting	OSINT provides correct names trivially
Unknown sender	Lookalike domains pass visual inspection
Suspicious links	Links go to legitimate sites that redirect
Urgency alone	Legitimate emails also have urgency

None of these are reliable discriminators in 2026.

What Still Works: Authentication Layer Checks

SPF, DKIM, DMARC — these operate at the email infrastructure level and can't be faked without compromising the legitimate domain.

# Check authentication results for a received email
# Look for these headers in the raw message

# SPF: did the email originate from an authorized server?
Received-SPF: pass (google.com: domain of cfo@company.com designates 
  198.51.100.1 as permitted sender)

# DKIM: was the email cryptographically signed by the domain?
DKIM-Signature: v=1; a=rsa-sha256; d=company.com; s=selector1;

# DMARC: does the domain's policy require both to pass?
Authentication-Results: mx.google.com;
  dkim=pass header.d=company.com;
  spf=pass smtp.mailfrom=company.com;
  dmarc=pass (p=REJECT)

A legitimate internal email from your CFO should pass all three. Any failure is a hard signal — not a soft one. Most attackers can't pass DMARC on the domain they're spoofing without compromising it directly.

Programmatic header parsing:

// Parse authentication results from email headers
function parseAuthResults(headers) {
  const authHeader = headers['authentication-results'] || '';

  return {
    spf: authHeader.match(/spf=(pass|fail|softfail|neutral)/)?.[1] || 'missing',
    dkim: authHeader.match(/dkim=(pass|fail|none)/)?.[1] || 'missing',
    dmarc: authHeader.match(/dmarc=(pass|fail|none)/)?.[1] || 'missing',
  };
}

function isAuthenticationSuspicious(authResults) {
  const { spf, dkim, dmarc } = authResults;
  // Any failure on a supposedly internal or financial email = flag
  return spf !== 'pass' || dkim !== 'pass' || dmarc !== 'pass';
}

The Signal That Doesn't Depend on Content

Authentication checks require access to headers. The signal that works at the human layer — and that AI cannot defeat — is the request pattern.

Legitimate organizations have consistent behavioral signatures:

const PHISHING_REQUEST_PATTERNS = [
  'wire transfer outside normal approval chain',
  'request for credentials or MFA codes via email',
  'urgency to bypass standard process',
  'confidentiality instruction (do not tell X)',
  'new payment method or vendor not in system',
  'action requested on behalf of unavailable approver',
];

// The key insight: legitimate urgent requests
// arrive through established channels with context.
// Phishing creates the urgency in the email itself.

This pattern holds regardless of how the email is written. AI can generate perfect prose but cannot change the fact that a real CFO initiating a real wire transfer uses the company's actual payment system, not a direct email to a finance manager with a new bank account.

Building a Detection Heuristic

For teams building email security tooling or internal automation:

def phishing_risk_score(email):
    score = 0

    # Authentication failures (high weight)
    if email.spf != 'pass': score += 40
    if email.dkim != 'pass': score += 30
    if email.dmarc != 'pass': score += 30

    # Domain analysis
    if is_lookalike_domain(email.sender_domain): score += 50
    if email.reply_to != email.from_address: score += 25

    # Request pattern signals (content analysis)
    content = email.body.lower()
    if any(phrase in content for phrase in [
        'wire transfer', 'bank account', 'routing number'
    ]): score += 20
    if any(phrase in content for phrase in [
        'urgent', 'immediately', 'today only', 'close of business'
    ]): score += 10
    if any(phrase in content for phrase in [
        'keep this confidential', "don't mention", 'just between us'
    ]): score += 30

    # High score = route to additional verification, not auto-block
    return score

def is_lookalike_domain(domain, legitimate_domains):
    from jellyfish import jaro_winkler_similarity
    return any(
        jaro_winkler_similarity(domain, legit) > 0.85 
        and domain != legit
        for legit in legitimate_domains
    )

The key design decision: high-risk emails should trigger an out-of-band verification requirement, not an auto-block. Auto-blocking has false positive costs; requiring phone verification for flagged financial requests has almost none.

The Defense That Defeats All Variants

Out-of-band verification: any email requesting financial action, credential changes, or process exceptions gets verified via phone call to a number already on record.

This rule is architecturally sound because it breaks the attack at the social engineering layer regardless of how convincing the email is. It doesn't matter how good the AI gets at writing emails — it can't intercept a phone call the target initiates to a known number.

The consumer version of this — what non-technical users should watch for — is at lucas8.com/how-to-spot-ai-phishing-email

What a VPN Actually Does (And Why Most Devs Use It Wrong)

Spicy — Thu, 11 Jun 2026 17:02:35 +0000

Every developer I know has a VPN. Most of them have it running while they're logged into Google, sending data through Chrome, and using apps that do their own certificate pinning — which means the VPN is protecting approximately nothing meaningful in that moment.

This isn't a knock on VPNs. It's a scoping problem. Here's what the tool actually covers, what leaks around it, and how to test it properly.

What's Actually Happening at the Network Layer

A VPN operates at Layer 3 (Network) of the OSI model. It creates an encrypted tunnel — typically using WireGuard, OpenVPN, or IKEv2/IPSec — between your device and a VPN server. All IP traffic gets routed through that tunnel.

Protocol comparison:

Protocol	Speed	Security	Port	Notes
WireGuard	⭐⭐⭐ Fast	⭐⭐⭐ Strong	UDP 51820	Modern, audited, ~4000 lines of code
OpenVPN	⭐⭐ Medium	⭐⭐⭐ Strong	TCP 443 / UDP 1194	Battle-tested, ~100k lines
IKEv2/IPSec	⭐⭐⭐ Fast	⭐⭐ Good	UDP 500/4500	Native on mobile, good reconnect
L2TP/IPSec	⭐ Slow	⭐ Weak	UDP 1701	Avoid — potentially compromised

WireGuard is the correct choice in 2026 unless you have a specific reason not to use it. Smaller codebase = smaller attack surface. Most audited VPN providers now use it by default.

What Leaks Around the Tunnel

The VPN handles IP routing. It doesn't handle everything else.

DNS leaks are the most common issue. If your system's DNS resolver isn't explicitly routed through the tunnel, your DNS queries go directly to your ISP — revealing every domain you visit even with the VPN active.

Test this from the terminal:

# Check your current DNS resolver
cat /etc/resolv.conf

# Test for DNS leak (run while VPN is active)
# Should show your VPN provider's DNS, not your ISP's
nslookup whoami.akamai.net

# More thorough test
dig +short myip.opendns.com @resolver1.opendns.com

Most reputable VPN clients handle DNS routing automatically, but worth verifying — especially on Linux where DNS management is fragmented across systemd-resolved, dnsmasq, and NetworkManager depending on your distro.

WebRTC leaks expose your real IP through browser APIs even when a VPN is active. This is a browser-layer problem, not a network-layer problem.

// WebRTC leak test — run in browser console while on VPN
// If this returns your real IP, you have a WebRTC leak
const pc = new RTCPeerConnection({
  iceServers: [{ urls: 'stun:stun.l.google.com:19302' }]
});
pc.createDataChannel('');
pc.createOffer().then(offer => pc.setLocalDescription(offer));
pc.onicecandidate = (e) => {
  if (e.candidate) {
    const ip = e.candidate.candidate.match(/(\d+\.\d+\.\d+\.\d+)/);
    if (ip) console.log('IP exposed via WebRTC:', ip[1]);
  }
};

Fix: Firefox has media.peerconnection.enabled in about:config. Chrome requires an extension or a VPN client with WebRTC leak protection built in.

Application-layer tracking doesn't touch the network layer at all. If you're authenticated in your browser, that session follows you. Cookies, localStorage, IndexedDB — none of this is affected by routing your IP through Amsterdam.

The Kill Switch and Why It Matters

A kill switch blocks all traffic if the VPN connection drops. Without it, your traffic briefly reverts to your real IP when the tunnel reconnects. This is the correct default for any privacy-sensitive setup.

On Linux with ufw:

# Block all traffic by default
sudo ufw default deny outgoing
sudo ufw default deny incoming

# Allow only traffic through VPN interface (tun0 for OpenVPN, wg0 for WireGuard)
sudo ufw allow out on tun0
sudo ufw allow out on wg0

# Allow LAN traffic if needed
sudo ufw allow out on eth0 to 192.168.0.0/16
sudo ufw allow out on eth0 to 10.0.0.0/8

sudo ufw enable

WireGuard users can use PostUp/PreDown hooks in the config for a more integrated approach:

[Interface]
PrivateKey = <your_private_key>
Address = 10.x.x.x/32
DNS = 1.1.1.1

PostUp = iptables -I OUTPUT ! -o wg0 -m mark ! --mark $(wg show wg0 fwmark) -j REJECT
PreDown = iptables -D OUTPUT ! -o wg0 -m mark ! --mark $(wg show wg0 fwmark) -j REJECT

Split Tunneling: Route Only What You Need

Split tunneling lets you route specific traffic through the VPN while everything else goes directly. Useful when you need VPN for specific services but don't want to tunnel your local development traffic or internal network requests.

Most GUI clients support this natively. For WireGuard directly:

[Peer]
PublicKey = <server_public_key>
Endpoint = <server_ip>:51820

# Route only specific subnets through VPN instead of all traffic
AllowedIPs = 10.0.0.0/8, 172.16.0.0/12

# vs. route everything:
# AllowedIPs = 0.0.0.0/0, ::/0

What a VPN Actually Buys You (Honest Summary)

✅ Your ISP can't see which domains you visit
✅ Traffic encrypted against public WiFi eavesdropping
✅ Your IP hidden from destination servers
✅ DNS queries protected (if configured correctly)
❌ No protection against cookie/session tracking
❌ No protection against browser fingerprinting
❌ No protection while authenticated in any app or service
❌ No protection from the VPN provider themselves

The tool solves a specific set of network-layer problems. For application-layer privacy, you need application-layer solutions — content blocking, session isolation, fingerprint-resistant browsers.

If you're setting up a VPN for a team or personal setup and want the technical layer done properly: WireGuard, verified DNS routing, kill switch enabled, and DNS leak tested before you trust it. Everything else is marketing.

Consumer version — what this means for people who don't want to touch iptables: lucas8.com/what-vpn-actually-protects