DEV Community: Arcjet

Devcontainers, Little Snitch, macOS TCC - protecting developer laptops

David Mytton — Tue, 08 Jul 2025 09:39:51 +0000

A single compromised npm package on a developer's laptop is all it takes - a quiet threat that executes with the familiar npm install command.

The potential for damage is significant - compromised commit rights to source repositories, stolen session tokens, exposed secrets from environment variables, and even direct access to production networks. Once you gain a foothold on a developer laptop, there are many opportunities to reach sensitive production systems.

At Arcjet, developer laptops consistently come top in our assessments of the "most likely" threats. By the very nature of the job, developers are regularly expected to install dependencies, execute code on their local systems, use third-party editor extensions, and connect to sensitive environments. This inherent risk likely explains the recent surge in developer-focused exploits, such as malware bundled within Python, Go, and Node packages; VS Code extension exploits; and Chrome extension hijacking.

Arcjet is a devtools startup. Our security as code SDK helps developers implement features like bot detection and signup form spam detection. We’re thinking about security all day every day, not just in our product, but also in how we run the company. In this blog post, I’ll talk through some of the work we’ve done to improve our own developer security.

Devcontainers

The first line of defense is containing the development environment itself. Originally developed by Microsoft, Devcontainers is an open specification that defines the development environment for a project:

A development container (or dev container for short) allows you to use a container as a full-featured development environment.

Using a .devcontainer/devcontainer.json file, you can define a container environment by specifying a base image from a public or private registry. Various optional features can be added to install common tools, such as the GitHub or AWS CLIs, linters, formatters, and other language runtimes. Include recommended VS Code extensions and scripts to run after installation, and within a few seconds of launching the container you have a fully configured development environment.

When you have a team of developers, getting them all running the same version of the same tools can be a big challenge. Using Devcontainers solves this by defining a consistent environment as configuration, rather than manually setting things up. The Devcontainers config in our public JS SDK and docs repos has helped make it easy for external contributions.

// For format details, see https://aka.ms/devcontainer.json. For config options, see the
// README at: https://github.com/devcontainers/templates/tree/main/src/javascript-node
{
  "name": "arcjet-docs",
  "image": "mcr.microsoft.com/devcontainers/javascript-node:1-22-bookworm",
  // Features to add to the dev container. More info: https://containers.dev/features.
  "features": {
    "ghcr.io/devcontainers/features/common-utils:2.5.2": {},
    "ghcr.io/trunk-io/devcontainer-feature/trunk:1.1.0": {}
  },
  // Use 'forwardPorts' to make a list of ports inside the container available locally.
  // "forwardPorts": [],
  // Install trunk tools inside the container
  // Uses array syntax to skip the shell: https://containers.dev/implementors/json_reference/#formatting-string-vs-array-properties
  "updateContentCommand": ["trunk", "install"],
  // Install npm dependencies within the container
  // Uses array syntax to skip the shell: https://containers.dev/implementors/json_reference/#formatting-string-vs-array-properties
  "postCreateCommand": ["npm", "ci"],
  "customizations": {
    "vscode": {
      "extensions": [
        "astro-build.astro-vscode",
        "unifiedjs.vscode-mdx",
        "trunk.io"
      ]
    }
  }
  // Configure tool-specific properties.
  // "customizations": {},
  // Uncomment to connect as root instead. More info: https://aka.ms/dev-containers-non-root.
  // "remoteUser": "root",
}

Devcontainer config for the Arcjet docs repo.

Using a Devcontainer isolates the dev environment from the host system (the developer laptop). Code is executed inside the container rather than directly on the host. This isolation mitigates most attack vectors such as malware executed via post-install scripts or from backdoored dependencies.

The downside is a minor performance overhead, particularly with I/O-intensive operations on macOS due to the underlying virtualization layer. However, for most development workflows, the security benefits far outweigh the negligible performance impact. Cloning a repository directly into a container volume rather than binding to the host filesystem can mitigate most of the performance issues.

While credentials and source code within the devcontainer could still be exfiltrated, the damage is constrained by the container's boundaries, making it easier to quarantine. This prevents code from unrestricted host access - keychain, password manager vaults, browser history databases, etc.

Containers are not designed for security or 100% isolation - they’re more of a convenient packaging and deployment format - so there is always the potential for container breakout. However, most attackers will assume that code is executed on the host system directly. All the code knows is that it’s running on a (pretty sparse) Linux machine. Devcontainers can therefore be a very effective layer of security for development.

Outbound firewall

The next layer is controlling what the isolated environment can access. macOS has a good built-in firewall, but it is primarily designed to protect against inbound connections. Tracking outbound connections is just as important.

Using an outbound firewall such as LuLu firewall (free, open source) or Little Snitch (paid, or its free variant Little Snitch Mini) will alert you the first time any application attempts to make outbound connections. This is initially quite noisy, but you get a good baseline of common applications pretty quickly.

Why is this important? A compromised dependency might try to "phone home" by sending exfiltrated secrets (like your AWS_ACCESS_KEY_ID) to a remote server over a standard port like DNS (53) or HTTPS (443). A default-deny firewall forces you to explicitly allow connections, making this anomalous traffic immediately obvious.

Little Snitch rules configuration.

Built-in macOS protections

This layer focuses on using mechanisms built into macOS. Introduced in macOS Mojave (10.14), the Transparency Consent and Control (TCC) framework restricts application access to sensitive user data and system resources.

You’ll have seen this in action with the consent boxes appearing whenever applications try to access your microphone, camera, location, photos, contacts, and other areas of your system that macOS considers sensitive.

This protection also extends to the ~/Downloads, ~/Documents, and ~/Desktop folders so that any process which tries to read or write files to these locations will be blocked until you approve access. macOS 10.15 introduced additional access controls for anything located in the ~/Desktop and ~/Documents directories which locks down kernel access even further.

macOS Privacy & Security controls.

These restrictions apply to each application, which includes any scripts or processes that might attempt to exfiltrate the contents of source code directories on disk. If you place all your code into one of these three directories, they will also benefit from the TCC protections (although the responsible process might show up as your editor or terminal).

For example, if you check out your Git repository to ~/Documents/repo then any malware that attempts to scrape the contents of ~/Documents will trigger the consent popup.

The role of TCC becomes more nuanced when using Devcontainers. This is because the container runtime itself (e.g., Docker Desktop or OrbStack) is the application that receives TCC authorization to access directories on the host. Consequently, malware executing within the container (e.g., via a post-install script) that accesses these mounted files will not trigger a new TCC prompt. The I/O request is proxied through the trusted runtime, effectively bypassing a direct TCC check on the malicious process.

While this means TCC's file-access prompts offer less direct protection from threats inside the container, the container itself still provides a layer of isolation. TCC remains a useful defense against a potential container escape, where malware might try to break out and execute directly on the macOS host.

SSH agent for Git keys

The final layer is ensuring the developer's identity and access are secure. Any keys stored directly on disk are easily accessible. This is one reason why AWS recommends using IAM Identity Center with the SSO CLI flow for logging into AWS accounts - so static credentials aren’t stored on disk.

The same problems arise with SSH keys, often used for GitHub authentication. Generated keys are stored in ~/.ssh by default, which makes them easy to exfiltrate.

Balancing UX with security is always a challenge. One option is to set a passphrase for the key and then store it in Keychain. This happens automatically if you access a passphrase protected file stored at .ssh/id_rsa or .ssh/identity - macOS will manage access for you. If you have multiple keys or the key is stored somewhere else, you can manually add it to ssh-agent and ask for the passphrase to be stored in Keychain.

An alternative to using the macOS Keychain is the SSH key support in 1Password. This avoids any files on disk and in contrast to using your login password for Keychain, 1Password triggers a prompt whenever a new application wishes to access the key. We have 1Password reporting to our logging infrastructure which means we also get audit logs for all credential access.

At Arcjet, we mandate signed commits, which requires developers to manage a signing key. Enforcing signed commits is a foundational practice for securing the software supply chain. It provides verifiable attestations that code originates from a trusted developer (that has also authenticated recently), protecting the repository from unauthorized code injection, even if a developer's GitHub access is compromised.

1Password Developer tools.

MDM

It’s only a matter of time before an incident happens. Ideally, one of the above layers will catch attacks or mistakes, but when something does happen we need to be able to detect it, understand how it happened, apply effective quarantine measures, and quickly remediate the situation.

The focus is often on the fancy detection and response part of this, but logging is just as important because it helps you answer questions like: What happened? What data (if any) was extracted? How long has this been compromised? Were any other systems impacted?

All Arcjet devices are provisioned with MDM tooling to help detect potential problems quickly, alert the right people, and protect our team & customers. We’ve partnered with Latacora for 24/7 monitoring & response and their team acts like our internal security experts. Various detection rules notify us of any suspicious activity, we have escalation channels to trigger rapid response investigations, and regularly run practice tabletop exercises to test our processes.

Conclusions

Security is all about layers. Securing a developer workstation is not about achieving an impenetrable state; it's about creating layers of defense that systematically reduce the attack surface. By isolating development work in containers, controlling network egress, making use of built-in features on the host OS, and securing developer identity, we’re able to build a robust security posture without getting in the way of development.

How we run Arcjet like an open source project

David Mytton — Mon, 23 Jun 2025 13:47:45 +0000

Building a startup culture that matches the users of the product is an advantage. We're a remote team and as several of our team members have a long history of open source contributions, these practices have evolved from how many open source projects operate.

Arcjet is a remote-first company. We're developing SDK that streamlines bot detection, attack prevention, and spam protection for developers. Even though our core service is not open source our SDK and docs are, so we've adopted many of the workflows that are used in open source. We're not quite requiring contributions go through git email, but remote working requires systems and tools to make it work well.

For example, we track feature requests, ideas, bugs, and technical debates in GitHub issues. Code comments reference issue IDs (e.g., // TODO(#123): something something), and we use draft PRs to solicit feedback before formal reviews.

Over the last 16 years I’ve led three startups remotely - Server Density (acquired 2018), Console.dev (a devtools newsletter) and now Arcjet. In this blog post I'll talk through some of the processes we're using to make remote engineering work at a devtools startup.

Real-time sometimes, asynchronous most of the time

Slack aimed to replace email as a single workspace for all communication, but in practice it fragments attention and search is unreliable.

Following the 37signals Guide to Internal Communication recommendation that unrestricted group chat can overload teams, we set clear boundaries.

Slack for socializing, quick clarifications or discussions, and link-sharing.
No expectation of immediate replies.
If it will matter after today, all lasting ideas, decisions, and conclusions migrate to GitHub issues/PRs or Notion pages.
We auto-delete Slack history after 7 days to enforce permanent records elsewhere.

Over the past two years, we shifted from GitHub Discussions to GitHub Issues due to notification limitations. Engineering questions, bugs, and feature ideas are all tracked via an issue. For code-in-progress, Pull Requests are used for context-specific discussions and PRs are often opened in draft first for early feedback. The principle here is that everything has a permanent link we can reference in future. This has worked well in general, but big PRs and extensive comments easily get lost in the GitHub web UI.

Notion is for design notes, policies, research, and general information that doesn't belong in a PR. Notion is like a powerful wiki, however version history and comment threading can be cumbersome, causing duplication between Notion pages and GitHub discussions (e.g., architecture feedback appearing in both).

We’ve recently started capturing more structured Architectural Decision Records as a way to improve this. The advantage of committing ADRs to the codebase is the ability to use PRs and commit history for tracking. However, discovery is more of a challenge and we've not yet solved the problem of information existing in multiple places e.g. design notes in Notion and decision records in GitHub.

I am optimistic about using Notion’s AI integration with GitHub, Google Docs, and Slack, to answer questions and summarize discussions. There’s a lot of information embedded across several tools, so it seems the perfect AI use case.

Stripe's internal email groups

A decade ago, Stripe wrote about how they use internal email lists to improve transparency. They provided more detail on their methodology with a post a year later and even built a custom tool for managing Google Groups.

Stripe was trying to solve the problem where email is directed to individuals. Using a group meant that you could search all past history and get a link to any specific thread. However, their tool was achieved in 2019 and I couldn't find anything recent about whether the approach still works. I know they use Slack, but has that replaced email?

We continue using email externally and use group addresses for finance, operations, hiring, and marketing. Almost all external emails are CC’d to the internal group to maintain transparency. This has proven useful as we hire people into roles that I was previously involved with - the handover is much easier when context is contained within a searchable group.

Internal weekly update emails

We also introduced a weekly email to an internal-only updates mailing list. Every Friday everyone sends an email to the list answering three questions:

What did you ship? Briefly explain the main thing you shipped this week. “Shipped” = merged to main and deployed to production. Dependency updates don’t count unless there was major migration work.
What did you work on? A couple of bullet points of what else you worked on, not including what you shipped.
What did you learn or find interesting? A couple of sentences explaining one or two things you found interesting, with links. Must be related to Arcjet in some way.

This keeps everyone up to date and you can read the updates at your leisure. I particularly enjoy reading everyone's responses to the final question, which I adapted from Jensen Huang’s Top 5 Things (T5T) emails. Sometimes an update gets no replies, but there's often discussion with people replying to ask about or add to these points.

So far Google Groups has worked well (it’s a shame GitHub Discussions wasn’t better). Each person can subscribe how they like (every email, summaries, digests), they’re searchable, and there’s a link to each message. However, the web interface is outdated, threading doesn’t work perfectly, and external spam protection seems poor compared to regular Gmail. WordPress P2 might be an option, but it feels too much like it’s a hacked together blog.

Remote-first, in-person regularly

The remote-only purists are wrong. So are those who say everyone must be in the office all the time. There’s always been a balance to strike and there are plenty of examples of both styles working well. It comes down to who is setting principles and defining the culture.

One thing I’ve learned is that you can create a remote-first culture and then add an in-person or in-office element later, but not the other way around. The most important thing is to have the right tools and systems in place.

If the default assumption is that everyone is in the office then these systems don’t get fully developed and important information gets lost. The result is the remote team being left out, the whole system failing, and then you get “return to office” mandates. The solution is to always assume that everyone is remote and record every discussion and decision.

But working in-person works well for coming up with ideas and iterating quickly. I’ve always found a big difference before and after people meet for the first time. Once you’ve met in-person, it’s a lot easier to work together remotely.

Arcjet is set up as a remote-first, distributed company (US and Western EU timezones to ease collaboration). However, we aim to do in-person meetups 2-3 times a year. It’s not cheap to organize, but remote working has never been about saving costs - the money you save on an office goes into travel. We’ve run these multiple times in London, New York, and Las Vegas (for Defcon) and are planning other varied locations for the next ones. The main challenge is finding somewhere to work from together - comfortable chairs and good wifi are hard to find!

Key principles

Open-source-inspired workflows suit our developer-focused security product by improving transparency and feedback loops. Our team has more context to help make good decisions because we can search through past work to understand the current state of things.

To make this work we following a few key principles:

Document everything in linkable, permanent repositories.
Favor async workflows; use real-time sparingly.
Invest in periodic in-person meetups to build rapport.

The tools have never been better, and I expect AI to make linking systems more effective. It'll be interesting to look back in another 10 years and see what else has changed.

Bot detection techniques for developers

David Mytton — Thu, 05 Jun 2025 19:50:00 +0000

tl;dr: Bot traffic now dominates the web, and AI scrapers are making it worse. Blocking by user agent or IP isn’t enough. This post covers practical detection and enforcement strategies - including fingerprinting, rate limiting, and proof-of-work - plus how Arcjet’s security as code product builds defenses directly into your app logic.

Bots have always been a part of the internet. Most site owners like good bots because they want to be indexed in search engines and they tend to follow the rules.

Bad bots have also been a part of the internet for a long time. You can observe this within seconds of exposing a server on a public IP address. If it’s a web server you’ll quickly see scanners testing for known WordPress vulnerabilities, accidentally published .git directories, exposed config files, etc. Same for other servers: SSH brute forcing, SMTP relay attacks, SMB login attempts, etc.

But recently things seem to have become worse. Depending on who you ask, bots make up 37%, 42% or almost 50% of all 2024 internet traffic. This also ranges by industry, from 18% in marketing to 57% in gaming (2023).

Arcjet’s bot detection is our most popular feature and we see millions of requests from bots every day with all sorts of abuse patterns. In this post we’ll look at the problem bots cause, the techniques they use to evade defenses, and how you can protect yourself.

What problems do bots cause?

A human using a web browser has certain expected behaviors. Even if many requests to the various page assets can be executed in parallel, they will still make requests at human speed. Their progress will be gradual (they won’t load every page on a site simultaneously), and caching mechanisms (local in-browser and/or through a CDN) work to reduce the number of requests and/or data transfer.

Good bots mimic this behavior by progressively visiting site pages at a more human-like pace. They typically follow the rules posted by site owners (the robots.txt voluntary standard was first published in 1994 and became a proposed standard in 2022).

Problem	Description	Examples
Expensive requests	Bots request resource-intensive pages - from static HTML to dynamic pages backed by costly database queries. These pages often can’t be cached or pre-rendered.	Bots crawling every Git commit, blame, and history page on SourceHut. Dynamic content on online stores and wikis being repeatedly requested.
Large downloads	Projects hosting big files - like ISOs or software archives - suffer when bots download at scale, straining bandwidth.	Fedora Linux mirrors overwhelmed by bot downloads. Open source projects struggle with abusive scraping of images, documentation, and archives. Even large vendors like Red Hat and Canonical have to manage these loads; smaller projects rely on limited infrastructure or donations.
Resource exhaustion	Every request has a cost - whether for dynamic or static content. Bots can saturate compute, bandwidth, or memory limits, degrading service or creating DoS-like conditions.	LWN and Wikipedia fighting traffic spikes. Brute-force login attempts on mail servers (e.g., this case) seeking spam relays. Even generous hosts like Hetzner can’t handle infinite abuse; serverless (per request) pricing makes this even riskier.

Are AI bots worse?

Many recent scraping incidents have been attributed to AI crawlers. Attribution is tricky - user agents and IPs are easy to spoof - but detailed logs and traffic patterns from open-source platforms strongly suggest AI bots are a major contributor:

The Diaspora open source web infrastructure traffic logs show 24% of traffic from OpenAI’s GPTBot and 4.3% from Anthropic’s Claudebot. Around 16% come from Amazonbot, although it’s not clear if that is for AI or not.
ReadTheDocs posted examples of crawlers excessive download requests. The user agents weren’t listed, but applying Cloudflare’s AI crawler block list cut bandwidth by 75% - from 800GB/day to 200GB/day.
Discussions around bot traffic from KDE’s GitLab instance suggested traffic from “Chinese AI companies” did not include proper user agent identification whereas traffic from “Western LLM operators” did.

As with everything, incentives matter. Site owners are happy to serve Googlebot because its reasonable behavior means it doesn’t cost (much) and the site gets traffic from searches in return. Win-win. If you don’t want that, it’s easy to restrict or block with robots.txt.

Contrast this with AI where scraping is usually for training purposes with no guarantee that the source of that training data will ever be cited or receive traffic. Why would a site owner want to participate in this “trade”? They’re more likely to want to block the traffic, so the AI scrapers need to hide their identity.

Wildcard: agents acting on behalf of humans

Most site owners want human traffic, so the definition of good vs bad bots comes down to whether that automated traffic is acceptable or not. There’s a spectrum:

Automated API clients = good.
Search engine indexing bots = usually good.
AI crawlers = sometimes good or bad, depending on your philosophical stance.
Scrapers = bad.

This becomes more challenging when you introduce AI agents acting on behalf of humans. The difficulty is nicely illustrated by the type of bots OpenAI operates:

Bot	Purpose	Training?	Citations?	Identification
OAI-SearchBot	Crawls sites to power ChatGPT’s search index.	❌	✅ Drives referral traffic	Clear user agent. Site owners can verify traffic sources. Generally considered beneficial.
ChatGPT-User	Real-time bot for ChatGPT Q&A sessions - reads live content to summarize responses.	❌	⚠️ Sometimes cited, sometimes not. Requires monitoring traffic to assess impact.	Uses a dedicated user agent. Behavior is passive until invoked by a user prompt.
GPTBot	Crawler used to collect training data for foundation models.	✅	❌ Provides no return value to the site owner.	User agent is documented and can be blocked via robots.txt. High bandwidth and content costs.
(Operator)	Full browser agent (Chrome in a VM) used by OpenAI agents to interact with the web on user request.	❓	❓ Depends on use case - behaves like a human user.	No public documentation on how to identify it. Mimics normal Chrome user traffic. Cannot be reliably blocked without false positives.

As AI tools become more popular, simply blocking all AI bots is probably not what you should do. For example, allowing AI bots that act more like search engines such as OAI-SearchBot, means your site will receive traffic from users displaced from the traditional search engines.

Distinguishing between different areas of your site is also important. You should allow search indexing of your content, but block automated bots from a signup page. This is what the robots.txt is supposed to be for, but using a bot detection tool like Arcjet with different rules for different pages of your site allows it to be enforced.

Operators like OAI-SearchBot offer value (e.g., traffic, citations). Others, like GPTBot, provide no benefit and can impose high costs. Treat each agent class differently. Blocking all AI bots is a blunt instrument.

How to detect and block bots

The first step to detecting and managing bots is to create rules in your robots.txt file. Good bots like Google will behave and follow these rules. It’s a good exercise to develop an understanding of how you want to control bots on your site. Use Google’s documentation to guide creating the rules.

But we have to assume that the bad bots won’t follow these rules, so this is where we start to build layers of defenses. Start with low-cost signals (headers), then move to harder-to-spoof data (IP reputation, TLS/HTTP fingerprinting), and finally consider active challenges (CAPTCHAs, PoW).

Blocking user agents

A surprising number of bad bots actually identify themselves with the user agent HTTP header. We often see requests identified as curl, python-urllib, or Go-http-client from simplistic scrapers that haven’t changed the default user agent. We track many hundreds of known user agents in our open source well-known-bots project (forked from crawler-user-agents).

You can use open source projects like isbot (Node.js), userAgent (Next.js), and CrawlerDetect (PHP, also available for Python) to write checks in your application web server or middleware.

There are two obvious downsides to this:

New crawlers and bots are released all the time. As we saw above, OpenAI has at least 3 bots, plus its Operator agent, and Anthropic has changed the name of its bot multiple times. Keeping up with the latest user agent variants is time consuming.
Clients can set the user agent header to whatever they like and can pretend to be something else. The User-Agent header should be set for every HTTP request and it has a specific format, but there is no enforcement of this. If you want to allow GoogleBot, a bad bot could pretend to be Google by using the same user agent header.

Arcjet’s rule configuration allows you to choose specific bots to allow or deny and use categories of common bots, which get regularly updated. This means bot protection rules can be granular and easy to understand. For example:

const aj = arcjet({
  key: process.env.ARCJET_KEY!, // Get your site key from https://app.arcjet.com
  rules: [
    detectBot({
      mode: "LIVE",
      // Block all bots except the following
      allow: [
        "CATEGORY:SEARCH_ENGINE", // Google, Bing, etc
        // Uncomment to allow these other common bot categories
        // See the full list at https://arcjet.com/bot-list
        //"CATEGORY:MONITOR", // Uptime monitoring services
        //"CATEGORY:PREVIEW", // Link previews e.g. Slack, Discord
      ],
    }),
  ],
});

Verifying user agents

To mitigate spoofed requests where a client pretends to be a bot we want to allow, you need to verify the request. For example, if we see a request with a GoogleBot user agent, we need to verify that request is actually coming from Google.

The big crawler operators all provide methods for verifying their bots e.g. Applebot, Bing, Datadog, Google, and OpenAI all support verification. This is usually through a reverse DNS lookup to verify the source IP address belongs to the organization claimed in the user agent. Some provide a list of IPs which then need to be checked instead e.g. Google has a machine readable list of IPs.

Example: Verifying Googlebot

To check if a request is coming from a Google Crawler:

Run a reverse lookup on the client source IP address. For example, a request from 66.249.66.1 claiming to be Google using the host command on macOS or Linux. Check that the domain result is either googlebot.com, google.com, or googleusercontent.com:

~ host 66.249.66.1
1.66.249.66.in-addr.arpa domain name pointer crawl-66-249-66-1.googlebot.com.

Run a forward DNS lookup on the domain returned and check that the IP address matches the original source IP address:

~ host crawl-66-249-66-1.googlebot.com
crawl-66-249-66-1.googlebot.com has address 66.249.66.1

We can see that the IP address matches the source IP, so we know that this request is actually coming from Google.

IP address reputation

As you receive requests from a variety of IP addresses, you can build up a picture of what normal traffic looks like. This technique has been used to prevent email spam for decades - the same principles apply when analyzing suspicious web traffic. If an IP address has recently been associated with bot traffic then it’s more likely that a new request is also a bot.

Various commercial databases exist from providers like MaxMind, IPInfo, and IP API. They offer an API or downloadable database of IPs with associated metadata like location and fraud scoring.

Looking up IP data like the network owner, IP address type, and geo-location all help to build a picture of whether the request is likely to be abusive or not. For example, requests coming from a data center or cloud provider are highly likely to be automated so you might want to block them from signup forms. Cloudflare reported that AWS was responsible for 12.7% of global bot traffic in 2024 and Fedora Linux was forced to block all traffic from Brazil during a period of high abuse.

IP data isn’t perfect though. IP geo-location is notorious for inaccuracies, especially for IP addresses linked to mobile or satellite networks. Bot operators cycle through large numbers of IP addresses across disparate networks, and are buying access to residential proxies to mask their requests.

IP-based decisions often cause false positives, so blocking solely on IP reputation is risky. Arcjet’s bot detection provides all the signals back into your code so you can decide how to handle suspicious requests e.g. flagging an online order for human review rather than immediately accepting it.

The usual approach is to trigger a challenge, like a CAPTCHA. Early versions relied on distorted text that was difficult for OCR (Optical Character Recognition) software to parse. More recent iterations include "no-CAPTCHA reCAPTCHA" (which analyzes user behavior like mouse movements before presenting a challenge) and invisible CAPTCHAs that work in the background or require showing “proof of work”.

Proof of work

Requiring all clients to spend some compute time completing a proof of work challenge introduces a cost to every request.

This idea isn’t new. Bill Gates famously announced a similar idea back in 2004 in response to huge volumes of email spam. The theory is that an individual human browser can afford to spend a micro-amount of time solving a challenge, whereas it would make mass web crawling economically inefficient. The challenge difficulty could increase depending on how suspicious the request is, making it increasingly expensive for bots.

There are several open source projects which implement this idea through a reverse proxy that will challenge all requests to your site: Anubis, Checkpoint, go-away, Nepenthes, haproxy-protection, and Iocaine are all interesting implementations.

AI crawlers are able to work around these though - AI poisoning is a well-known technique, and many crawlers already evade such defenses - and there are downsides, such as hurting accessibility.

The effectiveness of these types of proof of work or CAPTCHAs is an ongoing arms race. Modern AI can now solve many types of CAPTCHAs with increasing accuracy and speed. This means that while CAPTCHAs can deter simpler bots, determined attackers can bypass them, especially for high-value targets where the cost of solving the CAPTCHA is negligible compared to the potential profit. For example, paying a few cents (or even tens of dollars) to solve a challenge protecting sports or concert ticket purchases is worth it when the profits are in the hundreds of dollars.

HTTP message signatures

Cloudflare has proposed a new signature verification technique where bots can identify themselves using request signing. Whilst there are some benefits like non-repudiation, it remains to be seen how this improves the existing approach to verifying bot IP addresses using reverse DNS.

Whereas HTTP Message Signatures are focused on requests from automated bots, Apple introduced something similar with Private Access Tokens in 2022 for browsers. Although RFC 9577 (Privacy Pass HTTP Authentication Scheme) is progressing through the standardization process, it hasn’t had widespread adoption. It is built into the Apple ecosystem and works in Safari, but no other browsers have adopted it (a Chrome / Firefox extension is available).

HTTP Message Signatures for automated traffic Architecture.

JA3/JA4 fingerprint

The JA3 fingerprint was invented in 2017 at Salesforce. It’s based on hashing various characteristics of the SSL/TLS client negotiation metadata. The idea is that the same client will have the same fingerprint even if it is making requests across IP addresses and networks. JA3 has mostly been deprecated because of how easy it is to cause the hash to change just by making slight changes to network traffic e.g. reordering cipher suites. It has been replaced by JA4.

The challenge with JA4 hashing is how it's based on the TLS handshake metadata, such as the protocol version and number of ciphers. This is available if you run your own web servers, but not on modern platforms like Vercel, Netlify, and Fly.io because they run reverse proxy edge gateways for you (Vercel calculates the JA3 and JA4 fingerprints for you and adds headers with the data).

JA4: TLS Client Fingerprint

An alternative is JA4H which is calculated based on HTTP request metadata, but this is a proprietary algorithm whereas JA4 is open source.

When you have the hash there is still manual work to decide which ones to block just like deciding to block IP addresses. It is best combined with IP reputation when taking automated decisions so as to minimize the risk of false positives.

JA4H: HTTP Client Fingerprint

Rate limiting

IP address based rate limiting is a basic solution which, like user agent blocking, can help with some of the simpler attacks. However, it is easily bypassed with rotating IP addresses. Using different characteristics to key the rate limit can help - a session or user ID is the best option if your site requires a login.

Otherwise, using a fingerprint (like JA3/JA4) will help manage anonymous clients. Using compound keys that include features such as the IP address, path, and fingerprint as supported by Arcjet’s rate limiting functionality can help create sophisticated protections.

Conclusions

No one technique is enough because bot detection isn’t perfect. Instead, a robust defense strategy relies on a multi-layered approach. Starting with robots.txt to guide well-behaved bots is a good first step, but it must be augmented by more assertive techniques. These include verifying user agents, leveraging IP address reputation data, employing TLS and HTTP fingerprinting like JA3/JA4, implementing intelligent rate limiting, and considering proof-of-work challenges or CAPTCHAs where appropriate.

The key is to remain vigilant and adapt. Understanding the different types of bots, their evasion techniques, and the array of available countermeasures allows site owners to make informed decisions. Or use a product like Arcjet which takes away a lot of the hassle and means the protections are built right into the logic of your application.

Low latency global routing with AWS Global Accelerator

David Mytton — Tue, 27 May 2025 09:16:10 +0000

Arcjet performs real-time security analysis in the critical path of API and authentication flows. That means latency isn’t just a consideration - it’s a core design constraint.

To meet our end-to-end p50 latency SLA of 20–30ms: we deploy globally, use persistent HTTP/2 connections, and rely on AWS's network to ensure routing to the nearest healthy region.

A core component of our architecture is AWS Global Accelerator. This service routes traffic through a set of Anycast IPs via distributed AWS edge locations (points of presence) to the closest healthy AWS region, utilizing AWS's private global network for more stable and lower-latency pathways compared to the public internet.

This post explains the details of how we use AWS Global Accelerator and other AWS services to achieve our SLA targets. While our context is delivering a low-overhead security product for developers, the principles and benefits discussed are applicable to any mission-critical, latency-sensitive global application.

The challenge - request analysis in the hot path

Arcjet is delivered as an SDK so it can be tightly integrated into the logic of the application and benefit from the full request context. Security rules can be adjusted dynamically based on user and session characteristics, and the results can be incorporated into how the application behaves.

Everything is downstream of latency. That’s why Arcjet takes a local-first approach: rules are evaluated in-process via a WebAssembly module bundled in the SDK. Many decisions can complete within 1-2ms, but others require a network call, such as when our dynamic IP reputation database is used as one of the security signals.

In cases where our cloud decision API is involved, we set ourselves a p50 response time goal of 20ms which leaves 5-10ms for the network round trip so that we can hit our end to end latency goal of 20-30ms.

This goal applies globally. Developers deploy their applications everywhere, so centralizing our API is not an option.

Architecture diagram showing the request being processed by the Arcjet SDK within the application environment. An Arcjet API call may be used to enrich the decision. Read more about the Arcjet Architecture.

The solution - Global Accelerator + multi-region

Arcjet’s cloud decision API is written in Go and provides both JSON and gRPC interfaces to support different environments, with gRPC preferred due to the more optimized Protocol Buffers message format.

Our cloud decision API is containerized and deployed across multiple availability zones within each AWS region using AWS Elastic Kubernetes Service (EKS), fronted by an AWS Application Load Balancer (ALB). The ALB distributes incoming API traffic across our container instances; its cross-zone load balancing capability ensures that if one availability zone experiences an issue, traffic is automatically routed to instances in the other healthy zones within that region, enhancing resilience.

From the developer’s perspective, Arcjet is a single endpoint. Behind the scenes, Global Accelerator ensures each request is routed to the closest healthy AWS region using latency-based routing and active health checks. This happens automatically - no configuration or region awareness required from the developer.

AWS Global Accelerator uses a global network of 119 Points of Presence in 94 cities across 51 countries (source).

Application to edge to region

Setting up a new connection is often the slowest part of communicating with our API. Establishing a TCP connection and performing a TLS handshake takes multiple round trips, adding many milliseconds to each new request. This can quickly eat into our 5-10ms network round trip budget and is particularly problematic in serverless environments where cold starts are common.

To mitigate this our SDK establishes a persistent HTTP/2 connection to our API, allowing multiple requests to be multiplexed over a single, long-lived connection. Global Accelerator helps minimize the round trip time because the initial TLS handshake can be completed by the closest network edge, which is usually much closer than the AWS region serving the request.

The network path between the edge location and the AWS region uses the AWS global network, which is much more optimized compared to routing over the public internet. Global Accelerator helps eliminate network jitter and unpredictable routing by using AWS’s private backbone between edge locations and regions. This stability is critical for maintaining consistent performance, especially in bursty serverless environments.

We run in >10 AWS regions, launching more based on customer demand. Each regional deployment is completely independent, a key design choice for maximizing availability and fault isolation, orchestrated seamlessly by AWS Global Accelerator's intelligent routing. If traffic is re-routed then this happens within AWS’s network without requiring our SDK client to reconnect and avoiding a cold start.

Performance in practice

We measure both internal processing latency and end-to-end cold start latency from the SDK’s perspective. Over the last 7 days, the median internal API response time across all regions was 12ms (p95: 20ms).

In a full cold start scenario where a new connection is initiated by the Arcjet SDK, we recorded a p50 of 25ms from request initiation to response delivery - including 2ms for TCP and 6ms for TLS handshakes.

Conclusions

Low latency isn’t just a feature - it’s fundamental to Arcjet’s design for real-time application security. By leveraging AWS’s global edge network and services like Global Accelerator, we offload the hardest parts of distributed networking and stay focused on building developer-first security features.

If you're building latency-sensitive APIs, especially in a multi-region or serverless world, Global Accelerator can probably help.

Next.js middleware bypasses: How to tell if you were affected?

David Mytton — Fri, 11 Apr 2025 15:11:48 +0000

At Arcjet, we found the recent Next.js middleware bypass vulnerabilities (CVE-2025-29927 & CVE‑2024‑51479) especially relevant - not only are we Next.js users ourselves, but we wanted to see how our own security SDK could help Next.js users manage incident response.

Authorization bypasses rank among the most critical security threats because they allow attackers to enter areas of an application that should remain off-limits. Given that Next.js is one of the most popular JavaScript frameworks and using middleware for authentication is a common pattern, there is a high chance of users being affected these vulnerabilities.

In this post, we’ll examine both of these vulnerabilities and walk through how to identify suspicious patterns to determine if you were affected.

For those already using Arcjet, we’ll also discuss how you can leverage our platform to confirm whether your applications were exploited. Our security SDK integrates with your application’s middleware (or directly inside routes), providing real-time inspection of every incoming HTTP request. This gives us the ability to search for attack signatures in past requests.

CVE-2025-29927: Next.js Middleware bypass via `x‑middleware‑subrequest`

GHSA-f82v-jwr5-mffw (CVE-2025-29927) is a critical Next.js vulnerability that lets attackers bypass middleware authorization checks by providing a crafted x‑middleware‑subrequest header. It affects all Next.js versions released after 11.1.4, with patches now available in 12.3.5, 13.5.9, 14.2.25, and 15.2.3.

While outright blocking x‑middleware‑subrequest can stop the exploit, many services legitimately rely on the header for internal requests, as Cloudflare discovered when they attempted a blanket block that had to be rolled back.

Any mitigation that drops this header risks breaking valid functionality, underscoring the need for a more targeted fix, and highlighting a classic problem with generic, network-level WAFs.

Was I affected?

Once you apply the patch (or block requests with the header), the question turns to whether or not you were affected. Because the vulnerability dates back to code introduced in 2022 (Next.js > 11.1.4), it’s crucial to search historical logs for requests containing a suspicious x‑middleware‑subrequest header.

Even though CVE‑2025‑29927 was only recently disclosed, there was a lag in releasing patches (and then announcing the vulnerability). Patch adoption takes time and real-world exploitation has been possible for several years - zero-day vulnerabilities often circulate in underground markets long before they’re officially revealed. This means it's possible to have been affected for some time.

What to look for in logs

The exploit signature for CVE-2025-29927 is very simple and depends on the version of Next.js you’re using. As explained in the disclosure writeup, we can look for specific payloads:

Next.js prior to 12.2 : requests to known sensitive paths such as /admin where the x-middleware-subrequest header contains pages/_middleware or pages/admin/_middleware
Next.js 12.2 onwards : requests with a x-middleware-subrequest header containing middleware or src/middleware.
Next.js 15 onwards : requests with repeated patterns, due to the introduction of a MAX_RECURSION_DEPTH constant set to 5 e.g. requests with a x-middleware-subrequest header containing middleware:middleware:middleware:middleware:middleware or src/middleware:src/middleware:src/middleware:src/middleware:src/middleware

Any requests matching these signatures could indicate compromise.

CVE-2024-51479: Next.js middleware pathname-based authorization bypass

Looking further back, another critical middleware flaw, GHSA-7gfc-8cq8-jh5f / CVE-2024-51479, emerged in December 2024. Affecting versions from 9.5.5 up to 14.2.14, Next.js was patched in 14.2.15.

This issue occurs when the middleware’s authorization logic depends on the pathname: the middleware inadvertently allows access to the top‑level route (/admin) even when sub-paths like /admin/users are correctly matched.

Was I affected?

In the snippet below (adapted from Herodevs), all requests to paths under /admin with the missing authorization header should be blocked.

However, this vulnerability meant that while requests to /admin/users are denied, requests to /admin would be allowed. If your middleware had similar logic, any sensitive pages at top-level URLs could be accessed.

import { NextResponse } from 'next/server';

export function middleware(request) {
  const { pathname } = request.nextUrl;
  // Simulate authorization check: block access to /admin unless a condition is met
  if (pathname.startsWith('/admin') && !request.headers.get('authorization')) {
    return new Response('Unauthorized', { status: 401 });
  }
  return NextResponse.next();
}

export const config = {
  matcher: ['/admin/:path*', '/admin'],
};

What to look for in logs

This vulnerability hinges on two factors:

A request to a sensitive top-level path (e.g. /admin) where the middleware is checking for the pathname. Only the exact root path is exposed, not its sub-routes (like /admin/users).
An authorization check in middleware alone. In the example, we rely on a header named authorization (plus the pathname check above) but it could be any other header.

As with CVE‑2025‑29927, any application that relied solely on middleware for authorization checks would have been vulnerable. If you run the check (or additional checks) in the routes themselves then you would have had another layer of security. Middleware is good for running initial checks and redirecting users to an authentication flow, it is a good idea to perform further authorization checks in each route.

To confirm whether you were exposed, find any requests to top-level protected paths (e.g., /admin) that lacked the expected authorization header (like authorization or x-auth-token). If those requests bypassed your middleware without triggering an error, you were likely vulnerable.

Request lookup using Arcjet

When conducting an incident analysis, it’s important to be able to answer questions like: was I affected? How long was I affected for? What data was accessed? The answers to these determine what kind of legal disclosures you need to make and how you inform customers.

If you don’t have request logs then how do you know if you were impacted?

Arcjet analyzes and reports granular request metadata such as paths and headers (though sensitive headers like authorization are redacted and the request body is never sent outside your environment). By installing Arcjet, if (when) a new vulnerability surfaces you’ll have detailed request history to confirm potential exploit attempts. Request logs are available to all Arcjet users, although retention time depends on your pricing plan.

If you need help with determining if you are affected (or have ever been affected) by these or other vulnerabilities, reach out to our support team.

Secure local Node.js dev servers with OrbStack

David Mytton — Fri, 07 Feb 2025 15:05:03 +0000

At Arcjet, we use OrbStack to manage our local development environment. We run various containers in production, so we use Docker Compose to mirror those services locally. This includes our low-latency security decision API used by our security as code SDKs, the dashboard webapp, website, docs, and backend processing components.

OrbStack is a feature-rich alternative to (but compatible with) Docker Desktop. It has a significantly better UI, is much more performant, and comes with powerful features like Custom Domains and Automatic HTTPS. These allow us to mirror our production environment very closely, particularly ensuring we have full SSL configured locally so we can properly test things like security headers.

Recently, OrbStack 1.9 made improvements so SSL certificates are now trusted between containers. I decided to spend some time improving our development experience by replacing the self-signed certificates we previously used between containers with this new OrbStack functionality.

Working… Sometimes

The first change I made was removing a custom HTTP transport workaround to allow the local certificates between our services written in Go. This was very straightforward and no other changes were needed—the Go services trusted the certificate and seamlessly communicated via HTTPS. This allowed me to remove the self-signed certificates generated by mkcert and remove a setup step for our development environment.

However, whenever I tried to make the same changes for our Node.js services, they would fail with an error: self-signed certificate in certificate chain. This didn’t make sense to me because OrbStack claimed that these certificates were trusted between containers and the error went away in Go code with the 1.9 release.

Bundled certificates

In scouring the Node.js documentation, I discovered the --use-openssl-ca command-line flag. The documentation explicitly calls out:

The bundled CA store, as supplied by Node.js, is a snapshot of Mozilla CA store that is fixed at release time. It is identical on all supported platforms.

This means that Node.js bundles a static snapshot of Mozilla’s Certificate Authority which it uses by default to validate certificates. Once I understood this, it makes sense that each Node.js service still flagged the OrbStack certificates would be untrusted—they aren’t included in Mozilla’s CA!

Luckily, Node.js accepts the --use-openssl-ca flag to opt-out of the bundled CA and instead rely on OpenSSL for CA. The documentation even explains that changes to the CA require this flag:

Using OpenSSL store allows for external modifications of the store. For most Linux and BSD distributions, this store is maintained by the distribution maintainers and system administrators. OpenSSL CA store location is dependent on configuration of the OpenSSL library but this can be altered at runtime using environment variables.

Custom OpenSSL CA store location in a container

We want this to apply to any instance of the node command run inside our containers without needing to specify the flag each time. We can apply this globally by setting the NODE_OPTIONS environment variable in our container. I set this in our docker-compose.yml file:

node-service:
  build:
    context: .
    dockerfile: services/node-service/Dockerfile
  environment:
    - NODE_OPTIONS="--use-openssl-ca"

Root certificate

Even with this option set, communication from our Node.js services was still failing. It turns out that OrbStack mounts the root certificate at /usr/local/share/ca-certificates/orbstack-root.crt but the OpenSSL CA doesn’t know about it.

To make Node.js aware of the root certificate, we need to set the SSL_CERT_FILE environment variable. I updated our docker-compose.yml file with this variable:

node-service:
  build:
    context: .
    dockerfile: services/node-service/Dockerfile
  environment:
    - NODE_OPTIONS="--use-openssl-ca"
    - SSL_CERT_FILE=/usr/local/share/ca-certificates/orbstack-root.crt

Perhaps we could apply some other commands to avoid setting the SSL_CERT_FILE variable, but I wasn’t sure which magic incantation to apply.

OpenSSL headers are not in slim containers

Be aware of your base containers when applying this technique. Using a *-slim container will fail with ERR_SSL_WRONG_VERSION_NUMBER because the OpenSSL headers are removed to slim down the container. Instead, use the full container image or reinstall the OpenSSL headers that were removed.

HTTPS dev servers

With all of the above applied, we have seamless HTTPS communication in our entire development stack—our browser communicates over HTTPS with our applications which communicate over HTTPS to various other services.

Development certificates have always frustrated me, so I am delighted that none of this required manually generating a certificate or committing a development certificate into the repository. Finally, we have a streamlined onboarding process for our development environment with full HTTPS support!

Bonus: Vite

Recently, Vite fixed a CVE that would allow a malicious page to interact with the dev server. Their solution (more or less) was to disallow anything other than the localhost domain from communicating with the dev server.

This is a problem with the OrbStack Custom Domains feature because you are accessing the Vite dev server via a domain such as https://modest_bhaskara.orb.local instead of localhost. You can solve this by setting your custom domain in server.allowedHosts in your vite.config.js file:

export default {
  server: {
    allowedHosts: ["modest_bhaskara.orb.local"]
  }
}

Unfortunately, OrbStack doesn’t provide this value inside the container. If it were available as an environment variable, we could specify it as a generalized process.env property access. However, the custom domain names are based on the container name or customizable with labels, so we know what the domain will be for each container.

It’s great to see these regular improvements to OrbStack, which is a core part of our development setup. It means we can remove dev-only workarounds so our development environment is as close as possible to mirroring production.

Does Next.js need a WAF?

David Mytton — Thu, 16 Jan 2025 12:05:11 +0000

The fact that the developers of Next.js at Vercel enable their Web Application Firewall by default free for all accounts, suggests that yes, Next.js needs a WAF!

Throwing a network-level WAF in front of your application is an easy way to defend against attacks. Next.js is no different from any other web application in that sense. Although there are some out of the box protections in React from things like Cross Site Scripting (XSS), Next.js itself has had past vulnerabilities and unsafe coding can introduce others. You still need to follow a security checklist for Next.js.

But there are tradeoffs from using a WAF. Analyzing every request adds latency and different areas of your application may be higher risk than others. Arcjet tackles this differently through our Shield WAF feature - we still analyze requests for common threats, but do so in the background. This minimizes latency and helps avoid false positives because multiple requests can be analyzed.

Arcjet is also more closely integrated with your code so you can dynamically adjust your application logic depending on the response. You might want to instantly block suspicious requests from anonymous users, but if you know the user is signed into an enterprise account and completed 2FA, you may just want to flag the request or trigger an alert.

Network level proxy firewalls are a legacy architecture that doesn’t work well with modern applications. Choosing whether to add a WAF is now a more interesting discussion because it can be more deeply integrated with your code.

How can a WAF protect Next.js?

A WAF can help defend against attacks by scanning all incoming requests, looking for known attack patterns, and proactively blocking suspicious requests.

Arcjet includes Shield WAF as one of the features you can enable in our SDK. Through processing millions of requests, we see two main types of attacks in our WAF logs:

Passive scanning attacks

A lot of malicious web traffic is from passive scanners. This is usually high volume, low sophistication. For example, we see a lot of scanning for Wordpress config files and Windows remote code execution even though the application is clearly a Next.js app hosted on Vercel.

We also see requests trying to take advantage of configuration mistakes, such as .env or .git files accidentally deployed, or plain text configuration files left behind.

If you’re using Next.js deployed to a modern platform like Vercel, Render, Railway, or Fly.io, then these requests are a minor annoyance. They can be easily blocked to reduce your infrastructure costs, but likely don’t represent much of a threat.

More concerning are targeted, passive attacks, such as scanners enumerating your forms, login routes, and API endpoints for SQL injection or other input-based attacks. These rarely succeed on the first request because the attacker is really enumerating your application opportunistically to see if something strange happens. A WAF can detect those attempts and proactively block the client before the exploration discovers something.

For these types of attacks, a WAF in front of Next.js is a nice bonus. It provides another layer of protection and can even reduce infrastructure costs because requests are blocked before your main code executes.

Active direct attacks

Like all popular software, Next.js has had past security vulnerabilities. Any dependencies you use may have security issues and your own code and contain mistakes, such as incorrect input validation leading to Cross Site Scripting, SQL Injection, or Server Side Request Forgery type issues.

These can be detected by more sophisticated scanning - looking for specific vulnerabilities because you know that the site is using Next.js.

For example, CVE-2024-34351 affected Next.js <14.1.1 where a specific Host header could result in a Server Side Request Forgery attack with Server Actions. This only affected self-hosted installations of Next.js due to how Vercel works.

Another issue from 2024 was CVE-2024-51479 where a middleware authorization bypass in Next.js <14.2.15 affected Next.js hosted everywhere. Vercel deployed mitigations for all applications on their platform, but others were vulnerable to attack.

In this case, the answer is that using a WAF as a security layer in front of Next.js is needed to help detect emerging vulnerabilities. These attacks all have signatures which can be detected and blocked if the WAF is regularly updated. This is important because you might not be able to update to a new version immediately, so having the protections applied through a managed WAF ruleset will mitigate any active attacks. This is where a WAF can really make a difference.

PCI v4 makes WAFs mandatory

If you are running any service that needs to comply with the PCI payments processing standards then you too will soon need to enable a WAF in front of Next.js.

On March 31, 2025, PCI DSS 4.0 will take effect. This includes changes that mean a WAF moves from recommended (in PCI DSS 3) to required. Section 6.4.2 says:

For public-facing web applications, an automated technical solution is deployed that continually detects and prevents web-based attacks

Using a WAF is given as a specific example:

A web application firewall (WAF), which can be either on-premise or cloud-based, installed in front of public-facing web applications to check all traffic, is an example of an automated technical solution that detects and prevents web-based attacks

Making a WAF for Next.js

The big downside of network-level WAFs is they are generic. They have no understanding of your application so can’t customize rules based on things like tech stack (no point applying PHP detections for a Next.js application), the routes (should this endpoint return JSON or is it an HTML response?), or the session context (is a user logged into a paying account vs an anonymous user on your website?).

This is why we’ve designed the Arcjet Shield WAF to run as part of our native SDK. When you add a Shield rule, we analyze the request in the background with rules customized based on an understanding of the tech stack being protected.

As requests are processed, suspicious activity is monitored until a certain threshold is reached. This helps avoid false positives where legitimate traffic accidentally triggers a rule.

We see this regularly when applying the rules from the open source OWASP Core Ruleset, which is integrated into Arcjet Shield for Pro plan users. This is a great repository of mitigations against common attacks, but has to be broad to cover all types of web applications.

We regularly monitor the results of Shield analysis and apply customizations based on our understanding of the type of traffic we expect to see. We already know which SDK and framework you’re using, so we can be much more targeted with the rules that get applied.

And of course since Arcjet rules are all just code, and you can see the reasons behind our decisions, your application can adjust in real time. The flexibility is there to decide what’s best for your situation.

Arcjet Shield is free for all users and the managed OWASP Core Ruleset is applied to Pro & Enterprise accounts. Enable it with just a few lines of code:

const aj = arcjet({
  key: process.env.ARCJET_KEY!,
  rules: [
    // Shield protects your app from common attacks e.g. SQL injection
    // DRY_RUN mode logs only. Use "LIVE" to block
    shield({
      mode: "DRY_RUN",
    }),
  ],
});

So do you need a WAF for Next.js?

If you need to be PCI compliant, then from March 31, 2025 the answer is yes; you need a WAF for Next.js.

For everyone else, enabling a WAF will give you another layer of security. Defense in depth is important because no layer can ever be 100% secure, so you probably should put a WAF in front of Next.js.

Test security rules without breaking production: Arcjet's DRY_RUN mode

David Mytton — Tue, 07 Jan 2025 09:36:04 +0000

Picture this: it’s well into the evening in the office, and you sit at your computer, moments away from altering the security configurations of your company’s critical software. You were urgently asked to tighten some things up, but right now the only thing on your mind is receiving an emergency call as soon as you’re about to go to bed. Who knew making changes could be this stressful?

With the right tools, security changes don’t have to be this intimidating. It’s fear of the unknown that is the biggest cause for hesitation. The best thing you can do is to build confidence in your changes through a data-driven approach to the implementation - an approach that uses real environments , detailed context , and live activity to build evidence that your change will not be disruptive.

In this article, we’ll cover the challenges of configuring legacy WAFs and CDN security services, and how Arcjet takes a different approach that takes the nerves out of security rule changes.

The Old Approach: Cumbersome logging

When it comes to building confidence in our changes, traditional CDN security tools have limited options. Typically, you can turn on WAF logging and evaluate potential rule changes against those logs, but there are a few of downsides to this approach:

The WAF is a separate system, meaning different configuration and unusual log formats. This level of logging is usually something that needs to be enabled and may not even be included as a feature of the product available by default.
Using a separate system means finding where the logs are and figuring out how to query them. Then when you get a result, you need to understand how to correlate it to the requests that triggered the log entries.
Logging is often billed separately and by volume, so if you can only test in production your log volume may suddenly explode.

With so much context switching, it’s easy for something to slip through the cracks.

The Old Approach: Cross-environment limitations

Another way to build confidence in a change is to test it in a non-production environment first, but this is still a challenge for traditional web application security tools. Traditional WAFs operate as a reverse proxy and are usually only deployed to production and never in the development environment.

The software development lifecycle begins on a developer’s workstation, but there isn’t a realistic way to test WAF rules on a workstation. You typically need to wait until you have deployed your application to a dedicated testing environment.

And even if you do have security set up in your testing environment, it can be cumbersome to manage as an external system and your testing won’t include normal user traffic.

The Arcjet Way: Test everywhere

At this point, it’s clear that there’s some toil and doubt involved when making security rule changes with traditional WAFs. Now, let’s explore how Arcjet’s architecture allows you to take an approach that removes doubt through simplicity.

Test locally

In “the old approach”, we discussed how it’s nearly impossible to evaluate your security rules while developing on your workstation because the security engine lives on another system.

A benefit of Arcjet’s architecture is that your security functionality lives inside your application. That means you can build your application on your development workstation, and your security rules will act the same locally as if you were running them in production.

Let’s say you are developing a Next.js application and want to add Arcjet Shield WAF to one of your routes. Once you have Arcjet added to your route.ts file with a Shield rule, you start your application locally.

import arcjet, { shield } from "@arcjet/next";
import { NextResponse } from "next/server";

const aj = arcjet({
  key: process.env.ARCJET_KEY!,
  rules: [
    shield({
      mode: "LIVE",
    }),
  ],
});

export async function GET(req: Request) {
  const decision = await aj.protect(req);

  for (const result of decision.results) {
    console.log("Rule Result", result);
  }

  console.log("Conclusion", decision.conclusion);

  if (decision.isDenied() && decision.reason.isShield()) {
    return NextResponse.json(
      {
        error: "You are suspicious!",
      },
      { status: 403 },
    );
  }

  return NextResponse.json({
    message: "Hello world",
  });
}

/app/api/arcjet/route.ts

All you need to do to make sure the rule is working is send 5 curl requests with the special header to cross the test threshold for malicious activity. You can do this with the following command:

for i in {1..5}; do curl -v -H "x-arcjet-suspicious: true" http://localhost:3000; done

After running the 5th curl request, you should receive a 403 error and see a blocked request in your Arcjet logs.

Since the security engine is part of your application, you can do these simple sanity-checks for your rules anywhere your application can run.

Test in CI/CD

Another area where it’s traditionally hard to test security rules is in your CI pipeline. Arcjet’s security-as-code architecture makes it easy to do automated testing, such as with the Newman framework. Let’s take a look at the following Express app example from GitHub that illustrates how this works:

In our Express app, we have an API endpoint that is very sensitive to performance issues, so we’ll add an Arcjet rate limit rule to only allow 1 request per second.

import express from "express";
import arcjet, { detectBot, fixedWindow } from "@arcjet/node";

const aj = arcjet({
  key: process.env.ARCJET_KEY,
  rules: [],
});

const app = express();

app.get("/api/low-rate-limit", async (req, res) => {
  const decision = await aj
    // Only inline to self-contain the sample code.
    // Static rules should be defined outside the handler for performance.
    .withRule(fixedWindow({ mode: "LIVE", window: "1s", max: 1 }))
    .protect(req);

  if (decision.isDenied()) {
    res.status(429).json({ error: "rate limited" });
  } else {
    res.json({ hello: "world" });
  }
});

//...

const server = app.listen(8080);

// Export the server close function so we can shut it down in our tests
export const close = server.close.bind(server);

index.js

Now, we’ll define our test collection for Newman. To test the rate limiting, we will have Newman send two requests. We will expect the first request to succeed, and the second request should be denied by our Arcjet rate limit rule.

{
  "variable": [{ "key": "baseUrl", "value": "http://localhost:8080" }],
  "item": [
    {
      "name": "/api/low-rate-limit",
      "item": [
        {
          "name": "Allowed",
          "request": {
            "url": "{{baseUrl}}/api/low-rate-limit",
            "header": [
              {
                "key": "Accept",
                "value": "application/json"
              }
            ],
            "method": "GET",
            "body": {},
            "auth": null
          },
          "event": [
            {
              "listen": "test",
              "script": {
                "type": "text/javascript",
                "exec": [
                  "pm.test('should be allowed', () => pm.response.to.have.status(200))"
                ]
              }
            }
          ]
        },
        {
          "name": "Denied",
          "request": {
            "url": "{{baseUrl}}/api/low-rate-limit",
            "header": [
              {
                "key": "Accept",
                "value": "application/json"
              }
            ],
            "method": "GET",
            "body": {},
            "auth": null
          },
          "event": [
            {
              "listen": "test",
              "script": {
                "type": "text/javascript",
                "exec": [
                  "pm.test('should be rate limited', () => pm.response.to.have.status(429))"
                ]
              }
            }
          ]
        }
      ]
    }
  ],
  "event": []
}

tests/low-rate-limit.json

Lastly, we’ll create the Javascript file to run our test:

import { after, before, describe, test } from "node:test";
import assert from "node:assert";
import { fileURLToPath } from "node:url";
import { promisify } from "node:util";

import { run } from "newman";

// Promisify the `newman.run` API as `newmanRun` in the tests
const newmanRun = promisify(run);

describe("API Tests", async () => {
  // Importing the server also starts it listening on port 8080
  const server = await import("../index.js");

  after((done) => server.close(done));

  test("/api/low-rate-limit", async () => {
    const summary = await newmanRun({
      collection: fileURLToPath(
        new URL("./low-rate-limit.json", import.meta.url),
      ),
    });

    assert.strictEqual(
      summary.run.failures.length,
      0,
      "expected suite to run without error",
    );
  });

//...

});

tests/api.test.js

Now, you just need to set up a workflow to execute the automated tests within your CI environment and you can run the security rules as part of your test suite.

Test in staging or preview environments

With traditional WAFs, staging environments are typically set up to simulate production and run integration tests that behave like they would for your users. Even though Arcjet’s ability to simulate production is similar to WAFs in dedicated environments, there are still two major benefits at this point.

The first benefit is that you’ve already simulated production before you even got to the staging environment. This means you may have run most of your security-specific tests in CI already and caught issues earlier in the development lifecycle.

The second benefit is that setting up additional environments is less work with Arcjet. You don’t need to configure a reverse proxy, and all of your security rules were configured when you wrote the code.

When it comes to dedicated environments, you can stick to the old ways and run tests in staging or preview. However, it’s even more effective to test Arcjet rules in production.

Test in production

Admit it - you read the words “test in production” and cringed a little. With Arcjet rules, testing in production isn’t a bad thing. Any rule you create in Arcjet can be run in DRY_RUN mode without affecting your users. Let’s break down what that looks like.

When you are defining Arcjet security rules, each rule is deployed in either LIVE or DRY_RUN mode. LIVE rules will actively block a request that matches the security rule, but DRY_RUN rules will simply log the would-be block action in Arcjet. Here’s an example:

Let’s say you have an existing Next.js application with Arcjet Shield for blocking attacks like SQL injection, but you’d also like to start blocking automated bots. You simply add the detectBot rule to your Arcjet object with mode: "DRY_RUN":

import arcjet, { createMiddleware, detectBot } from "@arcjet/next";
export const config = {
  // Matcher tells Next.js which routes to run the middleware on.
  // This runs the middleware on all routes except for static assets.
  matcher: ["/((?!_next/static|_next/image|favicon.ico).*)"],
};
const aj = arcjet({
  key: process.env.ARCJET_KEY!,
  rules: [
    shield({
      mode: "LIVE", // Will block requests.
    }),
    detectBot({
      mode: "DRY_RUN", // New rule, log only for evaluation.
      allow: [
        "CATEGORY:SEARCH_ENGINE", // Google, Bing, etc.
      ],
    }),
  ],
});
// Pass any existing middleware with the optional existingMiddleware prop.
export default createMiddleware(aj);

After deploying the new rule, your application’s traffic keeps flowing the same way it did before. After some time, you can check Arcjet’s logs and notice that your uptime monitor’s requests are being logged as would-be blocks. You add another category to your detectBot rule and redeploy:

const aj = arcjet({
  key: process.env.ARCJET_KEY!,
  rules: [
    shield({
      mode: "LIVE",
    }),
    detectBot({
      mode: "DRY_RUN",
      allow: [
        "CATEGORY:SEARCH_ENGINE",
        "CATEGORY:MONITOR", // Uptime monitoring services.
      ],
    }),
  ],
});

Since the DRY_RUN capability is built into the rule definition, the process of evaluating rule changes is as easy as actually making the change.

With Arcjet, all your rules are just code, so you can do things like selectively sampling requests and applying rules to a subset of traffic. For example, if you wanted to trigger Arcjet Shield and bot detection rules in live mode on 10% of your traffic then you could write a sample function like this:

import arcjet, { detectBot, shield } from "@arcjet/next";
import { NextRequest, NextResponse } from "next/server";

export const config = {
  // matcher tells Next.js which routes to run the middleware on. This runs
  // the middleware on all routes except for static assets.
  matcher: ["/((?!_next/static|_next/image|favicon.ico).*)"],
};

const sampleRate = 0.1; // 10% of requests

const aj = arcjet({
  key: process.env.ARCJET_KEY!,
  // You could include one or more base rules to apply to all requests
  rules: [],
});

function shouldSampleRequest(sampleRate: number) {
  // sampleRate should be between 0 and 1, e.g., 0.1 for 10%, 0.5 for 50%
  return Math.random() < sampleRate;
}

// Shield and bot rules will be configured with live mode if the request is
// sampled, otherwise only Shield will be configured with dry run mode
function sampleSecurity() {
  if (shouldSampleRequest(sampleRate)) {
    console.log("Rule is LIVE");
    return aj
      .withRule(
        shield(
          { mode: "LIVE" }, // will block requests if triggered
        ),
      )
      .withRule(
        detectBot({
          mode: "LIVE",
          allow: [], // "allow none" will block all detected bots
        }),
      );
  } else {
    console.log("Rule is DRY_RUN");
    return aj.withRule(
      shield({
        mode: "DRY_RUN", // Only logs the result
      }),
    );
  }
}

export default async function middleware(request: NextRequest) {
  const decision = await sampleSecurity().protect(request);

  if (decision.isDenied()) {
    if (decision.reason.isBot()) {
      return NextResponse.json({ error: "You are a bot" }, { status: 403 });
    } else if (decision.reason.isShield()) {
      return NextResponse.json({ error: "Shields up!" }, { status: 403 });
    } else {
      return NextResponse.json({ error: "Forbidden" }, { status: 403 });
    }
  } else {
    return NextResponse.next();
  }
}

Next.js middleware.ts

Maybe “evaluate in production” is the more accurate term for what Arcjet allows you to do, but the benefits are clear: you can push a new rule to production with no worries and see what happens.

Conclusion

Making changes to security rules for your application doesn’t have to be intimidating or uncertain. We covered some of the ways traditional web application security tools fall short for change management, and how Arcjet provides a solution.

Arcjet’s architecture delivers a lot of benefits for developer experience, and testing changes is one of them. By making use of DRY_RUN mode, you can build confidence in your changes with no added complexity. With early sanity-checks and simple evidence from real traffic, you will have no fear of breaking production when using Arcjet to protect your application.

Building a minimalist web server using the Go standard library + Tailwind CSS

David Mytton — Thu, 02 Jan 2025 12:12:40 +0000

Dependencies pose a significant maintenance burden on software projects. Every package introduces risks by adding code outside your control, making them a common attack vector. Failing to stay up to date can force stressful upgrades when security patches are released. This is especially challenging in ecosystems like JavaScript, where breaking changes and dependency churn are common.

At Arcjet, we’re on a path to zero dependencies for our developer security SDK. Our goal for the 1.0.0 release (coming soon!) is that when you install Arcjet, you only have to trust us because the only code you bring into your project is our SDK.

I’ve also been thinking about how we can achieve this with our server-side Go code. Our API is built for low-latency high-throughput security decisions, so performance is crucial. While the Go ecosystem experiences less dependency churn than JavaScript, where keeping up to date has become a serious chore, minimizing dependencies remains a goal across our entire codebase.

Luckily, Go has an extensive standard library. Over the holidays I was inspired by several recent blog posts about using the standard library first before reaching for third-party modules. I decided to experiment with building a website using only the Go standard library, plus HTML and Tailwind CSS for styling.

In this post I’ll discuss how I built a minimalist web server using the Go standard library which dynamically generates HTML and CSS using Tailwind CSS. The only external dependency is the Tailwind CLI and optional use of Air for live reload, neither of which are in the production build artifacts.

Dependency churn

How often do you come back to an old project only to find it won’t build because some key dependencies have changed? Or you want to build a new feature only to find out there have been breaking changes to the core dependencies which must be upgraded first? Or a major API you relied on has been deprecated and the migration path is incomplete?

Whether it’s a side project or a major application you work on, I bet every developer has experienced this. It’s frustrating because you then have to spend time on rebuilding, refactoring, and/or migrating to the “new” way of doing things.

Using third party libraries speeds up development because you don’t need to reinvent the wheel. However, they always introduce maintenance overhead and usually come without any guarantee of continued updates or backwards compatibility. Multiply this for dependency you include and you have a real maintenance burden.

Why rely on the Go standard library?

The standard library for whichever programming language you’re using also might not have any such guarantees, but mature languages know that developers rely on them. There is an implied contract that things should rarely break.

But in Go, there is an explicit contract:

In Go 1 and the Future of Go Programs (from 2012) the Go team state

Go 1 defines two things: first, the specification of the language; and second, the specification of a set of core APIs, the "standard packages" of the Go library. The Go 1 release includes their implementation in the form of two compiler suites (gc and gccgo), and the core libraries themselves.

And in 2023 this was followed up by Backward Compatibility, Go 1.21, and Go 2:

when should we expect the Go 2 specification that breaks old Go 1 programs?

The answer is never. Go 2, in the sense of breaking with the past and no longer compiling old programs, is never going to happen. Go 2 in the sense of being the major revision of Go 1 we started toward in 2017 has already happened.

There will not be a Go 2 that breaks Go 1 programs. Instead, we are going to double down on compatibility, which is far more valuable than any possible break with the past. In fact, we believe that prioritizing compatibility was the most important design decision we made for Go 1.

By relying solely on the Go standard library, we can effectively guarantee long-term compatibility and minimal breakage.

Does Go have everything we need?

Yes! Go 1.22 introduced some improvements to the built in web server to make defining routes a lot easier, negating many of the ergonomic benefits of frameworks like Gin (and others to choose from).

And if we combine this with the existing support for serving embedded static files, compiling dynamic HTML templates, structured logging, SQL drivers for common databases, runtime traces, and the ability to execute commands before the Go build step, we can easily build a single binary with no external dependencies ready to ship to production.

Of course you can swap things out later - using an ORM like Gorm or exporting telemetry to Otel, for example - but Go has everything we need to get started in the standard library.

Start with the web server

Developers familiar with frameworks like Next.js or Remix are accustomed to automatic static asset management and filesystem-based route definitions. With our minimalist Go server this is more manual, but can be implemented in a way that feels idiomatic with the routing enhancements in Go 1.22.

We’ll follow the commonly used Go Project Layout by defining the routes and server in main.go which will load the route handlers from an internal/handlers package and keep the web content and templates in a web/templates directory.

For a basic website we want a static directory of assets like CSS and images (at web/static), plus a favicon and robots.txt hosted at the root. These are embedded in the Go binary. We also include a simple health check to indicate the server is running when it’s deployed.

The server will be containerized and shipped to a modern hosting platform like Railway, Fly.io, Render, or one of the larger cloud providers. It’s standard to route requests through a proxy or load balancer which can handle SSL, so that’s another dependency avoided.

However, if you wanted to just host this on a single VM then you could use something like Certmagic to automatically generate a Let’s Encrypt certificate for you. It goes against our zero dependency philosophy, but dealing with issuing SSL certificates might not be something you want to write from scratch! This becomes more challenging when you have to sync certificates across multiple servers, which is why it's often delegated to the proxy frontend.

Finally, we also set up structured logging using log/slog and use an environment variable to configure plain text (default, for development) and JSON (for production).

package main

import (
    "embed"
    "fmt"
    "io/fs"
    "log/slog"
    "net/http"
    "os"
    "time"
)

//go:embed web/static/*
var static embed.FS

func init() {
    _, jsonLogger := os.LookupEnv("JSON_LOGGER")
    _, debug := os.LookupEnv("DEBUG")

    var programLevel slog.Level
    if debug {
        programLevel = slog.LevelDebug
    }

    if jsonLogger {
        jsonHandler := slog.NewJSONHandler(os.Stdout, &slog.HandlerOptions{
            Level: programLevel,
        })
        slog.SetDefault(slog.New(jsonHandler))
    } else {
        textHandler := slog.NewTextHandler(os.Stdout, &slog.HandlerOptions{
            Level: programLevel,
        })
        slog.SetDefault(slog.New(textHandler))
    }

    slog.Info("Logger initialized", slog.Bool("debug", debug))
}

func main() {
    port := os.Getenv("PORT")
    if port == "" {
        port = "8080"
    }
    addr := ":" + port

    mux := http.NewServeMux()

    // Use an embedded filesystem rooted at "web/static"
    fs, err := fs.Sub(static, "web/static")
    if err != nil {
        slog.Error("Failed to create sub filesystem", "error", err)
        return
    }

    // Serve files from the embedded /web/static directory at /static
    fileServer := http.FileServer(http.FS(fs))
    mux.Handle("GET /static/", http.StripPrefix("/static/", fileServer))

    mux.HandleFunc("GET /favicon.ico", func(w http.ResponseWriter, r *http.Request) {
        data, err := static.ReadFile("web/static/img/favicon.ico")
        if err != nil {
            http.NotFound(w, r)
            return
        }
        w.Header().Set("Content-Type", "text/plain")
        w.Write(data)
    })
    mux.HandleFunc("GET /robots.txt", func(w http.ResponseWriter, r *http.Request) {
        data, err := static.ReadFile("web/static/robots.txt")
        if err != nil {
            http.NotFound(w, r)
            return
        }
        w.Header().Set("Content-Type", "text/plain")
        w.Write(data)
    })

    mux.HandleFunc("GET /health", func(w http.ResponseWriter, r *http.Request) {
        w.Header().Set("Content-Type", "text/plain")
        w.Write([]byte(`OK`))
    })

    server := &http.Server{
        Addr: fmt.Sprintf(":%s", port),
        Handler: mux,
        // Recommended timeouts from
        // https://blog.cloudflare.com/exposing-go-on-the-internet/
        ReadTimeout: 5 * time.Second,
        WriteTimeout: 10 * time.Second,
        IdleTimeout: 120 * time.Second,
    }

    slog.Info("Server listening", "addr", addr)

    if err := server.ListenAndServe(); err != nil {
        slog.Error("Server failed to start", "error", err)
    }
}

main.go web server.

Assuming web/static/robots.txt exists then when you go run main.go and curl http://localhost:8080/robots.txt you’ll get the contents of that file served. Same for robots.txt and the health check.

Middleware

Running code on every request is useful for logging, error handling, authentication, etc. Go web frameworks like Gin offer a rich middleware ecosystem, which is an advantage. Figuring out CORS is a pain! However, writing custom middleware for Go net/http is straightforward.

To make this web server more robust, we’ll include a simple panic handler so that if any of the routes panic, we don’t crash the server.

Create a new package in internal/middleware with middleware.go defining the structure:

package middleware

import (
    "net/http"
)

// Middleware is a function that wraps an http.Handler with custom logic.
type Middleware func(http.Handler) http.Handler

// Chain is a helper to build up a pipeline of middlewares, then apply them to a
// final handler.
type Chain struct {
    middlewares []Middleware
}

// Use appends a middleware to the chain.
func (c *Chain) Use(m Middleware) {
    c.middlewares = append(c.middlewares, m)
}

// Then applies the entire chain of middlewares to the final handler in reverse
// order.
func (c *Chain) Then(h http.Handler) http.Handler {
    for i := len(c.middlewares) - 1; i >= 0; i-- {
        h = c.middlewares[i](h)
    }
    return h
}

The middleware definition in internal/middleware/middleware.go

Then internal/middleware/recover.go can be a new HTTP handler:

package middleware

import (
    "log/slog"
    "net/http"
)

func RecoverMiddleware(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        defer func() {
            if err := recover(); err != nil {
                slog.Error("Recovered from panic", "error", err)
                http.Error(w, "Internal Server Error", http.StatusInternalServerError)
            }
        }()
        next.ServeHTTP(w, r)
    })
}

The panic recovery middleware in internal/middleware/recover.go

In main.go we can wrap mux with the new middleware and update the http.Server to use the wrapped mux as the Handler

chain := &middleware.Chain{}
chain.Use(middleware.RecoverMiddleware)
wrappedMux := chain.Then(mux)

server := &http.Server{
    Addr: fmt.Sprintf(":%s", port),
    Handler: wrappedMux,
    // Recommended timeouts from
    // https://blog.cloudflare.com/exposing-go-on-the-internet/
    ReadTimeout: 5 * time.Second,
    WriteTimeout: 10 * time.Second,
    IdleTimeout: 120 * time.Second,
}

Updated main.go with the new middleware.

Generating CSS with Tailwind

Tailwind uses HTML class attributes to automatically generate the CSS needed to create the layout, but that means it needs a build step. It has to parse the HTML then build the CSS, which we want to ship with the Go binary so everything is self contained.

I followed Xe Iaso’s blog How to use Tailwind CSS in your Go programs to get this working. You have to set up a basic npm package in your root so that we can include the Tailwind CLI. The package.json build script line triggers the build where web/static/css/main.css is any custom CSS you want included and web/static/css/styles.css is the output. You can remove the main.css file if you don’t have anything extra to add.

{
  "name": "example.com",
  "version": "1.0.0",
  "scripts": {
    "build": "tailwindcss build -i web/static/css/main.css -o web/static/css/styles.css"
  },
  "dependencies": {
    "tailwindcss": "3.4.17"
  }
}

package.json

The tailwind.config.js file looks like this:

/** @type {import('tailwindcss').Config} */
module.exports = {
  darkmode: "media",
  content: ["./web/templates/*.html"],
  plugins: [],
}

tailwind.config.js

Then in main.go in the root we add a generate command so that we can trigger the npm build script as part of the Go build process.

package main

import (
    "embed"
    "fmt"
    "io/fs"
    "log/slog"
    "net/http"
    "os"
    "time"
)

//go:generate npm run build

//go:embed web/static/*
var static embed.FS

func init() {
    // ...

Update top of the main.go file

Running go generate && go run main.go triggers Tailwind to generate CSS in the web/static directory, which Go then embeds into the build.

Web templates

The final thing to do is set up a simple index page as our first route. Create an HTML file at web/templates/index.html with anything you like. Then at web/templates.go set up this file:

package web

import (
    "embed"
)

//go:embed templates
var Templates embed.FS

web/templates.go

This allows us to use the templates as a Go module so they can be imported elsewhere in our code.

In a new package at internal/handlers/root.go we can define a root handler:

package handlers

import (
    "net/http"

    "html/template"
    "log/slog"

    "github.com/davidmytton/example/web"
)

type PageData struct {
    Title string
}

func RootHandler() http.HandlerFunc {
    return func(w http.ResponseWriter, r *http.Request) {
        file, err := web.Templates.ReadFile("templates/index.html")
        if err != nil {
            http.Error(w, "Internal Server Error", http.StatusInternalServerError)
            slog.Error("Error reading template", "error", err)
            return
        }

        tmpl := template.Must(template.New("index.html").Parse(string(file)))

        data := PageData{
            Title: "Home",
        }
        if err := tmpl.Execute(w, data); err != nil {
            http.Error(w, "Internal Server Error", http.StatusInternalServerError)
            slog.Error("Error executing template", "error", err)
        }
    }
}

internal/handlers/root.go handler

This reads the package template and then parses the data. Our panic recovery middleware will ensure that any template compilation errors are handled gracefully. The template can include variables and other structures, such as the title we pass in.

By using Go's html/template we get injection-safety - the templates themselves are assumed to be safe, but the data injected is not, so Go handles it appropriately.

<!DOCTYPE html>
<html lang="en" class="h-screen">
  <head>
    <meta charset="utf-8" />
    <meta name="viewport" content="width=device-width,initial-scale=1" />
    <title>{{.Title}}</title>
    <!-- ... -->

web/templates/index.html template file

Finally, in main.go we can set up the handler:

mux.HandleFunc("GET /", handlers.RootHandler())

Add to the main.go route definitions

Live reload with Air

One nice feature of web frameworks like Next.js is the instant reload whenever you make code changes. To implement live reload for our Go web server we can launch it with Air.

After you have installed Air and generated the default config with air init then you can adjust the build configuration with the following:

[build]
  args_bin = []
  bin = "./tmp/main"
  cmd = "go generate && go build -o ./tmp/main ."
  delay = 1000
  exclude_dir = ["node_modules", "assets", "tmp", "vendor", "testdata"]
  exclude_file = []
  exclude_regex = ["_test.go"]
  exclude_unchanged = false
  follow_symlink = false
  full_bin = ""
  include_dir = []
  include_ext = ["go", "tpl", "tmpl", "html", "css"]
  include_file = []
  kill_delay = "0s"
  log = "build-errors.log"
  poll = false
  poll_interval = 0
  post_cmd = []
  pre_cmd = []
  rerun = false
  rerun_delay = 500
  send_interrupt = false
  stop_on_error = false

.air.toml

Launching the dev environment

How many times have you come back to a side project only to forget how to actually run it? Make tends to be installed by default on most systems so if we always write a make dev command then we don’t need to remember anything!

Following the example of go-blueprint, I also set up a Makefile to easily start the web server. Here’s the contents:

.PHONY: dev build

dev:
    @if command -v $(HOME)/go/bin/air > /dev/null; then \
        AIR_CMD="$(HOME)/go/bin/air"; \
    elif command -v air > /dev/null; then \
        AIR_CMD="air"; \
    else \
        read -p "air is not installed. Install it? [Y/n] " choice; \
        if ["$$choice" != "n"] && ["$$choice" != "N"]; then \
            echo "Installing..."; \
            go install github.com/air-verse/air@latest; \
            AIR_CMD="$(HOME)/go/bin/air"; \
        else \
            echo "Exiting..."; \
            exit 1; \
        fi; \
    fi; \
    echo "Starting Air..."; \
    $$AIR_CMD

build:
    @echo "Installing Tailwind..."
    npm ci
    @echo "Generate Tailwind CSS..."
    go generate
    @echo "Building Go server..."
    go build -o tmp/server main.go
    @echo "Build complete."

The project Makefile

Running make dev will launch air watching files for any changes. If you edit a template or any of the Go files, the server will relaunch.

Deployment

We can extend our minimalist philosophy to the production builds as well. I like the Distroless project from Google which provides bare-minimum container images without any operating system and without running as root. Wolfi is an alternative if you need more choice over what's installed, but still want to go with a minimalist approach.

The Dockerfile looks like this:

FROM golang:1.23 AS builder

WORKDIR /app
COPY go.mod ./

ENV CGO_ENABLED=0
RUN go mod download

COPY . .
RUN go build -o server main.go

# Copy the server binary into a distroless container
FROM gcr.io/distroless/static-debian12:nonroot
COPY --from=builder /app/server /

CMD ["/server"]

USER nonroot

This assumes the CSS has already been generated with go generate after which you can build the container with docker build -t website --load . and run it with docker run -t website

Conclusion

We’ve now built a self-contained website using Tailwind CSS and dynamic HTML templates, ready to deploy as a tiny ~10 MB Docker image. Contrast this to a typical Node.js container, which could easily exceed hundreds of MB. Smaller image sizes isn’t a particularly important goal by itself, but it’s indicative of all the extra bloat you’re shipping.

Unfortunately, we still have to rely on the Tailwind CSS CLI. To illustrate how crazy things have become even with just that single external dependency, take a look inside the node_modules directory and see how many packages it requires! Thankfully, the Tailwind v4 beta includes standalone CLI binaries so hopefully we’ll be able to use that in future.

However, the server itself has zero external dependencies and since the default go.mod contains the toolchain version, even if we come back to this in 10 years it should still build and run without any changes!

Remix Security Checklist

David Mytton — Mon, 23 Dec 2024 11:08:57 +0000

Remix has been growing in popularity as a more lightweight framework that closely follows web standards. As an alternative to Next.js, it has tried to take a more minimalist path. Perhaps that’s why OpenAI recently migrated the ChatGPT UI from Next.js to Remix!

Like Next.js, building frontend UI with React includes basic security out of the box - vulnerabilities like cross site scripting are much less likely due to the design of the framework. However, you still need to think about security.

Good security is built in layers, creating walls behind ones that may be breached. Although total security is impossible, there are several additional measures you can take to mitigate the risk of attack. Defense-in-depth ensures that if one mechanism fails, others still protect the application.

We recently released the Arcjet security as code SDK for Remix to bring bot detection, PII redaction, signup form spam protection and rate limiting to Remix. This article will cover some of the important areas from our research about how to improve your Remix app security.

1. Dependencies & Updates

With the frequency and severity of supply chain attacks on the rise, one of the easiest ways to keep your website and user base safe is to stay current on the latest patches and updates. Though this vulnerability class has been gaining more attention and registries are making changes to minimize the threat level, third-party integrations will always pose a risk.

At Arcjet we review non-critical dependency updates every week using Dependabot with Socket pull request analysis to help us keep an eye on what’s included in those updates. Alternatively, npm audit can be used to assist you in keeping up-to-date with the latest releases that have addressed publicly known vulnerabilities.

Attacks such as dependency confusion and package hijacking have been responsible for major disruptions. Socket helps us ensure we mitigate those risks by highlighting unusual updates. To minimize your attack surface, consider implementing the functionality provided by trivial packages yourself.

The JavaScript ecosystem has a relatively high churn rate of updates which can be challenging to keep up to date with, particularly when there are breaking changes. This is a pain, but it’s more painful to be forced through several major version changes if you don’t keep up and then there’s a critical vulnerability that is only addressed in the latest release!

2. Module constraints

Server-only code will be automatically removed from what gets sent to the browser by the Remix compiler. However, to ensure this works properly, you must avoid module side effects.

Certain operations, such as logging or API calls, that occur immediately when a module is imported can expose sensitive information or cause errors:

import { auth } from "../auth.server";
import { useLoaderData } from "@remix-run/react";
import type { LoaderFunctionArgs } from "@remix-run/node";

// DANGEROUS: Immediately tries to verify auth on module import.
const authStatus = auth.verifySession();
// DANGEROUS: Potential side effect, may expose auth logic.
console.log("Auth Status:", authStatus);

export async function loader({ request }: LoaderFunctionArgs) {
  return Response.json({
    users: await auth.getAuthorizedUsers(),
    status: authStatus // Using the problematic module-level variable.
  });
}

export default function Users() {
  const data = useLoaderData<typeof loader>();
  return <div>{/* render users */}</div>;
}

Instead, the code responsible for the side effect should be wrapped in the loader function:

import { auth } from "../auth.server";
import { useLoaderData } from "@remix-run/react";
import type { LoaderFunctionArgs } from "@remix-run/node";

export async function loader({ request }: LoaderFunctionArgs) {
  // Side effect properly contained within the loader.
  const authStatus = await auth.verifySession();
  console.log("Auth Status:", authStatus); // Safe implementation.

  return Response.json({
    users: await auth.getAuthorizedUsers(),
    status: authStatus
  });
}

export default function Users() {
  const data = useLoaderData<typeof loader>();
  return <div>{/* render users */}</div>;
}

Placing sensitive operations (e.g., database queries, verifying sessions) directly in an imported module can unintentionally leak this logic to the client bundle, or execute it too early.

Always wrap such operations in a loader or action so that Remix’s server-only compilation can properly exclude them from client code. This approach also helps ensure that any credentials or personally identifiable information (PII) are only accessed by server functions, further reducing the likelihood of a security breach.

3. Environment variables

All your environment variables, such as API keys or database URLs, should be kept server-side to avoid accidental exposure. Any file with the .server.ts suffix will only ever be executed on the server.

You can access server-side environment variables inside your loader because loaders only ever run on the server. However, any value that is returned by the loader will be available on the client. Use sensitive environment variables inside the loader, but do not return them.

Anything placed into window.ENV will be exposed to the browser. While it’s convenient to pass environment variables to the client via a context like window.ENV, you must ensure only non-sensitive values are exposed - like a public-facing API base URL.

The ideal situation is to avoid using environment variables for any secrets - this is a common anti-pattern that should be avoided.

4. Authentication & Authorization

Securing routes in Remix can be accomplished with its native utilities. Session cookies can be created either with the createCookie utility or by using a session storage object which will be checked in a loader or action when reading or writing data.

`createCookie`

This utility creates a logical container to manage a browser cookie issued by the server. Any attributes can be set by adding to the options object or during serialize() when the Set-Cookie response header is generated.

Since cookies can be easily tampered with, Remix will automatically sign a cookie to verify its contents and ensure its integrity. The secrets that will be used for signing are sourced from the secrets property which stores a string value array.

If multiple secrets are provided, the one at index position 0 will be used to sign all outgoing cookies. However, any cookies that were signed with older secrets will still successfully decode. This is useful when you want to rotate secrets. It is critical that any secrets used are complex enough to be unguessable, as cookies could be forged if a malicious attacker is aware of their correct value.

It is recommended that all created cookies are stored in a *.server.ts file and then imported into your route modules. Files with the .server.ts suffix are never sent to the client.

Session Storage

There are a variety of session storage strategies available in Remix, as well as the ability to create a custom one:

createSessionStorage() - Creates a custom strategy. Requires a cookie and CRUD methods to manage session data.
createCookieSessionStorage() - This strategy stores all session data into the session cookie. With this method, additional backend services or databases are not required. A major drawback to this strategy is that every time the session changes due to a loader or action, it must be committed.
createFileSessionStorage() - Used with persistent file backed sessions.
createWorkersKVSessionStorage() - Used with Cloudflare Workers KV backed sessions.
createArcTableSessionStorage() - Used with Amazon DynamoDB backed sessions.

Define your session storage object in app/session.ts to act as a centralized location for routes to access session data.

Using Cookies

Once a cookie is generated the .parse() method can be used to extract and return its value. Then conditionals can be defined in your loader and action functions:

// app/routes/_index.tsx
// Remember the user's preference for banner visibility using a cookie.

import { json, redirect } from "@remix-run/node";
import { useLoaderData, Form } from "@remix-run/react";
import { userPrefs } from "~/cookies.server";

// Return showBanner value from userPrefs cookie.
export async function loader({ request }) {
  const cookieHeader = request.headers.get("Cookie");
  const cookie = (await userPrefs.parse(cookieHeader)) || {};
  return json({ showBanner: cookie.showBanner });
}

// Update showBanner value to false if bannerVisibility is set to hidden.
export async function action({ request }) {
  const cookieHeader = request.headers.get("Cookie");
  const cookie = (await userPrefs.parse(cookieHeader)) || {};
  const bodyParams = await request.formData();

  if (bodyParams.get("bannerVisibility") === "hidden") {
    cookie.showBanner = false;
  }

  // Serialize updated cookie and set in redirect response.
  return redirect("/", {
    headers: {
      "Set-Cookie": await userPrefs.serialize(cookie),
    },
  });
}

export default function Home() {
  const { showBanner } = useLoaderData();

  // Form to hide the banner.
  return (
    <div>
      {showBanner && (
        <div>
          <Form method="post">
            <input type="hidden" name="bannerVisibility" value="hidden" />
            <button type="submit">Hide</button>
          </Form>
        </div>
      )}
      <h1>Welcome!</h1>
    </div>
  );
}

Be aware that any data returned from a loader will be exposed to the client, even if it is not rendered in a component. So treat these with the same care as you would give a public API endpoint.

External Solutions

Alternatively, third-party authentication libraries such as Clerk and Better Auth provide an easy way to integrate robust identity management.

5. Cross-Site Request Forgery

Cross-Site Request Forgery (CSRF) attacks trick victims into submitting requests using their authenticated session. Any functionality that can be executed by authenticated users can be exploited. This includes functionality such as updating the account password, changing the email address associated with the account, deleting the account, etc.

The majority of browsers have built-in protection against CSRF attacks as they support the SameSite cookie attribute which restricts the inclusion of cookies in requests initiated by another website. Remix cookies are set to use SameSite=Lax by default.

WARNING : It is important that logout functions or any mutations are performed in an action and not a loader or you will put users at risk of a CSRF attack. View the official documentation for more information.

remix-utils

To add an extra layer of protection against CSRF attacks, you can use the remix-utils library. This library also offers other security features to compensate for the lack of native security in Remix, including safe redirects and CORS implementation.

6. Security Headers

Each route in Remix can set its own HTTP headers with the HeadersFunction:

import type { HeadersFunction } from "@remix-run/node";

export const headers: HeadersFunction = () => ({
  "header-name-a": "value",
  "header-name-b": "value"
});

However, the headers returned depend on the nesting level of the route:

When a route defines headers, only those headers are used by default.
Parent headers are only included if they are explicitly merged. If a child and parent share the same header, the value of the child's header overwrites the parent.
If a child's loader function throws an error and that error is handled by a parent route, then the parent's headers are used.
When a route doesn't define headers, Remix will traverse up the route hierarchy one parent at a time until it finds headers.

This is important to be aware of as you may cache content for longer than intended with a more aggressive child route Cache-Control header. You can avoid any overwriting by only defining headers in your childless routes.

As such, security headers in Remix should be set using the entry.server.ts file so they can be applied to every request. For data requests, you must use the handleDataRequest function:

export function handleDataRequest(
  response: Response,
  {
    request,
    params,
    context,
  }: LoaderFunctionArgs | ActionFunctionArgs
) {
  response.headers.set("X-Custom-Header", "value");
  return response;
}

With this knowledge, there are several security headers and attributes that you should use to protect your application.

// app/entry.server.tsx

import type { AppLoadContext, EntryContext } from "@remix-run/node";
import { RemixServer } from "@remix-run/react";
import { renderToString } from "react-dom/server";

export default function handleRequest(
  request: Request,
  responseStatusCode: number,
  responseHeaders: Headers,
  remixContext: EntryContext,
  loadContext: AppLoadContext
) {
  const markup = renderToString(
    <RemixServer context={remixContext} url={request.url} />
  );

  // Set security headers.
  // Interpret response as HTML.
  responseHeaders.set("Content-Type", "text/html");
  // Prevent clickjacking attacks.
  responseHeaders.set("X-Frame-Options", "SAMEORIGIN");
  // Enforces Content-Type.
  responseHeaders.set("X-Content-Type-Options", "nosniff");
  // Only include path for same-origin requests.
  responseHeaders.set("Referrer-Policy", "strict-origin-when-cross-origin");
  // Only allow same-origin resources.
  responseHeaders.set(
    "Content-Security-Policy",
    "default-src 'self'; script-src 'self'; style-src 'self';"
  );
  // Only use HTTPS.
  responseHeaders.set(
    "Strict-Transport-Security",
    "max-age=31536000; includeSubDomains"
  );
  // User device protection.
  responseHeaders.set("Permissions-Policy", "camera=(), microphone=(), geolocation=()");
  // Block cross-origin window access.
  responseHeaders.set("Cross-Origin-Opener-Policy", "same-origin");
  // Block cross-origin resource embedding.
  responseHeaders.set("Cross-Origin-Resource-Policy", "same-origin");
  // Enable origin-keyed agent clustering.
  responseHeaders.set("Origin-Agent-Cluster", "?1");
  // Prevent browsers from DNS prefetching.
  responseHeaders.set("X-DNS-Prefetch-Control", "off");
  // Block Adobe from loading domain data.
  responseHeaders.set("X-Permitted-Cross-Domain-Policies", "none");

  return new Response("<!DOCTYPE html>" + markup, {
    headers: responseHeaders,
    status: responseStatusCode,
  });
}

In addition to setting them manually, you can also use Helmet.js as Remix can be integrated with Express:

import express from "express";
import helmet from "helmet";
import { createRequestHandler } from "@remix-run/express";

const app = express();

// Use Helmet to set security headers.
app.use(helmet());

// Serve static files from the public directory.
app.use(express.static("public"));

// Handle all requests with Remix.
app.all(
  "*",
  createRequestHandler({
    getLoadContext() {
      // Whatever you return here will be passed as `context` to your loaders.
    },
  })
);

const port = process.env.PORT || 3000;
app.listen(port, () => {
  console.log(`Server is listening on port ${port}`);
});

7. Validation

React, and by extension Remix, automatically escapes strings used in dynamic content by HTML encoding characters used in injection attacks. This provides a base level of protection unless the input is used within an anchor tag's href attribute, style attributes, and dangerouslySetInnerHTML.

Additional bypasses have also been found when parsing JSON-objects and converting Markdown to HTML.

At Arcjet, we recommend using Zod as it is designed to work seamlessly with TypeScript, allowing you to declare your schema once and use it for both static checking and runtime validation:

// app/schemas/auth.ts

import { z } from 'zod';
import isAlphanumeric from 'validator/lib/isAlphanumeric';

export const loginSchema = z.object({
  username: z.string()
    .min(3, "Username must be at least 3 characters long.")
    .max(20, "Username cannot exceed 20 characters.")
    .refine((val) => isAlphanumeric(val, "en-US"), // Sets language locale.
      "Username can only contain letters and numbers."),

  email: z.string()
    .min(1, "Email is required.")
    .email("Please enter a valid email address."),

  password: z.string()
    .min(8, "Password must be at least 8 characters long.")
    .regex(/^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)/, 
      "Password must contain at least one uppercase letter, one lowercase letter, and one number."),

  confirmPassword: z.string()
    .min(8, "Please confirm your password."),
}).refine(
  (data) => data.password === data.confirmPassword,
  {
    message: "Passwords do not match.",
    path: ["confirmPassword"], // Associates error with field.
  }
);

export type LoginInput = z.infer<typeof loginSchema>;

This schema can then be imported and used in an action:

// app/routes/login.tsx

// ...imports

export async function action({ request }: ActionFunctionArgs) {
  const formData = await request.formData();
  const data = Object.fromEntries(formData);

  const result = loginSchema.safeParse(data);
  if (!result.success) {
    return Response.json(
      { errors: result.error.flatten().fieldErrors },
      { status: 400 }
    );
  }

  const { email, password } = result.data;
  const user = await login(email, password);
  if (!user) {
    return Response.json(
      { errors: { form: "Invalid credentials." } },
      { status: 401 }
    );
  }

  // Valid credentials create a session and redirects user to dashboard.
  return createUserSession(user.id, "/dashboard");
}

export default function Login() {
  const actionData = useActionData<typeof action>();
  const navigation = useNavigation();
  const { errors, validate } = useFormValidation(loginSchema);
  // Route action being called due to non GET form submission.
  const isSubmitting = navigation.state === "submitting";
  // ...

In the form, any fields that fail validation will display an error that is sent from the server:

// app/routes/login.tsx

// Within <Form method="post">.
<div>
  <input name="email" type="email" required />
  {actionData?.errors?.email && (
    <div>{actionData.errors.email[0]}</div>
  )}
</div>

The same Zod schema can also be used for client-side validation using a hook to provide immediate client-side validation before the form submission reaches the server:

// app/hooks/useFormValidation.ts

import { useState } from "react";
import type { z } from "zod";

export function useFormValidation<T extends z.ZodType>(schema: T) {
  const [errors, setErrors] = useState<z.inferFlattenedErrors<T>['fieldErrors']>({});

  const validate = (formData: FormData) => {
    const data = Object.fromEntries(formData);
    const result = schema.safeParse(data);

    if (!result.success) {
      setErrors(result.error.flatten().fieldErrors);
      return false;
    }

    setErrors({});
    return true;
  };

  return { errors, validate };
}

// app/routes/login.tsx

// Added to the Login() function.
const handleSubmit = (event: React.FormEvent<HTMLFormElement>) => {
  const form = event.currentTarget;
  const formData = new FormData(form);

  if (!validate(formData)) {
    event.preventDefault();
  }
};

// app/routes/login.tsx

<Form method="post" onSubmit={handleSubmit}>
  <div>
    <input name="email" type="email" required />
    // Client-side OR server-side errors.
    {(errors.email || actionData?.errors?.email) && (
      <div>
        {errors.email?.[0] || actionData?.errors?.email?.[0]}
      </div>
    )}
  </div>

Validation can also be performed on GET query parameters and cookies:

import type { LoaderFunctionArgs } from "@remix-run/node";
import { z } from "zod";
import { getSession } from "~/session.server";

// Example query: ?page=2&sort=desc
const querySchema = z.object({
  // .coerce converts page parameter string of 2 to number.
  // page must be > 1.
  page: z.coerce.number().min(1).default(1),
  // sort must be either "asc" or "desc", defaults to "asc" if missing.
  sort: z.enum(["asc", "desc"]).default("asc"),
});

// userId must be non-empty string.
const sessionSchema = z.object({ userId: z.string().min(1) });

export async function loader({ request }: LoaderFunctionArgs) {
  try {
    const url = new URL(request.url);
    const queryParams = Object.fromEntries(url.searchParams);
    const validatedQuery = querySchema.parse(queryParams);

    const session = await getSession(request.headers.get("Cookie"));
    const sessionData = { userId: session.get("userId") };
    const validatedSession = sessionSchema.parse(sessionData);

    return Response.json({ query: validatedQuery });
  } catch (error) {
    if (error instanceof z.ZodError) {
      return Response.json(
        { error: error.flatten().fieldErrors },
        { status: 400 }
      );
    }
    throw error;
  }
}

8. File Uploads

If your application allows users to upload files, it is essential to implement security measures to mitigate the risk of attackers uploading and executing malicious code in the context of your application.

In Remix, you can use the unstable_createFileUploadHandler and unstable_createMemoryUploadHandler utilities to filter uploads based on file characteristics. However, configuring restrictions to meet your needs and be secure can be overtly complex.

Instead of dealing with securing upload functionality yourself, consider using services such as UploadThing or sending files directly to a cloud object storage service. Uploading files to disk on a server you operate is bad practice.

Conclusion

These general guidelines will help you create a secure application for both you and your users. However, the specific configurations and implementations will need to be tailored to meet the needs of your specific application. There is no one-size-fits-all approach to security.

Bot spoofing and how to detect it with Arcjet

David Mytton — Fri, 20 Dec 2024 14:01:10 +0000

The User-Agent header is the name badge for web requests. Although it'sbeen deprecated by some browsers, it's still sent by well behaving clients and is commonly used to identify automated clients. It's what robots.txt is based on.

But just like a name badge, clients can write whatever they like in the User-Agent header. This is a problem if it's the only thing you use to set up rules for managing bots, and is one reason why Arcjet uses other fingerprinting techniques like IP address analysis as part of our bot detection features.

Now we're adding more detailed verification options to developers where every request will be checked behind the scenes using published IP and reverse DNS data for common bots.

Bot detection is never perfect, but this improvement helps protect against spoofed bots where clients pretend to be someone else. For example, we can detect if a client is really Googlebot by checking if the request IP is within Google’s published IP ranges.

The analysis happens automatically for all Arcjet Pro plan users. If we detect a spoofed bot (or successfully verify a bot), additional metadata will be added to the response decision so you can decide how to handle it.

For example, to check for spoofed bots:

if (decision.reason.isBot() && decision.reason.isSpoofed()) {
  console.log("Detected spoofed bot", decision.reason.spoofed);
  // Return a 403 or similar response
}

And to confirm whether a bot has been verified:

if (decision.reason.isBot() && decision.reason.isVerified()) {
  console.log("Verified bot", decision.reason.verified);
  // Allow the request
}

Right now we support verification for Google, Bing, ChatGPT, and Datadog. Our bot list is open source and we'll be adding more over time.

So if you're having trouble with bot traffic, try out verified bot detection in Arcjet by signing up for free today. When you're ready to go to production, reach out to upgrade to Pro (pricing).

The Wasm Component Model and idiomatic codegen

David Mytton — Tue, 17 Dec 2024 11:08:32 +0000

Arcjet bundles WebAssembly with our security as code SDK. This helps developers implement common security functionality like PII detection and bot detection directly in their code. Much of the logic is embedded in Wasm, which gives us a secure sandbox with near-native performance and is part of our philosophy around local-first security.

The ability to run the same code across platforms is also helpful as we build out support from JavaScript to other tech stacks, but it requires an important abstraction to translate between languages (our Wasm is compiled from Rust).

The WebAssembly Component Model is the powerful construct which enables this, but a construct can only be as good as the implementations and tooling surrounding it. For the Component Model, this is most evident in the code generation for Hosts (environments that execute WebAssembly Component Model) and Guests (WebAssembly modules written in any language and compiled to the Component Model; Rust in our case).

The Component Model defines a language for communication between Hosts and Guests which is primarily composed of types, functions, imports and exports. It tries to define a broad language, but some types, such as variants, tuples, and resources, might not exist in a given general purpose programming language.

When a tool tries to generate code for one of these languages, the authors often need to get creative to map Component Model types to that general purpose language. For example, we use jco for generating JS bindings and this implements variants using a JavaScript object in the shape of { tag: string, value: string }. It even has a special case for the result<_, _> type where the error variant is turned into an Error and thrown.

This post explores how the Wasm Component Model enables cross-language integrations, the complexities of code generation for Hosts and Guests, and the trade-offs we make to achieve idiomatic code in languages like Go.

Host code generation for Go

At Arcjet, we have had to build a tool to generate code for Hosts written in the Go programming language. Although our SDK attempts to analyze everything locally, that is not always possible and so we have an API written in Go which augments local decisions with additional metadata.

Go has a very minimal syntax and type system by design. They didn’t even have generics until very recently and they still have significant limitations. This makes codegen from the Component Model to Go complex in various ways.

For example, we could generate a result<_, _> as:

type Result[V any] struct {
    value V
    err error
}

However, this limits the type that can be provided in the error position. So we’d need to codegen it as:

type Result[V any, E any] struct {
    value V
    err E
}

This works but becomes cumbersome to use with other idiomatic Go, which often uses the val, err := doSomething() convention to indicate the same semantics as the Result type we’ve defined above.

Additionally, constructing this Result is cumbersome: Result[int, string]{value: 1, err: ""}. Instead of providing the Result type, we probably want to match idiomatic patterns so Go users feel natural consuming our generated bindings.

Idiomatic vs Direct Mapping

Code can be generated to feel more natural to the language or it can be a more direct mapping to the Component Model types. Neither option fits 100% of use cases so it is up to the tool authors to decide which makes the most sense.

For the Arcjet tooling, we chose the idiomatic Go approach for option<_> and result<_, _> types, which map to val, ok := doSomething() and val, err := doSomething() respectively. For variants, we create an interface that each variant needs to implement, such as:

type BotConfig interface {
    isBotConfig()
}

func (AllowedBotConfig) isBotConfig() {}

func (DeniedBotConfig) isBotConfig() {}

This strikes a good balance between type safety and unnecessary wrapping. Of course, there are situations where the wrapping is required, but those can be handled as edge cases.

Developers may struggle with non-idiomatic patterns, leading to verbose, less maintainable code. Using established conventions makes the code feel more familiar, but does require some additional effort to implement.

We decided to take the idiomatic path to minimize friction and make it easier for our team so we know what to expect when moving around the codebase.

Calling conventions

One of the biggest decisions tooling authors need to make is the calling convention of the bindings. This includes deciding how/when imports will be compiled, if the Wasm module will be compiled during setup or instantiation, and cleanup.

In the Arcjet codebase, we chose the factory/instance pattern to optimize performance. Compiling a WebAssembly module is expensive, so we do it once in the NewBotFactory() constructor. Subsequent Instantiate() calls are then fast and cheap, allowing for high throughput in production workloads.

func NewBotFactory(
    ctx context.Context,
) (*BotFactory, error) {
    runtime := wazero.NewRuntime(ctx)

    // ... Imports are compiled here if there are any

    // Compiling the module takes a LONG time, so we want to do it once and hold
    // onto it with the Runtime
    module, err := runtime.CompileModule(ctx, wasmFileBot)
    if err != nil {
            return nil, err
    }

    return &BotFactory{runtime, module}, nil
}

Consumers construct this BotFactory once by calling NewBotFactory(ctx) and use it to create multiple instances via the Instantiate method.

func (f *BotFactory) Instantiate(ctx context.Context) (*BotInstance, error) {
    if module, err := f.runtime.InstantiateModule(ctx, f.module, wazero.NewModuleConfig()); err != nil {
            return nil, err
    } else {
            return &BotInstance{module}, nil
    }
}

Instantiation is very fast if the module has already been compiled, like we do with runtime.CompileModule() when constructing the factory.

The BotInstance has functions which were exported from the Component Model definition.

func (i *BotInstance) Detect(
    ctx context.Context,
    request string,
    options BotConfig,
) (BotResult, error) {
   // ... Lots of generated code for binding to Wazero
}

Generally, after using a BotInstance, we want to clean it up to ensure we’re not leaking memory. For this we provide the Close function.

func (i *BotInstance) Close(ctx context.Context) error {
    if err := i.module.Close(ctx); err != nil {
            return err
    }

    return nil
}

If you want to clean up the entire BotFactory, that can be closed too:

func (f *BotFactory) Close(ctx context.Context) {
    f.runtime.Close(ctx)
}

We can put all these APIs together to call functions on this WebAssembly module:

ctx := context.Background()
factory, err := NewBotFactory(ctx)
if err != nil {
  panic(err)
}
defer factory.Close(ctx)

instance, err := factory.Instantiate(ctx)
if err != nil {
    panic(err)
}
defer instance.Close(ctx)

result, err := instance.Detect(
  ctx,
  request,
  AllowedBotConfig{
    Entities: []BotEntity{"GOOGLE_CRAWLER"},
      SkipCustomDetect: true,
    },
)
if err != nil {
    panic(err)
}
fmt.Printf("%+v", result)

This pattern of factory and instance construction takes more code to use, but it was chosen to achieve as much performance as possible in the hot paths of the Arcjet service.

By front-loading the compilation cost, we ensure that in the hot paths of the Arcjet service - where latency matters most - request handling is as efficient as possible. This trade-off does add some complexity to initialization code, but it pays off with substantially lower overhead per request - see our discussion of the tradeoffs.

Trade-offs

Any time we need to integrate two or more languages, it is fraught with trade-offs that need to be made—whether using native FFI or the Component Model.

This post discussed a few of the challenges we’ve encountered at Arcjet and the reasoning behind our decisions. If we all build on the same set of primitives, such as the Component Model and WIT, we can all leverage the same set of high-quality primitives, such as wit-bindgen or wit-component, and build tooling to suit every use case. This is why working towards standards helps everyone.

The WebAssembly Component Model offers a powerful abstraction for cross-language integration, but translating its types into languages like Go introduces subtle design challenges. By choosing idiomatic patterns and selectively optimizing for performance - such as using a factory/instance pattern - we can provide a natural developer experience while maintaining efficiency.

As tooling around the Component Model evolves, we can look forward to more refined codegen approaches that further simplify these integrations.

DEV Community: Arcjet

Devcontainers, Little Snitch, macOS TCC - protecting developer laptops

Devcontainers

Outbound firewall

Built-in macOS protections

SSH agent for Git keys

MDM

Conclusions

How we run Arcjet like an open source project

Real-time sometimes, asynchronous most of the time

Stripe's internal email groups

Internal weekly update emails

Remote-first, in-person regularly

Key principles

Bot detection techniques for developers

What problems do bots cause?

Are AI bots worse?

Wildcard: agents acting on behalf of humans

How to detect and block bots

Blocking user agents

Verifying user agents

Example: Verifying Googlebot

IP address reputation

Proof of work

HTTP message signatures

JA3/JA4 fingerprint

Rate limiting

Conclusions

Low latency global routing with AWS Global Accelerator

The challenge - request analysis in the hot path

The solution - Global Accelerator + multi-region

Application to edge to region

Performance in practice

Conclusions

Next.js middleware bypasses: How to tell if you were affected?

CVE-2025-29927: Next.js Middleware bypass via x‑middleware‑subrequest

Was I affected?

What to look for in logs

CVE-2024-51479: Next.js middleware pathname-based authorization bypass

Was I affected?

What to look for in logs

Request lookup using Arcjet

Secure local Node.js dev servers with OrbStack

Working… Sometimes

Bundled certificates

Custom OpenSSL CA store location in a container

Root certificate

OpenSSL headers are not in slim containers

HTTPS dev servers

Bonus: Vite

Does Next.js need a WAF?

How can a WAF protect Next.js?

Passive scanning attacks

Active direct attacks

PCI v4 makes WAFs mandatory

Making a WAF for Next.js

So do you need a WAF for Next.js?

Test security rules without breaking production: Arcjet's DRY_RUN mode

The Old Approach: Cumbersome logging

The Old Approach: Cross-environment limitations

The Arcjet Way: Test everywhere

Test locally

Test in CI/CD

Test in staging or preview environments

Test in production

Conclusion

Building a minimalist web server using the Go standard library + Tailwind CSS

Dependency churn

Why rely on the Go standard library?

Does Go have everything we need?

Start with the web server

Middleware

Generating CSS with Tailwind

Web templates

Live reload with Air

Launching the dev environment

Deployment

Conclusion

Remix Security Checklist

1. Dependencies & Updates

CVE-2025-29927: Next.js Middleware bypass via `x‑middleware‑subrequest`

`createCookie`