Achraf

Posted on Mar 30, 2022 • Originally published at blog.escape.tech

Getting Started with GraphQL Security

#graphql #security

GraphQL has been adopted by the biggest platforms out there - Facebook, Twitter, Github, Pinterest, Walmart - big businesses that can’t compromise on security.
But even though GraphQL can be a very secure option for your API, it does not come secure out of the box. It’s actually quite the opposite: all doors are open even for the most novice hackers.
Plus, GraphQL has its own set of new considerations that, if you come from REST, you might have missed!

This was originally posted on Escape's blog

What’s the risk

The primary categories of attacks that you must absolutely protect your application from are:

injections (SQL, XSS, CCS, etc) — using unexpected/random inputs to crash your application or access private data
access control — too loose restrictions on queries and mutation allowing anybody to take actions without the necessary role
brute-force attacks — submitting a shit load of (leaked) credentials with the hope of guessing correctly
DoS (Denial of Service) — flooding your API to make it crash
CSRF — induce users to perform unwanted actions by simply clicking a malicious link to your API

Note that these attacks are so basic that they can be performed automatically and at scale. In other words, regardless of the success or domain of your application, you are prone to be the target of a script running on thousands of scraped API endpoints.

Fortunately, there are some very simple strategies you can adopt to protect your API. Read more to learn how you can implement them 👇

1. Introspection

Most attacks specific to GraphQL start from running an introspection — this is a built-in query that returns your whole data schema. Anyone can know exactly what are the valid queries and mutations of your API and send attacks accordingly until they find a breach.
This feature is enabled by default. If you can (private API) make sure that you disable it.
You can do it with your GraphQL framework (e.g. Apollo):

const server = new ApolloServer({
  typeDefs,
  resolvers,
  introspection: process.env.NODE_ENV !== 'production'
});

Or you can use a plugin like graphql-disable-introspection:

app.use("/graphql", bodyParser.json(), graphqlExpress({
    schema: myGraphQLSchema,
    validationRules: [NoIntrospection]
}));

Same thing goes with GraphiQL.

2. Limit Access control with Authorization and Authentication

Another classic API security concern is access control. In most applications, features are accessible based on your authentication status and your role (user, admin).
Not having the right authorization-check layer will expose private data and higher access features to unauthorized users, e.g. deleting an asset without the admin role.
In REST, we can use a simple middleware approach to protect all sub-routes of an API:

app.use('/api/admin', isAdmin())

In GraphQL, we can perform the same thing with the context hook:

const server = new ApolloServer({
    typeDefs,
    resolvers,
    context: async ({ req, res }) => {
        // Check authorization from the headers
        const authHeader = req.headers.authorization || '';
        const token = authHeader.replace('Bearer ', '');
        if (!token) {
            throw new Error('Unauthorized');
        }
        let user = null;
        try {
            user = await verifyJwt(token);
        } catch (e) {
            throw new Error('JWT verification failed');
        }
        // Add user to the context object
        return { user };
    }
});

But throwing an error at this stage would block EVERY unauthenticated request! You might have different queries/mutations with different levels of authentication/authorization.
You guessed it, it’s gonna happen at the resolver level. We can use a cleaner approach with a resolver middleware:

const server = new ApolloServer({
    typeDefs,
    resolvers,
    context: async ({ req, res }) => {
        // Check authorization from the headers
        const authHeader = req.headers.authorization || '';
        const token = authHeader.replace('Bearer ', '');
        if (!token) {
            return { user }
        }
        let user = null;
        try {
            user = await verifyJwt(token);
        } catch (e) {
            throw new Error('JWT verification failed');
        }
        // Add user to the context object
        return { user };
    }
});

// Authentication middleware
const authMiddleware = (next) => (parent, args, context) => {
    if(!context.user) {
        throw new Error("Unauthenticated")
    }
    return next(parent, args, context)
}

const resolvers = {
    Mutation: {
        deleteAsset: authMiddleware(async (parents, {{name, address}}, context) => {
            if (context.user && context.user.role !== "admin") {
                throw new ForbiddenError("Missing role")
            }
            const newAsset = {id: uuid(), name, address}
            context.db.assets.push(newAsset)
            return newAsset
        })
    }
}

Now that we’ve built the first layer of security (access control) the next type of attack are based on death by overload (DoS). Let’s go over different strategies that tackle the threat from different angles but combined together can make your API unshakable 💪

3. Timeout

The easiest strategy to defend your API against large (stress-inducing 😣) queries is to simply configure a maximum time to process a query.
This is actually not specific to GraphQL so it happens in your backend framework (example with Express):

const server = new ApolloServer({ typeDefs, resolvers });

const app = express();
server.applyMiddleware({ app });

app.listen(5000, () => {
    console.log(`Server running at http://localhost:5000`)
}).setTimeout(1000*60*10)

The advantage is that it doesn’t require prior knowledge about the incoming queries.
But, damage can already be done by the time you reach that limit. This is the final layer of protection if the strategies below fail to block evil queries 👹

4. Query Whitelisting

Another very generic - but efficient - strategy is to whitelist allowed queries. This way you know exactly what you’re going to get. As with timeout, this is a simple strategy that let you constrain the possibilities for hackers.
You can use persistgraphql by Apollo to auto-generate a list of approved queries at build time.
This might not work for you if you have a complex API, so let’s get fancier!

4. Depth Limiting

One of the specificities of GraphQL is nested queries. They can be awesome, but used excessively they can demand enormous processing resources and become a prison of your own making...
The advantage of this strategy is that a deep query will not even be executed, not putting any load on your server.
The advantage of this strategy - as opposed to timeout - is that a deep query will not even be executed, and thus will not put any load on your server.
You can use the very light graphql-depth-limit library to easily limit the depth of queries. First, check how deep you expect queries to be, and then set a maximum depth accordingly

app.use("/api", graphqlServer({validationRules: [depthLimit(10)]}))

Now, it can’t be perfect right?

Indeed, this strategy does not consider domain-specific queries that might be too expensive without using excessive depth.

So we are still exposed to DoS attacks...

5. Resource Limitations with Complexity Analysis

Resolving even a simple query can, in some cases, be very expensive. Maybe it requires running a big machine learning model or complex algorithms. The timeout above will help limit this effect but as we said above, the damage is already done.
The best way to approach this challenge is to run a complexity analysis (sometimes also called Cost Analysis).
The general idea is to allocate a number to each object returned as well as a maximum complexity limit for queries (see how Github is doing it for their GraphQL API)

There are some packages that might help you in case you need to go down that route. You can start with something simple like graphql-validation-complexity. If you need more control for different use cases you can checkout graphql-cost-analysis.

The limit of this strategy is the difficulty of implementing it. How do you even estimate the complexity of a query? and how do you keep these numbers up to date (scaling/changing infrastructures)?
This is a strategy that usually makes sense once you have implemented all of the above and need a more advanced approach to limiting expensive queries.

We’ve now dealt with large, dirty queries. But what if our attacker comes with a load of small ones - also known as the brute force attack!

6. Rate Limiting to block Brute Force attacks

Aaah Brute Force, the old kid on the block...
How it works: attackers use a list of 700M+ emails and 20M+ passwords (real ones that leaked, reported by Troy Hunt 😱) to hit your login mutation.

Fortunately, this also has an easy way out called rate limiting. If you limit the number of possible login submissions, this attack is (almost) obsolete.

Use the graphql-limit-plugin to specify this limit on your queries and mutations.
The best way to set it up is to set a large time window between queries/mutations when they are highly vulnerable (like a sign in) and a shorted one for less vulnerable queries/mutations. That way you only limit attackers and not your users.

7. Verbose Errors

Alright, we’ve gone a long way already. But don’t lower your guard, we’re not done yet!

By default, GraphQL is very talkative (like me 🤗). As developers, we love that! It helps us debug errors easily. But when running in production it might be a bit quick to give up sensible information
Uniqueness violation. duplicate key value violates unique constraint user_provider_token

The message might give out information about your database or worse your backend services like Elasticsearch. Again, this is really helpful if you’re the developer trying to figure out what’s going on, but pretty dangerous if it falls into the hand of someone more ill-intended.
The strategy here is not specific to GraphQL. You can use a middleware to wrap your whole server and proxy these too-expressive error messages:
instead, write them in your logs
replace them with more adapted messages:

app.use("/api", graphqlServer({validationRules: [depthLimit(10)]}))
// at the very end
app.use((err, req, res, next) => {
  // log the errors for debugging
    console.error(err.stack)
    // return a more user-friendly error
  res.status(500).send('Something broke!')
})

8. Injections

Another boomer attack that is also a threat for GraphQL is the category of injections.
Here’s a simple SQL injection to get user info from a simple (allowed) query. Here’s how someone could bypass your login and get all your users’ details

The best way to protect your API from injections is to use input validation for all incoming requests, write custom validators for domain-specific and more complex validations.

9. POST vs GET requests

Fiouuu! Alright last one!
GraphQL can be sent over GET and POST HTTP requests.
If you think it’s the same thing, that’s bad!!!
Here is a valid mutation over a GET request:
http://badgraphqlapi.com/graphql?mutation={...}&variables={...}&operation=...

Imagine I’m a bad guy, I can build a link that changes the email of an account. I can then put my email as the new value and put that on the web. Even though the mutation requires the user to be authenticated because this is a GET request, an authenticated user might just click without knowing the consequences... and BAM I have control over their account!

In some GraphQL frameworks like Apollo mutation can’t be done over GET requests but, as they say, trust but verify - make sure mutations can’t be performed over GET request with automated tests (learn how to test a GraphQL API).

Conclusion

Because of the paradigm shift introduced with GraphQL, it’s easy to forget the basic security practices when moving over to GraphQL. From similar patterns (authorization/authentication) to new considerations (disabling introspection, rate limiting, etc). Here is your checklist:

✅ Disable introspection (and GraphiQL)
✅ Authorization and authentication to limit access to your API
✅ Set a timeout
✅ Create a whitelist of queries
✅ Set a maximum depth limit of queries
✅ Setup a cost analysis to limit complex/expensive queries
✅ Set a maximum rate limit for requests
✅ Stop default verbose errors
✅ Add input validations

And if you want to run a security check of your GraphQL API, checkout graphql.security — a free quick scan looking for a dozen vulnerabilities.

DEV Community

Getting Started with GraphQL Security

What’s the risk

1. Introspection

2. Limit Access control with Authorization and Authentication

3. Timeout

4. Query Whitelisting

4. Depth Limiting

5. Resource Limitations with Complexity Analysis

6. Rate Limiting to block Brute Force attacks

7. Verbose Errors

8. Injections

9. POST vs GET requests

Conclusion

Top comments (0)

Read next

AI Breakthrough Boosts Antibody Design Accuracy to 77%, Accelerating Drug Development

AI Models Still Can't Solve Complex Visual Puzzles: New Research Shows 80% Failure Rate

AI Models Better at Solving Problems Than Explaining Their Methods, Study Shows

Mozilla's Exit from Rust Led to 9% Drop in Development, New Contributors Most Affected