Steve Sewell for Builder.io

Posted on Jan 10, 2023 • Edited on Jan 13, 2023 • Originally published at builder.io

Safer URL reading and writing in modern JavaScript

#javascript #webdev #programming

You might unknowingly be writing URLs in an unsafe way

Can you spot the bug in this code?

const url = `https://builder.io/api/v2/content
  ?model=${model}&locale=${locale}?query.text=${text}`

const res = await fetch(url)

There are at least 3...

We will break them down below:

Common issue #1 - incorrect separator characters

Doh, this is certainly newbie mistake, but one so easy to miss I’ve caught this in my own code even after 10 years of JS development.

A common culprit for this in my experience is after editing or moving code. You have a correctly structured URL, then maybe copy one piece from one to another, and then miss that the param separator was wrongly ordered.

This can also happen when concatenating, for instance

url = url + '?foo=bar'

But wait, the original url may have had a query param in it. Ok so this should be

url = url + '&foo=bar'

But wait, if the original url ****didn’t**** have query params then this is now wrong. Argh.

Common issue #2 - forgetting to encode

Gah. model and locale likely don’t need to be encoded, as they are URL-safe values, but I didn’t stop to think text can be all kind of text, including whitespace and special characters, which will cause us problems.

So maybe we’ll overcorrect and play things extra safe:

const url = `https://builder.io/api/v2/content
  ?model=${
    encodeURIComponent(model)
  }&locale=${
    encodeURIComponent(locale)
  }&query.text=${
    encodeURIComponent(text)
  }`

But things are feeling a little… uglier.

Common issue #3 - accidental whitespace characters

Oof. In order to break this long URL into multiple lines, we accidentally included the newline character and extra spaces into the URL, which will make fetching this no longer work as expected.

We can break the string up properly now, but we’re getting even messier and harder to read:

const url = `https://builder.io/api/v2/content`
  + `?model=${
    encodeURIComponent(model)
  }&locale=${
    encodeURIComponent(locale)
  }&query.text=${
    encodeURIComponent(text)
  }`

That was a lot just to make constructing one URL correct. And are we going to remember all this next time, especially as that deadline is rapidly approaching and we need to ship that new feature or fix asap?

There has to be a better way…

The `URL` constructor to the rescue

A cleaner and safer solution to the above challenge is to use the URL constructor

const url = new URL('https://builder.io/api/v2/content')

url.searchParams.set('model', model)
url.searchParams.set('locale', locale)
url.searchParams.set('text', text)

const res = await fetch(url.toString())

This solves several things for us

Separator characters will always be correct (? for the first param, & thereafter)
All params are automatically encoded
No risk of additional whitespace chars when breaking across multiple lines for long URLs

Modifying URLs

It is also incredible helpful for situations where we are modifying a URL that we don’t know the current state.

For instance, instead of having this problem:

url += (url.includes('?') ? '&' : '?') + 'foo=bar'

We can instead just do:

// Assuming `url` is a URL
url.searchParams.set('foo', 'bar')

// Or if URL is a string
const structuredUrl = new URL(url)
structuredUrl.searchParams.set('foo', 'bar')
url = structuredUrl.toString()

Similarly, you can also write other parts of the URL:

const url = new URL('https://builder.io')

url.pathname = '/blog'      // Update the path
url.hash = '#featured'      // Update the hash
url.host = 'www.builder.io' // Update the host

url.toString()              // https://www.builder.io/blog#featured

Reading URL values

Now, the age old problem of “I just want to read a query param from the current URL without a library” is solved.

const pageParam = new URL(location.href).searchParams.get('page')

Or for instance update the current URL with:

const url = new URL(location.href)
const currentPage = Number(url.searchParams.get('page'))
url.searchParams.set('page', String(currentPage + 1))
location.href = url.toString()

But this is not just limited to the browser. It can also be used in Node.js

const http = require('node:http');

const server = http.createServer((req, res) => {
  const url = new URL(req.url, `https://${req.headers.host}`)
  // Read path, query, etc...
});

As well as Deno:

import { serve } from "https://deno.land/std/http/mod.ts";
async function reqHandler(req: Request) {
    const url = new URL(req.url)
  // Read path, query, etc...
  return new Response();
}
serve(reqHandler, { port: 8000 });

URL properties to know

URL instances support all of the properties you are already used to in the browser, such as on window.location or anchor elements, all of which you can both read and write

const url = new URL('https://builder.io/blog?page=1');

url.protocol // https:
url.host     // builder.io
url.pathname // /blog
url.search   // ?page=1
url.href     // https://builder.io/blog?page=1
url.origin   // https://builder.io
url.searchParams.get('page') // 1

Or, at a glance:

URLSearchParams methods to know

The URLSearchParams object, accessible on a URL instance as url.searchParams supports a number of handy methods:

`searchParams.has(name)`

Check if the search params contain a given name

url.searchParams.has('page') // true

`searchParams.get(name)`

Get the value of a given param

url.searchParams.get('page') // '1'

`searchParams.getAll(name)`

Get all values provided for a param. This is handy if you allow multiple values at the same name, like &page=1&page=2

url.searchParams.getAll('page') // ['1']

`searchParams.set(name, value)`

Set the value of a param

url.searchParams.set('page', '1')

`searchParams.append(name, value)`

Append a param. Useful if you potentially support the same param multiple times, like &page=1&page=2

url.searchParams.append('page'. '2')

`searchParams.delete(name)`

Deletes a param

url.searchParams.delete('page')

Pitfalls

The one big pitfall to know is that all URLs passed to the URL constructor must be absolute.

For instance, this will throw an error:

new URL('/blog') // ERROR!

You can resolve that, by providing an origin as the second argument, like so:

new URL('/blog', 'https://builder.io')

Or, if you truly need to only work with URL parts, you could alternatively use URLSearchParams directly if you just need to work with query params of a relative URL:

const params = new URLSearchParams('page=1')
params.set('page=2')
params.toString()

URLSearchParams has one other nicety as well, which is that it can take an object of key value pairs as its input as well

const params = new URLSearchParams({
  page: 1,
  text: 'foobar',
})
params.set('page=2')
params.toString()

Browser and runtime support

new URL supports all modern browsers, as well as Node.js and Deno! (source)

Conclusion

At Builder.io, we've been able to make our URL logic for our Visual CMS more safe and robust on both our frontend and backend, without any dependencies, by embracing this modern API.

Just watch out for accidentally passing a relative URL to the URL constructor, and consider falling back to URLSearchParams in those cases if that suits your need, and I hope you find this to be a useful tool for your toolkit as well!