Savannah Copland 👋

Posted on Mar 23, 2021 • Originally published at fingerprintjs.com

How the Web Audio API is used for browser fingerprinting

#webdev #javascript #typescript #tutorial

Did you know that you can identify web browsers without using cookies or asking for permissions?

This is known as “browser fingerprinting” and it works by reading browser attributes and combining them together into a single identifier. This identifier is stateless and works well in normal and incognito modes.

When generating a browser identifier, we can read browser attributes directly or use attribute processing techniques first. One of the creative techniques that we’ll discuss today is audio fingerprinting.

Audio fingerprinting is a valuable technique because it is relatively unique and stable. Its uniqueness comes from the internal complexity and sophistication of the Web Audio API. The stability is achieved because the audio source that we’ll use is a sequence of numbers, generated mathematically. Those numbers will later be combined into a single audio fingerprint value.

Before we dive into the technical implementation, we need to understand a few ideas from the Web Audio API and its building blocks.

A brief overview of the Web Audio API

The Web Audio API is a powerful system for handling audio operations. It is designed to work inside an AudioContext by linking together audio nodes and building an audio graph. A single AudioContext can handle multiple types of audio sources that plug into other nodes and form chains of audio processing.

A source can be an audio element, a stream, or an in-memory source generated mathematically with an Oscillator. We’ll be using the Oscillator for our purposes and then connecting it to other nodes for additional processing.

Before we dive into the audio fingerprint implementation details, it’s helpful to review all of the building blocks of the API that we’ll be using.

AudioContext

AudioContext represents an entire chain built from audio nodes linked together. It controls the creation of the nodes and execution of the audio processing. You always start by creating an instance of AudioContext before you do anything else. It’s a good practice to create a single AudioContext instance and reuse it for all future processing.

AudioContext has a destination property that represents the destination of all audio from that context.

There also exists a special type of AudioContext: OfflineAudioContext. The main difference is that it does not render the audio to the device hardware. Instead, it generates the audio as fast as possible and saves it into an AudioBuffer. Thus, the destination of the OfflineAudioContext will be an in-memory data structure, while with a regular AudioContext, the destination will be an audio-rendering device.

When creating an instance of OfflineAudioContext, we pass 3 arguments: the number of channels, the total number of samples and a sample rate in samples per second.

const AudioContext = 
  window.OfflineAudioContext ||
  window.webkitOfflineAudioContext
const context = new AudioContext(1, 5000, 44100)

AudioBuffer

An AudioBuffer represents an audio snippet, stored in memory. It’s designed to hold small snippets. The data is represented internally in Linear PCM with each sample represented by a 32-bit float between -1.0 and 1.0. It can hold multiple channels, but for our purposes we’ll use only one channel.

Oscillator

When working with audio, we always need a source. An oscillator is a good candidate, because it generates samples mathematically, as opposed to playing an audio file. In its simplest form, an oscillator generates a periodic waveform with a specified frequency.

The default shape is a sine wave.

We made a live demo of this! You can play with the real deal on our blog.

It’s also possible to generate other types of waves, such as square, sawtooth, and triangle.

The default frequency is 440 Hz, which is a standard A4 note.

Compressor

The Web Audio API provides a DynamicsCompressorNode, which lowers the volume of the loudest parts of the signal and helps prevent distortion or clipping.

DynamicsCompressorNode has many interesting properties that we’ll use. These properties will help create more variability between browsers.

Threshold - value in decibels above which the compressor will start taking effect.
Knee - value in decibels representing the range above the threshold where the curve smoothly transitions to the compressed portion.
Ratio - amount of input change, in dB, needed for a 1 dB change in the output.
Reduction - float representing the amount of gain reduction currently applied by the compressor to the signal.
Attack - the amount of time, in seconds, required to reduce the gain by 10 dB. This value can be a decimal.
Release - the amount of time, in seconds, required to increase the gain by 10 dB.

We made a live demo of this! You can play with the real deal on our blog.

How the audio fingerprint is calculated

Now that we have all the concepts we need, we can start working on our audio fingerprinting code.

Safari doesn’t support unprefixed OfflineAudioContext, but does support
webkitOfflineAudioContext, so we’ll use this method to make it work in Chrome and Safari:

const AudioContext =
  window.OfflineAudioContext ||
  window.webkitOfflineAudioContex

Now we create an AudioContext instance. We’ll use one channel, a 44,100 sample rate and 5,000 samples total, which will make it about 113 ms long.

const context = new AudioContext(1, 5000, 44100)

Next let’s create a sound source - an oscillator instance. It will generate a triangular-shaped sound wave that will fluctuate 1,000 times per second (1,000 Hz).

const oscillator = context.createOscillator()
oscillator.type = "triangle"
oscillator.frequency.value = 1000

Now let’s create a compressor to add more variety and transform the original signal. Note that the values for all these parameters are arbitrary and are only meant to change the source signal in interesting ways. We could use other values and it would still work.

const compressor = context.createDynamicsCompressor()
compressor.threshold.value = -50
compressor.knee.value = 40
compressor.ratio.value = 12
compressor.reduction.value = 20
compressor.attack.value = 0
compressor.release.value = 0.2

Let’s connect our nodes together: oscillator to compressor, and compressor to the context destination.

oscillator.connect(compressor)
compressor.connect(context.destination);

It is time to generate the audio snippet. We’ll use the oncomplete event to get the result when it’s ready.

oscillator.start()
context.oncomplete = event => {
  // We have only one channel, so we get it by index
  const samples = event.renderedBuffer.getChannelData(0)
};
context.startRendering()

Samples is an array of floating-point values that represents the uncompressed sound. Now we need to calculate a single value from that array.

Let’s do it by simply summing up a slice of the array values:

function calculateHash(samples) {
  let hash = 0
  for (let i = 0; i < samples.length; ++i) {
    hash += Math.abs(samples[i])
  }
  return hash
}

console.log(getHash(samples))

Now we are ready to generate the audio fingerprint. When I run it on Chrome on MacOS I get the value:

101.45647543197447

That’s all there is to it. Our audio fingerprint is this number!

You can check out a production implementation in our open source browser fingerprinting library.

If I try executing the code in Safari, I get a different number:

79.58850509487092

And get another unique result in Firefox:

80.95458510611206

Every browser we have on our testing laptops generate a different value. This value is very stable and remains the same in incognito mode.

This value depends on the underlying hardware and OS, and in your case may be different.

Why the audio fingerprint varies by browser

Let’s take a closer look at why the values are different in different browsers. We’ll examine a single oscillation wave in both Chrome and Firefox.

First, let’s reduce the duration of our audio snippet to 1/2000th of a second, which corresponds to a single wave and examine the values that make up that wave.

We need to change our context duration to 23 samples, which roughly corresponds to a 1/2000th of a second. We’ll also skip the compressor for now and only examine the differences of the unmodified oscillator signal.

const context = new AudioContext(1, 23, 44100)

Here is how a single triangular oscillation looks in both Chrome and Firefox now:

However the underlying values are different between the two browsers (I’m showing only the first 3 values for simplicity):

`Chrome:`	`Firefox:`
`0.08988945186138153`	`0.09155717492103577`
`0.18264609575271606`	`0.18603470921516418`
`0.2712443470954895`	`0.2762767672538757`

Let’s take a look at this demo to visually see those differences.

We made a live demo of this! You can play with the real deal on our blog.

Historically, all major browser engines (Blink, WebKit, and Gecko) based their Web Audio API implementations on code that was originally developed by Google in 2011 and 2012 for the WebKit project.

Examples of Google contributions to the Webkit project include:
creation of OfflineAudioContext,
creation of OscillatorNode, creation of DynamicsCompressorNode.

Since then browser developers have made a lot of small changes. These changes, compounded by the large number of mathematical operations involved, lead to fingerprinting differences. Audio signal processing uses floating point arithmetic, which also contributes to discrepancies in calculations.

You can see how these things are implemented now in the three major browser engines:

Additionally, browsers use different implementations for different CPU architectures and OSes to leverage features like SIMD. For example, Chrome uses a separate fast Fourier transform implementation on macOS (producing a different oscillator signal) and different vector operation implementations on different CPU architectures (which are used in the DynamicsCompressor implementation). These platform-specific changes also contribute to differences in the final audio fingerprint.

Fingerprint results also depend on the Android version (it’s different in Android 9 and 10 on the same devices for example).

According to browser source code, audio processing doesn’t use dedicated audio hardware or OS features—all calculations are done by the CPU.

Pitfalls

When we started to use audio fingerprinting in production, we aimed to achieve good browser compatibility, stability and performance. For high browser compatibility, we also looked at privacy-focused browsers, such as Tor and Brave.

OfflineAudioContext

As you can see on caniuse.com, OfflineAudioContext works almost everywhere. But there are some cases that need special handling.

The first case is iOS 11 or older. It does support OfflineAudioContext, but the rendering only starts if triggered by a user action, for example by a button click. If context.startRendering is not triggered by a user action, the context.state will be suspended and rendering will hang indefinitely unless you add a timeout. There are not many users who still use this iOS version, so we decided to disable audio fingerprinting for them.

The second case are browsers on iOS 12 or newer. They can reject starting audio processing if the page is in the background. Luckily, browsers allow you to resume the processing when the page returns to the foreground.
When the page is activated, we attempt calling context.startRendering() several times until the context.state becomes running. If the processing doesn’t start after several attempts, the code stops. We also use a regular setTimeout on top of our retry strategy in case of an unexpected error or freeze. You can see a code example here.

Tor

In the case of the Tor browser, everything is simple. Web Audio API is disabled there, so audio fingerprinting is not possible.

Brave

With Brave, the situation is more nuanced. Brave is a privacy-focused browser based on Blink. It is known to slightly randomize the audio sample values, which it calls “farbling”.

Farbling is Brave’s term for slightly randomizing the output of semi-identifying browser features, in a way that’s difficult for websites to detect, but doesn’t break benign, user-serving websites. These “farbled” values are deterministically generated using a per-session, per-eTLD+1 seed so that a site will get the exact same value each time it tries to fingerprint within the same session, but that different sites will get different values, and the same site will get different values on the next session. This technique has its roots in prior privacy research, including the PriVaricator (Nikiforakis et al, WWW 2015) and FPRandom (Laperdrix et al, ESSoS 2017) projects.

Brave offers three levels of farbling (users can choose the level they want in settings):

Disabled — no farbling is applied. The fingerprint is the same as in other Blink browsers such as Chrome.
Standard — This is the default value. The audio signal values are multiplied by a fixed number, called the “fudge” factor, that is stable for a given domain within a user session. In practice it means that the audio wave sounds and looks the same, but has tiny variations that make it difficult to use in fingerprinting.
Strict — the sound wave is replaced with a pseudo-random sequence.

The farbling modifies the original Blink AudioBuffer by transforming the original audio values.

Reverting Brave standard farbling

To revert the farbling, we need to obtain the fudge factor first. Then we can get back the original buffer by dividing the farbled values by the fudge factor:

async function getFudgeFactor() {
  const context = new AudioContext(1, 1, 44100)
  const inputBuffer = context.createBuffer(1, 1, 44100)
  inputBuffer.getChannelData(0)[0] = 1

  const inputNode = context.createBufferSource()
  inputNode.buffer = inputBuffer
  inputNode.connect(context.destination)
  inputNode.start()

  // See the renderAudio implementation 
  // at https://git.io/Jmw1j
  const outputBuffer = await renderAudio(context)
  return outputBuffer.getChannelData(0)[0]
}

const [fingerprint, fudgeFactor] = await Promise.all([
  // This function is the fingerprint algorithm described
  // in the “How audio fingerprint is calculated” section
  getFingerprint(),
  getFudgeFactor(),
])
const restoredFingerprint = fingerprint / fudgeFactor

Unfortunately, floating point operations lack the required precision to get the original samples exactly. The table below shows restored audio fingerprint in different cases and shows how close they are to the original values:

OS, browser	Fingerprint	Absolute difference between the target fingerprint
macOS 11, Chrome 89 (the target fingerprint)	124.0434806260746	n/a
macOS 11, Brave 1.21 (same device and OS)	Various fingerprints after browser restarts: 124.04347912294482 124.0434832855703 124.04347889351203 124.04348024313667	0.00000014% – 0.00000214%
Windows 10, Chrome 89	124.04347527516074	0.00000431%
Windows 10, Brave 1.21	Various fingerprints after browser restarts: 124.04347610535537 124.04347187270707 124.04347220244154 124.04347384813703	0.00000364% – 0.00000679%
Android 11, Chrome 89	124.08075528279005	0.03%
Android 9, Chrome 89	124.08074500028306	0.03%
ChromeOS 89	124.04347721464	0.00000275%
macOS 11, Safari 14	35.10893232002854	71.7%
macOS 11, Firefox 86	35.7383295930922	71.2%

As you can see, the restored Brave fingerprints are closer to the original fingerprints than to other browsers’ fingerprints. This means that you can use a fuzzy algorithm to match them. For example, if the difference between a pair of audio fingerprint numbers is more than 0.0000022%, you can assume that these are different devices or browsers.

Performance

Web Audio API rendering

Let’s take a look at what happens under the hood in Chrome during audio fingerprint generation. In the screenshot below, the horizontal axis is time, the rows are execution threads, and the bars are time slices when the browser is busy. You can learn more about the performance panel in this Chrome article. The audio processing starts at 809.6 ms and completes at 814.1 ms:

The main thread, labeled as “Main” on the image, handles user input (mouse movements, clicks, taps, etc) and animation. When the main thread is busy, the page freezes. It’s a good practice to avoid running blocking operations on the main thread for more than several milliseconds.

As you can see on the image above, the browser delegates some work to the OfflineAudioRender thread, freeing the main thread.
Therefore the page stays responsive during most of the audio fingerprint calculation.

Web Audio API is not available in web workers, so we cannot calculate audio fingerprints there.

Performance summary in different browsers

The table below shows the time it takes to get a fingerprint on different browsers and devices. The time is measured immediately after the cold page load.

Device, OS, browser	Time to fingerprint
MacBook Pro 2015 (Core i7), macOS 11, Safari 14	5 ms
MacBook Pro 2015 (Core i7), macOS 11, Chrome 89	7 ms
Acer Chromebook 314, Chrome OS 89	7 ms
Pixel 5, Android 11, Chrome 89	7 ms
iPhone SE1, iOS 13, Safari 13	12 ms
Pixel 1, Android 7.1, Chrome 88	17 ms
Galaxy S4, Android 4.4, Chrome 80	40 ms
MacBook Pro 2015 (Core i7), macOS 11, Firefox 86	50 ms

Audio fingerprinting is only a small part of the larger identification process.

Audio fingerprinting is one of the many signals our open source library uses to generate a browser fingerprint. However, we do not blindly incorporate every signal available in the browser. Instead we analyze the stability and uniqueness of each signal separately to determine their impact on fingerprint accuracy.

For audio fingerprinting, we found that the signal contributes only slightly to uniqueness but is highly stable, resulting in a small net increase to fingerprint accuracy.

You can learn more about stability, uniqueness and accuracy in our beginner’s guide to browser fingerprinting.

Try Browser Fingerprinting for Yourself

Browser fingerprinting is a useful method of visitor identification for a variety of anti-fraud applications. It is particularly useful to identify malicious visitors attempting to circumvent tracking by clearing cookies, browsing in incognito mode or using a VPN.

You can try implementing browser fingerprinting yourself with our open source library. FingerprintJS is the most popular browser fingerprinting library available, with over 12K GitHub stars.

For higher identification accuracy, we also developed the FingerprintJS Pro API, which uses machine learning to combine browser fingerprinting with additional identification techniques. You can try FingerprintJS Pro free for 10 days with no usage limits.

Get in touch

Star, follow or fork our GitHub project
Email us your questions at oss@fingerprintJS.com
Sign up to our newsletter for updates

Top comments (25)

Nathaniel • Mar 23 '21

That's super interesting. But also deeply troubling.

What measures does FingerprintJS put in place to stop it's users from abusing it?

What's to stop someone using FingerprintJS for extortion? Or a oppressive regime using it to identify disidents?

Savannah Copland 👋 • Mar 23 '21

Hi Nathaniel - I answered some of this in my response to Pankaj - dev.to/savannahjs/comment/1ck9e

More specifically to your question though: browser fingerprinting aims to uniquely identify identifies browsers, but it is not able to identify individual people. In that way, this technology behaves very similarly to cookies, though is a little more difficult to spoof.

We do try to ensure our customers use the technology for anti-fraud, and we never do cross-domain tracking.

Nathaniel • Mar 23 '21

Could you clarify that statement. In what sense does it identify the browser but not the individual using the browser?

Savannah Copland 👋 • Mar 23 '21

A browser fingerprinting script generates a hash using signals collected via the browser. This hash serves as a "fingerprint" of that a specific site visitor's browser that remains stable between browsing sessions. If you were generating and storing browser fingerprints for your website, you would be able to tell if a visitor returned and associate multiple browsing sessions with the same browser.

It's tricky to ever know exactly who is visiting on a specific browser. You could associate the fingerprint with account information if the visitor has ever logged in, but that's probably as close as you can get. As we don't do cross-domain tracking, a website would only be able to associate browsing information for users of their site only.

Hopefully that answers your question - forgive me if I'm on the wrong track!

Nathaniel • Mar 23 '21

So the distinction is that you can identify the device but not the person using the device?

Either I'm fundamentally misunderstanding, or that's misleading thing to say.

I'm sure the vast majority of devices are used by a single individual — and with the exception of libraries and internet cafes, are used by a close knit group.

If it has the same capability of indentifying users as cookies then it definitely can 100% identify an individual person.

So is this statement true or false?

it is not able to identify individual people.

Just because it can't identify everybody 100% of the time it doesn't mean it can't identify an individual.

I hope you can appreciate why people find this disturbing.

Savannah Copland 👋 • Mar 23 '21

The distinction I'm trying to make is that even if you assume a device is used by a single individual, you still need to associate that device with additional data sources (like user data) to know that person's name, email, or phone number (to tie back to your dissident example).

I totally understand your concern though. To your argument, while there's clearly a difference between a hashed ID and a user's name or address, GDPR considers cookies and fingerprints 'personal' data, which allows it to extend protections around how this information is stored, when consent is required, and the conditions under which personal data must be deleted. We are 100% on board with this type of governance as it ensures a healthy balance between privacy and security.

Nathaniel • Mar 23 '21

Okay, I think I understand.

In a sense it's the same as cookies, but it's for people who have explicitely taken steps to avoid being tracked online.

If one of fingerprintJS's users breaks the law and invades my privacy with it, who is held responsible?

Is there a list of organizations that use FingerprintJS?
I couldn't find any on the site.

Savannah Copland 👋 • Mar 23 '21

To the cookies comment - yes that's right.

For breaking laws (as it pertains to GDPR and the EU), there are different rules for 'data processors' and 'data controllers'. We have responsibilities as a data processor that include data encryption, ensuring proper authorization access and confidentiality of data, and security incident reports and auditing. The data controller also has its own set of requirements, including asking for consent to track for marketing purposes. The Information Commissioner’s Office (who enforces GDPR) can levy significant fines against either the processor, the controller, or both, depending on who is breaking the rules. So in short, it depends, but we take our end of upholding privacy laws very seriously.

For organizations using us - we have some logos on our homepage but other than that we don't provide a full list!

Nathaniel • Mar 23 '21

I'm sorry to belabour the point.

The privacy and security implications of this go beyond legal questions into ethical ones. Tools like this are always abused — and it's often the most vulnerable people who pay the price.

I'm sure you take all kinds of strict security and legal measures, but in my opinion this is going to hurt people. I hope I'm wrong.

Pankaj Patel • Mar 23 '21

Good explanation on how Audio can be used for Fingerprinting

Though now I am wondering why do we need this extensive Fingerprinting, can you please elaborate?

Savannah Copland 👋 • Mar 23 '21

Hi Pankaj!

Our company (FingerprintJS) focuses on using browser fingerprinting as one tool of many to fight online fraud. Generally, a very small percentage of a website's traffic is responsible for the lion's share of fraudulent activity - cracking account logins, testing stolen credit cards, etc. By identifying fraudulent visitors via first-party tracking, websites can require additional authentication or other security workflows without gumming up the user experience for everyone else.

As far as ensuring that our paid product is used for anti-fraud reasons, we do work to ensure our customers are GDPR compliant, as sites using browser fingerprinting need to follow the same rules as cookies. Our pricing model also makes our solution not particularly viable for advertising use cases, which requires a very high volume of tracking.

For our open-source project, we can't control how our solution is used (and browser fingerprinting is already endemic on the web), but we feel that being transparent about the technology is better for the developer community at large.

Sergey Fetiskin • Mar 23 '21

can you see different browser versions with that approach?

with your charts, I don't see the difference between Firefox Windows and my Firefox MacOs. Is it because of visual representation or it's intended

Savannah Copland 👋 • Mar 23 '21

Yes, it should be able to distinguish between Firefox instances (and Chrome and Safari too) if the underlying OS or hardware is different. It might be just that the visualization doesn't show enough detail to see it!

Lars Gyrup Brink Nielsen • Mar 23 '21

Darn, this is scary! As Pankaj Patel said, please explain what you do at FingerprintJS.

Savannah Copland 👋 • Mar 23 '21

Hey Lars - I wrote a more in-depth response to Pankaj's comment. Hopefully answers your question! dev.to/savannahjs/comment/1ck9e

Lars Gyrup Brink Nielsen • Mar 23 '21

Thanks for your explanation, Savannah.

Jan Küster • Mar 25 '21

So we all know the ethical implications of this but this tool is not more/less ethical than wireshark/nmap.

It is a dual use software and we should treat it like such:

it can be used for tracking und de-anonymization
companies, governments, militaries and intelligence agencies can use it for unethical reasons
however, especially military and intelligence agencies have much better options to track you down on a lower level or via metada
company fingerprinting is still covered by gdpr
there is definitely white use as well
for example: in our public research project we provide a tool for functional illiterates to assess and improve their fundamental literacy skills. Their only identifier is a code for login. If they forget the code they need to generate a new one. We use the fingerprinting to connect the sessions in such case and by thus ensure their anonymity
this contradicts the view we have on fingerprinting as de-anonymization tool
you see the world is not as dualistic as we might perceive it

Archonic • Mar 24 '21

It's hard to see this as anything but "tracking for users that don't want to be tracked". The privacy implications of being able to uniquely identify browsers anonymously is precisely why GDPR was created. Adherence to GDPR is a good thing but this library is an exploit of it. Given others' responses it seems I'm not alone. I think it's a matter of time before privacy conscious browsers thwart this. Being transparent about this technology is better than not being transparent about it, so thanks for that I guess, but I just can't put FingerprintJS in my good books.

Nathaniel • Mar 24 '21

Thanks for the link to Blacklight — that's a really interesting website.

Looking at the Github for FingerprintJS there's an thread about the ethical implications of the project: #430.

The authors defend themselves by saying the library helps defend users from privacy violations by being open source, and therefore bringing to light these issues.

They build weapons so world can better defend itself against people who use their weapons.

@savannahjs Does fingerprintJS also maintain an opensource library for protecting users?