DEV Community

Cover image for Supercharge Your Web Automation with BrowserQL: Logging in and Scraping Private Data Seamlessly
Shubhendra Singh Chauhan
Shubhendra Singh Chauhan

Posted on

Supercharge Your Web Automation with BrowserQL: Logging in and Scraping Private Data Seamlessly

Scraping data behind a login is often one of the hardest parts of browser automation.

  • Traditional tools like Puppeteer or Playwright can do it — but they require heavy setup, CAPTCHA bypass tricks, and constant maintenance to handle anti-bot defenses.
  • Browserless.io changes the game with BrowserQL — their GraphQL-based automation API, designed to make stealth-first browser control easier than ever.

In this guide, we’ll:

  • Log in to Hacker News
  • Access the user's private profile page
  • Extract hidden data (the user's email address)
  • Learn why BrowserQL makes this process effortless compared to traditional approaches

✅ And we’ll do it with clean Node.js code and just a few simple mutations.

Find the entire code in this GitHub repo


🚀 Why Browserless and BrowserQL?

Before we dive into code, let’s understand why Browserless is a major upgrade for web automation:

Problem with Traditional Automation How- Browserless/BrowserQL Solves It
Fingerprinting (easily detected bots) Browserless automatically applies human-like fingerprints
Proxies and session management Built-in session handling and proxy support
Complex async code to wait for pages GraphQL-style mutations like waitForNavigation and goto handle waits intuitively
Heavy setup and browser maintenance Fully cloud-hosted; no local browsers needed
Managing CAPTCHAs and consent modals BrowserQL options handle consent modals, stealth scripts minimize CAPTCHA triggers

Browserless essentially lets you focus on your business logic, not fighting web defenses.

No browser binaries.

No network headaches.

No bot detection flags.

browserless-loggedin

Prerequisites

The BrowserQL Query: Logging In and Scraping

BrowserQL uses a GraphQL-style syntax where every interaction with the browser is expressed as a mutation.

Our flow will involve the following steps:

  • Navigate to Hacker News login page
  • Type the username and password
  • Click the login button
  • Wait for navigation to confirm login
  • Navigate to the private profile page
  • Scrape the email field from the form

Here’s the BrowserQL mutation we’ll use:

mutation LoginAndScrape {
  goto(url: "https://news.ycombinator.com/login", waitUntil: networkIdle) {
    status
  }
  enterUsername: type(text: "yourUsername", selector: "input[name='acct']") {
    time
  }
  enterPassword: type(text: "yourPassword", selector: "input[name='pw']") {
    time
  }
  clickLogin: click(selector: "input[type='submit'][value='login']") {
    time
  }
  afterLogin: waitForNavigation {
    url
    status
  }
  postLoginScrape: goto(url: "https://news.ycombinator.com/user?id=yourUsername", waitUntil: networkIdle) {
    status
  }
  emailAddress: mapSelector(selector: "input[name='email']") {
    email: attribute(name: "value") {
      value
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

To test this visually, we will run this inside the BrowserQL IDE:

  • Login to Browserless.io
  • Select the BrowserQL Editor from the left navigation
  • Copy-paste the above code and update the credentials

💡Once tested on BroswerQL IDE, you can use the Export Query as Code feature to turn any of your queries into a number of available coding languages.

bql-editor


How BrowserQL Mutations Work

Mutation What it does
goto Navigates to a page
type Types into an input field
click Clicks a button
waitForNavigation Waits until page navigation completes
mapSelector + attribute Extracts data from HTML elements

Note: While text() mutation extracts visible text, input fields (like email forms) require fetching the value attribute instead.
That's why we use mapSelector + attribute(name: "value") to scrape the email address cleanly.

You can find the list of all the mutations on Browserless Docs


Setting Up the Project

Let’s get started with a simple Node.js project.

1. Initialize a new project and install dependencies

npm init -y
npm install node-fetch dotenv
Enter fullscreen mode Exit fullscreen mode
  • node-fetch: To make API requests
  • dotenv: To securely manage API tokens

2. Create a .env file for your API key

Never hardcode sensitive information like API tokens.

Create a .env file:

BROWSERLESS_TOKEN=your-browserless-api-token
Enter fullscreen mode Exit fullscreen mode

And add .env to your .gitignore file to avoid committing secrets.


NodeJS Code:

Now let’s export the code from the BrowserQL IDE:

  • Open BrowserQL IDE
  • Select the Export query as Code option on right side of the editor
  • Select Javascript(fetch) from the list of available languages
  • click on the copy icon on top of the code-block

Note: The copied code will include the hardcoded API token. You will need to remove the token from the code and put it in the .env file. You can also find the API Token on the dashboard of Browserless.

  • Add the following lines on top of the copied code:
import fetch from "node-fetch";
import dotenv from "dotenv";

dotenv.config();
const token = process.env.TOKEN;
Enter fullscreen mode Exit fullscreen mode

Your final code will look like this:

import fetch from "node-fetch";
import dotenv from "dotenv";

dotenv.config();

const endpoint = "https://production-sfo.browserless.io/chrome/bql";
const token = process.env.TOKEN;

const optionsString = "&blockConsentModals=true";

const options = {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    query: `
mutation LoginAndScrape {
  goto(url: "https://news.ycombinator.com/login", waitUntil: networkIdle) {
    status
  }

  enterUsername: type(
    text: "yourUsername", 
    selector: "input[name='acct']"
  ) {
    time
  }

  enterPassword: type(
    text: "yourPassword", 
    selector: "input[name='pw']"
  ) {
    time
  }

  clickLogin: click(
    selector: "input[type='submit'][value='login']"
  ) {
    time
  }

  afterLogin: waitForNavigation {
    url
    status
  }

  postLoginScrape: goto(
    url: "https://news.ycombinator.com/user?id=yourUsername",
    waitUntil: networkIdle
  ) {
    status
  }

emailAddress: mapSelector(selector: "input[name='email']") {
  email: attribute(name: "value") {
    value
  }
}
}

    `,
    operationName: "LoginAndScrape",
  }),
};

const url = `${endpoint}?token=${token}${optionsString}`;
const response = await fetch(url, options);
const data = await response.json();

console.log(JSON.stringify(data, null, 2));
Enter fullscreen mode Exit fullscreen mode

✅ Browserless handles the entire browser session, fingerprinting, navigation, and anti-bot measures behind the scenes.


Expected Output

Running the script returns a clean JSON like this:

{
  "data": {
    "emailAddress": [
      {
        "email": {
          "value": "username@email.com"
        }
      }
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

response


Why BrowserQL Shines Here

Browserless Advantage Impact
Cloud-hosted stealth browsers No local installation
Built-in human-like behavior Bypass simple bot detection automatically
Consent modals auto-accepted No annoying pop-ups
GraphQL language No need to learn complex browser APIs
Session management handled Clean, lightweight scripts

✅ In 30 lines, you’ve built a real-world login automation and data extraction workflow!


Final Thoughts

BrowserQL isn't just a new API — it’s a new way of thinking about browser automation:

  • Stealth-first by design
  • Cloud-native from day one
  • Intuitive like GraphQL, not procedural coding

If you want to stop worrying about fingerprints, proxies, CAPTCHAs, and flaky browsers —

Browserless and BrowserQL are your secret weapons.


Try BrowserQL Yourself

You can start experimenting with BrowserQL today:

👉 Create your free Browserless account here

Top comments (0)