DEV Community

Cover image for A practical guide to headless browser automation in Hyperlambda
Thomas Hansen
Thomas Hansen

Posted on • Originally published at hyperlambda.dev

A practical guide to headless browser automation in Hyperlambda

A lot of browser automation tooling feels like it was designed for its own ecosystem first and for your application second.

You end up learning a separate mental model, a separate lifecycle, and usually some kind of invisible context that makes simple things harder to reason about than they should be.

I did not want that.

When I added headless browser support to Hyperlambda, I wanted it to feel like the rest of the language. I wanted a small set of explicit operations that I could combine in predictable ways. Connect to a browser. Navigate somewhere. Wait for the page to become usable. Click, type, inspect, screenshot, close.

That is basically it.

This article is a practical walkthrough of how I use the headless browser slots in Hyperlambda.

The core idea

The model is intentionally simple.

You start by opening a browser session. That returns a session_id. Then you pass that session_id into every other browser-related slot.

I like this because there is no hidden browser object, no ambient scope, and no guessing about where state lives.

The flow is visible.

  1. Connect
  2. Navigate
  3. Wait
  4. Interact
  5. Read state
  6. Save a screenshot if needed
  7. Close

Here is the smallest possible example.

.session_id
set-value:x:@.session_id
   puppeteer.connect

puppeteer.goto:x:@.session_id
   url:"https://ainiro.io"

puppeteer.title:x:@.session_id

puppeteer.close:x:@.session_id
Enter fullscreen mode Exit fullscreen mode

If you understand that pattern, everything else builds on top of it.

How I start a browser session

The first slot I use is puppeteer.connect.

Minimal version:

.session_id
set-value:x:@.session_id
   puppeteer.connect
Enter fullscreen mode Exit fullscreen mode

That launches Chromium and returns a session identifier.

If I need more control, I can add configuration such as headless mode, executable path, launch timeout, extra Chromium flags, or lifetime settings.

For example:

.session_id
set-value:x:@.session_id
   puppeteer.connect
      headless:true
      timeout:30000
      args
         .:--no-sandbox
         .:--disable-dev-shm-usage
      timeout-minutes:30
      max-lifetime-minutes:120
Enter fullscreen mode Exit fullscreen mode

Most of the time, I do not need anything beyond the default call. But it is useful to know I can tune launch behavior when I need to.

How I navigate to a page

Once I have a session_id, I can send the browser somewhere with puppeteer.goto.

puppeteer.goto:x:@.session_id
   url:"https://ainiro.io"
Enter fullscreen mode Exit fullscreen mode

That is enough for simple flows.

If I want more deterministic behavior, I add timeout and a wait strategy.

puppeteer.goto:x:@.session_id
   url:"https://ainiro.io"
   timeout:30000
   wait-until:networkidle2
Enter fullscreen mode Exit fullscreen mode

The wait-until argument matters more than people think.

Some pages are usable as soon as the DOM exists.
Some keep loading assets after initial render.
Some populate important UI elements after async JavaScript completes.

That is why I usually treat goto as navigation, not as proof that the page is ready for the next action.

How I wait for the page to become usable

In practice, I rarely rely on navigation alone.

If I know I need a specific element before continuing, I wait for that element explicitly.

puppeteer.wait-for-selector:x:@.session_id
   selector:"#name"
   visible:true
   timeout:10000
Enter fullscreen mode Exit fullscreen mode

That makes the automation more stable because I am waiting for the thing I actually care about.

If I expect the page URL itself to change after some action, I can wait for that too.

puppeteer.wait-for-url:x:@.session_id
   url:"https://ainiro.io/contact-us"
   timeout:10000
Enter fullscreen mode Exit fullscreen mode

I think this is one of the biggest differences between browser automation that mostly works and browser automation that keeps breaking in annoying ways.

Do not wait for abstract readiness if your next step depends on something concrete.

Wait for the concrete thing.

How I click buttons and links

For clicks, I use puppeteer.click.

puppeteer.click:x:@.session_id
   selector:"#submit_contact_form_button"
Enter fullscreen mode Exit fullscreen mode

If I need to adjust how the click happens, I can add options.

puppeteer.click:x:@.session_id
   selector:"#submit_contact_form_button"
   click-count:2
Enter fullscreen mode Exit fullscreen mode

It is intentionally straightforward.

Find the selector. Click the selector. Move on.

That is exactly the kind of browser automation API I prefer.

How I type and fill form fields

There are two useful variants here.

If I want to type text into a field without clearing it first, I use puppeteer.type.

puppeteer.type:x:@.session_id
   selector:"#name"
   text:"Thomas Hansen"
Enter fullscreen mode Exit fullscreen mode

If I want to replace whatever is already there, I use puppeteer.fill.

puppeteer.fill:x:@.session_id
   selector:"#email"
   text:"thomas@gaiasoul.com"
Enter fullscreen mode Exit fullscreen mode

If I want a more human-looking typing pace, I can add delay.

puppeteer.type:x:@.session_id
   selector:"#info"
   text:"Hello from Hyperlambda"
   delay:25
Enter fullscreen mode Exit fullscreen mode

That covers most form automation I need.

And if I need keyboard-level interaction rather than button clicks, I can use puppeteer.press.

puppeteer.press:x:@.session_id
   selector:"#submit_contact_form_button"
   key:Enter
Enter fullscreen mode Exit fullscreen mode

How I work with selects

Dropdowns use puppeteer.select.

puppeteer.select:x:@.session_id
   selector:"#plan"
   values
      .:basic
      .:pro
Enter fullscreen mode Exit fullscreen mode

That is especially useful when automating admin panels, onboarding forms, or internal tools where select controls are common.

How I inspect the page

Not every browser session is about clicking through forms.

Sometimes I just want to inspect what the browser sees after the page has fully rendered.

For that, I typically use these slots:

Read the title

puppeteer.title:x:@.session_id
Enter fullscreen mode Exit fullscreen mode

Read the current URL

puppeteer.url:x:@.session_id
Enter fullscreen mode Exit fullscreen mode

Read the rendered HTML

puppeteer.content:x:@.session_id
Enter fullscreen mode Exit fullscreen mode

That last one is particularly useful because it gives me the page as the browser sees it after JavaScript execution, not just the original raw server response.

If I need something even more targeted, I can evaluate JavaScript directly in the page.

puppeteer.evaluate:x:@.session_id
   expression:"document.title"
Enter fullscreen mode Exit fullscreen mode

Or:

puppeteer.evaluate:x:@.session_id
   expression:"typeof window.mcaptcha"
Enter fullscreen mode Exit fullscreen mode

That gives me a quick way to inspect runtime state without building a bigger extraction flow.

How I save screenshots

Screenshots are useful for debugging, documentation, and simple verification.

Here is the basic PNG example.

puppeteer.screenshot:x:@.session_id
   filename:"/etc/tmp/example.png"
   full-page:true
Enter fullscreen mode Exit fullscreen mode

And here is a JPEG version.

puppeteer.screenshot:x:@.session_id
   filename:"/etc/tmp/example.jpg"
   type:jpeg
   quality:85
Enter fullscreen mode Exit fullscreen mode

This is one of those features I end up using more than I expect.

If something fails in an automated flow, a screenshot often gives me the answer much faster than logs alone.

How I close the session

When I am done, I close the browser.

puppeteer.close:x:@.session_id
Enter fullscreen mode Exit fullscreen mode

I know that sounds obvious, but I still think it matters to treat connect and close as part of the actual program design.

Open the resource.
Use the resource.
Close the resource.

That keeps the flow readable and avoids unnecessary browser sessions hanging around.

A full example

Here is a simple end-to-end example.

It opens Chromium, goes to a page, waits for network activity to calm down, reads title and URL, takes a screenshot, and closes the session.

.session_id
set-value:x:@.session_id
   puppeteer.connect

puppeteer.goto:x:@.session_id
   url:"https://ainiro.io"
   timeout:30000
   wait-until:networkidle2

puppeteer.title:x:@.session_id
puppeteer.url:x:@.session_id

puppeteer.screenshot:x:@.session_id
   filename:"/etc/tmp/ainiro-homepage.png"
   full-page:true

puppeteer.close:x:@.session_id
Enter fullscreen mode Exit fullscreen mode

And here is a form-oriented example.

.session_id
set-value:x:@.session_id
   puppeteer.connect

puppeteer.goto:x:@.session_id
   url:"https://ainiro.io/contact-us"
   timeout:30000
   wait-until:networkidle2

puppeteer.wait-for-selector:x:@.session_id
   selector:"#name"
   visible:true
   timeout:10000

puppeteer.fill:x:@.session_id
   selector:"#name"
   text:"Thomas Hansen"

puppeteer.fill:x:@.session_id
   selector:"#email"
   text:"thomas@gaiasoul.com"

puppeteer.type:x:@.session_id
   selector:"#info"
   text:"Hello from Hyperlambda"

puppeteer.click:x:@.session_id
   selector:"#submit_contact_form_button"

puppeteer.close:x:@.session_id
Enter fullscreen mode Exit fullscreen mode

What I think makes this approach nice

What I like about these slots is not that they are flashy.

It is that they are predictable.

They do not try to invent a second programming model inside Hyperlambda.
They do not hide the browser lifecycle behind clever abstractions.
They do not require me to guess where state comes from.

Everything important is explicit.

You can read the flow from top to bottom and understand exactly what the browser is doing.

That is a huge advantage when the automation grows from a toy example into something that actually matters.

Final thoughts

I think browser automation should feel boring in the best possible way.

You should be able to connect, navigate, wait, interact, inspect, screenshot, and close without learning a separate philosophy just to click a button on a web page.

That is why I like the headless browser slots in Hyperlambda.

They give me enough power to automate real workflows, but they stay small and direct enough that the code remains readable.

And for me, that is usually the difference between a browser automation API I try once and a browser automation API I actually keep using.

If you want to get started with Hyperlambda you can clone Magic Cloud and Hyperlambda here.

Top comments (0)