I would like to present Lambdalet.AI, my submission for the AWS Lambda Hackathon.
Lambdalet.AI (Lambda + bookmarklet) is an AI-powered bookmarking and read-it-later service. It uses a dynamic JavaScript bookmark — a so-called bookmarklet — to send the current page's HTML to an AWS Lambda function. The Lambda function invokes a Large Language Model on Bedrock to extract the page's main content — ignoring headers, footers, and other irrelevant elements — and saves it to a Notion database. The Notion database stores all our bookmarks and allows us to find bookmarks by title, URL, and even their content.
What’s a Bookmarklet?
A bookmarklet is a snippet of JavaScript stored in a bookmark; it runs in the context of the page you are viewing by using the javascript:
URL scheme. For instance, this bookmarklet makes any website editable:
javascript:(() => {
document.body.contentEditable = 'true';
document.designMode = 'on';
void 0;
})();
Add it as a bookmark, click it on any site (even on this article), and you can freely edit the page.
Bookmarklets were once hugely popular because services such as Instapaper, Evernote, and Pocket could extend browsers without requiring a full extension. The rise of Content Security Policy (CSP) has broken many bookmarklets, even though the spec explicitly states CSP shouldn’t interfere with them. In practice, behavior varies by browser and site, but many pages still allow bookmarklets — or we can often work around these limitations, as we’ll see later.
Why Save Bookmarks to Notion?
Notion is my command center for notes, tasks, and even my blog zirkelc.dev (be sure to give it a star!). Notion databases excel at structured data, and their search is world-class. It always bothered me that browsers save only a title and URL, which often leaves me hunting for an article I know I read but just can’t find again. Capturing the full content solves that.
Try It
You can try Lambdalet.AI right now — no deployment required:
Open the Notion database: Open my shared Notion database in a new tab. New pages will appear here automatically.
-
Create the bookmarklet: Add a new bookmark and paste this code below as its URL.
Bookmarklet -
Pick any website: Like for example, this article. Requests are deduplicated by URL, so multiple clicks on the same page simply update the entry.
Note that if you select any text on the website, only this selection will be saved to Notion
Click the bookmarklet: If the site’s CSP blocks
fetch
, the code falls back to a form submission in a temporary window.Check Notion: A new entry appears. Content extraction is asynchronous, so text may take a moment to appear.
How It Works
Let’s follow the request as it travels from your browser to Notion. For more depth, see the repo’s README and the code.
On any website you want to save, you can click the bookmarklet in your browser. It will send the current page's HTML, title and URL in a POST
request to a REST API Gateway. The bookmarklet is designed to first try a simple fetch
call. As this is often blocked by a restrictive Content Security Policy (CSP), the bookmarklet falls back on a form submission which temporarily opens a new window (to avoid navigating way from the current page).
Why POST instead of GET?
A POST request lets the client send the fully rendered HTML — no server-side scraping required. That avoids running a headless browser in Lambda and sidesteps localization issues that would arise if the Lambda IP came from a different region than the user.
The REST API is protected by an API key with usage quotas. Because form submissions can’t set custom headers such as x-api-key
, the key is passed as a query-string parameter and verified by a custom Lambda authorizer.
Once past the API Gateway, the request reaches the first Lambda function, whose only job is deduplication. Rather than processing every click immediately, the function writes a message to an SQS FIFO queue keyed by URL. Modern pages easily exceed the 256 KB SQS payload limit, so the raw HTML is uploaded to S3 and the queue message contains only the S3 URI. Multiple clicks on the same URL therefore collapse into a single queued job.
A second Lambda polls the queue. Each message is handled in its own invocation to avoid timeouts while it waits for Bedrock. The function downloads the HTML from S3, then checks whether a Notion page for that URL already exists. If it does, the old page is archived and a new empty page is created. Archiving is cheaper than deleting the page's content block-by-block; the Notion API’s rate limits make large-scale deletes expensive.
The new Notion page initially contains only the title and URL. The raw HTML is first converted to Markdown to reduce token count, then passed to Amazon Bedrock (Claude 3.7 Sonnet) to extract only the main content — no headers, nav bars, cookie banners, or ads.
Prompt
const prompt = `
Here is the content from the URL converted from HTML to markdown:
<url>${url}</url>
<markdown>
${markdown}
</markdown>
Your task is to extract the main content from the given markdown.
Wrap your response in <content> tags.
<content>
[Your markdown content here]
</content>`;
The model's Markdown output is converted into Notion block objects and appended to the page. Finally, the status of the Notion page is updated to indicate the completion of processing.
All Claude models support 200K (~150K words) token context windows, but the practical ceiling is actually the output limit. Only the latest generations, Claude Sonnet 3.7 and 4.0, have raised the limit from 8K to 64K tokens (~48 K words), making such use cases possible.
And that’s it: a click of the bookmarklet becomes a clean, searchable page in Notion.
What’s Next?
I built earlier prototypes in the pre-LLM era, but only recent models, with much larger contexts and outputs, make this practical. Here are a few things I'd like to implement in future iterations.
Richer Metadata
Great bookmarks must be easy to rediscover. Beyond full text, I want tags, cover images, and auto-generated summaries.
Better CSP Handling
The current fallback uses form submission, but some sites — I'm looking at you, GitHub — block even that. I'm exploring additional fallback mechanisms like window.open
to open a new tab, though that forces a GET
request, so we’d need to fetch HTML server-side — likely with a headless browser.
Refactor with Step Functions
Today the second Lambda idles while waiting for the Bedrock response. Converting the flow to Step Functions would let us use the native Bedrock integration, reducing cost and avoiding timeouts.
Deploy Your Own
Setting up your own Lambdalet.AI instance is easy — full setup instructions are in the repo. Be sure to give it a star and happy bookmarking! 🔖 📑
Lambdalet.AI
Lambdalet.AI (Lambda + bookmarklet) is an AI-powered bookmarking and read-it-later service. It uses a dynamic javascript bookmark - a so called bookmarklet - to send the current page's HTML to an AWS Lambda function. The Lambda function invokes a Large Language Model on Bedrock to extract the page's main content - ignoring headers, footers and other non-content elements - and saves it to a Notion database. The Notion database stores all our bookmarks and allows us to find bookmarks by title, URL and even their content.
Try It
You can try Lambdalet.AI without deploying anything. Follow these steps:
-
Open Notion Database: Open the shared Notion database in a new tab. This is where all saved pages will appear.
-
Create Bookmarklet: Create a new bookmark in your browser using the JavaScript code below as the URL.
JavaScript bookmarklet
javascript: (async () => {
…
Top comments (4)
Great writeup, will keep that one in my bookmarks ;)
Never heard of bookmarklets before. Thanks for the demo!
This is extremely impressive, I’ve always wanted something like this. The level of detail in the solution and the classic bookmarklet twist is right up my alley
Thank you! :-)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.