DEV Community

Azamat Safarov
Azamat Safarov

Posted on

Automating NotebookLM with Hermes Agent: From Research to Multi-Platform Content

NotebookLM made the writing smarter — analyzing sources, generating podcasts, building visuals. But getting those artifacts out of Google and into my publishing pipeline still took 30 minutes of manual clicking and downloading. I fixed that by wiring NotebookLM directly into Hermes Agent. Now research turns into publish-ready content in about two minutes.


The Problem I Was Solving

NotebookLM is one of the best research tools Google has shipped. Drop in a PDF, a few URLs, or a research query and it builds a notebook — a living summary you can chat with, quiz, or turn into artifacts.

But here is where it breaks for daily publishing. After NotebookLM generates a podcast, a mind map, or a slide deck, you are back to manual mode. Visit the site. Click each artifact. Download. Rename. Move to the right folder. Format for the platform you are posting to. I was spending nearly half an hour on this for every article. The AI did the creative work; I did the file management.

Worse, the standard approach for automating this — browser automation via Playwright — hits a wall. Google detects headless Chromium and Firefox instantly. The login page throws "browser not secure" and you are done. Every automation script I tried failed at authentication.

I needed a way to run NotebookLM from code, download artifacts programmatically, and route them into my publishing pipeline. Hermes Agent already handled the platform distribution. I just needed the bridge.

The Cookie Export Hack That Works

The only authentication method that survived Google's bot detection was exporting cookies from my real Chrome browser. No headless browsers, no Puppeteer tricks. Just real cookies from a real session.

Here is what works reliably:

  1. Log into notebooklm.google.com in your normal Chrome or Edge
  2. Export cookies for domains google.com and notebooklm.google.com as Netscape format using any cookie-export extension
  3. Convert that text file to Playwright's storage_state.json format
  4. The notebooklm-py wrapper reads that state file and gets a valid token
# convert_netscape_cookies.py — one-time setup, reusable
import json, sys, datetime, urllib.parse

netscape_lines = open(sys.argv[1]).readlines()
cookies = []
for line in netscape_lines:
    line = line.strip()
    if not line or line.startswith("#"):
        continue
    parts = line.split("\t")
    if len(parts) < 7:
        continue
    domain, flag, path, secure, expiry, name, value = parts[:7]
    cookies.append({
        "name": name, "value": value, "domain": domain.lstrip("."),
        "path": path, "expires": int(expiry) if expiry.isdigit() else None,
        "httpOnly": "HttpOnly" in flag or "httponly" in flag,
        "secure": "Secure" in flag or "secure" in flag,
        "sameSite": "Lax"
    })

storage = {"cookies": cookies, "origins": []}
with open("storage_state.json", "w") as f:
    json.dump(storage, f, indent=2)
print(f"Converted {len(cookies)} cookies, saved to storage_state.json")
Enter fullscreen mode Exit fullscreen mode

Usage: python convert_netscape_cookies.py exported_cookies.txt

I run this once per session. 23 cookies are enough for full API access. Until Google rotates their auth, this is the only reliable path.

The NotebookLM CLI That Does Everything

Once authenticated, notebooklm-py exposes a clean CLI and Python API. I wrapped this into Hermes skills so the agent can create notebooks, add sources, generate artifacts, and poll for completion without me touching the browser.

Creating a Notebook and Adding Sources

# Create a notebook
python -m notebooklm create "Philosophy Notes"
# → returns notebook ID

# Add sources — URLs, files, research queries
python -m notebooklm source add <notebook-id> https://example.com/article.pdf
python -m notebooklm source add-research <notebook-id> "Stoicism in modern productivity"
Enter fullscreen mode Exit fullscreen mode

Sources can be URLs, local files, Google Drive URLs, or research queries. NotebookLM processes them and builds an indexed knowledge base you can chat with.

Generating Artifacts

Each artifact has its own command and output format.

Artifact Command Output Best For
Audio Overview generate audio MP3 Podcasts, Telegram voice messages
Infographic generate infographic "cover..." PNG Article covers, diagrams, flowcharts
Mind Map generate mind-map --instructions "..." JSON Architecture overviews, concept maps
Slide Deck generate slide-deck "5 slides..." PDF Presentations, LinkedIn carousels
Quiz generate quiz Text Telegram polls, engagement

Note the positional argument for prompts. --prompt does not exist for most artifacts. Infographic, slide-deck, and report accept the description as a positional string. Mind-map only accepts --instructions. Audio accepts a positional topic or nothing — it builds the podcast from your sources directly.

One gotcha: --wait can time out on slow generations (over 60 seconds). I always launch without waiting, get the task ID, then poll separately:

# Launch generation without waiting
python -m notebooklm generate infographic "Dark-themed cover for article about AI research tools, ocean palette" --orientation landscape --style professional --json
# → {"task_id": "...", "status": "pending"}

# Poll in a loop until done
python -m notebooklm artifact poll <task_id>
Enter fullscreen mode Exit fullscreen mode

When the status flips to completed, a Google Cloud Storage URL is returned. I download that asset and route it to the right platform folder.

Wiring It Into the Publishing Pipeline

The full flow in Hermes Agent looks like this:

Source material (markdown, PDFs, URLs)
        ↓
NotebookLM notebook created
        ↓
Sources added → NotebookLM indexes and summarizes
        ↓
Artifacts generated (infographic, mind-map, audio, slides)
        ↓
Hermes downloads, compresses, renames assets
        ↓
Platform-specific routing:
   Dev.to      → full article with embedded infographic
   Telegram    → voice message (audio compressed to 48kbps mono)
   LinkedIn    → carousel from slide deck PDF
   Medium      → essay with timeline / quote cards
   Bluesky     → dense teaser + link to full article
   Mastodon    → threaded summary with cover image
Enter fullscreen mode Exit fullscreen mode

The whole sequence runs as a single Hermes skill. I trigger it with one line: generate notebooklm article for <topic>. The agent creates the notebook, waits for artifacts, downloads them, compresses audio for Telegram's 20 MB limit, and writes platform drafts in parallel.

No more clicking through Google UI. No more drag-and-drop. The AI generates; the pipeline publishes.

A Working Integration Script

Here is the core pattern I use inside the Hermes skill:

from notebooklm import NotebookLMClient
import requests, os, time

client = NotebookLMClient()

# 1. Create notebook
nb = client.create_notebook("Stoic Productivity Analysis")

# 2. Add sources — mix of URLs and research queries
client.add_source(nb.id, source_url="https://example.com/seneca-essay.pdf")
client.add_source(nb.id, source_research="modern applications of Stoic principles")

# 3. Generate cover image (infographic)
cover = client.generate_artifact(
    nb.id,
    artifact_type="infographic",
    instructions="Dark-themed cover image for a technical article about AI-powered research tools. Ocean color palette."
)

# 4. Poll for completion
task_id = cover.task_id
while True:
    status = client.poll_artifact(task_id)
    if status.status == "completed":
        break
    if status.status in ("failed", "error"):
        raise RuntimeError(f"Artifact failed: {status.error}")
    time.sleep(10)

# 5. Download
cover_url = status.url
r = requests.get(cover_url)
open("assets/generated/cover.png", "wb").write(r.content)

# 6. Generate audio podcast
audio = client.generate_artifact(nb.id, artifact_type="audio")
audio_task = audio.task_id
while True:
    status = client.poll_artifact(audio_task)
    if status.status == "completed":
        break
    time.sleep(10)

# 7. Download
audio_url = status.url
r = requests.get(audio_url)
open("assets/generated/podcast.mp3", "wb").write(r.content)

# 8. Compress for Telegram (20 MB limit)
os.system("ffmpeg -i assets/generated/podcast.mp3 -b:a 48k -ac 1 assets/generated/podcast_tg.mp3")
Enter fullscreen mode Exit fullscreen mode

The NotebookLMClient reads the same storage_state.json I built from Chrome cookies. No login step, no browser puppetry. Just API calls against a valid Google session.

Where This Fits in the Larger Stack

NotebookLM is not where content starts — it is where research gets processed. My pipeline already had source material (markdown drafts, PDFs, web pages). NotebookLM adds three things other tools do not:

  1. Audio Overview — two-voice podcast generated from your sources. I drop this straight into Telegram as a voice message. Compression takes a 30 MB raw file down to 7-10 MB. Voices stay clear at 48 kbps mono.

  2. Infographics — article covers and process diagrams generated from a text description. Before this I used Python Pillow locally. It worked, but every cover looked like the same template with slightly different text. NotebookLM generates unique styles per article. I just describe what I want.

  3. Mind Maps — architecture and concept maps that I embed in Dev.to and Medium articles. The output includes node coordinates in JSON, so I can render it or let NotebookLM produce the visual directly.

I still write the original text. I still decide the angle and the argument. But the repetitive work — cover design, audio production, diagram drawing — moved from my desk to the agent's queue.

What Broke Along the Way

Playwright login is dead. Every Chromium and Firefox instance I tried was detected by Google. "Browser not secure" at the login screen, every time. The cookie export method is the only approach that still works.

Wrong prompt syntax. I spent an hour debugging generate infographic --prompt "..." before realizing the flag does not exist. Positional arguments only. For mind-map, use --instructions.

Audio timing. NotebookLM audio generation takes 5-10 minutes. If I use --wait, the CLI sometimes times out after 60 seconds. I now always launch with --json to get the task ID, then poll in a loop until completed.

Image size limits. Bluesky caps images at 2 MB. Raw NotebookLM PNGs are often 2.3-2.8 MB. I run them through ffmpeg for JPEG compression at 800px width before uploading. Yes, ffmpeg — it handles image scaling too.

Results and What Changed

Content production dropped from research plus design plus publishing to research plus about two minutes. The bottleneck is no longer tooling — it is deciding what to write. NotebookLM handles visuals and audio. Hermes handles routing and platform formatting. I handle ideas.

The full pipeline code is open-source:

https://github.com/AzamatSafarov/hermes-notebooklm-bridge

It includes the cookie converter, the one-command pipeline, a Chrome bookmarklet for one-click cookie export, and a working example of Hermes Agent integration. If you are already using Hermes, you can slot it in with two CLI commands and a cookie file.

This whole pipeline runs on the LLM Wiki pattern by Andrej Karpathy — a persistent wiki that the LLM builds and maintains as you go, instead of RAG that re-discovers everything from scratch on every query.

https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f#file-llm-wiki-md


Dev.to
https://dev.to/azamat_safarov_119e17602f/

Medium
https://medium.com/@akutagorasava777

Paragraph
https://paragraph.com/@azamatsafarov

X
https://x.com/Azamat__Safarov

Top comments (0)