DEV Community: southy404

Mindscape: A Live Cognitive Graph for Hermes Agent

southy404 — Wed, 27 May 2026 13:10:22 +0000

This is a submission for the Hermes Agent Challenge: Build With Hermes Agent

What I Built

I built Mindscape, a Hermes Agent plugin that turns agent activity into a live, visual cognitive graph.

The idea started from a simple fascination: Hermes Agent can plan, use tools, reason across multiple steps, and build context over time — but most of that process is normally hidden inside logs, chat messages, or terminal output.

I wanted to see the agent’s thinking structure visually.

So I built a Mindscape view for Hermes: an Obsidian-like graph where sessions, tool calls, reasoning snapshots, decisions, memory nodes, errors, and project architecture can become connected nodes.

Instead of treating an agent run as a flat chat history, Mindscape treats it as a growing map of cognition.

The plugin adds a new dashboard section inside Hermes where you can explore:

a live graph view
a timeline of cognitive events
clustered concepts and tags
searchable nodes
an inspector panel with metadata, relations, timestamps, and node type

Mindscape is meant to answer questions like:

What did the agent do?
Which tools were involved?
What decisions were made?
Which concepts are connected?
What reasoning or errors happened during a session?
How does a project or task evolve over time?

It is both a visualization tool and a lightweight graph memory layer for Hermes Agent.

Demo

Here are a few screenshots from the current version.

Graph View

The graph view shows concepts as connected nodes. Different node types use different colors, and selecting a node opens the inspector on the right.

Example node types include:

reasoning
tool call
decision
session
memory
error
manual nodes

Timeline View

The timeline shows the same graph as a chronological event stream. This makes it easier to understand what happened first, what came later, and how a session evolved.

Inspector

When a node is selected, Mindscape shows:

title
type
content
tags
metadata
timestamps
relations to other nodes

This is useful when debugging an agent workflow or when trying to understand why a particular decision or tool call happened.

Code

GitHub repository:

https://github.com/southy404/hermes-mindscape

Install directly from GitHub:

hermes plugins install southy404/hermes-mindscape --enable

Then start the Hermes dashboard:

hermes dashboard

Open the Mindscape tab in the dashboard sidebar.

Mindscape also includes an optional example graph for quick testing:

curl -X POST "http://127.0.0.1:9119/api/plugins/mindscape/seed-demo?force=true"

The plugin works without the demo seed. The seed is only there to help users quickly see what the graph looks like.

My Tech Stack

Hermes Agent
Hermes plugin system
Python
FastAPI-style plugin API routes
Hermes lifecycle hooks
Hermes dashboard plugin extension
Vanilla JavaScript
SVG force-directed graph rendering
CSS
Local persistent JSON graph storage
WebSocket-ready update flow

The plugin is structured as a standard Hermes plugin:

hermes-mindscape/
├── plugin.yaml
├── manifest.json
├── __init__.py
├── README.md
├── dashboard/
│   ├── plugin_api.py
│   └── dist/
│       ├── index.js
│       └── style.css
├── graph/
│   ├── store.py
│   └── events.py
└── hooks/
    └── graph_hooks.py

How I Used Hermes Agent

Hermes Agent is not just used as a theme or wrapper here. Mindscape is built around Hermes itself.

The plugin connects to Hermes in three main ways:

1. Hermes as the agentic source

Mindscape is designed around the kind of work Hermes does: planning, tool use, multi-step reasoning, session management, and project execution.

Instead of only showing the final answer, Mindscape tries to preserve the structure around the agent’s process.

A tool call can become a node.
A decision can become a node.
A reasoning snapshot can become a node.
A session can become a node.
An error can become a node.
A relation between two ideas can become an edge.

That makes the agent’s work easier to inspect after the fact.

2. Hermes plugin hooks

Mindscape uses Hermes plugin hooks to observe agent activity defensively.

The goal is not to interrupt Hermes or change its behavior. The goal is to listen, capture useful events, and convert them into graph data.

For example, hooks can capture:

session start
tool calls
reasoning snapshots
subagent events
errors or failed actions

Those events are normalized into Mindscape nodes with metadata, tags, timestamps, and relations.

3. Hermes Dashboard integration

The project also extends the Hermes dashboard with a full custom UI.

I wanted the plugin to feel native inside Hermes instead of being a separate external app. The dashboard plugin provides:

graph navigation
zoom controls
node creation
editing and deletion
timeline view
cluster view
search
inspector panel

This makes Mindscape useful not only as a backend logging system but as an interactive visual workspace.

Why I Built It

I have always liked visual knowledge tools like mind maps, graph views, and Obsidian-style note networks.

When working with AI agents, I often found myself wondering:

What would it look like if the agent’s reasoning and tool usage became a living map?

Hermes Agent was a good fit for this idea because it already has the important pieces:

tool use
sessions
plugins
hooks
dashboard extensions
local-first execution
multi-step agent workflows

Mindscape connects those pieces into a visual layer.

The long-term idea is that agents should not only produce answers — they should leave behind inspectable cognitive traces.

What Makes It Useful

Mindscape can help with:

debugging agent workflows
understanding which tools were used
reviewing reasoning paths
visualizing project architecture
tracking decisions over time
exploring agent memory
showing non-technical users what an agent is doing
turning hidden logs into visible structure

For example, if Hermes works on a complex project, Mindscape can show the related components, decisions, tools, errors, and sessions as a graph instead of scattered text.

That makes the agent feel less like a black box.

What Was Challenging

The hardest part was not only building the graph.

The hard part was making it feel like a real Hermes plugin:

installable from GitHub
visible in the Hermes dashboard
safe to load
persistent across sessions
not dependent on personal demo data
defensive when hooks are unavailable
usable even with an empty graph
clean enough for public release

I also had to make sure the plugin did not require manual local copying. The final version is structured so it can be installed directly through Hermes:

hermes plugins install southy404/hermes-mindscape --enable

What I Learned

This project made me appreciate Hermes Agent’s plugin architecture much more.

The plugin system makes it possible to build something that is not just a tool, but a new interface for understanding agent behavior.

I also learned that visualizing agent activity changes how you think about agents.

A chat transcript is linear.
A log file is technical.
A graph is explorable.

That makes a big difference.

Future Ideas

I would like to continue improving Mindscape with:

richer live WebSocket updates
better automatic semantic linking
graph snapshots
reasoning replay
import/export
SQLite storage
3D graph mode
multi-agent views
memory confidence scoring
export to Markdown or Obsidian
automatic project architecture mapping

The bigger vision is a Hermes-native cognitive map that grows as the agent works.

Final Thoughts

Mindscape is my attempt to make agentic behavior visible.

Hermes Agent can already plan, use tools, and reason through tasks. Mindscape adds a visual layer on top of that — a way to inspect the shape of the agent’s work.

For me, the exciting part is not only the graph itself.

It is the idea that open source agents should be understandable, inspectable, and explorable.

That is what Mindscape tries to make possible.

Google I/O 2026 Made the Search Box Feel Like an Agent Layer

southy404 — Wed, 20 May 2026 12:21:55 +0000

This is a submission for the Google I/O Writing Challenge

Google I/O 2026 Made the Search Box Feel Like an Agent Layer

A few weeks ago, after Google Cloud NEXT ’26, I wrote that the agentic shift was already rewriting how we build software.

At Google I/O 2026, that same idea clicked again — but from a different angle. This time, it was not mainly about cloud infrastructure, enterprise workflows, or multi-agent orchestration platforms. It was not only about a new Gemini model, a redesigned app, or another polished demo of AI doing something impressive on stage.

It was the search box.

That sounds almost too simple, but I think that is exactly why it matters. For more than 25 years, the Google search box has been one of the most stable interfaces in computing: type a few words, get a list of links, open pages, and do the work yourself.

At I/O 2026, that familiar box started to look like something else.

Not just an input field. Not just a gateway to links. But a place where users can express intent, attach context, generate interfaces, start agents, monitor information, and continue tasks over time.

Google Cloud NEXT made agents feel like infrastructure.

Google I/O made them feel like the interface.

And honestly, that might be the bigger shift.

From Agent Infrastructure to Everyday Interface

What stood out to me at Google Cloud NEXT was that AI was no longer presented as just a smarter API call. It was becoming a layer that could operate across tools, keep context, coordinate work, and continue over time. As a developer, that changed how I looked at software.

Instead of thinking only in endpoints, requests, and responses, we suddenly had to think about responsibilities, permissions, state, handoffs, and behavior. The system was no longer just reacting to input. It was starting to act, coordinate, and evolve.

But at NEXT, that shift still felt mostly like something happening inside platforms: cloud tools, enterprise workflows, agent orchestration systems, internal automation, and developer environments.

Google I/O 2026 made that same idea feel much closer to everyday users.

Search, Gemini, Gmail, Chrome, Android, AI Studio, developer tools, and even the browser itself are becoming surfaces where agentic behavior can show up. That is a different kind of shift, because it moves agents from the backend of software into the place where people already begin their digital lives.

If Search can start tasks, generate custom interfaces, monitor information, reason over personal context, and connect into tools, then Search is no longer just a page of results.

It starts to look like a task layer.

The Search Box Is No Longer Just a Search Box

For a long time, the search box trained us to compress what we wanted into keywords.

We did not write full thoughts. We did not attach messy context. We did not explain our situation in detail. We learned to search in fragments:

best laptop 2026
weather Berlin
React form validation
cheap flights Tokyo
coffee machine red eco friendly

That was the rhythm of search. You compressed your intent into the shortest useful phrase, Google returned links, and then you did the real work yourself.

The new direction feels different.

The search box is becoming more like a command surface: a place where a user can bring a longer question, a document, an image, a video, a browser tab, or a more complex goal. Instead of only returning links, the system can answer, visualize, summarize, compare, generate a custom UI, or hand work off to an agent.

That is subtle, but huge.

Because once the interface can understand long-running intent, accept different kinds of context, generate interactive results, and let agents continue work over time, it becomes less like a search engine and more like a lightweight operating layer above the web.

Not an operating system in the classic sense.

But an operating layer for intent.

The user expresses what they want, and the system figures out what information, tools, apps, or agents are needed to move forward.

Source: Google — “What’s New in Search”

What stood out to me in that Search session was not only the AI answer layer. It was the way Search is being redesigned around longer questions, richer context, follow-up behavior, generated interfaces, and tasks that can continue beyond a single query.

That is the part that makes it feel less like a search feature and more like an agent interface.

This Did Not Start in 2026

What makes this shift even more interesting is that it did not come out of nowhere.

Back in 2023, Google introduced the Search Generative Experience. At that point, the idea was still relatively easy to understand: AI could summarize answers directly inside Search and let users ask follow-up questions. That was already a major change, but it still felt like search with an AI layer on top.

You typed a question. Google generated an answer. Links still sat nearby as supporting sources.

The risk was already visible back then: if Google starts summarizing more of the web directly, users may rely more on Google itself instead of visiting the sources behind the answer. It raised questions about publishers, source visibility, hallucinations, ads, and how much trust people should place in an AI-generated answer.

But in 2026, the direction feels bigger than summarization.

The search box is not just becoming a place where AI writes answers. It is becoming a place where users can bring files, images, videos, browser tabs, and long-running intent. That moves Search from answering questions toward operating on tasks.

And that changes the meaning of Search.

From Keywords to Intent

The old search box asked users to compress intent into keywords.

The new one asks them to express intent more fully — and then lets AI systems decide what should happen next.

That is a massive change in interface design. The user is no longer only searching for a page. They may be starting a workflow, asking for a comparison, generating a custom UI, creating a tracker, delegating a task, or asking an agent to keep watching something in the background.

That means the search box is no longer just an input field.

It becomes a boundary between human intent and machine action.

And boundaries like that need careful design.

This is where the shift becomes more serious. A search query used to be relatively low-risk. If the result was bad, you clicked something else. But once search becomes a place where agents can act, monitor, remember, and coordinate across tools, the design problem becomes much deeper.

The question is no longer only:

Did Search find the right information?

It becomes:

Did the system understand the intent correctly?

Did it use the right context?

Did it act within the right permissions?

Can the user understand what happened?

That is a very different product problem.

Developers Are Not Just Building Apps for Humans Anymore

This is where the developer impact gets interesting.

For a long time, we built software mainly for human users. We designed buttons, forms, APIs, pages, dashboards, onboarding flows, and settings screens. Even when we built APIs, we usually imagined another deterministic system calling them.

In an agentic environment, that changes.

We also need to build software that agents can understand and operate reliably. That means the structure of software matters in a new way. An app is no longer only judged by how it looks to a human. It may also be judged by how clearly it exposes actions, state, permissions, and consequences to an AI system trying to help the user.

Can an agent understand what actions are possible?

Can it tell the difference between previewing an action and executing it?

Can it call the right tool safely?

Can it recover if something goes wrong?

Can it explain what it did?

Can it respect permissions?

Can it operate without accidentally crossing a boundary?

This is not only a UX problem.

It is an architecture problem.

WebMCP Feels Like a Signal

One announcement that stood out to me was WebMCP.

The idea of exposing structured tools on the web so browser-based agents can execute tasks more reliably feels like a clear signal of where things are going. The web was built for humans clicking around, reading pages, filling forms, and interpreting visual layouts.

Agents can technically click around too, but that is fragile.

They can misread layouts, click the wrong thing, hallucinate state, or fail when the UI changes. A human can often recover from a confusing interface. An agent might confidently do the wrong thing.

Structured tools change that. They give agents a more reliable way to interact with web apps, and that means developers may need to think about a new layer of web compatibility.

Not just:

Does this page work in Chrome?
Does it pass accessibility checks?
Is it responsive?

But also:

Can an agent understand this?
Are actions clearly exposed?
Are permissions explicit?
Can the system explain what happened?
Can dangerous actions be separated from harmless ones?

That could become a real design constraint.

And honestly, it probably should.

AI Studio and Antigravity Show the Other Side

The developer tooling side is just as important.

Google’s announcements around AI Studio, Antigravity, managed agents, Android CLI, and Chrome DevTools for agents point to a future where agents are not only using software. They are helping build it.

That creates a strange loop.

Agents help developers build apps. Those apps expose tools and structured interfaces. Other agents then use those apps. Over time, the developer is no longer just building a product for a person sitting in front of a screen.

They are building part of an agent ecosystem.

That means good software design becomes more than clean code and nice UI. It becomes about making systems legible, controllable, observable, and safe for both humans and agents.

This also changes what “developer experience” means. It is not only about documentation, SDKs, and nice error messages anymore. It may also be about whether an agent can understand your system well enough to extend it, debug it, test it, or operate it safely.

The Old Problems Did Not Go Away

This is where my thinking connects back to my previous post.

The same problems are still there.

Context is still the bottleneck. Memory still changes the nature of the system. Governance is still not optional. Debugging is still becoming more about decisions than code.

But at I/O, those problems moved closer to the user.

If a cloud agent makes a wrong decision in a controlled enterprise workflow, that is already serious. But if a personal agent acts across Gmail, Calendar, Search, shopping, documents, browser tabs, and third-party tools, the trust problem becomes much more personal.

The question is no longer only:

Can the model do the task?

The question becomes:

Should it?

With which data?

Under which permissions?

With what audit trail?

And how does the user stay in control?

That is the part I think developers should pay close attention to. The more agentic systems become, the less useful it is to think only in terms of prompts and outputs. The real challenge moves into the surrounding system: context, permissions, memory, identity, traces, and recovery.

The Trust Layer Becomes the Real Platform

This is probably my biggest takeaway from Google I/O 2026.

The next platform layer is not just the model.

It is not just the agent runtime.

It is trust.

Identity, permissions, memory boundaries, audit logs, sandboxing, confirmation flows, data minimization, and clear user control will matter more and more. Because the more useful agents become, the more dangerous vague authority becomes.

A chatbot that gives a bad answer is annoying.

An agent that takes the wrong action is a different problem.

And an always-on agent with access to personal context, tools, files, money, and communication channels is not something we can treat like a normal app feature.

That needs a different level of design.

It also needs a different level of humility. Autonomy is not automatically good. More access is not automatically better. A system that can do more is only useful if the user can understand it, constrain it, correct it, and trust it.

Where Most Teams Are Still Thinking Too Small

A lot of teams still treat AI as something they add to an existing product.

A chat window. A summarization button. A smarter search field. A content generator. Those things are useful, but they are not the full shift.

The bigger change is that software itself is becoming more agentic.

It acts over time. It coordinates tools. It remembers. It makes decisions. It generates interfaces. It can be delegated tasks. It can operate across products.

That means the UI is no longer the whole product. Sometimes the product is the behavior behind the UI. Sometimes the user may never touch the UI directly at all.

This is where I think many teams will underestimate the change. They will ask, “Where should we add AI?”

But the better question might be:

What parts of our system are safe, structured, and understandable enough for an agent to operate?

That question leads to very different design decisions.

What Developers Should Pay Attention To

After I/O, I think developers should pay attention to a few things.

First, make your systems understandable to agents. If agents are going to operate software, vague interfaces and hidden side effects become a bigger problem.

Second, treat permissions as product design, not just security configuration. Users need to understand what an agent can do, not only what data it can read.

Third, build for observability. If an agent acts, there should be a trace of what it saw, what it decided, what tool it used, and what happened after.

Fourth, think carefully about memory. Memory makes agents useful, but it also makes systems harder to debug and harder to trust if it is not transparent.

And finally, do not confuse autonomy with usefulness. The best agent is not always the one that does the most. Sometimes the best agent is the one that knows when to ask, when to stop, and when to hand control back to the human.

Final Thought

Google I/O 2026 did not just show better AI features.

It showed a different interface model.

Not software that waits for clicks, but software that understands intent, coordinates tools, generates interfaces, and keeps working over time.

At NEXT, agents looked like infrastructure.

At I/O, they started looking like the interface.

That is a big deal.

Because the future may not be:

AI inside every app.

It may be:

Agents operating across every app.

And if that is where things are going, developers have a new responsibility. We are not just designing screens anymore. We are designing behavior, trust, and control in systems that can act.

Sketch Judge: Draw Fast, Match Right, Let Gemma 4 Decide

southy404 — Tue, 19 May 2026 15:59:45 +0000

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

What I Built

I started this challenge with a very different idea.

For a while, I thought about building something more serious. I already have several ongoing projects where I experiment with Gemma 4, especially in local-first setups. One of them is an offline agent that can collect ideas, notes, and memories locally, then synchronize them with a main system once the device is online again. The prototype for that direction already exists.

But for this challenge, I wanted to build something lighter.

Something playful.

Something colorful.

Something that could make people smile.

Something that adults, artists, and even kids could try.

That was the beginning of Sketch Judge — a mobile-first AI drawing game where Gemma chooses what you have to draw, and then judges how well you matched the motif.

The game loop is simple:

Add players.
Choose rounds and draw time.
Let Gemma choose a motif.
Reveal the motif to the current player.
Draw it before the timer ends.
Let the AI judge the result.
Continue until the final leaderboard reveals the winner.

The goal is not only to draw something beautiful. The goal is to draw something that clearly matches the target motif.

So if the target is book, drawing a beautiful apple should not win.

That made the project more interesting than a normal drawing app. Gemma is not only generating text in the background. It becomes the game master, the prompt creator, and the judge.

Sketch Judge has two modes:

Casual Mode is forgiving and family-friendly. A simple drawing can score well if the motif is clearly recognizable.

Artist Mode is stricter and aimed at adults, artists, and competitive players. In this mode, Gemma chooses more difficult and expressive motifs, and the scoring gives more weight to proportion, details, creativity, color, polish, and effort.

A simple outline can still be recognized, but it should not beat a detailed, colorful, polished drawing.

This app will not change the world or save it.

But maybe it can help you forget the problems for a moment and just have some fun.

Demo

Live frontend preview:

https://sketch-judge.vercel.app/

The hosted Vercel version currently renders the frontend only.

The local Ollama / Gemma 4 AI connector is not enabled in this deployment yet.

Code

Repository:

https://github.com/southy404/sketch-judge

The repository contains:

the mobile-first game UI,
the drawing canvas,
local Ollama integration,
Gemma-powered motif generation,
Gemma-powered judging prompts,
server-side score guards,
Casual Mode and Artist Mode,
fallback motif pools,
final leaderboard logic,
and a colorful sketchbook-style visual design.

How I Used Gemma 4

Gemma 4 powers the main creative loop of Sketch Judge.

I used the E4B model through Ollama for this prototype.

This model choice was intentional. I wanted to build something that fits a mobile-first and local-first direction. A huge cloud-only model could be powerful, but it would not match the future version I have in mind for this project.

Sketch Judge is a small web prototype now, but the idea points toward a game that could eventually run much closer to the user’s own device. Using a smaller Gemma model through Ollama felt like the right step because:

it keeps the prototype local-first,
it avoids making the game depend on a cloud API from the beginning,
it fits the idea of private, casual gameplay,
it makes experimentation fast,
and it keeps the door open for a future mobile/on-device version.

The model does not only sit behind the app as a chatbot. It directly controls important parts of the gameplay.

Gemma chooses the motif

At the start of each round, Gemma chooses the drawing motif.

The app asks Gemma for a structured JSON response:

{
  "name": "Floating Island",
  "hint": "small island with a tree in the sky",
  "difficulty": "artist",
  "category": "scene"
}

The motif system has two goals:

The prompt should be drawable in a short time.
The prompt should not become repetitive.

Early versions of the game picked too many basic words like:

apple
book
key
banana

That made the game feel too limited.

So I added recent motif memory, category tracking, fallback motif pools, and stricter prompt instructions. In Casual Mode, Gemma chooses simple and fun prompts. In Artist Mode, it prefers harder and more expressive ideas.

Examples of Artist Mode motifs:

Clockwork Bird
Floating Library
Dragon Teapot
Rainy Neon Alley
Underwater Castle
Crystal Fox
Tiny Robot Café
Moon Garden
Mechanical Garden

This made Gemma feel more like a creative game master instead of a random word picker.

Gemma judges the drawing

After a player submits a drawing, the app sends the image and target motif to the judging system.

Gemma returns a structured result like:

{
  "detectedObject": "book",
  "targetMatch": true,
  "score": 74,
  "recognition": 82,
  "shape": 4,
  "proportion": 3,
  "creativity": 2,
  "effort": 4,
  "feedback": "The book is recognizable, but Artist Mode expects stronger proportions and more detail."
}

The most important rule is:

The score should depend on whether the drawing matches the motif.

A beautiful drawing of the wrong object should not win.

That was one of the biggest lessons while building the project. AI can be very friendly, sometimes too friendly. In early tests, the model could say something like:

Nice apple, but the target was a book.

And still give a high score.

That is not fair for a game, so the backend now adds scoring guardrails around the model output.

Score guards and fairness

The backend validates and normalizes the AI response before showing it to the player.

It checks things like:

Is the target motif actually matched?
Is the score on the expected 0–100 scale?
Are category ratings consistent with the final score?
Is the feedback overpraising a weak drawing?
Is Artist Mode active?
Is the drawing sparse, tiny, or only a simple outline?
Is this a fallback score?

For example:

Wrong motif: capped low.
Empty or tiny drawing: capped low.
Artist Mode simple outline: cannot score like polished art.
Casual Mode: more forgiving.
Fallback mode: never pretends to fully understand the motif.

This keeps Gemma creative, but the game rules stay fair.

Artist Mode

Artist Mode changes the role of Gemma.

In Casual Mode, the main question is:

Is the motif recognizable?

In Artist Mode, the question becomes:

Is the motif recognizable, and is the drawing actually strong?

Artist Mode judges more seriously:

shape
proportion
detail
color
composition
creativity
effort
polish

This means a simple black outline of a book can still score okay, but it should not receive 90+ points unless the drawing really has detail and finish.

Local-first AI through Ollama

The prototype uses Ollama locally.

This local-first approach is part of the concept. If this project grows further, I would like to explore:

browser-to-local-Ollama mode,
a packaged local app,
a mobile version,
or eventually an on-device Gemma setup.

The game is simple, but the architecture is intentionally pointing toward local AI gameplay.

Thanks for reading.

Sketch Judge is not a serious productivity tool. It is not trying to automate work. It is not trying to save the world.

It is just a colorful little AI game.

But maybe that is exactly why I liked building it.

Sometimes technology should also be allowed to be playful. Sometimes an AI model does not need to write reports or solve tasks. Sometimes it can just choose a silly motif, watch your chaotic drawing, and decide who wins.

Draw fast. Match right. AI decides.

I rebuilt AFTER HUMAN — an open home for AI tools, experiments, and future products

southy404 — Sun, 10 May 2026 11:10:33 +0000

Over the last weeks I rebuilt my website:

👉 https://www.afterhuman.online/

But this is not just a portfolio page.

I want Afterhuman to become the home for the open source AI products, tools, experiments, and ideas I am building next.

What is AFTER HUMAN?

Afterhuman is my independent AI development forge.

The idea is simple:

Build practical, open, developer-friendly AI systems that people can actually use, extend, break, improve, and learn from.

Not just demos.

Not just landing pages.

Not just “AI wrappers”.

I want to build tools that feel useful, local-first where possible, transparent, and shaped together with a community.

Why I rebuilt the site

I needed a place that can grow with the projects.

Right now, the site introduces the direction:

open source first
modular systems
sovereign AI
tools for developers, builders, inventors, and explorers

The current version is still early, but it already gives Afterhuman a clearer identity and a place where future releases can live.

The first bigger project: Argus

Because of a DEV challenge that is currently running, I started working more seriously on Argus.

Argus is not public yet, but it is one of the main things I want to release under the Afterhuman umbrella.

The vision:

A modular AI system for real-time observation, memory, tools, automation, and intelligent response.

Think less “chatbot” and more:

a personal AI operating layer that can understand context, use tools, remember what matters, and help you act.

It is still early, and I do not want to overpromise. But the foundation is being built, and I will share more soon.

Why community matters

One thing I learned while building OpenBlob and writing here on DEV:

Projects become better when people can see them early.

Not when everything is polished.

Not when all features are finished.

But when other developers can ask questions, challenge ideas, suggest use cases, and maybe even build with you.

That is why I want Afterhuman to become an open community for:

AI developers
open source builders
creative technologists
indie hackers
researchers
inventors
curious people who want to explore what AI tools can become

What comes next

The next steps are:

Publish more information about Argus
Open the Discord community more publicly
Share progress on the website
Release first open source components and tools
Keep documenting the build process here on DEV

For now, the site is live:

👉 https://www.afterhuman.online/

And if you are interested in open source AI tools, local AI systems, agents, automation, or experimental developer tools, I would love to have you around.

This is still the beginning.

But that is usually the most interesting part.

The Agentic Shift Isn’t Coming. It’s Already Rewriting How We Build Software.

southy404 — Fri, 24 Apr 2026 10:27:13 +0000

This is a submission for the Google Cloud NEXT Writing Challenge

At Google Cloud NEXT ’26, something clicked for me — and it honestly wasn’t what I expected.

It wasn’t a new model, a faster API, or one of those polished demos that look great but don’t really change how you build things.

It was the realization that I was still thinking in the wrong abstraction.

While Google was showing systems that operate over time, coordinate across tools, and make decisions with context, I caught myself still thinking in endpoints, requests, and features.

That gap is where the real shift is happening.

We Didn’t Just Get Better AI — We Got a Different Layer of Software

For years, even as AI got better, our mental model didn’t really change.

Most systems still worked like this: user sends a request, system processes it, returns a response. Even with LLMs, we mostly just swapped out deterministic logic for probabilistic outputs and called it a day.

But what was presented at NEXT doesn’t really fit that anymore.

These systems don’t just respond. They keep context over time, coordinate multiple agents, and keep doing things even when no one is actively interacting with them.

That doesn’t feel like “AI inside your app.”

It feels more like something that’s just… running.

Source: Google Cloud NEXT ’26 — Official Announcement

From Output to Execution

The biggest shift is easy to miss, but once you notice it, you can’t unsee it.

We’re moving away from systems that are judged by how good their output looks, toward systems that are judged by what they actually do.

Generating a nice answer is one thing.

Actually executing a task across multiple systems — with permissions, constraints, and changing context — is a completely different problem.

And you feel that difference immediately when you try to build something like this.

Because suddenly it’s not about “did the response look right?”

It’s about “did the system actually do the right thing?”

You’re Not Just Writing Code Anymore

This is the part that hit me the most.

If you take this seriously, your role as a developer shifts.

You’re not mainly writing endpoints, functions, or UI flows anymore.

You’re defining responsibilities. You’re deciding who (or what) is allowed to do what, how decisions move through the system, and what should happen over time when different parts interact.

At some point it stops feeling like assembling logic…

…and starts feeling like designing behavior under constraints.

Multi-Agent Systems Look Clean — Until You Build Them

On paper, multi-agent systems look almost too clean.

You split things up nicely: one agent plans, another evaluates, another executes. Each has a clear role, everything is modular, everything makes sense.

Until you actually try it.

Because then you realize: complexity didn’t go away. It just moved.

Instead of one complex system, you now have multiple smaller systems that need to agree with each other.

And they don’t always do that.

You can easily end up in situations where:

one agent thinks something is ready to execute
another thinks it still needs clarification

Both are “right” in isolation. The result is still wrong.

No crash. No error. Just weird behavior.

That’s a very different kind of problem.

Context Is the Real Bottleneck Now

For a long time, we all focused on models. Bigger, smarter, faster.

But lately it feels like the bottleneck is somewhere else.

Context.

Not just having data, but having the right data, in the right shape, shared consistently across everything involved.

Because if different parts of the system operate on slightly different context, things start drifting fast.

Without a solid context layer, agents don’t really “understand” anything. They just make reasonable guesses.

With it, they start to behave in a way that actually feels grounded.

Memory Changes the Nature of the System

Stateless systems are simple. Every request is its own thing.

Stateful systems are… not.

As soon as you introduce memory, everything changes a bit. The system starts carrying history. Decisions are influenced by things that happened before, sometimes in ways that aren’t obvious anymore.

That’s powerful, but also a bit uncomfortable.

Because now you’re not debugging a single execution anymore.

You’re trying to understand a chain of decisions that led to a certain outcome.

Governance Becomes a Core Design Problem

Another thing that becomes obvious pretty quickly: once systems can act, control becomes critical.

Not just “secure your API” kind of control.

Actual decision control.

Who is allowed to do what?

Which actions are valid?

What happens if something goes wrong?

This is where identity, permissions, and traceability stop being “enterprise stuff” and become core to the system.

Without that, autonomous systems aren’t just powerful — they’re kind of dangerous.

Debugging Becomes About Decisions, Not Code

This is probably the weirdest shift.

In normal systems, something breaks and you trace it back to a line of code.

Here, everything can technically work — and still be wrong.

The issue isn’t that something failed. It’s that different parts of the system interpreted the situation differently or acted on slightly different context.

So you’re not really debugging code anymore.

You’re debugging decisions.

Where Most Teams Are Still Thinking Too Small

Right now, a lot of implementations still treat AI as a feature.

Something behind an endpoint. Something inside a UI.

But that framing feels… outdated.

Because the real shift is deeper.

The system itself becomes the AI. The UI is just one surface.

What actually matters is what’s happening behind it — how agents coordinate, how context flows, how decisions are made over time.

That’s where things get interesting.

Final Thought

Google Cloud NEXT ’26 didn’t just introduce new tools.

It introduced a different way of thinking about software.

Not as something that reacts to input…

…but as something that acts, coordinates, and evolves over time.

The real question isn’t whether you’ll use AI in your system.

It’s whether you’re ready to build systems where behavior — not just code — is the main thing you design.

I Tried OpenClaw on Windows with Ollama. I was hyped… until I wasn’t.

southy404 — Thu, 23 Apr 2026 11:14:40 +0000

This is a submission for the OpenClaw Writing Challenge

The Beginning

Today was the day.

For the first time, I cloned OpenClaw on my Windows machine.

My mission was simple: build something for the OpenClaw Challenge using my local Ollama setup.

At first, everything felt smooth. I cloned the repo, read the README, checked the well-written docs, followed the Windows setup instructions, and ran the install command.

Then I saw this in the terminal:

Windows detected - OpenClaw runs great on WSL2.
Native Windows might be trickier.

That was the first moment I got a little skeptical.

Still, the setup looked clean. I was guided through onboarding, picked QuickStart, selected Ollama as the provider, chose local only, set the base URL, selected my model… and then:

boom.

Error: Cannot find module '@larksuiteoapi/node-sdk'

Alright. Not great — but maybe just a one-off.

I installed the package manually and ran the setup again.

Then again:

Windows detected - OpenClaw runs great on WSL2.
Native Windows might be trickier.

And slowly, I started to understand why.

I went through the setup again — model, base URL, everything — and then:

boom again.

Error: Cannot find module 'nostr-tools'

Second missing module. And this time for something I wasn’t even using.

Fine. Installed it.

Ran setup again.

And then:

boom.

Error: Cannot find module '@slack/web-api'

At that point, the warning from the terminal stopped feeling like advice — and started feeling like a prediction.

The Windows Attempt

To be fair, OpenClaw never hid it. It told me early that native Windows might be tricky.

And for my setup, it absolutely was.

The loop looked like this:

run setup
hit missing module
install manually
repeat

What made it frustrating wasn’t just the errors — it was that they were tied to integrations I didn’t even need. I was just trying to run OpenClaw locally with Ollama.

So I Switched to WSL2

At that point, I did what the tool had been suggesting all along: switch to WSL2.

And honestly — that part did feel better.

No random module errors. Cleaner setup. Everything looked more stable.

But then I hit the next issue.

My local Ollama setup wasn’t really there anymore.

My models didn’t show up properly, and instead of a clean local flow, I ended up in a setup that expected an OpenAI- or Anthropic-style API.

After digging a bit, the issue became obvious:

Ollama wasn’t installed in that environment.

So yes — WSL2 solved one problem, but it also disconnected me from the setup I actually wanted.

The Turnaround: `ollama launch openclaw`

Then I remembered something.

A few weeks ago, I got an email from Ollama that said:

ollama launch openclaw

So I went back to my Windows environment and tried exactly that.

And suddenly:

it worked.

Not perfectly. Not magically. But it worked.

The dashboard opened. The system was running.

The First Thing for me OpenClaw Did Right

I typed hi into the chat.

Instead of replying like a normal chatbot, OpenClaw pointed me to BOOTSTRAP.md.

And that was the first moment where I was genuinely impressed.

Instead of just chatting, you actually create your assistant.

Through conversation, you define:

IDENTITY.md
USER.md
SOUL.md

You give it a name, a personality, a tone — even an emoji.

That felt different.

It didn’t feel like configuring software.
It felt like shaping a system.

That idea alone is incredibly strong.

And Then… Boom Again

After going through the bootstrap, everything looked promising.

Then:

Ollama timeout.

Frustrating — but manageable.

I switched to a faster model in the dashboard, retried, and it worked.

Then I got a well-structured response suggesting behavior defaults, memory handling, and skill setup:

“Perfect. Both saved. 🦞

Now for SOUL.md — I suggest the following defaults:

Behavior & Collaboration:

Be proactive: Don’t wait for commands — check emails, calendar, and projects

Maintain memory: Update MEMORY.md every few days

Language: German/English mix

Heartbeats: 2–4x per day, but don’t respond on every poll (HEARTBEAT_OK if nothing important)

External actions (email, social): Always ask first

Git projects: Automatically commit/push (when requested)

Group chats: Only respond to real inputs, not every time

Use emoji reactions for Discord

Configure skills:

Weather (wttr.in for weather)

Healthcheck (for security checks)

Skill-creator (for creating new skills)

taskflow (for complex workflows)

What do you think? Should I write SOUL.md with these defaults, or do you want to adjust anything? 🦞”

Honestly?

It was good.

So I confirmed it.

And then…

boom. Timeout again.

That was the moment where the experience broke for me.

What I Think After All This

I don’t think OpenClaw is bad.

Actually, I think it’s one of the most interesting directions in this space right now.

There are ideas here that stand out:

agent-based workflows
identity + memory as first-class concepts
a real attempt at building a personal AI, not just a chat interface
a huge and fast-growing open-source community pushing it forward
an ecosystem of plugins, integrations, and channels that goes far beyond a single use case

This is not just “another AI tool.”

That’s rare.

But at the same time, the experience still feels very experimental.

Not just in performance — but in reliability.

Things work… until they don’t.

And when they break, it’s not always obvious why.

The Part That Makes Me Careful

OpenClaw isn’t just a chatbot.

It’s an agent that can:

run commands
access files
act in the background

That’s powerful.

But that also means trust matters a lot more.

And right now, I personally don’t feel comfortable giving that level of control to a system that still feels this unstable.

Conclusion

Right now, my opinion is simple:

OpenClaw is fascinating — but not ready for me yet.

I didn’t end up building my challenge project with it.

But I’m still glad I tried it.

Because the direction is genuinely exciting.

And to be fair:

If I had invested more time, I’m pretty sure I could have gotten everything running properly.

But that’s also part of the point.

For me, the current setup effort combined with the limitations of local models right now just doesn’t feel worth it yet.

And here’s the important part:

This space is moving fast.

Local models are improving rapidly.
Hardware is getting better.
Tooling is evolving almost weekly.

Which means:

The exact same setup could feel completely different in a few months.

So while it didn’t work for me today…

I don’t think that will be true for long.

And that’s exactly why I’ll keep watching OpenClaw.

What about you?

Have you tried OpenClaw yet?

Whether locally, with cloud models, or in a completely different setup — I’m genuinely curious how your experience has been.

Did it feel smooth and powerful…
or more like something that’s still finding its footing?

And more importantly:

Do you see yourself actually using something like this in your daily workflow — or are we not quite there yet?

Let me know 👇

I just gave my local AI desktop companion access to the outside world (Telegram, Discord, Email…)

southy404 — Sun, 19 Apr 2026 10:16:49 +0000

For the last weeks, I’ve been building a local-first AI desktop companion that lives on your screen.

It can:

see your screen
understand your context
execute actions on your system

But it had one big limitation:

It only lived on your desktop

So I changed that.

🌐 Introducing: Blob Connectors

I just added a new layer to OpenBlob:

👉 Blob Connectors

A lightweight Python bridge that connects your local AI to the outside world:

Telegram
Discord
Slack
Email

🧠 What this actually means

You can now do things like:

send open spotify via Telegram → Spotify opens on your PC
ask a question in Discord → your local model answers
send an email → get a contextual AI reply
control your desktop from anywhere

And the important part:

It’s still local-first

⚙️ How it works

All channels go through the same pipeline:

Telegram / Discord / Slack / Email
              │
        Blob Connectors (Python)
              │
    ┌─────────┴─────────┐
    │                   │
OpenBlob running?    Ollama fallback
(localhost)         (local model)
              │
        Command Router
              │
      Desktop action

Everything becomes a normalized Message object.

No matter where it comes from.

🔌 Why this matters

This is not just “adding integrations”.

This is the first real step towards:

an AI system that exists beyond a single interface

Now OpenBlob is:

not just UI-bound
not just voice-bound
not just desktop-bound

It becomes a distributed interface to your own system

🧩 Built for extension

Each connector implements the same interface:

class MyConnector(BlobConnector):
    async def receive_message(self, raw) -> Message | None: ...
    async def send_response(self, original: Message, response: str) -> None: ...
    async def start(self) -> None: ...

So adding new platforms is trivial:

WhatsApp
Matrix
iMessage (maybe 👀)
anything with an API

🔒 Still local-first

Important:

runs on your machine
uses your local models (Ollama)
no required cloud backend
transparent behavior

If OpenBlob is offline:

→ it automatically falls back to local reasoning

🚧 Current state

works across all channels
still early
structure is stabilizing
lots of room for improvement

🔮 What this unlocks next

This connector layer enables things like:

shared memory across all channels
persistent conversations
multi-agent systems
calendar / tool integrations
real remote control of your system

🤝 If you want to build with me

This is probably the best moment to jump in.

You can:

build new connectors
improve routing / memory
design better UX
experiment with AI behaviors

👉 https://github.com/southy404/openblob

💡 Final thoughts

This is mainly an infrastructure update.

By introducing a connector layer and a normalized message interface, OpenBlob becomes:

easier to extend
easier to integrate
less tied to a single UI

It’s a small surface change — but a significant internal shift.

Gemini Footprint Tracker — See the Real Cost of Every AI Prompt

southy404 — Sat, 18 Apr 2026 08:03:11 +0000

This is a submission for Weekend Challenge: Earth Day Edition

What I Built

Every time you send a message to an AI, it consumes water, energy, and emits CO₂. Most people have no idea how much. Gemini Footprint Tracker makes that cost visible — in real time, per request, with full transparency about how the numbers are calculated.

You bring your own Gemini API key, pick a model, and start chatting. After every response the tracker shows how much water and CO₂ that exchange cost — scaled by token count and model weight. A community panel aggregates anonymous footprint data from all users via Supabase, so you can see the collective impact grow in real time.

Important: this is an awareness and transparency project, not an official measurement tool. The estimates are based on Google's publicly published baseline for a median Gemini Apps text prompt, combined with transparent app-side scaling logic. Every assumption is documented — what comes from Google, what is estimated, and where the model falls short. The /learn page inside the app explains the full methodology.

The goal is simple: make something invisible a little more visible.

Demo

🔗 Live: gemini-footprint-tracker.vercel.app

You'll need a free Google AI Studio API key to send messages. The key stays in your browser — it never touches a server.

Code

southy404 / gemini-footprint-tracker

🌍 Gemini Footprint Tracker

An awareness project that makes the environmental cost of AI visible — tracking water, CO₂, and energy usage per Gemini API request in real time.

Built for the DEV Earth Day Challenge 2026.

→ Live Demo

What it does

Every prompt you send to Gemini uses water, energy, and emits CO₂. This tracker uses Gemini's usage metadata (token counts) combined with Google's official published baseline values to estimate the environmental footprint of each request — and aggregates it anonymously across all users via Supabase.

💧 Water consumption per request (mL)
☁️ CO₂ emissions per request (gCO₂e)
⚡ Token-based scaling per model (Flash-Lite / Flash / Pro)
📊 Community stats across all sessions
🔒 Your API key stays local — never sent anywhere except directly to Gemini

Stack

Framework	React 19 + TypeScript + Vite
Styling	Tailwind CSS v4
Animation	Framer Motion
Backend	Supabase (anonymous footprint

…

View on GitHub

How I Built It

Stack: React 19 + TypeScript + Vite, Tailwind CSS v4, Framer Motion, Supabase, Gemini API

The estimation model

Google publicly reports that a median Gemini Apps text prompt uses 0.26 mL of water, emits 0.03 gCO₂e, and consumes 0.24 Wh of energy. That's the only official number available. From there I built a token-based scaling model:

WeightedTokens  = PromptTokens + ResponseTokens × 3.5
TokenScale      = max(0.2, WeightedTokens / 775)
WaterEstimate   = 0.26 × TokenScale × ModelMultiplier
CO₂Estimate     = 0.03 × TokenScale × ModelMultiplier

The 3.5× output weight reflects that autoregressive decoding is significantly more compute-intensive than input prefill. The reference prompt (250 input + 150 output tokens) and the model multipliers (Flash-Lite: 0.85×, Flash: 1.0×, Pro: 1.35×) are documented approximations — not official Google values. The /learn page inside the app makes this separation explicit: what is official, what is estimated, and where the numbers can't be trusted.

Community stats

Each request anonymously logs water and CO₂ to Supabase. The topbar shows live community totals — water consumed, CO₂ emitted, unique users tracked. The numbers update in real time across all sessions.

UX decisions

The interface is intentionally built to feel like a normal AI chat — familiar composer, clean response layout, no dashboard clutter. That was a deliberate choice: AI resource usage is a topic that matters for everyone who uses these tools, not just people who go looking for environmental data. If it looks like a tracker, most people close it. If it looks like a chat, they stay.

The footprint numbers appear quietly after each response — present, but not in your face. The community stats in the topbar give a sense of collective scale without being alarming. Transparency about estimates is built into the UI from the start: the helper text, the suggestion chips, and the /learn page all reinforce that these are informed approximations, not ground truth.

Other decisions:

API key stored in localStorage only, never transmitted anywhere except directly to Gemini
Voice input via Web Speech API
Animated transition between hero and chat state using Framer Motion's layoutId
Mobile-responsive throughout, including the KaTeX methodology page
Earth background video from NASA-Imagery via Pixabay

Prize Categories

Best use of Google Gemini — The entire app is built around the Gemini API. Every message goes through generateContent, and the response's usageMetadata — prompt and candidate token counts — directly drives the footprint calculation. The model selector supports gemini-2.5-flash-lite, gemini-2.5-flash, and gemini-2.5-pro, each with a distinct environmental multiplier. Gemini isn't a feature bolted on — it's the thing being measured.

OpenBlob is evolving: better architecture, modern UI, and real-time transcripts

southy404 — Wed, 15 Apr 2026 16:16:29 +0000

Over the past days, OpenBlob changed a lot.

Not just visually — but fundamentally.

This is a proper progress update on where things are heading 👇

🧠 Quick recap

OpenBlob is a local-first desktop AI companion that:

lives on your desktop
understands your context
can see your screen (via vision models)
reacts in real-time
executes actions directly on your system

👉 Repo: https://github.com/southy404/openblob

🔧 Rebuilding the core (this was the big one)

The biggest update isn’t something you see. It’s how everything works underneath. OpenBlob now has a much cleaner and more scalable structure:

Core pipeline

input (voice / text / screen)
→ intent detection
→ command router
→ execution (local first)
→ AI fallback if needed

What changed

Clear separation of responsibilities
Proper command routing system
Modular capabilities instead of chaos
Easier to extend without breaking everything

This turns OpenBlob into something bigger than a chatbot: a runtime layer for your desktop.

🧩 Open-source friendly structure

One goal became very clear: this needs to be hackable. So the architecture is moving towards a module system like this:

📁 modules/
↳ 📁 discord/
↳ 📁 spotify/
↳ 📁 browser/
↳ 📁 system/

Each module:

exposes commands
runs locally
can be extended independently

This makes it much easier to:

build plugins
integrate APIs
experiment without touching the core

🎨 New UI (cleaner, faster, more alive)

The UI got a big upgrade:

Floating bubble interface
Glassmorphism style
Smoother, more organic animations
Faster interaction

Interaction now feels like:

CTRL + SPACE → instant open
Global voice toggle
Minimal friction

Less “tool”. More presence.

💬 NEW: Just Chatting mode

Sometimes you don’t want commands. You just want to talk. So OpenBlob now has a Just Chatting mode:

Pure conversation with your AI companion
No command routing
No execution layer
Just dialogue

This is important because: the companion shouldn’t only do things — it should also be there.

Use cases:

Thinking out loud
Asking questions
Casual conversation
Testing personality / tone

🖼 Screenshot assistant (more usable now)

The screen pipeline is getting more solid:

screenshot
→ OCR
→ context extraction
→ reasoning
→ answer

Already useful for:

Debugging
UI understanding
Games
Quick research

Still improving — but getting reliable.

🎙️ NEW: real-time transcript system

This is one of the biggest new additions. OpenBlob can now:

Listen to system audio
Listen to microphone input
Generate live transcripts
Store structured sessions

Pipeline

audio (system / mic)
→ transcription
→ segmented timeline
→ structured session
→ saved as text

What it already works for

Meetings (Meet, Zoom, etc.)
YouTube / podcasts
Lectures
General audio capture

🧪 Current prototype

Live text appearing in real-time
Segmented transcript blocks
Session tracking
Simple overlay UI

It’s still early. But it works.

🔮 Where transcripts are going

This is not just speech-to-text. Next steps:

📝 Meeting assistant

Summaries
Key points
Action items

🧠 Memory layer

Link transcripts to context
Searchable history

⚡ Real-time help

Explain while listening
Highlight important info
Suggest responses

⚡ Philosophy (still the same)

Local-first
Context > Prompt
System-level AI
Playful + useful

🧪 Current state

Still experimental
Still buggy sometimes
Evolving very fast

But now: Much better structure, clearer direction, and easier to contribute.

🤝 If you want to join

Now is actually a great time. You can:

Build modules (Discord, Spotify, browser, etc.)
Improve transcription
Design UI
Experiment with AI

👉 Join here: https://github.com/southy404/openblob

💡 Final thought

I’m starting to believe the future of AI is not a chat window in a browser.

But something that lives on your system, understands your context, and can both act and talk.

OpenBlob is slowly getting there.

I’m building a local AI desktop companion that sees your screen — and you can help shape it

southy404 — Thu, 09 Apr 2026 17:57:15 +0000

Most AI tools feel disconnected.

They don’t see your screen.
They don’t understand what you're doing.

So I built one that does.

Meet OpenBlob

An open-source, local-first desktop AI companion for Windows that doesn’t just respond — it lives on your desktop.

👉 GitHub: https://github.com/southy404/openblob

It can:

understand what app you’re using
analyze screenshots
help inside games, apps, and browsers
react visually with an animated companion
and yes… even play hide and seek with you

The problem with current AI assistants

Most tools today are:

cloud-dependent
context-blind
static
not fun to use

They don’t feel like part of your system.

🧠 It understands context

OpenBlob looks at:

active window
app name
window title

So if you’re in a game, it knows.
If you're debugging, it adapts.

This is where things start to feel different.

🖼 It can see your screen

You can take a screenshot and it will:

extract visible text
detect what you're looking at
generate a real search query
explain what's going on

Screenshot → OCR → context → reasoning → answer

Still a bit rough — but already very usable.

🎮 It actually helps inside games

Instead of:

alt-tab → google → guess

You can:

screenshot
let it detect the game
get a real answer

This alone changes how you play.

🤖 Multi-model AI (local-first)

Runs via Ollama with:

text models
vision models
fallback system

No cloud required.

🎨 It feels alive

The companion:

has moods (idle, thinking, love, sleepy)
reacts to interaction
can be “petted”
dances when music is playing

Small details, big difference.

🎮 The weird part (my favorite)

Hide and Seek mode

You can literally say:

“let’s play hide and seek”

And it will:

hide somewhere on your screen
peek occasionally
wait until you find it

Sounds dumb.

Feels surprisingly real.

⚡ New UI (WIP)

CTRL + SPACE to open
floating companion
instant interaction

Inspired by tools like Raycast / Arc — but alive.

⚠️ still slightly buggy

🧪 Screenshot assistant (work in progress)

fast snipping
instant processing
contextual answers

Works — but not perfect yet.

Why open source?

Because this shouldn’t belong to one company.

This kind of system should be:

transparent
hackable
community-built

Philosophy

local-first
context > prompt
playful + useful
build in public

Current state

Early stage.

evolving fast
sometimes buggy
lots of experiments

If you want to join

This project is wide open.

You can:

contribute features
improve UI
experiment with AI
build plugins

👉 https://github.com/southy404/openblob

Final thought

I don’t think the future of AI is chat.

I think it’s something that:

lives with you, understands your environment, and evolves

That’s what I’m trying to build.

I built a CAPTCHA that never lets you leave

southy404 — Sat, 04 Apr 2026 18:20:10 +0000

This is a submission for the DEV April Fools Challenge

What I Built

I built a fake CAPTCHA game called I'm Not a Robot.

It starts like a normal human verification flow:

click the checkbox
solve the image challenge
verify and move on with your life

Except it never really lets you move on.

The main joke is based on one of the most annoying real CAPTCHA experiences: you click all the correct image tiles, and then more tiles keep loading. Sometimes the new tile also contains the thing you were supposed to click. Sometimes it does not. Sometimes you think you are finally done, but the system decides you are absolutely not done.

So I turned that tiny moment of internet frustration into the entire product.

The project is intentionally useless, mildly hostile, and completely committed to wasting your time in the most familiar way possible.

Demo

Live demo: CodePen demo

Try it yourself and see how long it takes before the CAPTCHA starts feeling personal.

Code

The whole project is built as a lightweight front-end-only prototype and hosted on CodePen.

CodePen: View the code here

How I Built It

I wanted it to feel recognizable first and ridiculous second.

So instead of making it look overly stylized or futuristic, I designed it to resemble the familiar CAPTCHA flow people already know:

a simple checkbox start
a blue challenge header
a 3x3 image grid
a verify button
repeated image replacement after selecting the correct tiles

From there, I made the interaction slowly become absurd.

Tech used

HTML
CSS
Vanilla JavaScript
CodePen for hosting and sharing

The core idea

The most important interaction in the whole project is this:

When you click a correct tile, it does not just stay solved.

It gets replaced with a new tile immediately, just like those real image CAPTCHAs that seem determined to test your patience instead of your humanity.

That replacement loop is the joke.

To make it feel a little more believable, I built it so that:

only the clicked tile gets replaced
some replacement tiles contain another hydrant
some replacement tiles do not
the prompt slowly becomes more absurd over time
the challenge keeps pretending you are almost done
the final screen punishes you for sticking with it

I also created pseudo-photo tile images directly in code so the project stays self-contained and easy to run without external assets.

Prize Category

I’m mainly submitting this for Best Ode to Larry Masinter and hopefully also Community Favorite.

Why Best Ode to Larry Masinter:

it is intentionally useless
it turns a familiar internet standard-ish experience into something absurd
it fully commits to the bit
it feels like the kind of thing nobody needed, but the internet somehow deserved

Why Community Favorite:

the joke is immediate
the frustration is universal
almost everyone has suffered through an image CAPTCHA before
it is very easy to understand, click, and share

Final Thoughts

I liked the idea of building something that feels normal for about five seconds and then slowly reveals that it exists only to trap you in an endless loop of fake progress.

That felt extremely appropriate for an April Fools challenge.

If the best useless software is software that technically works while emotionally making things worse, then I think this qualifies.

Thanks for reading, and good luck proving you are human.

🚀 I built a Chrome Extension to manage AI prompts properly (Prompt Vault)

southy404 — Mon, 30 Mar 2026 11:31:43 +0000

If you're working with tools like ChatGPT, Claude, Gemini, or Midjourney daily, you probably ran into the same problem I did:

👉 Your best prompts are scattered everywhere.
Notes. Docs. Random chats. Lost forever.

So I built something simple — but actually useful.

🔐 Introducing Prompt Vault

👉 https://chromewebstore.google.com/detail/prompt-vault/njpfhfjoofkflbkfepckeepojbmfmocm

Prompt Vault is a lightweight Chrome extension to save, organize, search, and instantly reuse your AI prompts — without friction, without clutter, and without relying on external tools.

💡 Why I built this

I kept re-writing the same prompts over and over again.

Or worse:
I knew I had a perfect prompt somewhere… but couldn’t find it when I needed it.

Most tools out there felt:

Overcomplicated
Slow
Or required accounts / cloud sync

I wanted something:
👉 Fast
👉 Local
👉 Reliable

So I built it myself.

⚡ Core Features

🔒 Failsafe 1-Click Copy

Clipboard copy just works.
No silent failures — it uses 3 fallback methods to guarantee success.

🏷️ Smart Tags & Filtering

Organize your prompts with custom tags and instantly filter them.

No more chaos. Just structure.

🔍 Live Search (with Highlights)

Search across:

Title
Content
Tags

Results update in real-time and highlight matches.

📊 Flexible Sorting

Everyone thinks differently.

Sort your prompts by:

Most Recent
A–Z
Most Used
Tags

📤 JSON Import / Export

Your data is yours.

Backup everything
Share prompt packs
Move between devices

📈 Usage Tracking

See which prompts you actually use.

Optimize your workflow based on real usage — not guesswork.

🌙 Dark & Light Mode

Clean dark UI by default.
Switch anytime — preference is saved.

⌨️ Keyboard Shortcuts (for power users)

Ctrl + N → New prompt
Ctrl + F → Search
Esc → Close

Fast. Minimal. No mouse needed.

🧠 Who this is for

Heavy ChatGPT / Claude / Gemini users
Prompt engineers & AI devs
Writers, marketers, SEO people
Anyone tired of repeating the same instructions

🛡️ Privacy First

This was non-negotiable.

✔ 100% local storage (Chrome storage)
✔ No accounts
✔ No tracking
✔ No servers
✔ No ads

Your prompts never leave your machine.

📦 Lightweight by design

No bloat
No subscriptions
No setup

Install → click → start saving prompts.

🔥 Try it

👉 https://chromewebstore.google.com/detail/prompt-vault/njpfhfjoofkflbkfepckeepojbmfmocm

Thanks for reading 🙌

DEV Community: southy404

Mindscape: A Live Cognitive Graph for Hermes Agent

What I Built

Demo

Graph View

Timeline View

Inspector

Code

My Tech Stack

How I Used Hermes Agent

1. Hermes as the agentic source

2. Hermes plugin hooks

3. Hermes Dashboard integration

Why I Built It

What Makes It Useful

What Was Challenging

What I Learned

Future Ideas

Final Thoughts

Google I/O 2026 Made the Search Box Feel Like an Agent Layer

Google I/O 2026 Made the Search Box Feel Like an Agent Layer

From Agent Infrastructure to Everyday Interface

The Search Box Is No Longer Just a Search Box

This Did Not Start in 2026

From Keywords to Intent

Developers Are Not Just Building Apps for Humans Anymore

WebMCP Feels Like a Signal

AI Studio and Antigravity Show the Other Side

The Old Problems Did Not Go Away

The Trust Layer Becomes the Real Platform

Where Most Teams Are Still Thinking Too Small

What Developers Should Pay Attention To

Final Thought

Sketch Judge: Draw Fast, Match Right, Let Gemma 4 Decide

What I Built

Demo

Code

How I Used Gemma 4

Gemma chooses the motif

Gemma judges the drawing

Score guards and fairness

Artist Mode

Local-first AI through Ollama

I rebuilt AFTER HUMAN — an open home for AI tools, experiments, and future products

What is AFTER HUMAN?

Why I rebuilt the site

The first bigger project: Argus

Why community matters

What comes next

The Agentic Shift Isn’t Coming. It’s Already Rewriting How We Build Software.

We Didn’t Just Get Better AI — We Got a Different Layer of Software

From Output to Execution

You’re Not Just Writing Code Anymore

Multi-Agent Systems Look Clean — Until You Build Them

Context Is the Real Bottleneck Now

Memory Changes the Nature of the System

Governance Becomes a Core Design Problem

Debugging Becomes About Decisions, Not Code

Where Most Teams Are Still Thinking Too Small

Final Thought

I Tried OpenClaw on Windows with Ollama. I was hyped… until I wasn’t.

The Beginning

The Windows Attempt

So I Switched to WSL2

The Turnaround: ollama launch openclaw

The First Thing for me OpenClaw Did Right

And Then… Boom Again

What I Think After All This

The Part That Makes Me Careful

Conclusion

What about you?

I just gave my local AI desktop companion access to the outside world (Telegram, Discord, Email…)

For the last weeks, I’ve been building a local-first AI desktop companion that lives on your screen.

🌐 Introducing: Blob Connectors

🧠 What this actually means

⚙️ How it works

🔌 Why this matters

🧩 Built for extension

🔒 Still local-first

🚧 Current state

The Turnaround: `ollama launch openclaw`