Aman Sachan

Posted on Jun 8 • Originally published at github.com

I built a 81-tool, fully local AI desktop assistant with PySide6 and Ollama (here is the architecture)

#python #ai #ollama #opensource

Why a desktop app, not another chat UI

VS Code gives you an editor. Cursor gives you an editor + chat bolted on. Sentience is a different shape of animal: a native desktop window where the AI is the primary surface, 81 first-class tools hang off a ReAct loop, and the whole thing runs offline against Ollama with no telemetry, no extension marketplace, and no monthly bill.

It is about 6,200 lines of Python. The license is MIT. The window opens in under a second on a cold start.

The shape: a window, an agent, and a tool registry

[Sentience window (PySide6)]
  - Sidebar (file tree)
  - Editor (QScintilla, syntax highlight)
  - Chat panel (ReAct loop entry point)
  - Embedded terminal (QTermWidget subprocess)
        |
        v
  jarvis_agent.py
    - tool_registry[81 entries]
    - provider (Ollama / OpenAI / Anthropic / Groq)
    - memory (SQLite + sqlite-vec)

Three widgets in the main window: a file tree, a code editor, and a chat panel. The chat panel is the entry point to the agent. Below them, an embedded terminal you can shell into; the agent uses the same terminal internally when it runs shell-scoped tools.

The ReAct loop with strict tool schemas

The heart of the agent is a Reason -> Act -> Observe loop, but with a twist: every tool is a Pydantic model with a JSON schema the model has to fill. No free-form string parsing.

class ToolRegistry:
    def __init__(self):
        self.tools: Dict[str, Tool] = {}
        self.schemas: Dict[str, dict] = {}

    def register(self, name: str, fn: Callable, schema: type[BaseModel]):
        self.tools[name] = fn
        self.schemas[name] = schema.model_json_schema()

class ReadFileTool(BaseModel):
    path: str = Field(..., description="Absolute path to file")
    start_line: int = Field(0, ge=0)
    end_line: int = Field(-1, ge=-1)

@registry.register("read_file", ReadFileTool)
def read_file(path: str, start_line: int = 0, end_line: int = -1) -> str:
    p = Path(path)
    if not p.is_file():
        return f"ERROR: {path} not found"
    lines = p.read_text().splitlines()
    return "\n".join(lines[start_line: end_line or None])

The model is told about all 81 tools in a single system message. On each turn it must return a JSON object like {"thought": "...", "tool": "read_file", "args": {...}} or {"thought": "...", "final_answer": "..."}. We parse with pydantic.ValidationError caught and re-prompted, so malformed tool calls never crash the loop.

Multi-provider without a rewrite

The same loop works against Ollama, OpenAI, Anthropic, or Groq. The abstraction is one method:

class Provider(Protocol):
    def chat(self, messages: list, tools: list[dict]) -> ModelResponse: ...

class OllamaProvider:
    def chat(self, messages, tools):
        r = requests.post("http://localhost:11434/api/chat",
                          json={"model": "llama3.2:3b",
                                "messages": messages,
                                "format": tools}, timeout=120)
        return ModelResponse.parse_obj(r.json())

class OpenAIProvider:
    def chat(self, messages, tools):
        return self.client.chat.completions.create(
            model="gpt-4o-mini",
            messages=messages,
            tools=[{"type": "function",
                    "function": {"name": t["name"],
                                 "parameters": t["schema"]}} for t in tools])

Same Provider protocol, different wire format. The agent does not care which model is on the other end; the JSON tool schema is the contract.

The self-modify tool, kept honest

There is a tool literally called edit_own_source. It is dangerous, so it is gated:

@registry.register("edit_own_source", EditOwnSourceTool)
def edit_own_source(file: str, old: str, new: str, confirm: bool) -> str:
    if not confirm:
        return "REFUSED: pass confirm=True only after showing the diff to the user"
    if not file.startswith(str(ROOT / "src")):
        return f"REFUSED: {file} is outside src/"
    p = Path(file)
    backup = p.with_suffix(p.suffix + f".bak.{int(time.time())}")
    shutil.copy(p, backup)
    p.write_text(p.read_text().replace(old, new))
    return f"OK; backup at {backup}"

Two guard rails. The first forces the model to show the diff to the user before flipping confirm=True. The second refuses any path outside src/. A backup is written before any write. This is the kind of thing that needs to be loud, not silent.

Why the terminal lives inside the agent

When the agent runs a shell command, it does not spawn a hidden subprocess; it pushes the command into the embedded terminal widget. The user can see exactly what is running, can Ctrl-C it, and can scroll back. The agent reads the output as if it were any other tool.

That single design choice killed a whole category of "agent ran rm -rf in the background" bugs. Visibility first, autonomy second.

Memory: SQLite + a dumb but fast vector

Persistent memory is a SQLite table with a vec0 virtual table for embeddings. No Pinecone, no server. Embeddings are computed locally with nomic-embed-text via Ollama. The remember tool writes a row; the recall tool does a cosine top-k and returns the joined text.

Stack

Layer	Tech
Window	PySide6 (Qt 6.5+)
Editor	QScintilla with LSP over stdin
Terminal	QTermWidget wrapping bash
Agent	Custom ReAct loop, Pydantic tool schemas
Providers	Ollama, OpenAI, Anthropic, Groq
Storage	SQLite + sqlite-vec
Packaging	PyInstaller (Windows .exe, Linux AppImage)

What is next

A proper streaming mode for the chat panel (right now it is chunky), a visual diff tool so edit_own_source is reviewable in-app, and a "skills" registry; small JSON manifests that bundle 3-5 tools into a named capability the model can opt into. Cursor and Claude have skills; Sentience should too.

Repo: github.com/AmSach/sentience
Run it: pip install pyside6 openai anthropic aiohttp requests && python sentience.py

If you find a tool the agent should have, open an issue with a Pydantic schema and a one-line description. I merge PRs in 24 hours.

Python #PySide6 #LocalAI #Ollama #OpenSource #BuildInPublic

Top comments (1)

Josh Green • Jun 8

Cold start under a second with 81 tools in a ReAct loop is genuinely impressive. Most local AI setups I've tried either start fast or do a lot -- not both. What model are you running against it by default? I've had good results with qwen3.6 for tool calling but curious if smaller models can keep up with that many tool definitions in the context widnow.