DEV Community

Devansh Sharma
Devansh Sharma

Posted on

I’m Building a Real “Jarvis” in Python — Here’s What’s Working (and What’s Not)

Most AI tools today are just chat interfaces.

You open them → type → get a response → close them.

That’s not an assistant.

So I started building VISAR Edge — a system-level AI assistant that’s always present, not something you have to “open.”


🧠 The Core Idea

Instead of:

«User → Prompt → Response»

I’m building:

«Environment → Context → AI → Action»


⚙️ Current Architecture

  • Python (core system)
  • PySide6 (UI layer experiments)
  • SpeechRecognition (voice input)
  • Gemini API (LLM)
  • System hooks (clipboard monitoring, active window tracking)
  • Chroma Vector DB,many more...

🚀 What It Can Do (So Far)

  • Floating UI overlay (works across apps)
  • Voice-triggered interactions
  • Clipboard-aware responses
  • Modular feature system (plug-and-play architecture)

⚠️ What’s Harder Than Expected

  1. Real-Time Context Awareness

Capturing meaningful context without overwhelming the system is tricky.

Too little → useless assistant
Too much → performance nightmare


  1. Latency

Cloud APIs are powerful but slow for an “always-on” experience.

This is pushing me toward:

  • Local models
  • Hybrid inference (local + cloud fallback)

  1. System Design > AI

Big realization:

«This is more of a systems engineering problem than an AI problem.»

Managing:

  • State
  • Memory
  • Context flow
  • Event triggers

…is harder than calling an LLM.


🧩 What I’m Exploring Next

  • Local LLM integration (7B–13B range)
  • Vector memory systems
  • Event-driven architecture
  • Background processing optimization

❓ Where I Need Feedback

If you’ve built anything similar, I’d love your thoughts:

  • Is Python a bad long-term choice for this?
  • Best way to handle continuous context streams?
  • How would you design memory for this kind of system?

🔗 Full Project Context

I’ve documented the vision, UI, and direction here:
👉 https://www.linkedin.com/in/devanshsharma987


💬 Final Thought

Everyone is building smarter chatbots.

I’m trying to build something you don’t have to think about using.

If this fails, it’ll fail by aiming too high — not too low.

Would appreciate honest feedback.

Top comments (0)