tash-2s

Posted on Apr 9

I Recorded 13 Hours of My Day With Smart Glasses for AI. Here's What I Built and What I Learned

#ai #automation #devjournal #showdev

People are increasingly giving AI personal context, from journals and documents to screen recordings and audio captures from their computers. The more AI knows about you, the more powerful it can become.

But it still does not capture our real-world activities unless we manually tell it. That feels like a limitation.

So I tried a different approach: what if I captured a full day of real life and passed it to AI?

I built a smart glasses recording system around Rokid Glasses, wore it through a normal day, and captured roughly 13 hours of footage. Then I built a pipeline to make that data browsable and queryable using Reka AI.

Why I used smart glasses

For this purpose, smart glasses felt like the most natural capture device.

Rokid Glasses have a camera, mics, a display, and speakers. They run an Android-based system, are lightweight, and feel close enough to normal glasses that I can wear them naturally all day. I already wear glasses for vision correction anyway, so this fits my life very well.

For a project like this, giving AI real-world context, the smart glasses form factor is hard to beat.

The system I built

At a high level, the system has four parts:

Smart glasses for recording
A smartphone for local storage and relay
A backend for ingest, orchestration, and AI-related processing
A web UI for browsing footage and interacting with AI**

While capturing, the glasses app and phone app work together. Later, at home, the phone uploads videos to the backend for processing. After that, the data becomes usable through the web UI.

The practical challenges

Two things matter immediately: storage and battery.

Storage

The glasses do not have enough local storage for a full day of recording.

So instead of trying to keep everything on-device, the glasses app automatically splits recordings into chunks and transfers them to the phone regularly over the local network. Once the data has safely moved onward, the glasses free up space. No public network access is necessary while capturing.

The goal is information capture and AI analysis, so I prioritize smaller files over social-media-quality footage.

Battery

The small glasses form factor also limits battery life, and actively using the camera makes that constraint obvious very quickly.

My solution is very simple: I pair the glasses with a neck power bank, connected via a short charging cable. That makes all-day use possible without thinking about the remaining battery. You do notice something around your neck, but it is not bothersome.

About privacy

Privacy obviously matters here.

I made recording easy to turn on and off with a simple tap on the glasses, and the glasses show an indicator LED while recording. If a situation should not be recorded, you should be able to turn it off immediately.

This can eventually go further. Automatic shutoff, filtering, redaction, and context-aware privacy defaults would all make a lot of sense. I previously built a privacy filter for video streams, so I can integrate that here.

What recording a whole day actually felt like

I put on the glasses right after waking up and started recording. Then I mostly left them on until night.

The footage covered a pretty normal day: house chores, going to a co-working space, working, shopping, walking around, and eating.

In total, the captured recording time ended up being roughly 13 hours that day.

This setup quickly changed the feeling from "I'm collecting context" to simply living my life.

Turning raw footage into something usable

A pile of video files is not useful by itself.

I built a web-based UI to browse the captured data as a timeline, and even that alone was interesting. Seeing an entire day of POV footage laid out visually gives me a different perspective on everyday life and leads to new observations.

But a timeline is only the beginning, and this is where the fun part starts.

You do not want to manually scrub through 13 hours of your life clip by clip just to answer simple questions.

This is where multimodal AI becomes the important layer.

I used the Reka Vision API to index the recorded videos so they could be searched and queried. That changed the data from "a folder full of clips" into something much closer to a usable dataset.

Then I used the Reka Flash LLM as the chat layer. The web UI could take a natural-language question, let the LLM use Reka Vision as the tool behind the interaction, and have the model generate a response.

Asking AI about my day

On the UI, I can ask questions like:

"What supplements did I take this morning?"
"When was I coding the most today?"

It can retrieve things I had forgotten, or things I would never have thought to look for manually.

Instead of navigating files, I could interact with the day at the level of intent.

We are already at the point where using computers without AI feels like a disadvantage. I believe real life will move in the same direction.

What I think this points to

This project made one thing feel much more concrete to me:

AI gets much more powerful and interesting once it has access to real-world context, not just information on computers.

And smart glasses could become an ideal real-world interface between human and AI. They are always with you, can see and hear what is around you, can display information in front of you, and can speak to you.

The obvious next step is to go beyond simple chat. A few directions I want to explore:

automatic daily journals and work logs
productivity feedback based on what I actually did
automatically generating a short vlog video (Reka Vision has a highlight clip generation API)
AI that takes actions by itself, not just answers questions

Watch the demo

Also

Projects like this are part of why I am building GlassKit, an open-source dev suite for smart glasses apps. The goal is to make it much easier to build smart glasses AI apps.

This space is still early, and that is exactly why it is exciting. AI is moving incredibly fast, and I do not think it will stay confined to the computer screen for long.

Would love to hear what other people think about real-world context and AI!

DEV Community