Reflexes, Cognition, and Thought

#arduino #hardware #beginners #showdev

In my previous posts, I focused on sharing the basics—making LEDs blink and understanding wiring. Today’s adventure was about expanding on what my droid will actually need to function.

The droid will have a multi-layered "brain." I’ve been working on the Reflex Layer with the Arduino Uno for prototyping. In this post, I’ll review what I’ve learned there and explore the Cognition Layer using computer vision and local AI.

The Reflex Layer: Data Types & Geometry

Before the droid can walk, it has to have a way to traverse the world (or at least my home). I used the Arduino to test the "logic" of movement before ever plugging in a motor.

Visual Odometer

I built a Visual Odometer using 4 LEDs to represent 4 bits of a signed char. I wanted to visualize an "integer overflow." By starting the counter at 120 (near the 127 limit of a 1-byte signed integer), I could watch the moment the droid "lost its mind." As soon as it hit 128, the odometer flipped to -128.

Seeing the LEDs jump and the Serial Monitor report a negative distance was a tactile lesson: pick the right "storage box" (data type) for your sensors, or your droid will think it's traveling backward just because it reached a limit.

Simulating Motion with Light

Since I don’t have something that physically moves yet, I used a photoresistor (light sensor) to simulate "steps." Every time I flashed my phone light, the Arduino registered a movement. I also had the LED change color based on the light being detected so I could see really quickly whether my code was working.

I used the Pythagorean Theorem ($a^2 + b^2 = h^2$) to calculate the "as-the-crow-flies" distance from the starting point. Using the Serial Plotter, I could see the $X$ and $Y$ coordinates stair-stepping while the distance tracked a smooth, calculated curve.

#include <math.h>
// ... logic to detect light pulse ...
if (sensorValue < 400 && !triggered) { 
    xPos += 5; 
    yPos += 3; 
    // h = sqrt(x^2 + y^2)
    hypotenuse = sqrt(pow(xPos, 2) + pow(yPos, 2));
    triggered = true;
}

Adding Some Motion

At this point, I figured I could just add some hardware and experience the spinning of the motor based on the distance traveled. I was a bit surprised when I opened up the servos packaging and discovered that I didn't know what to do with them.

I unpacked the Arduino motor shield figuring it would be obvious where it would plug-in, but nope. While the shield was easily installed, the wiring wasn't straightforward.

I could not figure out where the wires on the Geek Servos were supposed to go.

I tried guessing the connection and successfully saw my LED light up, but there was no spinning motor.

Which is when I realized, that my servo was actually just a motor.

I also realized that I likely needed an external power source to support this hardware. I have other servos to try, but I really want these to work since they are LEGO-compatible. To keep the momentum, I decided to look at the second layer of the brain.

The Cognition Layer: Enter the Raspberry Pi 5

I set up my sparkly new Raspberry Pi 5 from a CanaKit. This is the "High-Functioning" brain. This kit was super easy and the video was very straightforward to follow—a great "intro to building a computer" kit. After a quick setup and package update, I dove straight into Edge AI.

Sidequest: The Screenshot Struggle I spent way too long trying to automate screenshots on the Pi for this blog. I learned the hard way that scrot only produces black screens on the new Pi OS (Wayland). After fighting with grim and slurp, I realized I'd figure that part out later. No screenshots for now!

"I See You"

I hooked up an ELP 2.0 Megapixel camera and installed Ollama to run a local Vision Language Model (openbmb/minicpm-v4.5). I wrote a Python script using OpenCV (cv2) to grab a frame and feed it to the model.

The Result: Success! The Pi analyzed the camera feed locally and described me and the room.

DROID SAYS: 
Observing: A human with glasses and purple attire occupies the center of an indoor space; 
ceiling fan whirs above while wall decor and doorframes frame background elements—a 
truly multifaceted environment!

It took about 3 minutes to process one frame. My droid currently has the processing speed of a very deep philosopher—it’s not going to win any races yet, but it is truly thinking about its surroundings.

The Vision Script (vision_test.py)

Here is the bridge between the camera and the AI:

import cv2
import ollama
import os
import time

def capture_and_analyze():
    # Initialize USB Camera
    cam = cv2.VideoCapture(0)

    if not cam.isOpened():
        print("Error: Could not access /dev/video0. Check USB connection.")
        return
    print("--- Droid Vision Active ---")

    # Warm-up: Skip a few frames so the auto-exposure adjusts
    for _ in range(5):
        cam.read()
        time.sleep(0.1)

    ret, frame = cam.read()

    if ret:
        img_path = 'droid_snapshot.jpg'
        cv2.imwrite(img_path, frame)
        print("Image captured! Sending to MiniCPM-V-4.5...")
        try:
            # Querying the local Ollama model
            response = ollama.chat(
                model='openbmb/minicpm-v:4.5',
                messages=[{
                    'role': 'user',
                    'content': 'Act as a helpful LEGO droid. Describe what you see in one short, robotic sentence.',
                    'images': [img_path]
                }]
            )
            print("\nDROID SAYS:", response['message']['content'])
        except Exception as e:
            print(f"Ollama Error: {e}")

        # Clean up the photo after analysis
        if os.path.exists(img_path):
            os.remove(img_path)
    else:
        print("Error: Could not grab a frame.")
    cam.release()

if __name__ == "__main__":
    capture_and_analyze()

What's Next

I'm going to figure out the motors, but for now, I'm going to focus on refining the Vision + AI pieces. I am going to try using the opencv-face-recognition library and experiment with different, smaller models to see if I can speed up that 3-minute "thought" process!