DEV Community

Cover image for Voice Enabled Motion Detection Camera with Raspberry Pi
Dilek Karasoy for Picovoice

Posted on

1

Voice Enabled Motion Detection Camera with Raspberry Pi

The tutorial for day 21 is voice enabled motion detection camera with RPI.

We'll be using Raspberry Pi 3 and a small USB mic. However, you can choose another board. Picovoice supports RPI variants.

Let's config the mic first. List of available input audio devices:

arecord -L | grep plughw
Enter fullscreen mode Exit fullscreen mode

The output should be similar to below:

plughw:CARD=PCH,DEV=0
Enter fullscreen mode Exit fullscreen mode

Copy this line and create a .asoundrc file in your home folder with these options:

pcm.!default {
   type asym
   capture.pcm "mic"
}
pcm.mic {
   type plug
   slave {
      pcm "${INPUT_AUDIO_DEVICE}"
   }
}
Enter fullscreen mode Exit fullscreen mode

Replace ${INPUT_AUDIO_DEVICE} with what you copied earlier. You may need to reboot the system for these settings to take effect.

After that, grab the Picovoice Pythonpackage:

sudo pip3 install picovoice picovoicedemo
Enter fullscreen mode Exit fullscreen mode

The picovoicedemo package let us test our models rapidly. The package requires a valid AccessKey. You can get your AccessKey for free from the Picovoice Console.

We'll also design a voice interface using Picovoice Console.

For this tutorial we'll focus on three main command types (intents)

  1. Turning cameras on/off.
  2. Cleaning old files to make space for new pictures.
  3. Emailing the log files.

Creating a simple VUI on Picovoice Console

In order to design our Voice User Interface (VUI) we use Rhino Speech-to-Intent for custom voice commands. You can use the below YAML file to start with and then enrich or change it as you wish. Use Import YAML functionality on Picovoice Console if you decide to use the below.

context:
  expressions:
    changeCameraState:
      - "[switch, turn] $state:state (all, the) cameras"
      - "[switch, turn] (all, the) cameras $state:state"
      - "[switch, turn] $state:state (the) $location:location (camera, cameras)"
      - "[switch, turn] (the) $location:location [camera, cameras] $state:state"
      - "[switch, turn] $state:state (the) [light, lights] [at, in] (the)
        $location:location"
      - "[switch, turn] (the) [light, lights] [in, at] the $location:location
        $state:state"
    cleanCameraHistory:
      - "[delete, remove, clean] all (the) [files, videos] older than
        $pv.TwoDigitInteger:day [day, days]"
      - "[delete, remove, clean] all (the) [files, videos] older than
        $pv.TwoDigitInteger:month [month, months]"
    emailLog:
      - "[email, mail] (me) (all) the [log, logs]"
      - "[email, mail] (me) a report"
  slots:
    state:
      - off
      - on
    location:
      - garage
      - entrance
      - front door
      - back door
      - driveway
      - yard
      - stairway
      - Hallway
Enter fullscreen mode Exit fullscreen mode

Train, download and extract the model into your home folder. We used “Computer” as the wake-word. However, you can train another one with Porcupine Wake Word Detection on Picovoice Console.

The next step is integrating the voice interface into the existing motion camera detection project. Picovoice eases this step by offering a Python SDK. We need to only modify the wake_word_callback and inference_callback functions based on the context model’s intents:

from picovoice import Picovoice

keyword_path = ...

def wake_word_callback():
    # wake word detected
    pass
context_path = ...

def inference_callback(inference):
    # `inference` exposes three immutable fields:
    # (1) `is_understood`
    # (2) `intent`
    # (3) `slots`
    pass

handle = Picovoice(
        access_key=${ACCESS_KEY},
        keyword_path=keyword_path,
        wake_word_callback=wake_word_callback,
        context_path=context_path,
        inference_callback=inference_callback)

while True:
    handle.process(get_next_audio_frame())
Enter fullscreen mode Exit fullscreen mode

You just need to replace the access and put the path for speech models whether you save them in your downloads or desktop folder.

For more detailed information, you can refer to the Python API documentation.

API Trace View

Struggling with slow API calls? 👀

Dan Mindru walks through how he used Sentry's new Trace View feature to shave off 22.3 seconds from an API call.

Get a practical walkthrough of how to identify bottlenecks, split tasks into multiple parallel tasks, identify slow AI model calls, and more.

Read more →

Top comments (0)

AWS Security LIVE!

Tune in for AWS Security LIVE!

Join AWS Security LIVE! for expert insights and actionable tips to protect your organization and keep security teams prepared.

Learn More

👋 Kindness is contagious

Discover a treasure trove of wisdom within this insightful piece, highly respected in the nurturing DEV Community enviroment. Developers, whether novice or expert, are encouraged to participate and add to our shared knowledge basin.

A simple "thank you" can illuminate someone's day. Express your appreciation in the comments section!

On DEV, sharing ideas smoothens our journey and strengthens our community ties. Learn something useful? Offering a quick thanks to the author is deeply appreciated.

Okay