Day 20: On-device Voice Assistant with Flutter

#learning #discuss #productivity

Today we will build a voice enabled timer with Flutter and Picovoice. Since Picovoice processes voice data on the device, no voice data leaves your phone. Remember the last time you waited for Siri to set a 10-sec timer for 10 seconds? That's not going to happen!

1. Let's start with the UI:
It is pretty straightforward: three different widgets that are toggled between using the bottom navigation bar.

void _updateTime() {
    // check if timer is up
    if (_timerStopwatch.isRunning &&
        _timerStopwatch.elapsed >= _timerDuration) {
      _timerComplete();
    }

    // check if alarm should sound
    DateTime now = DateTime.now();
    if (_alarmTime != null &&
        !_alarmSounding &&
        (_alarmTime.isBefore(now) || _alarmTime.isAtSameMomentAs(now))) {
      _alarmComplete();
    }

    // update clock
    setState(() {
      _clockTime = now;
      _updateTimer = Timer(
        Duration(milliseconds: 50),
        _updateTime,
      );
    });
  }

2. Let's import Picovoice Flutter SDK
The Picovoice Platform Flutter SDK is what we need to integrate a wake word and custom voice commands. Import it by adding it to the project’s pubspec.yaml file.

picovoice_flutter: ${LATEST_VERSION}

Now let's get the PicovoiceManager to capture and pass audio to the Picovoice platform.

import 'package:picovoice_flutter/picovoice_manager.dart';
import 'package:picovoice_flutter/picovoice_error.dart';
import 'package:rhino_flutter/rhino.dart';

PicovoiceManager _picovoiceManager;
final String accessKey = "..."; // your Picovoice AccessKey

void _initPicovoice() async {
  try {
    _picovoiceManager = await PicovoiceManager.create(
        accessKey,
        keywordPath, 
        _wakeWordCallback, 
        contextPath, 
        _inferenceCallback);

    // start audio processing
    _picovoiceManager.start();        
  } on PicovoiceException catch (ex) {
    print(ex);
  }
}

void _wakeWordCallback() {
  print("wake word detected!");
}

void _inferenceCallback(RhinoInference inference) {
  print(inference);
}

A couple of things to unpack. We’re providing the PicovoiceManager with an AccessKey, two files and two callbacks.

You should get your AccessKey from the Picovoice Console and replace it.
keywordPath and _wakeWordCallback relate the Porcupine Wake Word — 3. contextPath and _inferenceCallback, relate to the Rhino Speech-to-Intent

If we launch the app at this point, it will fail to initialize due to missing keyword and context files — this is our next task. So, let's solve that problem and create a custom wake word and command context on the Picovoice Console

3. Design a Custom Voice Interface
We have 2 options:
You can create custom wake words and contexts on self-service intuitive Picovoice Console OR You can use the demo code we open-sourced. It includes wake word and context files, so you can start with them.

Training a Custom Wake Word on Picovoice Console:

Training a Context (Custom Voice Commands) on Picovoice Console:

4. Import Model Files as Flutter Assets
Now we have both keywordPath(file path of the Porcupine model file - .ppn file extension), and contextPath (refers to the file path of the Rhino model file - .rhn file extension). We’ll drop them into the asset folder of our Flutter project and add them to the pubspec.yaml file as assets. Thanks to Flutter's asset bundle, we’ll add this file logic and some platform detection to our _initPicovoice function from earlier:

PicovoiceManager _picovoiceManager;
final String accessKey = "..."; // your Picovoice AccessKey

void _initPicovoice() async {
  String platform = Platform.isAndroid ? "android" : "ios";
  String keywordPath = "assets/$platform/pico_clock_$platform.ppn";
  String contextPath = "assets/$platform/flutter_clock_$platform.rhn";  

  try {
    _picovoiceManager = await PicovoiceManager.create(
        accessKey,
        keywordPath, 
        _wakeWordCallback, 
        contextPath, 
        _inferenceCallback);
      _picovoiceManager.start();
  } on PvError catch (ex) {
    print(ex);
  }
}

When you launch the app,PicovoiceManager will initialize and start streaming audio. If you say the wake word, followed by any of the commands from the context, you’ll see them printed to the debug console.
Now we need them actually control the app!

5. Integrate the Voice Controls
From the print commands (_wakeCallback and _inferenceCallback), you can add to parse the objects we’re going to receive from the PicovoiceManager. The good news is, _wakeWordCallback is called whenever the wake word is detected and the inference is a class with the following simple structure:

{
    isUnderstood: true,
    intent: 'setTimer',
    slots: {
        hours: 2,
        seconds: 31
    }
}

This is in contrast to the speech-to-text approach where one has to parse a completely unknown and unstructured string. After filling out these functions, we have callbacks that look like the following:

void _wakeWordCallback() {  
  setState(() {
    _listeningForCommand = true;
  });  
}
void _inferenceCallback(RhinoInference inference) {  
  if (inference.isUnderstood) {
    Map<String, String> slots = inference.slots;
    if (inference.intent == 'clock') {
      _performClockCommand(slots);
    } else if (inference.intent == 'timer') {
      _performTimerCommand(slots);
    } else if (inference.intent == 'setTimer') {
      _setTimer(slots);
    } else if (inference.intent == 'alarm') {
      _performAlarmCommand(slots);
    } else if (inference.intent == 'setAlarm') {
      _setAlarm(slots);
    } else if (inference.intent == 'stopwatch') {
      _performStopwatchCommand(slots);
    } else if (inference.intent == 'availableCommands') {
      _showAvailableCommands();
    }
  } else {
    _rejectCommand();
  }
  setState(() {
    _listeningForCommand = false;
  });
}

Once we’ve filled in all the control functions we should have a completely hands-free and cross-platform Flutter clock, as demonstrated shown: