Learn to Use Custom Wake Word and Text-to-Speech on a Raspberry Pi

One of the primary motivations for working on spokestack-python was to allow our models to run on embedded devices like Raspberry Pi. We are excited to show you how easy it is to use Wake Word and TTS models on these devices.

Spokestack Account

You will want to login and get your API keys for this tutorial. If you do not already have a Spokestack account, please create one.

Hardware

This tutorial is geared toward the Raspberry Pi 4B and Zero W. Technically, the minimum hardware requirements for this tutorial are a device that runs Python, a microphone, and at least one speaker. The recommended hardware is listed below. In addition, we have created a wishlist on Adafruit that is exactly what we used for this tutorial. If you have issues using other hardware or want to show us what you made, feel free to contact us. We are working on more hardware guides, so stay tuned!

Parts List

Raspberry Pi Setup

For the initial setup of the Raspberry Pi we recommend following the Adafruit Voice Bonnet tutorial. This guide walks you through everything from OS installation to sound configuration. In addition to the Adafruit instructions, there are a few Spokestack-specific instructions/tips in the following. These instructions should be followed while connected to your Raspberry Pi via SSH.

Audio

PulseAudio and the Adafruit Voice Bonnet do not interact well so you will want to disable PulseAudio with the following:

systemctl --user stop pulseaudio.socket pulseaudio.service

If you would like to enable PulseAudio afterward you can restart the service with:

systemctl --user start pulseaudio.socket pulseaudio.service

System Dependencies

The following are some system dependencies that need to be installed before installing spokestack on the Raspberry Pi.

sudo apt-get -y install portaudio19-dev libblas-dev libmp3lame-dev

Install Rust for Tokenizers

The command to install Rust is taken directly from the instructions. On a Raspberry Pi 4, I didn’t have any issues compiling Rust, but if you are using the Zero, you may need to cross-compile. We are currently working on an easy solution for the smaller embedded devices.

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

TFlite Interpreter

For this, we can go with TensorFlow’s recommended apt package. We’ve used the pip versions in the past, but this one is easier to install. These commands are directly from the original instructions.

echo "deb https://packages.cloud.google.com/apt coral-edgetpu-stable main" | sudo tee /etc/apt/sources.list.d/coral-edgetpu.list
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
sudo apt-get update
sudo apt-get install python3-tflite-runtime

Installing Spokestack

Spokestack should be installed through pip. We are currently using v0.0.20 for this tutorial.

pip install spokestack==0.0.20

Testing with a Project

We will test with our “Hello, World!” project. Keep in mind we installed the dependencies in the previous sections, so you will not need to follow that project’s README.

git clone https://github.com/spokestack/python-hello-world.git
cd python-hello-world

You will want to add your API keys to the const.py in KEY_ID and KEY_SECRET. Now we should be able to run the app. The project will automatically download the default wake word models. Once running, the default text-to-speech voice will respond when you say “Hey, Spokestack!”

python app.py

Wrapping Up

In this tutorial, we covered how to set up Spokestack on an embedded device. This should get you started with using Spokestack with your projects. If you run into any trouble be sure to reach out through our support channels.

Originally posted June 23, 2021