VOICE AND WAKEWORD COMBOS FOR OVOS/NEON
Neon.AI and OpenVoice OS (OVOS) both offer out-of-the-box smart speaker/voice assistant platforms. Neon has spent much of 2023 focused on the Mycroft Mark 2 smart speaker, although they have recently branched out into Orange Pi and Raspberry Pi offerings. OVOS can run headless (without a GUI) on almost any platform that can run Docker - most flavors of Linux, Windows 10/11 (with WSL2), and MacOS. Because both organizations have recently been thrust into the position of carrying the torch for the now-defunct Mycroft.
CONFIGURING LOCAL TTS
On the journey towards creating a personal, private, open source voice assistant, one of the most important components is text-to-speech (TTS). This is the component that takes the text that the voice assistant wants to speak and turns it into an audio file that can be played through the assistant’s speakers. TTS has come a long way over the years and is much less compute-intensive than speech-to-text (STT), but there are still benefits to running your TTS off the actual assistant hardware.
OVOS ON A MAC
I’ve written quite a few posts already about configuring different aspects of Neon.AI, a private, open source voice assistant built on top of OpenVoiceOS (OVOS). However, all of those posts assume you already have a device running one of those platforms. In this post, I’ll walk through the process of setting up a Mac to run OVOS. This is a great way to get started with OVOS if you don’t have a Raspberry Pi or other device available, or you just want to take advantage of the great processing power available on a Mac.
PRIVATE DOCKER REGISTRY IN K8S
Mirroring package repositories has been an option available to Linux users for a long time. It’s a great way to save bandwidth and speed up package installation. This is especially true if you’re using Kubernetes, where you’ll be pulling images from a registry many times a day. There’s a lot of value to doing the same with Docker images, particularly for any that are private and only in active use in your homelab.
USING EXTERNAL (PRIVATE OR PUBLIC) TTS ON NEON AND OVOS
Neon.AI and OpenVoice OS (OVOS) both offer out-of-the-box smart speaker/voice assistant platforms, using a combination of their own aggregated text-to-speech (TTS) and speech-to-text (STT) hosted options as well as low-power open source engines in case the internet is not available. Recently, a community member was asking about ways to improve the performance of the Neon software on the Mycroft Mark 2 smart speaker. I realized that I’d done a post on configuring Coqui-TTS for Neon, but not how to configure another external TTS system.
SNAPSHOT TESTING IN AWS CDK PYTHON
At Defiance Digital, we use the AWS CDK for almost everything. Generally we use TypeScript because it’s the original language for the CDK, everything using JSII transpiles back to TypeScript, and it has the most compability with the CDK. However, we have a few projects that use Python, and on those I’ve really been missing Jest snapshot testing. For most CDK projects snapshot tests are the perfect way to make sure that your stacks don’t have unintended changes.
MARK2 DEV KIT
If you’ve gotten involved with Neon.AI doing their developer bounties, you may have requested or received a Mark 2 Dev Kit instead of a production Mark 2. The original Mark 2 dev kits had an acrylic housing and some 3D printed parts. The newer Mark 2 dev kits generally come in an entirely 3D printed housing. The directions that get sent out with dev kits are a bit out of date.
HOME ASSISTANT INTEGRATION FOR OVOS
Home Assistant has a Mycroft integration available, but it is quite old, and as Neon and OVOS continue to separate from the original codebase it will become less functional. Additionally, it is not following Home Assistant integration development best practices. As of the 2023.6 series of releases, the integration does not work with Neon or OVOS running on the Mycroft Mark 2. I’ve made a preliminary integration for OVOS (which also works with Neon, being based on the same codebase).
RUNNING A FASTERWHISPER STT SERVER
Piggybacking off my post about running a local, private STT server, we’re going to take a look at running a FasterWhisper STT Docker image. FasterWhisper is an improvement on the open source Whisper STT project that adds some interesting features, including the ability to run exclusively via CPU, and using less memory. The large model is multi-lingual and does an excellent job of capturing punctuation. FasterWhisper is the current preferred STT solution for the OVOS team.
CONFIGURING LOCAL STT FOR OVOS/NEON
Neon.AI and OpenVoice OS (OVOS) both offer out-of-the-box smart speaker/voice assistant platforms, using a combination of their own aggregated text-to-speech (TTS) and speech-to-text (STT) hosted options as well as low-power open source engines in case the internet is not available. While both companies go out of their way to be as privacy-respecting as possible, ultimately I don’t want my voice assistant to be sending my voice to a server somewhere else on the internet or sharing the text that will be spoken aloud.