AIVoice

Supported SoCs

SoC	RTL8721Dx	RTL8720E	RTL8713E	RTL8726E	RTL8730E
AFE single mic (ASR mode)	Y	N	Y	Y	Y
AFE single mic (COM mode)	N	N	Y	Y	Y
AFE dual mic (ASR mode)	N	N	Y	Y	Y
KWS fixed keyword	Y	N	Y	Y	Y
KWS user-defined keyword	N	N	Y	Y	Y
VAD	Y	N	Y	Y	Y
ASR	N	N	Y	Y	Y

Overview

AIVoice is an offline AI solution developed by Realtek, including local algorithm modules like Audio Front End (Signal Processing), Keyword Spotting, Voice Activity Detection, Speech Recognition etc. It can be used to build voice related applications on Realtek Ameba SoCs.

AIVoice can be used as a purely offline solution on its own, or it can be combined with cloud systems such as voice recognition, LLMs to create a hybrid online and offline voice interaction solution.

Applications

The smart voice system is widely applied in various fields and products, enhancing the efficiency and convenience of human-computer interaction. The applications including:

Smart Home: Smart speakers like Amazon Echo and Google Nest, or home appliances with built-in voice control features, allow users to control home lighting, temperature, and other smart devices through voice commands, enhancing convenience and comfort in living.
Smart Toys: Intelligent voice systems are being integrated into interactive toys (like AI story machines and voice-enabled educational robots, companion robots). These toys can engage in natural conversations with users, answering endless questions, telling personalized stories, or providing bilingual education.
In-Car Systems: Many modern vehicles are equipped with voice recognition systems that enable drivers to navigate, make calls, and play music using voice commands, ensuring driving safety and also making the driving experience more enjoyable.
Wearable Products: Many products include smartwatches, smart headphones, and health monitoring devices come equipped with voice assistants. User can use voice control to check and send messages, control music player, answer calls etc, enhancing user experience and interaction methods.
Meeting Scenarios: Voice recognition technology can transcribe meeting content in real-time, helping participants better record and review discussion points.

File Path

Chip	OS	aivoice_lib_dir	aivoice_example_dir
RTL8730E	Linux	{LINUXSDK}/apps/aivoice	{LINUXSDK}/apps/aivoice/example
RTL8721Dx/RTL8730E	FreeRTOS	{RTOSSDK}/component/aivoice	{RTOSSDK}/component/example/aivoice
RTL8713E/RTL8726E	FreeRTOS	{DSPSDK}/lib/aivoice	{DSPSDK}/example/aivoice

Modules

Modules	Functions
AFE (Audio Front End)	Enhancing speech signals and reducing noise
KWS (Keyword Spotting)	Detecting specific wakeup words to trigger voice assistants, such as `Hey siri`, `Alexa`
VAD (Voice Activity Detection)	Detecting speech segments or noise segments
ASR (Automatic Speech Recognition)	Detecting offline voice control commands

Flows

Some algorithm flows have been implemented to facilitate user development.

Full Flow: An offline full flow including AFE, KWS and ASR. AFE and KWS are always-on, ASR turns on and supports continuous recognition when KWS detects the keyword. ASR exits after timeout.
AFE+KWS: Offline flow including AFE and KWS, always-on.
AFE+KWS+VAD: Offline flow including AFE, KWS and VAD. AFE and KWS are always-on, VAD turns on and supports continuous activity detention when KWS detects the keyword. VAD exits after timeout.

All SoCs

Select SoC via Features

HiFi DSP Series

HiFi DSP Series

Cortex-A Linux Series

Cortex-A Linux Series

Display Series

Display Series

Audio Series

Audio Series

Wi-Fi 6 + BLE Series

Wi-Fi 6 + BLE Series

Wi-Fi 2.4G/5G + BLE Seriess

Wi-Fi 2.4G/5G + BLE Series

Wi-Fi + Classic BT Series

Wi-Fi + Classic BT Series

Wi-Fi R-MESH Series

Wi-Fi R-MESH Series

Select SoC via Applications

IoT Control

IoT Control

Application Note

Wi-Fi Guide

Wi-Fi Guide

SDK

Advanced Features

AI Voice