AIVoice
Supported SoCs
SoC |
RTL8721Dx |
RTL8720E |
RTL8713E |
RTL8726E |
RTL8730E |
---|---|---|---|---|---|
AFE single mic (ASR mode) |
Y |
N |
Y |
Y |
Y |
AFE single mic (COM mode) |
N |
N |
Y |
Y |
Y |
AFE dual mic (ASR mode) |
N |
N |
Y |
Y |
Y |
KWS fixed keyword |
Y |
N |
Y |
Y |
Y |
KWS user-defined keyword |
N |
N |
Y |
Y |
Y |
VAD |
Y |
N |
Y |
Y |
Y |
ASR |
N |
N |
Y |
Y |
Y |
Overview
AIVoice is an offline AI solution developed by Realtek, including local algorithm modules like Audio Front End (Signal Processing), Keyword Spotting, Voice Activity Detection, Speech Recognition etc. It can be used to build voice related applications on Realtek Ameba SoCs.
AIVoice can be used as a purely offline solution on its own, or it can be combined with cloud systems such as voice recognition, LLMs to create a hybrid online and offline voice interaction solution.
Applications
The smart voice system is widely applied in various fields and products, enhancing the efficiency and convenience of human-computer interaction. The applications including:
Smart Home: Smart speakers like Amazon Echo and Google Nest, or home appliances with built-in voice control features, allow users to control home lighting, temperature, and other smart devices through voice commands, enhancing convenience and comfort in living.
Smart Toys: Intelligent voice systems are being integrated into interactive toys (like AI story machines and voice-enabled educational robots, companion robots). These toys can engage in natural conversations with users, answering endless questions, telling personalized stories, or providing bilingual education.
In-Car Systems: Many modern vehicles are equipped with voice recognition systems that enable drivers to navigate, make calls, and play music using voice commands, ensuring driving safety and also making the driving experience more enjoyable.
Wearable Products: Many products include smartwatches, smart headphones, and health monitoring devices come equipped with voice assistants. User can use voice control to check and send messages, control music player, answer calls etc, enhancing user experience and interaction methods.
Meeting Scenarios: Voice recognition technology can transcribe meeting content in real-time, helping participants better record and review discussion points.
File Path
Chip |
OS |
aivoice_lib_dir |
aivoice_example_dir |
---|---|---|---|
RTL8730E |
Linux |
{LINUXSDK}/apps/aivoice |
{LINUXSDK}/apps/aivoice/example |
RTL8721Dx/RTL8730E |
FreeRTOS |
{RTOSSDK}/component/aivoice |
{RTOSSDK}/component/example/aivoice |
RTL8713E/RTL8726E |
FreeRTOS |
{DSPSDK}/lib/aivoice |
{DSPSDK}/example/aivoice |
Modules
Modules |
Functions |
---|---|
AFE (Audio Front End) |
Enhancing speech signals and reducing noise |
KWS (Keyword Spotting) |
Detecting specific wakeup words to trigger voice assistants, such as |
VAD (Voice Activity Detection) |
Detecting speech segments or noise segments |
ASR (Automatic Speech Recognition) |
Detecting offline voice control commands |
Flows
Some algorithm flows have been implemented to facilitate user development.
Full Flow: An offline full flow including AFE, KWS and ASR. AFE and KWS are always-on, ASR turns on and supports continuous recognition when KWS detects the keyword. ASR exits after timeout.
AFE+KWS: Offline flow including AFE and KWS, always-on.
AFE+KWS+VAD: Offline flow including AFE, KWS and VAD. AFE and KWS are always-on, VAD turns on and supports continuous activity detention when KWS detects the keyword. VAD exits after timeout.