AIVoice Examples

AIVoice Full Flow Offline Example

This example shows how to use AIVoice full flow with a pre-recorded 3 channel audio and will run only once after EVB reset. Audio functions such as recording and playback are not integrated.

Example code is under ${aivoice_example_dir}/full_flow_offline.

Steps of Using AIVoice

Select aivoice flow or modules needed. Set KWS mode to multi or single if using full flow, afe_kws or afe_kws_vad flow.

/* step 1:
 * Select the aivoice flow you want to use.
 * Refer to the end of aivoice_interface.h to see which flows are supported.
 */
const struct rtk_aivoice_iface *aivoice = &aivoice_iface_full_flow_v1;
rtk_aivoice_set_multi_kws_mode();

Build configuration.

/* step 2:
 * Modify the default configure if needed.
 * You can modify 0 or more configures of afe/vad/kws/...
 */
struct aivoice_config config;
memset(&config, 0, sizeof(config));

/*
 * here we use afe_res_2mic50mm for example.
 * you can change these configuratons according the afe resource you used.
 * refer to aivoce_afe_config.h for details;
 *
 * afe_config.mic_array MUST match the afe resource you linked.
 */
struct afe_config afe_param = AFE_CONFIG_ASR_DEFAULT_2MIC50MM; // change this according to the linked afe resource.
config.afe = &afe_param;

/*
 * ONLY turn on these settings when you are sure about what you are doing.
 * it is recommend to use the default configure,
 * if you do not know the meaning of these configure parameters.
 */
struct vad_config vad_param = VAD_CONFIG_DEFAULT();
vad_param.left_margin = 300; // you can change the configure if needed
config.vad = &vad_param;    // can be NULL

struct kws_config kws_param = KWS_CONFIG_DEFAULT();
config.kws = &kws_param;    // can be NULL

struct asr_config asr_param = ASR_CONFIG_DEFAULT();
config.asr = &asr_param;    // can be NULL

struct aivoice_sdk_config aivoice_param = AIVOICE_SDK_CONFIG_DEFAULT();
aivoice_param.no_cmd_timeout = 10;
config.common = &aivoice_param; // can be NULL

Use create() to create and initialize aivoice instance with given configuration.

/* step 3:
 * Create the aivoice instance.
 */
void *handle = aivoice->create(&config);
if (!handle) {
    return;
}

Register callback function.

/* step 4:
 * Register a callback function.
 * You may only receive some of the aivoice_out_event_type in this example,
 * depending on the flow you use.
 * */

rtk_aivoice_register_callback(handle, aivoice_callback_process, NULL);

The callback function can be modified according to user cases:

static int aivoice_callback_process(void *userdata,
                                    enum aivoice_out_event_type event_type,
                                    const void *msg, int len)
{

    (void)userdata;
    struct aivoice_evout_vad *vad_out;
    struct aivoice_evout_afe *afe_out;

    switch (event_type) {
    case AIVOICE_EVOUT_VAD:
            vad_out = (struct aivoice_evout_vad *)msg;
            printf("[user] vad. status = %d, offset = %d\n", vad_out->status, vad_out->offset_ms);
            break;

    case AIVOICE_EVOUT_WAKEUP:
            printf("[user] wakeup. %.*s\n", len, (char *)msg);
            break;

    case AIVOICE_EVOUT_ASR_RESULT:
            printf("[user] asr. %.*s\n", len, (char *)msg);
            break;

    case AIVOICE_EVOUT_ASR_REC_TIMEOUT:
            printf("[user] asr timeout\n");
            break;

    case AIVOICE_EVOUT_AFE:
            afe_out = (struct aivoice_evout_afe *)msg;

            // afe will output audio each frame.
            // in this example, we only print it once to make log clear
            static int afe_out_printed = false;
            if (!afe_out_printed) {
                    afe_out_printed = true;
                    printf("[user] afe output %d channels raw audio, others: %s\n",
                               afe_out->ch_num, afe_out->out_others_json ? afe_out->out_others_json : "null");
            }

            // process afe output raw audio as needed
            break;

    default:
            break;
    }

    return 0;
}

Use feed() to input audio data to aivoice.

/* when run on chips, we get online audio stream,
 * here we use a fix audio.
 * */

const char *audio = (const char *)get_test_wav();
int len = get_test_wav_len();
int audio_offset = 44;
int mics_num = 2;
int afe_frame_bytes = (mics_num + afe_param.ref_num) * afe_param.frame_size * sizeof(short);
while (audio_offset <= len - afe_frame_bytes) {
        /* step 5:
         * Feed the audio to the aivoice instance.
         * */

        aivoice->feed(handle,
                      (char *)audio + audio_offset,
                      afe_frame_bytes);

        audio_offset += afe_frame_bytes;
}

(Optional) Use reset() if status reset is needed.

Use destroy() to destroy the instance if aivoice is no longer needed.

/* step 6:
* Destroy the aivoice instance */

aivoice->destroy(handle);

Expected Result

Download image to EVB, after running, the logs should display the algorithm results as follows:

[AIVOICE] set multi kws mode
---------------------SPEECH COMMANDS---------------------
Command ID1, 打开空调
Command ID2, 关闭空调
Command ID3, 制冷模式
Command ID4, 制热模式
Command ID5, 加热模式
Command ID6, 送风模式
Command ID7, 除湿模式
Command ID8, 调到十六度
Command ID9, 调到十七度
Command ID10, 调到十八度
Command ID11, 调到十九度
Command ID12, 调到二十度
Command ID13, 调到二十一度
Command ID14, 调到二十二度
Command ID15, 调到二十三度
Command ID16, 调到二十四度
Command ID17, 调到二十五度
Command ID18, 调到二十六度
Command ID19, 调到二十七度
Command ID20, 调到二十八度
Command ID21, 调到二十九度
Command ID22, 调到三十度
Command ID23, 开高一度
Command ID24, 开低一度
Command ID25, 高速风
Command ID26, 中速风
Command ID27, 低速风
Command ID28, 增大风速
Command ID29, 减小风速
Command ID30, 自动风
Command ID31, 最大风量
Command ID32, 中等风量
Command ID33, 最小风量
Command ID34, 自动风量
Command ID35, 左右摆风
Command ID36, 上下摆风
Command ID37, 播放音乐
Command ID38, 暂停播放
Command ID39, 接听电话
Command ID40, 挂断电话
---------------------------------------------------------

[AIVOICE] rtk_aivoice version: v1.5.0#S0825120#N1ed33d6#A6c25e38
[AIVOICE] rtk_aivoice_model afe version: afe_2mic_asr_v1.3.1_AfePara_2mic50_v2.0_bf_v0.0_20250401
[AIVOICE] rtk_aivoice_model vad version: vad_v7_opt
[AIVOICE] rtk_aivoice_model kws version: kws_xqxq_v4.1_opt
[AIVOICE] rtk_aivoice_model asr version: asr_cn_v8_opt
[AIVOICE] rtk_aivoice_log_format version: v2
[user] afe output 1 channels raw audio, others: {"abnormal_flag":0,"ssl_angle":-10}
[AIVOICE] [KWS] result: {"id":2,"keyword":"ni-hao-xiao-qiang","score":0.7746397852897644}
[user] wakeup. {"id":2,"keyword":"ni-hao-xiao-qiang","score":0.7746397852897644}
[user] voice angle 90.0
[user] vad. status = 1, offset = 385
[user] vad. status = 0, offset = 1865
[AIVOICE] [ASR] result: {"type":0,"commands":[{"rec":"打开空调","id":1}]}
[user] asr. {"type":0,"commands":[{"rec":"打开空调","id":1}]}
[user] voice angle 90.0
[user] vad. status = 1, offset = 525
[AIVOICE] [KWS] result: {"id":2,"keyword":"ni-hao-xiao-qiang","score":0.750707507133484}
[user] wakeup. {"id":2,"keyword":"ni-hao-xiao-qiang","score":0.750707507133484}
[user] voice angle 90.0
[user] vad. status = 1, offset = 445
[user] vad. status = 0, offset = 1765
[AIVOICE] [ASR] result: {"type":0,"commands":[{"rec":"播放音乐","id":37}]}
[user] asr. {"type":0,"commands":[{"rec":"播放音乐","id":37}]}
[user] voice angle 90.0

Build Example

Switch to GCC project directory in SDK
```
cd {SDK}/amebadplus_gcc_project
```
Run menuconfig.py to enter the configuration interface
```
./menuconfig.py
```

Navigate through menu path to enable TFLM Library and AIVoice

--------MENUCONFIG FOR General---------
CONFIG TrustZone  --->
...
CONFIG APPLICATION  --->
   GUI Config  --->
   ...
   AI Config  --->
      [*] Enable TFLITE MICRO
      [*] Enable AIVoice

Build image
```
./build.py -a full_flow_offline
```

Build TFLM Library for DSP, refer to Build TFLM.

Or use the prebuilt TFLM Library in {DSPSDK}/lib/aivoice/prebuilts.
Import {DSPSDK}/example/aivoice/full_flow_offline source in Xtensa Xplorer.

Set software configurations and modify libraries such as AFE resource, KWS resource if needed.

Add include path (-I)

${workspace_loc}/../lib/aivoice/include

Add library search path (-L)

${workspace_loc}/../lib/aivoice/prebuilts/$(TARGET_CONFIG)
${workspace_loc}/../lib/xa_nnlib/v1.8.1/bin/$(TARGET_CONFIG)/Release
${workspace_loc}/../lib/lib_hifi5/project/hifi5_library/bin/$(TARGET_CONFIG)/Release
${workspace_loc}/../lib/tflite_micro/project/bin/$(TARGET_CONFIG)/Release

Add libraries (-l)

-laivoice -lafe_kernel -lafe_res_2mic50mm -lkernel -lvad -lkws -lasr -lfst -lcJSON -ltomlc99  -ltflite_micro -lxa_nnlib -lhifi5_dsp

Build image, refer to the steps in DSP Build.

Build TFLM Library for DSP, refer to Build TFLM.

Or use the prebuilt TFLM Library in {DSPSDK}/lib/aivoice/prebuilts.
Import {DSPSDK}/example/aivoice/full_flow_offline source in Xtensa Xplorer.

Set software configurations and modify libraries such as AFE resource, KWS resource if needed.

Add include path (-I)

${workspace_loc}/../lib/aivoice/include

Add library search path (-L)

${workspace_loc}/../lib/aivoice/prebuilts/$(TARGET_CONFIG)
${workspace_loc}/../lib/xa_nnlib/v1.8.1/bin/$(TARGET_CONFIG)/Release
${workspace_loc}/../lib/lib_hifi5/project/hifi5_library/bin/$(TARGET_CONFIG)/Release
${workspace_loc}/../lib/tflite_micro/project/bin/$(TARGET_CONFIG)/Release

Add libraries (-l)

-laivoice -lafe_kernel -lafe_res_2mic50mm -lkernel -lvad -lkws -lasr -lfst -lcJSON -ltomlc99  -ltflite_micro -lxa_nnlib -lhifi5_dsp

Build image, refer to the steps in DSP Build.

FreeRTOS

Switch to GCC project directory in SDK
```
cd {SDK}/amebasmart_gcc_project
```
Run menuconfig.py to enter the configuration interface
```
./menuconfig.py
```

Navigate through menu path to enable TFLM Library and AIVoice

--------MENUCONFIG FOR General---------
CONFIG TrustZone  --->
...
CONFIG APPLICATION  --->
   GUI Config  --->
   ...
   AI Config  --->
      [*] Enable TFLITE MICRO
      [*] Enable AIVoice

Select AFE Resource according to hardware, default is afe_res_2mic50mm

AI Config  --->
   [*] Enable TFLITE MICRO
   [*] Enable AIVoice
      Select AFE Resource
         ( ) afe_res_1mic
         ( ) afe_res_2mic30mm
         (X) afe_res_2mic50mm
         ( ) afe_res_2mic70mm

Select KWS Resource, default is fixed keyword xiao-qiang-xiao-qiang ni-hao-xiao-qiang

AI Config  --->
   [*] Enable TFLITE MICRO
   [*] Enable AIVoice
      Select AFE Resource
      Select KWS Resource
         (X) kws_res_xqxq
         ( ) kws_res_custom

Build image
```
./build.py -a full_flow_offline
```

Linux

(Optional) Modify yocto recipe {LINUXSDK}/yocto/meta-realtek/meta-sdk/recipes-rtk/aivoice/rtk-aivoice-algo.bb to change library such as AFE resource, KWS resource if needed.
Compile the aivoice algo image using bitbake:
```
bitbake rtk-aivoice-algo
```

All SoCs

Select SoC via Features

HiFi DSP Series

HiFi DSP Series

Cortex-A Linux Series

Cortex-A Linux Series

Display Series

Display Series

Audio Series

Audio Series

Wi-Fi 6 + BLE Series

Wi-Fi 6 + BLE Series

Wi-Fi 2.4G/5G + BLE Seriess

Wi-Fi 2.4G/5G + BLE Series

Wi-Fi + Classic BT Series

Wi-Fi + Classic BT Series

Wi-Fi R-MESH Series

Wi-Fi R-MESH Series

Select SoC via Applications

IoT Control

IoT Control

Application Note

Wi-Fi Guide

Wi-Fi Guide

SDK

Advanced Features

AI Voice