Overview

Memory leak and memory fragmentation are two core issues in memory management. Memory leaks will cause available memory to gradually decrease, while memory fragmentation can lead to allocation failures and system performance degradation. Especially in RTOS environments, these problems can severely impact system stability and runtime efficiency.

In addition, memory overrun and heap corruption are also potential risks in memory management; these issues are often difficult to detect and debug.

FreeRTOS provides developers with an efficient and flexible memory management mechanism. The FreeRTOS kernel supports coalescing of adjacent free memory blocks in the heap_4 and heap_5 memory management schemes, effectively reducing the risk of memory fragmentation. However, the system does not natively offer effective methods to detect memory leaks or heap corruption problems.

We provide a heap monitoring function targeted at FreeRTOS heap_5, including:

  1. Heap status statistics

  2. Task-level heap usage analysis

  3. Heap integrity check (heap corruption detection)

  4. Memory leak detection

Related API header file path: heap_trace/heap_trace.h

Heap Status Statistics

Enable Method: Enabled by default

API Name: heap_get_stats()

Example Output:

********** Heap Usage Status **********
Total Number Of Successful Allocations:    13
Total Number Of Successful Frees:          2
Current Available Heap Space:              96576 bytes
Minimum Ever Free Space Remaining:         96448 bytes
Number Of Free Blocks:                     3
Size Of Largest Free Block:                65520 bytes
Size Of Smallest Free Block:               31056 bytes

Description of Output Items:

  • Total Number Of Successful Allocations : The number of times malloc has been successfully called and a valid memory block returned.

  • Total Number Of Successful Frees : The number of times free has been successfully called and a memory block released.

  • Current Available Heap Space : The total currently available heap space.

  • Minimum Ever Free Space Remaining : The minimum amount of free heap space remaining since system startup (sum of all free blocks).

  • Number Of Free Blocks : The number of free memory blocks in the heap.

  • Size Of Largest Free Block : The size of the largest free block in the heap.

  • Size Of Smallest Free Block : The size of the smallest free block in the heap.

Task-level Heap Usage Analysis

Obtain the stack size used by each task. Task-level heap usage analysis can be used for memory leak detection: if the heap usage of a certain task increases, there may be a memory leak.

Enable Method: Enabled by default. You need to change CONFIG_HEAP_TRACE_MAX_TASK_NUMBER in the Kconfig according to actual requirements, which sets the maximum number of traced tasks. If a task (e.g., task-A) does not free all the allocated memory via malloc before it is deleted, task-A will always occupy one heap_task_info entry. It is recommended to set this value greater than the number of resident tasks; otherwise, if all heap_task_info entries are occupied by resident tasks, malloc calls from new tasks will cause an overflow error.

API Name: heap_get_per_task_info()

Example Output:

/* 1st calling heap_get_per_task_info() */
Deleted Task: main thread - task heap usage: 0x64b0
Task: TX - task heap usage: 0x180

/* 2nd calling heap_get_per_task_info() */
Deleted Task: main thread - task heap usage: 0x64b0
Task: TX - task heap usage: 0x1C0

/* 3rd calling heap_get_per_task_info() */
Deleted Task: main thread - task heap usage: 0x64b0
Task: TX - task heap usage: 0x200

In this example, through the analysis of three invocation results, it can be seen that the heap memory usage of the “TX” task continues to increase, indicating a potential memory leak. Additionally, the “main thread” still shows heap memory usage even after being deleted, which may also indicate a memory leak. At this point, you can use set_tracing() to narrow down the investigation scope, or manually review the code logic of “TX” and “main thread” to confirm whether there is any unreleased memory.

In this example, the “TX” task indeed has a memory leak. As for the “main thread”, it is deleted immediately after creating the TASK TX. Its heap memory usage comes from allocating the task stack for TASK TX, which is a normal phenomenon. The code segment causing the memory leak in this example is shown below:

static void prvQueueSendTask( void *pvParameters )
{
   for( ;; )
   {
      /* Allocation but no free */
      uint32_t *p = malloc(8);
      uint32_t *q = malloc(8);
      vTaskDelay(50);
      heap_get_per_task_info();
   }
}

/* Main thread is used to create all app tasks */
int main( void )
{
   xTaskCreate(     prvQueueSendTask,
                  "TX",
                  configMINIMAL_STACK_SIZE,
                  NULL,
                  mainQUEUE_SEND_TASK_PRIORITY,
                  NULL );

   vTaskDelete(NULL);
   for( ;; );
}

Caution

  • Users should manage heap memory within the lifecycle of a single task: memory allocated with malloc in task-1 should be released with free before task-1 is deleted.

  • It is not recommended to release task-1 memory in task-2, after task-1 has been deleted. Because this will cause incorrect task heap usage calculations. When task-1 is deleted, both the task name and TCB (Task Control Block) are no longer valid.

Heap Integrity Check

Checks the integrity of heap data structures and content. We provide three protection modes:

  • Default Mode (Heap Header Data Structure Protection)

  • Lightweight Impact Mode (Boundary Guard)

  • Comprehensive Mode (Content Protector)

API Name: heap_check_integrity()

Default Mode (Heap Header Data Structure Protection)

Using the default enabled heap protector, most out-of-bounds writes and structure damages can be detected. This helps prevent hard-to-locate crashes caused by heap structure corruption. There is almost no performance impact. For more details about the heap protector, please refer https://mcuoneclipse.com/2024/01/28/freertos-with-heap-protector/

Enable Method: Enabled by default. If you need to disable it, you can turn off CONFIG_HEAP_PROTECTOR in Kconfig.

Timing and Content of Checks:

  • Default Check: Each time the heap data structure is used, it checks whether the addresses in the heap data structure are valid values.

  • Enhanced Check: After enabling CONFIG_HEAP_INTEGRITY_CHECK_IN_TASK_SWITCHED_OUT in Kconfig, an integrity check will be performed each time a task is switched out.

  • Manual Check: Users can call heap_check_integrity() to check the integrity of all heap block structures.

Lightweight Impact Mode (Boundary Guard)

The lightweight impact mode provides more precise detection of memory overflows. In memory management, a “canary” refers to a technique used for detecting memory overflows. Canary areas are filled with specific values during program execution and placed at the end of the stack or memory buffer. In lightweight impact mode, two canary areas are placed at the head and tail of each allocated heap block. The head canary area is filled with xHeadCanaryValue, and the tail canary area is filled with xTailCanaryValue.

#define xHeadCanaryValue 0xDEADBEEF     // Canary value at the head of each allocated heap block
#define xTailCanaryValue 0xCAFEBABE     // Canary value at the tail of each allocated heap block

Enable Method: Before enabling, please make sure CONFIG_HEAP_PROTECTOR is set to 1, then enable CONFIG_HEAP_CORRUPTION_DETECT_LITE in Kconfig.

Timing and Content of Checks:

  • Default Check: Each time free is called in lightweight impact mode, it verifies whether the canary bytes at the head and tail match the expected values.

  • Enhanced Check: After enabling CONFIG_HEAP_INTEGRITY_CHECK_IN_TASK_SWITCHED_OUT in Kconfig, an integrity check will be performed each time a task is switched out.

  • Manual Check: heap_check_integrity() is provided as a user API to verify whether the canary bytes of all allocated heap memory blocks match their expected values.

An overflow occurs when a program writes past the end of an allocated block. This usually causes corruption of the next contiguous chunk in the heap, whether or not it is allocated. An underflow occurs when a program writes before the start of an allocated block. This often corrupts the header of the block itself.

Negative Impact: Enabling lightweight impact check will increase memory usage. Each individual allocation will use 2 * portBYTE_ALIGNMENT bytes of memory, depending on the alignment.

Comprehensive Mode (Content Protector)

On top of the previous two features, comprehensive mode adds error checks for uninitialized access and use-after-free. In this mode, newly allocated blocks are initialized with the xFillAlocated pattern, while freed memory is filled with the xFillFreed pattern.

#define xFillAlocated   0xCC        // Newly allocated blocks are initialized with xFillAlocated
#define xFillFreed      0xDD        // All freed memory is filled with xFillFreed

Enabling Method: Comprehensive mode is implemented based on the lightweight impact mode. Before enabling, please ensure that CONFIG_HEAP_CORRUPTION_DETECT_LITE is set to 1, and then enable CONFIG_HEAP_CORRUPTION_DETECT_COMPREHENSIVE in Kconfig.

Timing and Content of Checks:

  • Default Check: Each time malloc is called, not only the canary region is verified, but also every byte of each free block is checked to match xFillFreed.

  • Manual Check: heap_check_integrity() is provided as a user API.

Negative Impact: Although comprehensive mode makes it easier to detect memory corruption or illegal access errors, this mode will significantly affect runtime performance. This is because memory must be initialized with a specific value every time malloc or free is called, and each check thoroughly examines all memory spaces. Therefore, it is recommended to enable this mode only during memory error debugging, and not for general or production environments. Additionally, after enabling comprehensive mode, it cannot be called in the task switch hook. This check is too time-consuming, and executing it in the hook will affect normal system scheduling.

Memory Leak Detection

How to Diagnose Memory Leaks:

It is recommended to first use the heap_get_stats() or heap_get_per_task_info() APIs to narrow the leak problem down to a specific task or function sequence: In these functions or sequences, the available memory continuously decreases and never recovers. Then, use the heap trace function within this smaller scope. The heap trace functionality introduction and usage are as follows:

Enabling Method:

  • Before enabling, make sure CONFIG_HEAP_PROTECTOR is set to 1, and then enable CONFIG_HEAP_TRACE in Kconfig.

  • In Kconfig, set CONFIG_HEAP_TRACE_STACK_DEPTH to the maximum depth of the malloc backtrace stack.

  • In Kconfig, set CONFIG_HEAP_TRACE_MAX_TASK_NUMBER to the maximum task number. If a task malloc’s some memory but does not free all of it before being deleted, it will occupy one entry in heap_task_info. It is recommended to set this value larger than the number of resident tasks; otherwise, when the resident tasks occupy all the heap_task_info entries, new task malloc operations will report overflow.

API Names:

  • heap_trace_init(): Initializes the structure for malloc/free logs.

  • heap_trace_start(): Starts leak tracing.

  • heap_trace_stop(): Stops leak tracing.

  • heap_trace_record_dump(): During the period between start and stop, displays heap information for malloc records without corresponding free records.

The num_records parameter passed during heap_trace_init() initialization can be regarded as a pool (buffer size), which represents “the number of memory blocks currently malloc’ed but not freed.” Suppose the user retrieves the following data using heap_get_stats():

Total Number Of Successful Allocations:    371
Total Number Of Successful Frees:          314

Generally, it is sufficient to set num_records as (371-314 = 57), unless in extreme cases. For example, if more than 60 mallocs happen consecutively before any frees, logs might be lost. Therefore, considering the situation where continuous mallocs occur when num_records is nearly full, it is recommended to add a few more records on top of 57.

heap_trace_record_dump() will display the leak records as shown below, including the malloc address, size, and malloc call stack:

====== Heap Trace Log Count: 2 records (log buffer capacity: 8) ======
560 bytes (@ 0x101086c0) allocated, caller bakctrace: 0x0e02b544 0x0e039274 0x0e03a630 0x0e032ac2 0x0e035900
1072 bytes (@ 0x10108aa0) allocated, caller bakctrace: 0x0e02943e 0x0e02b5a6 0x0e045b70 0x0e03a664 0x0e032ac2 0x0e035900
========================= Heap Trace Summary =========================
Mode: Heap Trace Leaks
1632 bytes 'leaked' in trace (2 allocations)
records: 2 (8 capacity, 3 high water mark)

Memory Leak False Positives:

  • During the trace period, if a free is immediately followed by a malloc at the same address, a leak record might be left. However, after using heap_get_stats() to check, you will find the remaining heap size is unchanged. For example, in WIFI lwip, the sys_check_timeouts function uses these six memory blocks and will often report this issue. Such false positive leak addresses or sizes are usually the same.

  • RTOS tasks or components created dynamically after heap_trace_start(), but not deleted before heap_trace_stop().

  • Try to use the heap_trace API within a single task. If the scope is too large, leaked records will be hard to analyze.