mm/dump:add command to support dump memory usage of all pids#18468
mm/dump:add command to support dump memory usage of all pids#18468xiaoqizhan wants to merge 1 commit intoapache:masterfrom
Conversation
For the memdump command, when CONFIG_MM_BACKTRACE >= 0 is enabled, it currently only supports dumping all leaked memory nodes and memory usage for a specific PID. It does not support dumping memory usage of all processes in one go. In some scenarios, it is necessary to dump memory usage of all processes at once. For example, when the memory pressure monitoring system detects that memory falls below a certain threshold, it needs to obtain the memory usage of all processes. Similarly, in some automation scripts, it is necessary to periodically collect memory data of each process for memory leak analysis. Signed-off-by: zhanxiaoqi <zhanxiaoqi@bytedance.com>
acassis
left a comment
There was a problem hiding this comment.
@xiaoqizhan thank you for extending the memdump, I noticed we don't have any Documentation about memdump, only dumpstack and similar. Could you please add a initial Documentation page about it (just basic info).
Or, if you want include more info, maybe include things like that response from @anjiahao1 about this issue: #9504 to make it more complete and helpful for developers.
| unsigned long seqmin; | ||
| unsigned long seqmax; |
There was a problem hiding this comment.
Why not use uintptr_t instead of "unsigned long"? It should be more adaptable to different architectures and data types.
There was a problem hiding this comment.
@acassis ,Thank you for your review and feedback. The parameters of the memdump command are passed down via the memdump_write method, with the relevant code in this method as follows:
struct mm_memdump_s dump =
{
PID_MM_ALLOC,
#if CONFIG_MM_BACKTRACE >= 0
0,
ULONG_MAX
#endif
};
The type of mm_memdump_s is struct malltask, whose definition is shown below:
struct malltask
{
pid_t pid; /* Process id /
#if CONFIG_MM_BACKTRACE >= 0
unsigned long seqmin; / The minimum sequence /
unsigned long seqmax; / The maximum sequence */
#endif
};
To maintain consistency with the previous data types, I also use unsigned long for seqmin and seqmax here.
There was a problem hiding this comment.
Thank you @xiaoqizhan, so let's keep it for now, but I suggest fixing it later (in another PR) using "uintptr_t" to make it to adapt better for each arch case.
There was a problem hiding this comment.
@acassis Thank you for your valuable suggestion. I will try to optimize it in another PR as you recommended.
| tcb_arg.heap = heap; | ||
| tcb_arg.seqmin = dump->seqmin; | ||
| tcb_arg.seqmax = dump->seqmax; | ||
| nxsched_foreach(memdump_tcb_handler, &tcb_arg); |
There was a problem hiding this comment.
why not reuse memdump_handler, the current approach has two loop(tcb + mem) which is very slower than one loop.
There was a problem hiding this comment.
@xiaoxiang781216
Thank you for your review and feedback. If the memdump_handler method is used, it will further call memdump_allocnode, which dumps detailed information of each node (including call stacks and other logs), which does not meet the current requirements. Additionally, in my patch, it calls mm_foreach(heap, mallinfo_task_handler, &handle) to traverse all memory nodes of the heap and perform statistics, without recording or outputting any other extra information.
In terms of the number of loops: a total of 3 loops are required: nxsched_foreach traverses the corresponding pid, mm_foreach traverses mm_nregions of the corresponding heap, and then traverses memory nodes based on each region to execute the callback function for statistics. The implementations of memdump_handler and mallinfo_task_handler are basically similar.
In the native implementation, when using the memdump pid command, traversal is performed via mm_foreach(heap, memdump_handler, &priv), which also involves two loops: traversing mm_nregions of the corresponding heap, and then traversing memory nodes based on each region. To obtain the memory information for all PIDs, an additional loop level is required, so the total number of loops remains three.
There was a problem hiding this comment.
ok, but why not pass dump to nxsched_foreach and remove memdump_tcb_arg_s.
BTW, it's better to integrate this feature to meminfo_read with an Kconfig(e.g. CONFIG_FS_PROCFS_MEMINFO_DETAIL) by extending procfs_meminfo_entry_s::mallinfo to procfs_meminfo_entry_s::mallinfo_task
| } | ||
|
|
||
| #if CONFIG_MM_BACKTRACE >= 0 | ||
| struct memdump_tcb_arg_s |
There was a problem hiding this comment.
Please put the type definition in section:
/****************************************************************************
- Private Types
****************************************************************************/
| break; | ||
|
|
||
| #if CONFIG_MM_BACKTRACE >= 0 | ||
| case 'a': |
There was a problem hiding this comment.
| case 'a': | |
| case 'a': |
|
|
||
| /* Special PID to query the info about alloc, free and mempool */ | ||
|
|
||
| #if CONFIG_MM_BACKTRACE >= 0 |
There was a problem hiding this comment.
| #if CONFIG_MM_BACKTRACE >= 0 |
remove, definition does not require adding macro check
| } | ||
|
|
||
| #if CONFIG_MM_BACKTRACE >= 0 | ||
| struct memdump_tcb_arg_s |
There was a problem hiding this comment.
move to file begin and add Private Types chapter
| task.seqmin = tcb_arg->seqmin; | ||
| task.seqmax = tcb_arg->seqmax; | ||
| info = mm_mallinfo_task(tcb_arg->heap, &task); | ||
| syslog(LOG_INFO, "pid:%5d, used:%10d, nused:%10d\n", |
There was a problem hiding this comment.
| syslog(LOG_INFO, "pid:%5d, used:%10d, nused:%10d\n", | |
| syslog(LOG_INFO, "PID: %5d, Used: %10d, NUsed: %10d\n", |
| task.seqmin = tcb_arg->seqmin; | ||
| task.seqmax = tcb_arg->seqmax; | ||
| info = mm_mallinfo_task(tcb_arg->heap, &task); | ||
| syslog(LOG_INFO, "pid:%5d, used:%10d, nused:%10d\n", |
There was a problem hiding this comment.
could we filter out PIDs that don't use any memory?
For the memdump command, when CONFIG_MM_BACKTRACE >= 0 is enabled, it currently only supports dumping all leaked memory nodes and memory usage for a specific PID. It does not support dumping memory usage of all processes in one go.
In some scenarios, it is necessary to dump memory usage of all processes at once. For example, when the memory pressure monitoring system detects that memory falls below a certain threshold, it needs to obtain the memory usage of all processes. Similarly, in some automation scripts, it is necessary to periodically collect memory data of each process for memory leak analysis.
Note: Please adhere to Contributing Guidelines.
Summary
This change extends the functionality of the memdump command to support dumping memory usage of all processes at once. Previously, when CONFIG_MM_BACKTRACE >= 0 was enabled, the memdump command only supported dumping leaked memory nodes and memory usage of a specific PID, but not all processes in one operation.
The enhancement addresses practical use cases such as:
Obtaining full process memory usage when the memory pressure monitoring system detects memory below a certain threshold;
Collecting periodic memory data of all processes in automation scripts for memory leak analysis.
New subcommands are added:
memdump a / memdump allpid: Dump memory usage of all processes;
memdump allpid [seqmin] [seqmax]: Dump memory usage of processes within the specified seq range.
This feature is only active when CONFIG_MM_BACKTRACE >= 0; it is not available if CONFIG_MM_BACKTRACE < 0.
Impact
User Impact
New subcommands (allpid/a) are added to the memdump command, which are fully backward-compatible (existing memdump functionalities for leaked nodes and specific PID remain unchanged);
Users need to ensure CONFIG_MM_BACKTRACE >= 0 (e.g., 0, 3,) to access the new features; the new commands are invisible when CONFIG_MM_BACKTRACE < 0.
Compatibility
No breaking changes to existing code or build process;
The new logic is wrapped under the existing CONFIG_MM_BACKTRACE macro, so no additional configuration dependencies are introduced.
Documentation
Relevant documentation (e.g., command usage docs) needs to be updated to include the new memdump subcommands and their usage constraints (macro dependency).
Testing
Build Configuration:
Test 1: CONFIG_MM_BACKTRACE = 0
Test 2: CONFIG_MM_BACKTRACE = 3
Test 3: CONFIG_MM_BACKTRACE = -1 (negative value for negative case verification)
Test Steps & Results
Step 1: Configure CONFIG_MM_BACKTRACE = 0 in the build config, compile and flash the firmware to the target board.
Step 2: Execute memdump a (alias for memdump allpid) on the target shell; verify the output format and content:
plaintext
pid: 853, used: 0, nused: 0
pid: 22, used: 104336, nused: 77
pid: 23, used: 340856, nused: 160
pid: 24, used: 0, nused: 0
Step 3: Execute memdump allpid 0 1000; verify processes with seq in 0-1000 range are output (consistent with the above format, filtered by seq range
Step 4: Repeat Step 1-3 with CONFIG_MM_BACKTRACE = 3; confirm the same output behavior (new commands work normally).
Step 1: Configure CONFIG_MM_BACKTRACE = -1, recompile and flash firmware.
Step 2: Execute memdump a / memdump allpid on the target shell; verify the commands are not recognized (no response for new subcommands), and only original memdump functionalities are retained.
Verify existing memdump features (dump leaked memory nodes, specific PID memory usage) work as expected after the change, with no regression.
Test Logs (Key Snippets)
bash
运行
Log for CONFIG_MM_BACKTRACE=0 + memdump allpid
root@board:/# memdump allpid
pid: 22, used: 104336, nused: 77
pid: 23, used: 340856, nused: 160
Log for CONFIG_MM_BACKTRACE=0 + memdump allpid 0 1000
root@board:/# memdump allpid 0 1000
pid: 22, used: 0, nused: 0
pid: 23, used: 340856, nused: 160