update readme
This commit is contained in:
302
README.md
302
README.md
@@ -1,76 +1,83 @@
|
|||||||
## Variable Monitor
|
## Variable Monitor
|
||||||
|
|
||||||
Monitor numerical variables (given address, length), and print system stack information when the set conditions are exceeded.
|
changelog
|
||||||
|
|
||||||
Number of simultaneous monitoring
|
|
||||||
- Monitoring with the same timing length will be grouped into one group, corresponding to one timer.
|
|
||||||
- A set of up to 32 variables, after which a new timer is allocated.
|
|
||||||
- The global maximum number of timers is 128.
|
|
||||||
- The above quantity limit is defined in the `watch_module.h` header macro.
|
|
||||||
|
|
||||||
Currently, monitoring is limited to the same application, and simultaneous calls from multiple applications are not currently supported.
|
|
||||||
- Multiple applications can work normally if only one program calls `cancel_all_watch();`.
|
|
||||||
|
|
||||||
## Usage
|
|
||||||
|
|
||||||
Example: helloworld.c
|
|
||||||
- Add `#include "watch.h"`
|
|
||||||
- Set each variable that needs to be monitored: name && address && length, set threshold, comparison method, timer interval (ns), etc.
|
|
||||||
- `start_watch(watch_arg);` Start monitoring
|
|
||||||
- Call `cancel_all_watch();` when you need to cancel monitoring
|
|
||||||
|
|
||||||
When the set conditions are exceeded, the system stack information is printed and viewed with `dmesg`, as shown in the following example:
|
|
||||||
- Within a timer, if multiple variables exceed the threshold, the stack information will not be output repeatedly;
|
|
||||||
- The timer restart time after printing the stack is 1s, and the next round of monitoring will start after 1s.
|
|
||||||
|
|
||||||
```log
|
```log
|
||||||
[86245.364861] -------------------------------------
|
11.9 多个变量监控支持
|
||||||
[86245.364864] -------------watch monitor-----------
|
11.10 按照 pid 区分不同内核结构, 支持每个进程单独申请取消自己的监控.
|
||||||
[86245.364865] Threshold reached:
|
11.13 用户接口 cancel_all_watch -> cancel_watch, 每个进程互不干扰.
|
||||||
name: temp0, threshold: 150
|
11.28 完全重构,更新文档.
|
||||||
[86245.364866] Timestamp (ns): 1699589000606300743
|
|
||||||
[86245.364867] Recent Load: 116.65, 126.83, 151.17
|
|
||||||
[86245.365669] task: name lcore-worker-4, pid 803327
|
|
||||||
[86245.365672] task: name lcore-worker-5, pid 803328
|
|
||||||
[86245.365673] task: name lcore-worker-6, pid 803329
|
|
||||||
[86245.365674] task: name lcore-worker-7, pid 803330
|
|
||||||
[86245.365676] task: name lcore-worker-8, pid 803331
|
|
||||||
[86245.365677] task: name lcore-worker-9, pid 803332
|
|
||||||
[86245.365679] task: name lcore-worker-10, pid 803333
|
|
||||||
[86245.365681] task: name lcore-worker-11, pid 803334
|
|
||||||
[86245.365682] task: name lcore-worker-68, pid 803335
|
|
||||||
[86245.365683] task: name lcore-worker-69, pid 803336
|
|
||||||
[86245.365684] task: name lcore-worker-70, pid 803337
|
|
||||||
[86245.365685] task: name lcore-worker-71, pid 803338
|
|
||||||
[86245.365686] task: name lcore-worker-72, pid 803339
|
|
||||||
[86245.365687] task: name lcore-worker-73, pid 803340
|
|
||||||
[86245.365688] task: name lcore-worker-74, pid 803341
|
|
||||||
[86245.365689] task: name lcore-worker-75, pid 803342
|
|
||||||
[86245.365694] task: name pkt:worker-0, pid 803638
|
|
||||||
[86245.365702] hrtimer_nanosleep+0x8d/0x120
|
|
||||||
[86245.365709] __x64_sys_nanosleep+0x96/0xd0
|
|
||||||
[86245.365711] do_syscall_64+0x37/0x80
|
|
||||||
[86245.365716] entry_SYSCALL_64_after_hwframe+0x44/0xae
|
|
||||||
[86245.365718] task: name pkt:worker-1, pid 803639
|
|
||||||
[86245.365721] hrtimer_nanosleep+0x8d/0x120
|
|
||||||
[86245.365724] __x64_sys_nanosleep+0x96/0xd0
|
|
||||||
[86245.365726] do_syscall_64+0x37/0x80
|
|
||||||
[86245.365728] entry_SYSCALL_64_after_hwframe+0x44/0xae
|
|
||||||
[86245.365730] task: name pkt:worker-2, pid 803640
|
|
||||||
[86245.365732] hrtimer_nanosleep+0x8d/0x120
|
|
||||||
[86245.365734] __x64_sys_nanosleep+0x96/0xd0
|
|
||||||
[86245.365737] do_syscall_64+0x37/0x80
|
|
||||||
[86245.365739] entry_SYSCALL_64_after_hwframe+0x44/0xae
|
|
||||||
[86245.365740] task: name pkt:worker-3, pid 803641
|
|
||||||
[86245.365743] hrtimer_nanosleep+0x8d/0x120
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### Parameter Description
|
## 说明
|
||||||
|
|
||||||
start_watch passes in the watch_arg structure. The meaning of each field is as follows
|
监控 数值变量(给定 地址,长度), 达到设定条件打印系统内 Task 信息(用户态堆栈/内核态堆栈/调用链信息).
|
||||||
- name limit `MAX_NAME_LEN`(15) valid characters
|
- 支持多进程, 单个进程退出时,取消该进程的所有监控.
|
||||||
|
- 相同定时间隔会分配到同一个定时器,一个定时器最多监控 32 个变量,全局最多 128 个定时器.
|
||||||
|
- 以上数量限制定义在 `source/module/monitor_timer.h`.
|
||||||
|
- `testcase/helloworld.c` 有测试到单进程 2049 个变量;
|
||||||
|
|
||||||
|
文件结构
|
||||||
|
|
||||||
|
```log
|
||||||
|
├── build // output
|
||||||
|
├── source // all source code
|
||||||
|
│ ├── buffer // 模块与用户空间通信的缓冲区
|
||||||
|
│ ├── module // 模块代码
|
||||||
|
│ ├── uapi // 用户空间接口
|
||||||
|
│ ├── ucli // 用户空间命令行工具
|
||||||
|
│ └── ucli_py // 用户空间命令行 python (仅测试用,待完成)
|
||||||
|
│ └── libunwind // python 解析堆栈信息移植库
|
||||||
|
├── testcase // 测试用例
|
||||||
|
└── tools // 测试工具
|
||||||
|
```
|
||||||
|
|
||||||
|
## 使用
|
||||||
|
|
||||||
|
设定对变量监控有两种函数: 宏定义 或 定义 watch_arg 结构体
|
||||||
|
- 都需要添加 `source/uapi` 下的头文件 `#include "monitor_user.h"`
|
||||||
|
|
||||||
|
需要取消监控时调用 `cancel_watch();` variant_monitor 会取消该进程所有监控.
|
||||||
|
- 当进程退出后,也会执行相同的操作,取消该进程所有监控.
|
||||||
|
- 因此调用 `cancel_watch();` 是个可选项,但依然建议调用以避免可能的内存泄漏.
|
||||||
|
|
||||||
|
获取 Task 信息是一项耗时操作,这里使用了 workqueue 处理,且一次处理后该定时器重启间隔默认为 5s.
|
||||||
|
- 此值可以在 `/proc/variable_monitor/dump_reset_sec` 查看和修改.
|
||||||
|
|
||||||
|
### 挂载驱动
|
||||||
|
|
||||||
|
项目根目录
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# 编译加载模块
|
||||||
|
make && insmod source/variable_monitor.ko
|
||||||
|
# 卸载模块,清理编译文件
|
||||||
|
# rmmod source/variable_monitor.ko && make clean
|
||||||
|
# 仅在 `kernel 5.17.15-1.el8.x86_64` 测试,其他内核版本未测试.
|
||||||
|
```
|
||||||
|
|
||||||
|
### 宏定义
|
||||||
|
|
||||||
|
示例如 `testcase/helloworld.c`, 对常见数值类型宏定义 方便使用:
|
||||||
|
- 其他类型见 `source/uapi/monitor_user_sw.h`
|
||||||
|
```c
|
||||||
|
// 传入变量名 | 地址 | 阈值
|
||||||
|
START_WATCH_INT("temp", &temp, 150);
|
||||||
|
START_WATCH_INT_LESS("temp", &temp, 150);
|
||||||
|
```
|
||||||
|
|
||||||
|
默认情况下,使用宏定义 定时器的时间间隔为 10us; 此值可以在 `/proc/variable_monitor/def_interval_ns` 查看和修改.
|
||||||
|
|
||||||
|
### watch_arg 结构体
|
||||||
|
|
||||||
|
如果需要对定时间隔等有更多控制,请定义 watch_arg 结构体,start_watch 启动监控:
|
||||||
|
- 对每个需要监控的变量 设置: 名称 && 地址 && 长度, 设置阈值, 比较方式, 定时器间隔(ns) 等.
|
||||||
|
- `start_watch(watch_arg);` 启动监控
|
||||||
|
- 需要取消监控时调用 `cancel_watch();`
|
||||||
|
|
||||||
```c
|
```c
|
||||||
|
// start_watch 传入的是 watch_arg 结构体.各个字段意义如下
|
||||||
|
// - name 限制 `MAX_NAME_LEN`(15) 个有效字符
|
||||||
typedef struct
|
typedef struct
|
||||||
{
|
{
|
||||||
pid_t task_id; // current process id
|
pid_t task_id; // current process id
|
||||||
@@ -82,63 +89,156 @@ typedef struct
|
|||||||
unsigned char greater_flag; // reverse flag (true: >, false: <)
|
unsigned char greater_flag; // reverse flag (true: >, false: <)
|
||||||
unsigned long time_ns; // timer interval (ns)
|
unsigned long time_ns; // timer interval (ns)
|
||||||
} watch_arg;
|
} watch_arg;
|
||||||
```
|
|
||||||
|
|
||||||
An initialization example
|
//一个初始化示例
|
||||||
|
|
||||||
```c
|
|
||||||
watch_args = (watch_arg){
|
watch_args = (watch_arg){
|
||||||
.task_id = getpid(),
|
.task_id = getpid(),
|
||||||
.ptr = &temp,
|
.ptr = &temp,
|
||||||
.name = "temp",
|
.name = "temp",
|
||||||
.length_byte = sizeof(int),
|
.length_byte = sizeof(int),
|
||||||
.threshold = 150 + i,
|
.threshold = 150,
|
||||||
.unsigned_flag = 0,
|
.unsigned_flag = 0,
|
||||||
.greater_flag = 1,
|
.greater_flag = 1,
|
||||||
.time_ns = 2000 + (i / 33) * 5000
|
.time_ns = 2000 + 5000
|
||||||
};
|
};
|
||||||
|
start_watch(watch_args);
|
||||||
|
```
|
||||||
|
|
||||||
|
### 打印输出
|
||||||
|
|
||||||
|
定时器不断按照设定间隔轮询变量,当达到设定条件时,采集此时系统内符合要求的 Task 信息(用户态堆栈/内核态堆栈/调用链信息).
|
||||||
|
- `dmesg` 可以查看到具体的超出设定条件的变量信息;
|
||||||
|
- Task 信息被输出到缓存区,使用 ucli 工具查看.
|
||||||
|
|
||||||
|
`dmesg` 打印示例如下
|
||||||
|
|
||||||
|
```log
|
||||||
|
[42865.640988] -------------------------------------
|
||||||
|
[42865.640992] -----------variable monitor----------
|
||||||
|
[42865.640993] 超出阈值:1701141698684973655
|
||||||
|
[42865.640994] : pid: 63936, name: temp0, ptr: 00000000bade6e61, threshold:110
|
||||||
|
[42865.648068] -------------------------------------
|
||||||
|
[42875.640703] -------------------------------------
|
||||||
|
[42875.640706] -----------variable monitor----------
|
||||||
|
[42875.640706] 超出阈值:1701141708684881779
|
||||||
|
[42875.640708] : pid: 63936, name: temp0, ptr: 00000000bade6e61, threshold:110
|
||||||
|
[42875.640710] : pid: 63936, name: temp1, ptr: 00000000ee645b96, threshold:111
|
||||||
|
[42875.640711] : pid: 63936, name: temp2, ptr: 00000000f62b7afe, threshold:112
|
||||||
|
[42875.640711] : pid: 63936, name: temp3, ptr: 00000000d100fa3c, threshold:113
|
||||||
|
[42875.640712] : pid: 63936, name: temp4, ptr: 000000006d31cae1, threshold:114
|
||||||
|
[42875.640712] : pid: 63936, name: temp5, ptr: 00000000723c7a2a, threshold:115
|
||||||
|
[42875.640713] : pid: 63936, name: temp6, ptr: 0000000026ef6e83, threshold:116
|
||||||
|
[42875.640714] : pid: 63936, name: temp7, ptr: 00000000fc1e5d5e, threshold:117
|
||||||
|
[42875.640714] : pid: 63936, name: temp8, ptr: 0000000069b2666e, threshold:118
|
||||||
|
[42875.640715] : pid: 63936, name: temp9, ptr: 000000000176263d, threshold:119
|
||||||
|
[42875.648023] -------------------------------------
|
||||||
|
```
|
||||||
|
|
||||||
|
默认情况下 `ucli` 编译后在 build 文件夹下
|
||||||
|
|
||||||
|
`ucli > output`
|
||||||
|
- ucli 会将缓存区内容解析后输出到 `output` 文件中.
|
||||||
|
- **此操作会清空缓存区**
|
||||||
|
|
||||||
|
`ucli` 工具输出示例如下(详情见 output_example)
|
||||||
|
- userstack 是 testcase 下的堆栈信息测试程序.
|
||||||
|
|
||||||
|
```log
|
||||||
|
##CGROUP:[/] 51666 [510] 采样命中[D]
|
||||||
|
进程信息: [/ / userstack], PID: 51666 / 51666
|
||||||
|
##C++ pid 51666
|
||||||
|
用户态堆栈SP:7ffcd5822298, BP:2, IP:7f071c720838
|
||||||
|
#~ 0x7f071c720838 __GI___nanosleep ([symbol])
|
||||||
|
#~ 0x7f071c72076e __sleep ([symbol])
|
||||||
|
#~ 0x400a08 customFunction1 ([symbol])
|
||||||
|
#~ 0x400a64 customFunction3 ([symbol])
|
||||||
|
#~ 0x400a42 customFunction2 ([symbol])
|
||||||
|
#~ 0x400a21 customFunction1 ([symbol])
|
||||||
|
#~ 0x400a64 customFunction3 ([symbol])
|
||||||
|
#~ 0x400a42 customFunction2 ([symbol])
|
||||||
|
#~ 0x400a21 customFunction1 ([symbol])
|
||||||
|
#~ 0x400a64 customFunction3 ([symbol])
|
||||||
|
#~ 0x400a42 customFunction2 ([symbol])
|
||||||
|
#~ 0x400a21 customFunction1 ([symbol])
|
||||||
|
#~ 0x400a64 customFunction3 ([symbol])
|
||||||
|
#~ 0x400a42 customFunction2 ([symbol])
|
||||||
|
#~ 0x400a21 customFunction1 ([symbol])
|
||||||
|
#~ 0x400a64 customFunction3 ([symbol])
|
||||||
|
#~ 0x400a42 customFunction2 ([symbol])
|
||||||
|
#~ 0x400a21 customFunction1 ([symbol])
|
||||||
|
#~ 0x400a64 customFunction3 ([symbol])
|
||||||
|
#~ 0x400a42 customFunction2 ([symbol])
|
||||||
|
#~ 0x400a21 customFunction1 ([symbol])
|
||||||
|
#~ 0x400a64 customFunction3 ([symbol])
|
||||||
|
#~ 0x400a42 customFunction2 ([symbol])
|
||||||
|
#~ 0x400a21 customFunction1 ([symbol])
|
||||||
|
#~ 0x400a64 customFunction3 ([symbol])
|
||||||
|
#~ 0x400a42 customFunction2 ([symbol])
|
||||||
|
#~ 0x400a21 customFunction1 ([symbol])
|
||||||
|
#~ 0x400a64 customFunction3 ([symbol])
|
||||||
|
#~ 0x400a42 customFunction2 ([symbol])
|
||||||
|
#~ 0x400a21 customFunction1 ([symbol])
|
||||||
|
#~ 0x400a64 customFunction3 ([symbol])
|
||||||
|
#~ 0x400a42 customFunction2 ([symbol])
|
||||||
|
#~ 0x400a21 customFunction1 ([symbol])
|
||||||
|
#~ 0x400a75 main ([symbol])
|
||||||
|
#~ 0x7f071c661d85 __libc_start_main ([symbol])
|
||||||
|
#~ 0x40081e _start ([symbol])
|
||||||
|
内核态堆栈:
|
||||||
|
#@ 0xffffffff811730dd hrtimer_nanosleep ([kernel.kallsyms])
|
||||||
|
#@ 0xffffffff811733a6 __x64_sys_nanosleep ([kernel.kallsyms])
|
||||||
|
#@ 0xffffffff819fa117 do_syscall_64 ([kernel.kallsyms])
|
||||||
|
#@ 0xffffffff81c0007c entry_SYSCALL_64_after_hwframe ([kernel.kallsyms])
|
||||||
|
#@ 0xffffffff819fa117 do_syscall_64 ([kernel.kallsyms])
|
||||||
|
#@ 0xffffffff81c0007c entry_SYSCALL_64_after_hwframe ([kernel.kallsyms])
|
||||||
|
#@ 0xffffffff819fa117 do_syscall_64 ([kernel.kallsyms])
|
||||||
|
#@ 0xffffffff81c0007c entry_SYSCALL_64_after_hwframe ([kernel.kallsyms])
|
||||||
|
#* 0xffffffffffffff userstack (UNKNOWN)
|
||||||
|
进程链信息:
|
||||||
|
#^ 0xffffffffffffff ./build/userstack (UNKNOWN)
|
||||||
|
#^ 0xffffffffffffff /bin/bash --init-file /root/.vscode-server-insiders/cli/servers/Insiders-ca9da6c177fc4cf7429e1d0c1c52f710d6d953c6/server/out/vs/workbench/contrib/terminal/browser/media/shellIntegration-bash.sh (UNKNOWN)
|
||||||
|
#^ 0xffffffffffffff /root/.vscode-server-insiders/cli/servers/Insiders-ca9da6c177fc4cf7429e1d0c1c52f710d6d953c6/server/node /root/.vscode-server-insiders/cli/servers/Insiders-ca9da6c177fc4cf7429e1d0c1c52f710d6d953c6/server/out/bootstrap-fork --type=ptyHost --logsPath /root/ (UNKNOWN)
|
||||||
|
#^ 0xffffffffffffff /root/.vscode-server-insiders/cli/servers/Insiders-ca9da6c177fc4cf7429e1d0c1c52f710d6d953c6/server/node /root/.vscode-server-insiders/cli/servers/Insiders-ca9da6c177fc4cf7429e1d0c1c52f710d6d953c6/server/out/server-main.js --connection-token=remotessh --a (UNKNOWN)
|
||||||
|
#^ 0xffffffffffffff sh /root/.vscode-server-insiders/cli/servers/Insiders-ca9da6c177fc4cf7429e1d0c1c52f710d6d953c6/server/bin/code-server-insiders --connection-token=remotessh --accept-server-license-terms --start-server --enable-remote-auto-shutdown --socket-path=/tmp/code (UNKNOWN)
|
||||||
|
#^ 0xffffffffffffff /root/.vscode-server-insiders/code-insiders-ca9da6c177fc4cf7429e1d0c1c52f710d6d953c6 command-shell --cli-data-dir /root/.vscode-server-insiders/cli --on-port --require-token b5a047063eb7 (UNKNOWN)
|
||||||
|
#^ 0xffffffffffffff /usr/lib/systemd/systemd --switched-root --system --deserialize 17 (UNKNOWN)
|
||||||
|
##
|
||||||
```
|
```
|
||||||
|
|
||||||
## demo
|
## demo
|
||||||
|
|
||||||
In the main project directory:
|
usercase 文件夹下
|
||||||
|
- `helloworld.c`: 测试大量变量监控
|
||||||
|
- `userstack.c`: 测试用户态堆栈输出
|
||||||
|
- `hptest.c`: 测试 hugePage 挂载
|
||||||
|
|
||||||
```bash
|
## 其他
|
||||||
make && insmod watch_module.ko
|
|
||||||
./watch
|
|
||||||
```
|
|
||||||
|
|
||||||
You can see the printed stack information in dmesg
|
程序分为两部分: 字符设备 和 用户空间接口, 两者通过 ioctl 通信.
|
||||||
|
|
||||||
```bash
|
用户空间地址访问
|
||||||
# Unload module and clean compile files
|
- 用户程序传入的变量 虚拟地址, 使用 `get_user_pages_remote` 获取地址所在内存页, `kmap` 将其映射到内核.
|
||||||
rmmod watch_module.ko && make clean
|
- 192.168.40.204 环境下,HugeTLB Pages 测试挂载正常.
|
||||||
```
|
- 内存页地址 + 偏移量存入定时器对应的 `kernel_watch_arg` 中, hrTimer 轮询时访问 `kernel_watch_arg` 得到真实值.
|
||||||
|
|
||||||
Only tested on kernel 5.17.15-1.el8.x86_64.
|
定时器分组
|
||||||
|
- hrTimer 数据结构定义在全局数组 `kernel_wtimer_list`.分配定时器时,会检查遍历 `kernel_wtimer_list` 比较定时器间隔,
|
||||||
|
- 相同定时间隔的 watch 分配到同一组,对应同一个 hrTimer.
|
||||||
|
- 若一个定时器监控变量数量超过 `TIMER_MAX_WATCH_NUM` (32),则会创建一个新的 hrTimer.
|
||||||
|
- hrTimer 的总数量(`kernel_wtimer_list` 数组长度)限制是 `MAX_TIMER_NUM`(128).
|
||||||
|
|
||||||
## Other
|
内存页 mount/unmount
|
||||||
|
- `get_user_pages_remote`/ `kmap` 会增加对应的计数,需要对等的 `put_page`/`kunmap`.
|
||||||
|
- 一个模块内全局链表 `watch_local_memory_list` 存储每一个成功挂载的变量对应的 page 和 kt,执行字符设备的 close 操作时,遍历并卸载.
|
||||||
|
|
||||||
The program is divided into two parts: character device and user space interface, both of which communicate through ioctl.
|
variable monitor 添加/删除
|
||||||
|
- kernel_watch_arg 数据结构中有 pid 的成员变量,但添加变量监控时,不按照进程区分.
|
||||||
|
- 删除时遍历全部监控变量,比较 pid.
|
||||||
|
- 删除造成的缺位,将最后的变量移动到空位, sentinel--; hrTimer 同理.
|
||||||
|
|
||||||
User space address access
|
堆栈输出条件: 条件参考自 [diagnose-tools::load.c](https://github.com/alibaba/diagnose-tools/blob/e285bc4626a7d207eabd4a69cb276e1a3b1b7c76/SOURCE/module/kernel/load.c#L209)
|
||||||
- The variable virtual address passed in by the user program, use `get_user_pages_remote` to obtain the memory page where the address is located, and `kmap` maps it to the kernel.
|
- `TASK` 要满足 TASK_RUNNING 和 `__task_contributes_to_load` 和 `TASK_IDLE`(可能有阻塞进程).
|
||||||
- In the 192.168.40.204 environment, the HugeTLB Pages test mounts normally.
|
- `__task_contributes_to_load` 对应内核宏 `task_contributes_to_loa`.
|
||||||
- The memory page address + offset is stored in the `kernel_watch_arg` corresponding to the timer, and hrTimer accesses `kernel_watch_arg` when polling to get the real value.
|
|
||||||
|
|
||||||
timer grouping
|
|
||||||
- The hrTimer data structure is defined in the global array `kernel_wtimer_list`. When allocating a timer, it will check the traversal `kernel_wtimer_list` to compare the timer interval.
|
|
||||||
- Watches with the same timing interval are assigned to the same group and correspond to the same hrTimer.
|
|
||||||
- If the number of variables monitored by a timer exceeds `TIMER_MAX_WATCH_NUM` (32), a new hrTimer will be created.
|
|
||||||
- The total number of hrTimers (`kernel_wtimer_list` array length) limit is `MAX_TIMER_NUM`(128).
|
|
||||||
|
|
||||||
Memory page mount/unmount
|
|
||||||
- `get_user_pages_remote`/ `kmap` will increase the corresponding count and requires the equivalent `put_page`/`kunmap`.
|
|
||||||
- A global linked list in the module `watch_local_memory_list` stores the page and kt corresponding to each successfully mounted variable. When performing the close operation of the character device, it is traversed and unloaded.
|
|
||||||
|
|
||||||
Stack output conditions: The conditions are referenced from [diagnose-tools::load.c](https://github.com/alibaba/diagnose-tools/blob/e285bc4626a7d207eabd4a69cb276e1a3b1b7c76/SOURCE/module/kernel/load.c#L209)
|
|
||||||
- `TASK` must satisfy TASK_RUNNING and `__task_contributes_to_load`.
|
|
||||||
- `__task_contributes_to_load` corresponds to the kernel macro `task_contributes_to_loa`.
|
|
||||||
|
|
||||||
```c
|
```c
|
||||||
// https://www.spinics.net/lists/kernel/msg3582022.html
|
// https://www.spinics.net/lists/kernel/msg3582022.html
|
||||||
|
|||||||
150
README_en.md
Normal file
150
README_en.md
Normal file
@@ -0,0 +1,150 @@
|
|||||||
|
## Variable Monitor
|
||||||
|
|
||||||
|
Monitor numerical variables (given address, length), and print system stack information when the set conditions are exceeded.
|
||||||
|
|
||||||
|
Number of simultaneous monitoring
|
||||||
|
- Monitoring with the same timing length will be grouped into one group, corresponding to one timer.
|
||||||
|
- A set of up to 32 variables, after which a new timer is allocated.
|
||||||
|
- The global maximum number of timers is 128.
|
||||||
|
- The above quantity limit is defined in the `watch_module.h` header macro.
|
||||||
|
|
||||||
|
Currently, monitoring is limited to the same application, and simultaneous calls from multiple applications are not currently supported.
|
||||||
|
- Multiple applications can work normally if only one program calls `cancel_all_watch();`.
|
||||||
|
|
||||||
|
## Usage
|
||||||
|
|
||||||
|
Example: helloworld.c
|
||||||
|
- Add `#include "watch.h"`
|
||||||
|
- Set each variable that needs to be monitored: name && address && length, set threshold, comparison method, timer interval (ns), etc.
|
||||||
|
- `start_watch(watch_arg);` Start monitoring
|
||||||
|
- Call `cancel_all_watch();` when you need to cancel monitoring
|
||||||
|
|
||||||
|
When the set conditions are exceeded, the system stack information is printed and viewed with `dmesg`, as shown in the following example:
|
||||||
|
- Within a timer, if multiple variables exceed the threshold, the stack information will not be output repeatedly;
|
||||||
|
- The timer restart time after printing the stack is 1s, and the next round of monitoring will start after 1s.
|
||||||
|
|
||||||
|
```log
|
||||||
|
[86245.364861] -------------------------------------
|
||||||
|
[86245.364864] -------------watch monitor-----------
|
||||||
|
[86245.364865] Threshold reached:
|
||||||
|
name: temp0, threshold: 150
|
||||||
|
[86245.364866] Timestamp (ns): 1699589000606300743
|
||||||
|
[86245.364867] Recent Load: 116.65, 126.83, 151.17
|
||||||
|
[86245.365669] task: name lcore-worker-4, pid 803327
|
||||||
|
[86245.365672] task: name lcore-worker-5, pid 803328
|
||||||
|
[86245.365673] task: name lcore-worker-6, pid 803329
|
||||||
|
[86245.365674] task: name lcore-worker-7, pid 803330
|
||||||
|
[86245.365676] task: name lcore-worker-8, pid 803331
|
||||||
|
[86245.365677] task: name lcore-worker-9, pid 803332
|
||||||
|
[86245.365679] task: name lcore-worker-10, pid 803333
|
||||||
|
[86245.365681] task: name lcore-worker-11, pid 803334
|
||||||
|
[86245.365682] task: name lcore-worker-68, pid 803335
|
||||||
|
[86245.365683] task: name lcore-worker-69, pid 803336
|
||||||
|
[86245.365684] task: name lcore-worker-70, pid 803337
|
||||||
|
[86245.365685] task: name lcore-worker-71, pid 803338
|
||||||
|
[86245.365686] task: name lcore-worker-72, pid 803339
|
||||||
|
[86245.365687] task: name lcore-worker-73, pid 803340
|
||||||
|
[86245.365688] task: name lcore-worker-74, pid 803341
|
||||||
|
[86245.365689] task: name lcore-worker-75, pid 803342
|
||||||
|
[86245.365694] task: name pkt:worker-0, pid 803638
|
||||||
|
[86245.365702] hrtimer_nanosleep+0x8d/0x120
|
||||||
|
[86245.365709] __x64_sys_nanosleep+0x96/0xd0
|
||||||
|
[86245.365711] do_syscall_64+0x37/0x80
|
||||||
|
[86245.365716] entry_SYSCALL_64_after_hwframe+0x44/0xae
|
||||||
|
[86245.365718] task: name pkt:worker-1, pid 803639
|
||||||
|
[86245.365721] hrtimer_nanosleep+0x8d/0x120
|
||||||
|
[86245.365724] __x64_sys_nanosleep+0x96/0xd0
|
||||||
|
[86245.365726] do_syscall_64+0x37/0x80
|
||||||
|
[86245.365728] entry_SYSCALL_64_after_hwframe+0x44/0xae
|
||||||
|
[86245.365730] task: name pkt:worker-2, pid 803640
|
||||||
|
[86245.365732] hrtimer_nanosleep+0x8d/0x120
|
||||||
|
[86245.365734] __x64_sys_nanosleep+0x96/0xd0
|
||||||
|
[86245.365737] do_syscall_64+0x37/0x80
|
||||||
|
[86245.365739] entry_SYSCALL_64_after_hwframe+0x44/0xae
|
||||||
|
[86245.365740] task: name pkt:worker-3, pid 803641
|
||||||
|
[86245.365743] hrtimer_nanosleep+0x8d/0x120
|
||||||
|
```
|
||||||
|
|
||||||
|
### Parameter Description
|
||||||
|
|
||||||
|
start_watch passes in the watch_arg structure. The meaning of each field is as follows
|
||||||
|
- name limit `MAX_NAME_LEN`(15) valid characters
|
||||||
|
|
||||||
|
```c
|
||||||
|
typedef struct
|
||||||
|
{
|
||||||
|
pid_t task_id; // current process id
|
||||||
|
char name[MAX_NAME_LEN + 1]; // name (15+1)
|
||||||
|
void *ptr; // virtual address
|
||||||
|
int length_byte; // byte
|
||||||
|
long long threshold; // threshold value
|
||||||
|
unsigned char unsigned_flag; // unsigned flag (true: unsigned, false: signed)
|
||||||
|
unsigned char greater_flag; // reverse flag (true: >, false: <)
|
||||||
|
unsigned long time_ns; // timer interval (ns)
|
||||||
|
} watch_arg;
|
||||||
|
```
|
||||||
|
|
||||||
|
An initialization example
|
||||||
|
|
||||||
|
```c
|
||||||
|
watch_args = (watch_arg){
|
||||||
|
.task_id = getpid(),
|
||||||
|
.ptr = &temp,
|
||||||
|
.name = "temp",
|
||||||
|
.length_byte = sizeof(int),
|
||||||
|
.threshold = 150 + i,
|
||||||
|
.unsigned_flag = 0,
|
||||||
|
.greater_flag = 1,
|
||||||
|
.time_ns = 2000 + (i / 33) * 5000
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
## demo
|
||||||
|
|
||||||
|
In the main project directory:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
make && insmod watch_module.ko
|
||||||
|
./watch
|
||||||
|
```
|
||||||
|
|
||||||
|
You can see the printed stack information in dmesg
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Unload module and clean compile files
|
||||||
|
rmmod watch_module.ko && make clean
|
||||||
|
```
|
||||||
|
|
||||||
|
Only tested on kernel 5.17.15-1.el8.x86_64.
|
||||||
|
|
||||||
|
## Other
|
||||||
|
|
||||||
|
The program is divided into two parts: character device and user space interface, both of which communicate through ioctl.
|
||||||
|
|
||||||
|
User space address access
|
||||||
|
- The variable virtual address passed in by the user program, use `get_user_pages_remote` to obtain the memory page where the address is located, and `kmap` maps it to the kernel.
|
||||||
|
- In the 192.168.40.204 environment, the HugeTLB Pages test mounts normally.
|
||||||
|
- The memory page address + offset is stored in the `kernel_watch_arg` corresponding to the timer, and hrTimer accesses `kernel_watch_arg` when polling to get the real value.
|
||||||
|
|
||||||
|
timer grouping
|
||||||
|
- The hrTimer data structure is defined in the global array `kernel_wtimer_list`. When allocating a timer, it will check the traversal `kernel_wtimer_list` to compare the timer interval.
|
||||||
|
- Watches with the same timing interval are assigned to the same group and correspond to the same hrTimer.
|
||||||
|
- If the number of variables monitored by a timer exceeds `TIMER_MAX_WATCH_NUM` (32), a new hrTimer will be created.
|
||||||
|
- The total number of hrTimers (`kernel_wtimer_list` array length) limit is `MAX_TIMER_NUM`(128).
|
||||||
|
|
||||||
|
Memory page mount/unmount
|
||||||
|
- `get_user_pages_remote`/ `kmap` will increase the corresponding count and requires the equivalent `put_page`/`kunmap`.
|
||||||
|
- A global linked list in the module `watch_local_memory_list` stores the page and kt corresponding to each successfully mounted variable. When performing the close operation of the character device, it is traversed and unloaded.
|
||||||
|
|
||||||
|
Stack output conditions: The conditions are referenced from [diagnose-tools::load.c](https://github.com/alibaba/diagnose-tools/blob/e285bc4626a7d207eabd4a69cb276e1a3b1b7c76/SOURCE/module/kernel/load.c#L209)
|
||||||
|
- `TASK` must satisfy TASK_RUNNING and `__task_contributes_to_load`.
|
||||||
|
- `__task_contributes_to_load` corresponds to the kernel macro `task_contributes_to_loa`.
|
||||||
|
|
||||||
|
```c
|
||||||
|
// https://www.spinics.net/lists/kernel/msg3582022.html
|
||||||
|
// remove from 5.8.rc3,but it still work
|
||||||
|
// whether the task contributes to the load
|
||||||
|
#define __task_contributes_to_load(task) \
|
||||||
|
((READ_ONCE(task->__state) & TASK_UNINTERRUPTIBLE) != 0 && (task->flags & PF_FROZEN) == 0 && \
|
||||||
|
(READ_ONCE(task->__state) & TASK_NOLOAD) == 0)
|
||||||
|
```
|
||||||
250
README_zh.md
250
README_zh.md
@@ -1,250 +0,0 @@
|
|||||||
## Variable Monitor
|
|
||||||
|
|
||||||
changelog
|
|
||||||
|
|
||||||
```log
|
|
||||||
11.9 多个变量监控支持
|
|
||||||
11.10 按照 pid 区分不同内核结构, 支持每个进程单独申请取消自己的监控.
|
|
||||||
11.13 用户接口 cancel_all_watch -> cancel_watch, 每个进程互不干扰.
|
|
||||||
11.28 完全重构,更新文档.
|
|
||||||
```
|
|
||||||
|
|
||||||
## 说明
|
|
||||||
|
|
||||||
监控 数值变量(给定 地址,长度), 达到设定条件打印系统内 Task 信息(用户态堆栈/内核态堆栈/调用链信息).
|
|
||||||
- 支持多进程, 单个进程退出时,取消该进程的所有监控.
|
|
||||||
- 相同定时间隔会分配到同一个定时器,一个定时器最多监控 32 个变量,全局最多 128 个定时器.
|
|
||||||
- 以上数量限制定义在 `source/module/monitor_timer.h`.
|
|
||||||
- `testcase/helloworld.c` 有测试到单进程 2049 个变量;
|
|
||||||
|
|
||||||
文件结构
|
|
||||||
|
|
||||||
```log
|
|
||||||
├── build // output
|
|
||||||
├── source // all source code
|
|
||||||
│ ├── buffer // 模块与用户空间通信的缓冲区
|
|
||||||
│ ├── module // 模块代码
|
|
||||||
│ ├── uapi // 用户空间接口
|
|
||||||
│ ├── ucli // 用户空间命令行工具
|
|
||||||
│ └── ucli_py // 用户空间命令行 python (仅测试用,待完成)
|
|
||||||
│ └── libunwind // python 解析堆栈信息移植库
|
|
||||||
├── testcase // 测试用例
|
|
||||||
└── tools // 测试工具
|
|
||||||
```
|
|
||||||
|
|
||||||
## 使用
|
|
||||||
|
|
||||||
设定对变量监控有两种函数: 宏定义 或 定义 watch_arg 结构体
|
|
||||||
- 都需要添加 `source/uapi` 下的头文件 `#include "monitor_user.h"`
|
|
||||||
|
|
||||||
需要取消监控时调用 `cancel_watch();` variant_monitor 会取消该进程所有监控.
|
|
||||||
- 当进程退出后,也会执行相同的操作,取消该进程所有监控.
|
|
||||||
- 因此调用 `cancel_watch();` 是个可选项,但依然建议调用以避免可能的内存泄漏.
|
|
||||||
|
|
||||||
获取 Task 信息是一项耗时操作,这里使用了 workqueue 处理,且一次处理后该定时器重启间隔默认为 5s.
|
|
||||||
- 此值可以在 `/proc/variable_monitor/dump_reset_sec` 查看和修改.
|
|
||||||
|
|
||||||
### 挂载驱动
|
|
||||||
|
|
||||||
项目根目录
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# 编译加载模块
|
|
||||||
make && insmod source/variable_monitor.ko
|
|
||||||
# 卸载模块,清理编译文件
|
|
||||||
# rmmod source/variable_monitor.ko && make clean
|
|
||||||
# 仅在 `kernel 5.17.15-1.el8.x86_64` 测试,其他内核版本未测试.
|
|
||||||
```
|
|
||||||
|
|
||||||
### 宏定义
|
|
||||||
|
|
||||||
示例如 `testcase/helloworld.c`, 对常见数值类型宏定义 方便使用:
|
|
||||||
- 其他类型见 `source/uapi/monitor_user_sw.h`
|
|
||||||
```c
|
|
||||||
// 传入变量名 | 地址 | 阈值
|
|
||||||
START_WATCH_INT("temp", &temp, 150);
|
|
||||||
START_WATCH_INT_LESS("temp", &temp, 150);
|
|
||||||
```
|
|
||||||
|
|
||||||
默认情况下,使用宏定义 定时器的时间间隔为 10us; 此值可以在 `/proc/variable_monitor/def_interval_ns` 查看和修改.
|
|
||||||
|
|
||||||
### watch_arg 结构体
|
|
||||||
|
|
||||||
如果需要对定时间隔等有更多控制,请定义 watch_arg 结构体,start_watch 启动监控:
|
|
||||||
- 对每个需要监控的变量 设置: 名称 && 地址 && 长度, 设置阈值, 比较方式, 定时器间隔(ns) 等.
|
|
||||||
- `start_watch(watch_arg);` 启动监控
|
|
||||||
- 需要取消监控时调用 `cancel_watch();`
|
|
||||||
|
|
||||||
```c
|
|
||||||
// start_watch 传入的是 watch_arg 结构体.各个字段意义如下
|
|
||||||
// - name 限制 `MAX_NAME_LEN`(15) 个有效字符
|
|
||||||
typedef struct
|
|
||||||
{
|
|
||||||
pid_t task_id; // current process id
|
|
||||||
char name[MAX_NAME_LEN + 1]; // name (15+1)
|
|
||||||
void *ptr; // virtual address
|
|
||||||
int length_byte; // byte
|
|
||||||
long long threshold; // threshold value
|
|
||||||
unsigned char unsigned_flag; // unsigned flag (true: unsigned, false: signed)
|
|
||||||
unsigned char greater_flag; // reverse flag (true: >, false: <)
|
|
||||||
unsigned long time_ns; // timer interval (ns)
|
|
||||||
} watch_arg;
|
|
||||||
|
|
||||||
//一个初始化示例
|
|
||||||
watch_args = (watch_arg){
|
|
||||||
.task_id = getpid(),
|
|
||||||
.ptr = &temp,
|
|
||||||
.name = "temp",
|
|
||||||
.length_byte = sizeof(int),
|
|
||||||
.threshold = 150,
|
|
||||||
.unsigned_flag = 0,
|
|
||||||
.greater_flag = 1,
|
|
||||||
.time_ns = 2000 + 5000
|
|
||||||
};
|
|
||||||
start_watch(watch_args);
|
|
||||||
```
|
|
||||||
|
|
||||||
### 打印输出
|
|
||||||
|
|
||||||
定时器不断按照设定间隔轮询变量,当达到设定条件时,采集此时系统内符合要求的 Task 信息(用户态堆栈/内核态堆栈/调用链信息).
|
|
||||||
- `dmesg` 可以查看到具体的超出设定条件的变量信息;
|
|
||||||
- Task 信息被输出到缓存区,使用 ucli 工具查看.
|
|
||||||
|
|
||||||
`dmesg` 打印示例如下
|
|
||||||
|
|
||||||
```log
|
|
||||||
[42865.640988] -------------------------------------
|
|
||||||
[42865.640992] -----------variable monitor----------
|
|
||||||
[42865.640993] 超出阈值:1701141698684973655
|
|
||||||
[42865.640994] : pid: 63936, name: temp0, ptr: 00000000bade6e61, threshold:110
|
|
||||||
[42865.648068] -------------------------------------
|
|
||||||
[42875.640703] -------------------------------------
|
|
||||||
[42875.640706] -----------variable monitor----------
|
|
||||||
[42875.640706] 超出阈值:1701141708684881779
|
|
||||||
[42875.640708] : pid: 63936, name: temp0, ptr: 00000000bade6e61, threshold:110
|
|
||||||
[42875.640710] : pid: 63936, name: temp1, ptr: 00000000ee645b96, threshold:111
|
|
||||||
[42875.640711] : pid: 63936, name: temp2, ptr: 00000000f62b7afe, threshold:112
|
|
||||||
[42875.640711] : pid: 63936, name: temp3, ptr: 00000000d100fa3c, threshold:113
|
|
||||||
[42875.640712] : pid: 63936, name: temp4, ptr: 000000006d31cae1, threshold:114
|
|
||||||
[42875.640712] : pid: 63936, name: temp5, ptr: 00000000723c7a2a, threshold:115
|
|
||||||
[42875.640713] : pid: 63936, name: temp6, ptr: 0000000026ef6e83, threshold:116
|
|
||||||
[42875.640714] : pid: 63936, name: temp7, ptr: 00000000fc1e5d5e, threshold:117
|
|
||||||
[42875.640714] : pid: 63936, name: temp8, ptr: 0000000069b2666e, threshold:118
|
|
||||||
[42875.640715] : pid: 63936, name: temp9, ptr: 000000000176263d, threshold:119
|
|
||||||
[42875.648023] -------------------------------------
|
|
||||||
```
|
|
||||||
|
|
||||||
默认情况下 `ucli` 编译后在 build 文件夹下
|
|
||||||
|
|
||||||
`ucli > output`
|
|
||||||
- ucli 会将缓存区内容解析后输出到 `output` 文件中.
|
|
||||||
- **此操作会清空缓存区**
|
|
||||||
|
|
||||||
`ucli` 工具输出示例如下(详情见 output_example)
|
|
||||||
- userstack 是 testcase 下的堆栈信息测试程序.
|
|
||||||
|
|
||||||
```log
|
|
||||||
##CGROUP:[/] 51666 [510] 采样命中[D]
|
|
||||||
进程信息: [/ / userstack], PID: 51666 / 51666
|
|
||||||
##C++ pid 51666
|
|
||||||
用户态堆栈SP:7ffcd5822298, BP:2, IP:7f071c720838
|
|
||||||
#~ 0x7f071c720838 __GI___nanosleep ([symbol])
|
|
||||||
#~ 0x7f071c72076e __sleep ([symbol])
|
|
||||||
#~ 0x400a08 customFunction1 ([symbol])
|
|
||||||
#~ 0x400a64 customFunction3 ([symbol])
|
|
||||||
#~ 0x400a42 customFunction2 ([symbol])
|
|
||||||
#~ 0x400a21 customFunction1 ([symbol])
|
|
||||||
#~ 0x400a64 customFunction3 ([symbol])
|
|
||||||
#~ 0x400a42 customFunction2 ([symbol])
|
|
||||||
#~ 0x400a21 customFunction1 ([symbol])
|
|
||||||
#~ 0x400a64 customFunction3 ([symbol])
|
|
||||||
#~ 0x400a42 customFunction2 ([symbol])
|
|
||||||
#~ 0x400a21 customFunction1 ([symbol])
|
|
||||||
#~ 0x400a64 customFunction3 ([symbol])
|
|
||||||
#~ 0x400a42 customFunction2 ([symbol])
|
|
||||||
#~ 0x400a21 customFunction1 ([symbol])
|
|
||||||
#~ 0x400a64 customFunction3 ([symbol])
|
|
||||||
#~ 0x400a42 customFunction2 ([symbol])
|
|
||||||
#~ 0x400a21 customFunction1 ([symbol])
|
|
||||||
#~ 0x400a64 customFunction3 ([symbol])
|
|
||||||
#~ 0x400a42 customFunction2 ([symbol])
|
|
||||||
#~ 0x400a21 customFunction1 ([symbol])
|
|
||||||
#~ 0x400a64 customFunction3 ([symbol])
|
|
||||||
#~ 0x400a42 customFunction2 ([symbol])
|
|
||||||
#~ 0x400a21 customFunction1 ([symbol])
|
|
||||||
#~ 0x400a64 customFunction3 ([symbol])
|
|
||||||
#~ 0x400a42 customFunction2 ([symbol])
|
|
||||||
#~ 0x400a21 customFunction1 ([symbol])
|
|
||||||
#~ 0x400a64 customFunction3 ([symbol])
|
|
||||||
#~ 0x400a42 customFunction2 ([symbol])
|
|
||||||
#~ 0x400a21 customFunction1 ([symbol])
|
|
||||||
#~ 0x400a64 customFunction3 ([symbol])
|
|
||||||
#~ 0x400a42 customFunction2 ([symbol])
|
|
||||||
#~ 0x400a21 customFunction1 ([symbol])
|
|
||||||
#~ 0x400a75 main ([symbol])
|
|
||||||
#~ 0x7f071c661d85 __libc_start_main ([symbol])
|
|
||||||
#~ 0x40081e _start ([symbol])
|
|
||||||
内核态堆栈:
|
|
||||||
#@ 0xffffffff811730dd hrtimer_nanosleep ([kernel.kallsyms])
|
|
||||||
#@ 0xffffffff811733a6 __x64_sys_nanosleep ([kernel.kallsyms])
|
|
||||||
#@ 0xffffffff819fa117 do_syscall_64 ([kernel.kallsyms])
|
|
||||||
#@ 0xffffffff81c0007c entry_SYSCALL_64_after_hwframe ([kernel.kallsyms])
|
|
||||||
#@ 0xffffffff819fa117 do_syscall_64 ([kernel.kallsyms])
|
|
||||||
#@ 0xffffffff81c0007c entry_SYSCALL_64_after_hwframe ([kernel.kallsyms])
|
|
||||||
#@ 0xffffffff819fa117 do_syscall_64 ([kernel.kallsyms])
|
|
||||||
#@ 0xffffffff81c0007c entry_SYSCALL_64_after_hwframe ([kernel.kallsyms])
|
|
||||||
#* 0xffffffffffffff userstack (UNKNOWN)
|
|
||||||
进程链信息:
|
|
||||||
#^ 0xffffffffffffff ./build/userstack (UNKNOWN)
|
|
||||||
#^ 0xffffffffffffff /bin/bash --init-file /root/.vscode-server-insiders/cli/servers/Insiders-ca9da6c177fc4cf7429e1d0c1c52f710d6d953c6/server/out/vs/workbench/contrib/terminal/browser/media/shellIntegration-bash.sh (UNKNOWN)
|
|
||||||
#^ 0xffffffffffffff /root/.vscode-server-insiders/cli/servers/Insiders-ca9da6c177fc4cf7429e1d0c1c52f710d6d953c6/server/node /root/.vscode-server-insiders/cli/servers/Insiders-ca9da6c177fc4cf7429e1d0c1c52f710d6d953c6/server/out/bootstrap-fork --type=ptyHost --logsPath /root/ (UNKNOWN)
|
|
||||||
#^ 0xffffffffffffff /root/.vscode-server-insiders/cli/servers/Insiders-ca9da6c177fc4cf7429e1d0c1c52f710d6d953c6/server/node /root/.vscode-server-insiders/cli/servers/Insiders-ca9da6c177fc4cf7429e1d0c1c52f710d6d953c6/server/out/server-main.js --connection-token=remotessh --a (UNKNOWN)
|
|
||||||
#^ 0xffffffffffffff sh /root/.vscode-server-insiders/cli/servers/Insiders-ca9da6c177fc4cf7429e1d0c1c52f710d6d953c6/server/bin/code-server-insiders --connection-token=remotessh --accept-server-license-terms --start-server --enable-remote-auto-shutdown --socket-path=/tmp/code (UNKNOWN)
|
|
||||||
#^ 0xffffffffffffff /root/.vscode-server-insiders/code-insiders-ca9da6c177fc4cf7429e1d0c1c52f710d6d953c6 command-shell --cli-data-dir /root/.vscode-server-insiders/cli --on-port --require-token b5a047063eb7 (UNKNOWN)
|
|
||||||
#^ 0xffffffffffffff /usr/lib/systemd/systemd --switched-root --system --deserialize 17 (UNKNOWN)
|
|
||||||
##
|
|
||||||
```
|
|
||||||
|
|
||||||
## demo
|
|
||||||
|
|
||||||
usercase 文件夹下
|
|
||||||
- `helloworld.c`: 测试大量变量监控
|
|
||||||
- `userstack.c`: 测试用户态堆栈输出
|
|
||||||
- `hptest.c`: 测试 hugePage 挂载
|
|
||||||
|
|
||||||
## 其他
|
|
||||||
|
|
||||||
程序分为两部分: 字符设备 和 用户空间接口, 两者通过 ioctl 通信.
|
|
||||||
|
|
||||||
用户空间地址访问
|
|
||||||
- 用户程序传入的变量 虚拟地址, 使用 `get_user_pages_remote` 获取地址所在内存页, `kmap` 将其映射到内核.
|
|
||||||
- 192.168.40.204 环境下,HugeTLB Pages 测试挂载正常.
|
|
||||||
- 内存页地址 + 偏移量存入定时器对应的 `kernel_watch_arg` 中, hrTimer 轮询时访问 `kernel_watch_arg` 得到真实值.
|
|
||||||
|
|
||||||
定时器分组
|
|
||||||
- hrTimer 数据结构定义在全局数组 `kernel_wtimer_list`.分配定时器时,会检查遍历 `kernel_wtimer_list` 比较定时器间隔,
|
|
||||||
- 相同定时间隔的 watch 分配到同一组,对应同一个 hrTimer.
|
|
||||||
- 若一个定时器监控变量数量超过 `TIMER_MAX_WATCH_NUM` (32),则会创建一个新的 hrTimer.
|
|
||||||
- hrTimer 的总数量(`kernel_wtimer_list` 数组长度)限制是 `MAX_TIMER_NUM`(128).
|
|
||||||
|
|
||||||
内存页 mount/unmount
|
|
||||||
- `get_user_pages_remote`/ `kmap` 会增加对应的计数,需要对等的 `put_page`/`kunmap`.
|
|
||||||
- 一个模块内全局链表 `watch_local_memory_list` 存储每一个成功挂载的变量对应的 page 和 kt,执行字符设备的 close 操作时,遍历并卸载.
|
|
||||||
|
|
||||||
variable monitor 添加/删除
|
|
||||||
- kernel_watch_arg 数据结构中有 pid 的成员变量,但添加变量监控时,不按照进程区分.
|
|
||||||
- 删除时遍历全部监控变量,比较 pid.
|
|
||||||
- 删除造成的缺位,将最后的变量移动到空位, sentinel--; hrTimer 同理.
|
|
||||||
|
|
||||||
堆栈输出条件: 条件参考自 [diagnose-tools::load.c](https://github.com/alibaba/diagnose-tools/blob/e285bc4626a7d207eabd4a69cb276e1a3b1b7c76/SOURCE/module/kernel/load.c#L209)
|
|
||||||
- `TASK` 要满足 TASK_RUNNING 和 `__task_contributes_to_load` 和 `TASK_IDLE`(可能有阻塞进程).
|
|
||||||
- `__task_contributes_to_load` 对应内核宏 `task_contributes_to_loa`.
|
|
||||||
|
|
||||||
```c
|
|
||||||
// https://www.spinics.net/lists/kernel/msg3582022.html
|
|
||||||
// remove from 5.8.rc3,but it still work
|
|
||||||
// whether the task contributes to the load
|
|
||||||
#define __task_contributes_to_load(task) \
|
|
||||||
((READ_ONCE(task->__state) & TASK_UNINTERRUPTIBLE) != 0 && (task->flags & PF_FROZEN) == 0 && \
|
|
||||||
(READ_ONCE(task->__state) & TASK_NOLOAD) == 0)
|
|
||||||
```
|
|
||||||
Reference in New Issue
Block a user