This repository has been archived on 2025-09-14. You can view files and clone it, but cannot push or open issues or pull requests.
Files
zhangyang-variable-monitor/README_zh.md
2023-11-16 13:17:49 +08:00

7.0 KiB

Variable Monitor

changelog

11.9  多个变量监控支持
11.10 按照 pid 区分不同内核结构, 支持每个进程单独申请取消自己的监控.
11.13 用户接口 cancel_all_watch -> cancel_watch, 每个进程互不干扰.

说明

监控 数值变量(给定 地址,长度), 超过设定条件打印系统堆栈信息.

同时监控数量

  • 相同定时长度的监控 会被分为一组,对应一个定时器.
  • 一组最多 32 个变量,超过后会分配一个新的定时器.
  • 定时器数量全局最多 128 个.
  • 以上数量限制定义在 watch_module.h 头部宏.

使用

示例如 helloworld.c

  • 添加 #include "watch.h"
  • 对每个需要监控的变量 设置: 名称 && 地址 && 长度, 设置阈值, 比较方式, 定时器间隔(ns) 等.
  • start_watch(watch_arg); 启动监控
  • 需要取消监控时调用 cancel_watch();

超出设定条件时,打印系统堆栈信息, dmesg 查看,如下示例:

  • 一个定时器内,多个变量超过阈值,堆栈信息不会重复输出;
  • 打印堆栈后定时器再启动时间为 1s, 1s 后开始下一个轮次监控.
[  713.225894] -------------------------------------
[  713.225900] -------------watch monitor-----------
[  713.225900] Threshold reached:
[  713.225901]  name: temp0, threshold: 150, pid: 4261
[  713.225902]  name: temp1, threshold: 151, pid: 4261
[  713.225903]  name: temp2, threshold: 152, pid: 4261
[  713.225904]  name: temp3, threshold: 153, pid: 4261
[  713.225904]  name: temp4, threshold: 154, pid: 4261
[  713.225905]  name: temp5, threshold: 155, pid: 4261
[  713.225905]  name: temp6, threshold: 156, pid: 4261
[  713.225906]  name: temp7, threshold: 157, pid: 4261
[  713.225906]  name: temp8, threshold: 158, pid: 4261
[  713.225907]  name: temp9, threshold: 159, pid: 4261
[  713.225907]  name: temp10, threshold: 160, pid: 4261
[  713.225908]  name: temp11, threshold: 161, pid: 4261
[  713.225908]  name: temp12, threshold: 162, pid: 4261
[  713.225909]  name: temp13, threshold: 163, pid: 4261
[  713.225909]  name: temp14, threshold: 164, pid: 4261
[  713.225910]  name: temp15, threshold: 165, pid: 4261
[  713.225910]  name: temp16, threshold: 166, pid: 4261
[  713.225911]  name: temp17, threshold: 167, pid: 4261
[  713.225911]  name: temp18, threshold: 168, pid: 4261
[  713.225912]  name: temp19, threshold: 169, pid: 4261
[  713.225912]  name: temp20, threshold: 170, pid: 4261
[  713.225913]  name: temp21, threshold: 171, pid: 4261
[  713.225913]  name: temp22, threshold: 172, pid: 4261
[  713.225914]  name: temp23, threshold: 173, pid: 4261
[  713.225914]  name: temp24, threshold: 174, pid: 4261
[  713.225915]  name: temp25, threshold: 175, pid: 4261
[  713.225915]  name: temp26, threshold: 176, pid: 4261
[  713.225916]  name: temp27, threshold: 177, pid: 4261
[  713.225916]  name: temp28, threshold: 178, pid: 4261
[  713.225916]  name: temp29, threshold: 179, pid: 4261
[  713.225917]  name: temp30, threshold: 180, pid: 4261
[  713.225917]  name: temp31, threshold: 181, pid: 4261
[  713.225918] Timestamp (ns): 1699846710299420862
[  713.225919] Recent Load: 0.05, 0.12, 0.08
[  713.225921] task: name rcu_gp, pid 3, state 1026
[  713.225926]  rescuer_thread+0x290/0x390
[  713.225931]  kthread+0xd7/0x100
[  713.225932]  ret_from_fork+0x1f/0x30
[  713.225935] task: name rcu_par_gp, pid 4, state 1026
[  713.225936]  rescuer_thread+0x290/0x390
[  713.225937]  kthread+0xd7/0x100
[  713.225938]  ret_from_fork+0x1f/0x30
[  713.225940] task: name netns, pid 5, state 1026
[  713.225941]  rescuer_thread+0x290/0x390
[  713.225942]  kthread+0xd7/0x100

参数说明

start_watch 传入的是 watch_arg 结构体.各个字段意义如下

  • name 限制 MAX_NAME_LEN(15) 个有效字符
typedef struct
{
    pid_t task_id;               // current process id
    char name[MAX_NAME_LEN + 1]; // name (15+1)
    void *ptr;                   // virtual address
    int length_byte;             // byte
    long long threshold;         // threshold value
    unsigned char unsigned_flag; // unsigned flag (true: unsigned, false: signed)
    unsigned char greater_flag;  // reverse flag (true: >, false: <)
    unsigned long time_ns;       // timer interval (ns)
} watch_arg;

一个初始化示例

watch_args = (watch_arg){
    .task_id = getpid(),
    .ptr = &temp,
    .name = "temp",
    .length_byte = sizeof(int),
    .threshold = 150 + i,
    .unsigned_flag = 0,
    .greater_flag = 1,
    .time_ns = 2000 + (i / 33) * 5000
};

demo

项目主文件下

  • helloworld.c: 测试大量变量监控
  • hptest.c: 测试 hugePage 挂载
# 编译加载模块
make && insmod variable_monitor.ko
./helloworld

dmesg 可以看到打印的堆栈信息

# 卸载模块,清理编译文件
rmmod variable_monitor.ko && make clean

仅在 kernel 5.17.15-1.el8.x86_64 测试,其他内核版本未测试.

其他

程序分为两部分: 字符设备 和 用户空间接口, 两者通过 ioctl 通信.

用户空间地址访问

  • 用户程序传入的变量 虚拟地址, 使用 get_user_pages_remote 获取地址所在内存页, kmap 将其映射到内核.
    • 192.168.40.204 环境下,HugeTLB Pages 测试挂载正常.
  • 内存页地址 + 偏移量存入定时器对应的 kernel_watch_arg 中, hrTimer 轮询时访问 kernel_watch_arg 得到真实值.

定时器分组

  • hrTimer 数据结构定义在全局数组 kernel_wtimer_list.分配定时器时,会检查遍历 kernel_wtimer_list 比较定时器间隔,
  • 相同定时间隔的 watch 分配到同一组,对应同一个 hrTimer.
  • 若一个定时器监控变量数量超过 TIMER_MAX_WATCH_NUM (32),则会创建一个新的 hrTimer.
  • hrTimer 的总数量(kernel_wtimer_list 数组长度)限制是 MAX_TIMER_NUM(128).

内存页 mount/unmount

  • get_user_pages_remote/ kmap 会增加对应的计数,需要对等的 put_page/kunmap.
  • 一个模块内全局链表 watch_local_memory_list 存储每一个成功挂载的变量对应的 page 和 kt,执行字符设备的 close 操作时,遍历并卸载.

variable monitor 添加/删除

  • kernel_watch_arg 数据结构中有 pid 的成员变量,但添加变量监控时,不按照进程区分.
  • 删除时遍历全部监控变量,比较 pid.
  • 删除造成的缺位,将最后的变量移动到空位, sentinel--; hrTimer 同理.

堆栈输出条件: 条件参考自 diagnose-tools::load.c

  • TASK 要满足 TASK_RUNNING 和 __task_contributes_to_loadTASK_IDLE(可能有阻塞进程).
  • __task_contributes_to_load 对应内核宏 task_contributes_to_loa.
// https://www.spinics.net/lists/kernel/msg3582022.html
// remove from 5.8.rc3,but it still work
// whether the task contributes to the load
#define __task_contributes_to_load(task)                                                                               \
    ((READ_ONCE(task->__state) & TASK_UNINTERRUPTIBLE) != 0 && (task->flags & PF_FROZEN) == 0 &&                       \
     (READ_ONCE(task->__state) & TASK_NOLOAD) == 0)