150 lines
6.7 KiB
Markdown
150 lines
6.7 KiB
Markdown
## Variable Monitor
|
|
|
|
Monitor numerical variables (given address, length), and print system stack information when the set conditions are exceeded.
|
|
|
|
Number of simultaneous monitoring
|
|
- Monitoring with the same timing length will be grouped into one group, corresponding to one timer.
|
|
- A set of up to 32 variables, after which a new timer is allocated.
|
|
- The global maximum number of timers is 128.
|
|
- The above quantity limit is defined in the `watch_module.h` header macro.
|
|
|
|
Currently, monitoring is limited to the same application, and simultaneous calls from multiple applications are not currently supported.
|
|
- Multiple applications can work normally if only one program calls `cancel_all_watch();`.
|
|
|
|
## Usage
|
|
|
|
Example: helloworld.c
|
|
- Add `#include "watch.h"`
|
|
- Set each variable that needs to be monitored: name && address && length, set threshold, comparison method, timer interval (ns), etc.
|
|
- `start_watch(watch_arg);` Start monitoring
|
|
- Call `cancel_all_watch();` when you need to cancel monitoring
|
|
|
|
When the set conditions are exceeded, the system stack information is printed and viewed with `dmesg`, as shown in the following example:
|
|
- Within a timer, if multiple variables exceed the threshold, the stack information will not be output repeatedly;
|
|
- The timer restart time after printing the stack is 1s, and the next round of monitoring will start after 1s.
|
|
|
|
```log
|
|
[86245.364861] -------------------------------------
|
|
[86245.364864] -------------watch monitor-----------
|
|
[86245.364865] Threshold reached:
|
|
name: temp0, threshold: 150
|
|
[86245.364866] Timestamp (ns): 1699589000606300743
|
|
[86245.364867] Recent Load: 116.65, 126.83, 151.17
|
|
[86245.365669] task: name lcore-worker-4, pid 803327
|
|
[86245.365672] task: name lcore-worker-5, pid 803328
|
|
[86245.365673] task: name lcore-worker-6, pid 803329
|
|
[86245.365674] task: name lcore-worker-7, pid 803330
|
|
[86245.365676] task: name lcore-worker-8, pid 803331
|
|
[86245.365677] task: name lcore-worker-9, pid 803332
|
|
[86245.365679] task: name lcore-worker-10, pid 803333
|
|
[86245.365681] task: name lcore-worker-11, pid 803334
|
|
[86245.365682] task: name lcore-worker-68, pid 803335
|
|
[86245.365683] task: name lcore-worker-69, pid 803336
|
|
[86245.365684] task: name lcore-worker-70, pid 803337
|
|
[86245.365685] task: name lcore-worker-71, pid 803338
|
|
[86245.365686] task: name lcore-worker-72, pid 803339
|
|
[86245.365687] task: name lcore-worker-73, pid 803340
|
|
[86245.365688] task: name lcore-worker-74, pid 803341
|
|
[86245.365689] task: name lcore-worker-75, pid 803342
|
|
[86245.365694] task: name pkt:worker-0, pid 803638
|
|
[86245.365702] hrtimer_nanosleep+0x8d/0x120
|
|
[86245.365709] __x64_sys_nanosleep+0x96/0xd0
|
|
[86245.365711] do_syscall_64+0x37/0x80
|
|
[86245.365716] entry_SYSCALL_64_after_hwframe+0x44/0xae
|
|
[86245.365718] task: name pkt:worker-1, pid 803639
|
|
[86245.365721] hrtimer_nanosleep+0x8d/0x120
|
|
[86245.365724] __x64_sys_nanosleep+0x96/0xd0
|
|
[86245.365726] do_syscall_64+0x37/0x80
|
|
[86245.365728] entry_SYSCALL_64_after_hwframe+0x44/0xae
|
|
[86245.365730] task: name pkt:worker-2, pid 803640
|
|
[86245.365732] hrtimer_nanosleep+0x8d/0x120
|
|
[86245.365734] __x64_sys_nanosleep+0x96/0xd0
|
|
[86245.365737] do_syscall_64+0x37/0x80
|
|
[86245.365739] entry_SYSCALL_64_after_hwframe+0x44/0xae
|
|
[86245.365740] task: name pkt:worker-3, pid 803641
|
|
[86245.365743] hrtimer_nanosleep+0x8d/0x120
|
|
```
|
|
|
|
### Parameter Description
|
|
|
|
start_watch passes in the watch_arg structure. The meaning of each field is as follows
|
|
- name limit `MAX_NAME_LEN`(15) valid characters
|
|
|
|
```c
|
|
typedef struct
|
|
{
|
|
pid_t task_id; // current process id
|
|
char name[MAX_NAME_LEN + 1]; // name (15+1)
|
|
void *ptr; // virtual address
|
|
int length_byte; // byte
|
|
long long threshold; // threshold value
|
|
unsigned char unsigned_flag; // unsigned flag (true: unsigned, false: signed)
|
|
unsigned char greater_flag; // reverse flag (true: >, false: <)
|
|
unsigned long time_ns; // timer interval (ns)
|
|
} watch_arg;
|
|
```
|
|
|
|
An initialization example
|
|
|
|
```c
|
|
watch_args = (watch_arg){
|
|
.task_id = getpid(),
|
|
.ptr = &temp,
|
|
.name = "temp",
|
|
.length_byte = sizeof(int),
|
|
.threshold = 150 + i,
|
|
.unsigned_flag = 0,
|
|
.greater_flag = 1,
|
|
.time_ns = 2000 + (i / 33) * 5000
|
|
};
|
|
```
|
|
|
|
## demo
|
|
|
|
In the main project directory:
|
|
|
|
```bash
|
|
make && insmod watch_module.ko
|
|
./watch
|
|
```
|
|
|
|
You can see the printed stack information in dmesg
|
|
|
|
```bash
|
|
# Unload module and clean compile files
|
|
rmmod watch_module.ko && make clean
|
|
```
|
|
|
|
Only tested on kernel 5.17.15-1.el8.x86_64.
|
|
|
|
## Other
|
|
|
|
The program is divided into two parts: character device and user space interface, both of which communicate through ioctl.
|
|
|
|
User space address access
|
|
- The variable virtual address passed in by the user program, use `get_user_pages_remote` to obtain the memory page where the address is located, and `kmap` maps it to the kernel.
|
|
- In the 192.168.40.204 environment, the HugeTLB Pages test mounts normally.
|
|
- The memory page address + offset is stored in the `kernel_watch_arg` corresponding to the timer, and hrTimer accesses `kernel_watch_arg` when polling to get the real value.
|
|
|
|
timer grouping
|
|
- The hrTimer data structure is defined in the global array `kernel_wtimer_list`. When allocating a timer, it will check the traversal `kernel_wtimer_list` to compare the timer interval.
|
|
- Watches with the same timing interval are assigned to the same group and correspond to the same hrTimer.
|
|
- If the number of variables monitored by a timer exceeds `TIMER_MAX_WATCH_NUM` (32), a new hrTimer will be created.
|
|
- The total number of hrTimers (`kernel_wtimer_list` array length) limit is `MAX_TIMER_NUM`(128).
|
|
|
|
Memory page mount/unmount
|
|
- `get_user_pages_remote`/ `kmap` will increase the corresponding count and requires the equivalent `put_page`/`kunmap`.
|
|
- A global linked list in the module `watch_local_memory_list` stores the page and kt corresponding to each successfully mounted variable. When performing the close operation of the character device, it is traversed and unloaded.
|
|
|
|
Stack output conditions: The conditions are referenced from [diagnose-tools::load.c](https://github.com/alibaba/diagnose-tools/blob/e285bc4626a7d207eabd4a69cb276e1a3b1b7c76/SOURCE/module/kernel/load.c#L209)
|
|
- `TASK` must satisfy TASK_RUNNING and `__task_contributes_to_load`.
|
|
- `__task_contributes_to_load` corresponds to the kernel macro `task_contributes_to_loa`.
|
|
|
|
```c
|
|
// https://www.spinics.net/lists/kernel/msg3582022.html
|
|
// remove from 5.8.rc3,but it still work
|
|
// whether the task contributes to the load
|
|
#define __task_contributes_to_load(task) \
|
|
((READ_ONCE(task->__state) & TASK_UNINTERRUPTIBLE) != 0 && (task->flags & PF_FROZEN) == 0 && \
|
|
(READ_ONCE(task->__state) & TASK_NOLOAD) == 0)
|
|
``` |