Files
geedge-jira/md/OMPUB-688.md

146 lines
4.5 KiB
Markdown
Raw Permalink Normal View History

2025-09-14 21:52:36 +00:00
# 【XJ-NPM现场】NEZHA 22.02版本 Loki状态反复异常问题
| ID | Creation Date | Assignee | Status |
|----|----------------|----------|--------|
| OMPUB-688 | 2022-11-07T17:12:44.000+0800 | 史振东 | 已解决 |
---
排查nezha步骤如下
1图表报错No PromServer availableprometheus服务不可用
!image-2022-11-07-16-59-33-332.png|width=692,height=286!
2在Nezha Web界面的Explore中使用指标语句查询prometheus服务不可用
!image-2022-11-07-16-59-48-757.png|width=683,height=293!
3浏览器打开promethues界面查询指标语句得到
1警告获取服务器时间时出现意外响应状态服务不可用。
2执行查询时出错查询队列中的查询超时。
指标数据不可正常显示。
!image-2022-11-07-17-00-03-215.png|width=691,height=204!
4查看nz-agent-error的日志
/var/log/nezha/nz-agent/nz-agent-error-2022-11-06.0.log
从error log中可以看到在连接端口号为13100的应用时出现了Read timed out。
!image-2022-11-07-17-01-24-057.png|width=702,height=27!
5从conf空间的端口规划中找到13100端口对应的组件是loki
!image-2022-11-07-17-02-33-101.png|width=351,height=82!
6查看loki的状态为停用状态。
!image-2022-11-07-17-02-59-339.png|width=686,height=153!
7查看loki进程信息。Loki的状态为Dsl不可中断的睡眠状态。
!image-2022-11-07-17-03-30-219.png|width=696,height=28!**shizhendong** commented on *2022-11-09T17:04:46.162+0800*:
BUG产生的原因loki 进程处于 不可唤醒的休眠状态Disk sleep导致 nz-agent 程序状态异常,监控功能异常
排查过程:
# 查看 loki.service 日志,发现 systemd 重启 loki 失败的原因是 端口 被占用 
level=error msg="error running loki" err="listen tcp :13100: bind: address already in use
# 通过查看端口占用情况,发现是 loki 进程占有端口
!1.png|thumbnail!
# 通过查看该进程状态,确定 loki 进程处于 僵死状态,无法 kill 掉
!2.png|thumbnail!
# 猜测方向为 1.网卡休眠 2.磁盘IO异常。
通过磁盘检测和网卡状态查看,一切正常,并未发现异常情况
# BUG临时通过重启服务器解决
---
**shizhendong** commented on *2022-11-09T17:08:37.513+0800*:
该 BUG 持续关注中,并对此环境 NEZHA 增加监控信息,如下:
# 10.111.231.101 基础信息监控CPU、Disk、Memory...),并在 asset info 添加对应监控图
# NZ 组件监控Dashboard = NEZHA monitoring包含程序组件Prometheus, loki, redis, mariadb等
# Nz alert rule包含对 10.111.231.101 设备上 NEZHA 组件及基础信息监控
---
2025-09-14 22:26:17 +00:00
# Attachments
2025-09-14 21:52:36 +00:00
2025-09-14 22:26:17 +00:00
Attachment: 1.png
2025-09-14 22:27:11 +00:00
2025-09-14 22:26:17 +00:00
![1.png](https://gfwleak.exec.li/admin/geedge-jira/raw/branch/master/attachment/32643/1.png)
2025-09-14 21:52:36 +00:00
2025-09-14 22:26:17 +00:00
Attachment: 2.png
2025-09-14 22:27:11 +00:00
2025-09-14 22:26:17 +00:00
![2.png](https://gfwleak.exec.li/admin/geedge-jira/raw/branch/master/attachment/32644/2.png)
2025-09-14 21:52:36 +00:00
2025-09-14 22:26:17 +00:00
Attachment: image-2022-11-07-16-59-33-332.png
2025-09-14 22:27:11 +00:00
2025-09-14 22:26:17 +00:00
![image-2022-11-07-16-59-33-332.png](https://gfwleak.exec.li/admin/geedge-jira/raw/branch/master/attachment/32604/image-2022-11-07-16-59-33-332.png)
2025-09-14 21:52:36 +00:00
2025-09-14 22:26:17 +00:00
Attachment: image-2022-11-07-16-59-48-757.png
2025-09-14 22:27:11 +00:00
2025-09-14 22:26:17 +00:00
![image-2022-11-07-16-59-48-757.png](https://gfwleak.exec.li/admin/geedge-jira/raw/branch/master/attachment/32603/image-2022-11-07-16-59-48-757.png)
2025-09-14 21:52:36 +00:00
2025-09-14 22:26:17 +00:00
Attachment: image-2022-11-07-17-00-03-215.png
2025-09-14 22:27:11 +00:00
2025-09-14 22:26:17 +00:00
![image-2022-11-07-17-00-03-215.png](https://gfwleak.exec.li/admin/geedge-jira/raw/branch/master/attachment/32602/image-2022-11-07-17-00-03-215.png)
2025-09-14 21:52:36 +00:00
2025-09-14 22:26:17 +00:00
Attachment: image-2022-11-07-17-01-24-057.png
2025-09-14 22:27:11 +00:00
2025-09-14 22:26:17 +00:00
![image-2022-11-07-17-01-24-057.png](https://gfwleak.exec.li/admin/geedge-jira/raw/branch/master/attachment/32601/image-2022-11-07-17-01-24-057.png)
2025-09-14 21:52:36 +00:00
2025-09-14 22:26:17 +00:00
Attachment: image-2022-11-07-17-02-17-148.png
2025-09-14 22:27:11 +00:00
2025-09-14 22:26:17 +00:00
![image-2022-11-07-17-02-17-148.png](https://gfwleak.exec.li/admin/geedge-jira/raw/branch/master/attachment/32600/image-2022-11-07-17-02-17-148.png)
Attachment: image-2022-11-07-17-02-33-101.png
2025-09-14 22:27:11 +00:00
2025-09-14 22:26:17 +00:00
![image-2022-11-07-17-02-33-101.png](https://gfwleak.exec.li/admin/geedge-jira/raw/branch/master/attachment/32599/image-2022-11-07-17-02-33-101.png)
Attachment: image-2022-11-07-17-02-59-339.png
2025-09-14 22:27:11 +00:00
2025-09-14 22:26:17 +00:00
![image-2022-11-07-17-02-59-339.png](https://gfwleak.exec.li/admin/geedge-jira/raw/branch/master/attachment/32598/image-2022-11-07-17-02-59-339.png)
Attachment: image-2022-11-07-17-03-30-219.png
2025-09-14 22:27:11 +00:00
2025-09-14 22:26:17 +00:00
![image-2022-11-07-17-03-30-219.png](https://gfwleak.exec.li/admin/geedge-jira/raw/branch/master/attachment/32597/image-2022-11-07-17-03-30-219.png)
2025-09-14 21:52:36 +00:00