Files
geedge-jira/md/OMPUB-967.md
2025-09-14 22:26:17 +00:00

52 lines
1.6 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# P19环境NZ 10.10.20.159服务器 prometheus wal 目录占用磁盘过大
| ID | Creation Date | Assignee | Status |
|----|----------------|----------|--------|
| OMPUB-967 | 2023-07-17T14:16:54.000+0800 | 史振东 | 已关闭 |
---
!image-2023-07-17-11-16-07-018.png!
!image-2023-07-17-11-16-30-474.png!
 **shizhendong** commented on *2023-07-18T16:01:03.467+0800*:
配置模式:指标数据=本地存储联邦关闭10.159&20.159 部署 Global Nz-agent 水平扩展,互相写指标数据
问题原因10.159 prometheus 通过 remote write 向 20.159 推送数据时,写入了脏数据
如何解决:
* prometheus remote write 配置优化
* 删除 wal 目录并重启 prometheus 服务
排查过程:
* 排查 prometheus 服务日志,发现大量 “out of order sample” 异常
* 猜测由于 “out of order sample” 引起的 wal 目录数据激增
* 通过模拟 P现场 部署模式进行测试,确认为 promtheus remote write 数据有误,确认为 wal 目录数据激增的原因
---
# Attachments
Attachment: image-2023-07-17-11-16-07-018.png
![image-2023-07-17-11-16-07-018.png](https://gfwleak.exec.li/admin/geedge-jira/raw/branch/master/attachment/41394/image-2023-07-17-11-16-07-018.png)
Attachment: image-2023-07-17-11-16-30-474.png
![image-2023-07-17-11-16-30-474.png](https://gfwleak.exec.li/admin/geedge-jira/raw/branch/master/attachment/41393/image-2023-07-17-11-16-30-474.png)
Attachment: prometheus-20.159.log
[prometheus-20.159.log](https://gfwleak.exec.li/admin/geedge-jira/raw/branch/master/attachment/41475/prometheus-20.159.log)