Files
geedge-jira/md/OMPUB-1401.md
2025-09-14 22:27:11 +00:00

2.1 KiB
Raw Permalink Blame History

【E21现场】出现告警 OLAP Asset Open too many files

ID Creation Date Assignee Status
OMPUB-1401 2024-08-05T16:51:26.000+0800 张洪庆 已关闭

No description


zhanghongqing commented on 2024-08-05T17:41:05.485+0800:

原因:

  • 查看服务器句柄主要为druid的historical 进程占用 !image-2024-08-05-17-12-22-298.png|width=848,height=146!
  • 随着druid的segments文件增加{{{}historical{}}} 进程打开的文件句柄数也随之增加。segments文件每天不断增加当达到生命周期时segments文件数趋于稳定打开文件句柄数也趋于稳定。
  • 当前监控打开句柄数告警阈值为60000druid服务器打开句柄数超过60000上限所以发生告警

解决:

  • 增加OLAP服务器操作系统句柄至1048576
  • 调整监控文件句柄告警阈值调整至100000

songlongkun commented on 2024-08-05T20:13:23.224+0800:

[~zhanghongqing]麻烦请提供一个可向业主提供的问题说明(解释),处长要该问题说明和处理方法。


zhanghongqing commented on 2024-08-06T16:11:36.415+0800:

[~songlongkun] 问题说明如下: TSG 24.02, a new feature called Object Statistics Metrics was introduced, which allows the system to perform statistics on objects and output the results  OLAP. However, the system may trigger an "OLAP Asset Open too many files" alert. This alert is typically caused by the number of file descriptors exceeding the system's set threshold. To resolve this issue, the file descriptors parameter fs.file-max in the server's operating system has been adjusted to 1,048,576, and the file descriptors alert threshold has been set to 100,000.


Attachments

Attachment: alert-message-2024-08-05+08-59-02.xlsx

alert-message-2024-08-05+08-59-02.xlsx

Attachment: image-2024-08-05-17-12-22-298.png

image-2024-08-05-17-12-22-298.png