Files
geedge-jira/md/OMPUB-1026.md
2025-09-14 22:27:11 +00:00

101 lines
3.0 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# 【XJ-NPM】IDC环境 Hbase、Hadoop服务监控端口异常
| ID | Creation Date | Assignee | Status |
|----|----------------|----------|--------|
| OMPUB-1026 | 2023-10-08T19:08:24.000+0800 | 戚岱杰 | 已解决 |
---
1.巡检时发现HbaseHadoop服务宕机告警,如附件1-1告警信息所示
2.反馈至本地大数据同事经排查HbaseHadoop服务正常监控端口发生变化
3.问题现状:
1Hbase监控图表显示该服务频繁重启24h内重启次数>10次如附件1-2,1-3Hbase status所示; 且Hbase监控端口频繁发生变化这种现状是否存在隐患情况以及如何处理
(2)Hadoop监控端口发生变化后哪吒监控端口还是原来端口故有一台Hadoop一直显示宕机是否需要将哪吒上监控端口换为Hadoop现在使用的端口或是使用其他解决方案**qidaijie** commented on *2023-10-09T18:01:07.276+0800*:
当前状态:
# 该问题与监控端口无关为231.103服务器HregionServer和DataNode进程异常。
# 目前103数据节点已被移出集群不影响正常使用。
 
详细情况:
1HregionServer和DataNode进程启动时间在2023年3月15/16号且没有重启记录。
!进程启动时间.png|thumbnail!
2HregionServer最新日志时间是2023年9月22号同时进入hbase的log目录后使用ll命令会卡死。
3231.103上运行的Hadoop和HBase的主节点均可正常提供服务器界面可正常访问。
 
综上怀疑HregionServer和DataNode进程存在假死或僵尸进程情况需再确认。
---
**qidaijie** commented on *2023-10-11T11:39:26.548+0800*:
经后续确认该问题为HregionServer和DataNode僵尸进程导致与GAL-248出现僵尸进程时现象一致。
!僵尸进程.png|thumbnail!
 
处理方式通过升级Linux系统内核3.10.0-693.e17.x86_64到kernel-3.10.0-1160.el7.x86_64重启后进程恢复正常后续持续观察。
 
---
# Attachments
Attachment: 1-1告警信息.png
![1-1告警信息.png](https://gfwleak.exec.li/admin/geedge-jira/raw/branch/master/attachment/45414/1-1告警信息.png)
Attachment: 1-2+Hbase+status.png
![1-2+Hbase+status.png](https://gfwleak.exec.li/admin/geedge-jira/raw/branch/master/attachment/45413/1-2+Hbase+status.png)
Attachment: 1-3+Hbase+status.png
![1-3+Hbase+status.png](https://gfwleak.exec.li/admin/geedge-jira/raw/branch/master/attachment/45412/1-3+Hbase+status.png)
Attachment: HBase界面.jpg
![HBase界面.jpg](https://gfwleak.exec.li/admin/geedge-jira/raw/branch/master/attachment/45419/HBase界面.jpg)
Attachment: HDFS界面.jpg
![HDFS界面.jpg](https://gfwleak.exec.li/admin/geedge-jira/raw/branch/master/attachment/45420/HDFS界面.jpg)
Attachment: 僵尸进程.png
![僵尸进程.png](https://gfwleak.exec.li/admin/geedge-jira/raw/branch/master/attachment/45625/僵尸进程.png)
Attachment: 进程启动时间.png
![进程启动时间.png](https://gfwleak.exec.li/admin/geedge-jira/raw/branch/master/attachment/45475/进程启动时间.png)