6.3 KiB
【E21现场】BOLE-IGW多块NPB出现tsg_9140_packet_io_rxdrop丢包告警
| ID | Creation Date | Assignee | Status |
|---|---|---|---|
| OMPUB-1290 | 2024-05-15T16:33:59.000+0800 | 杨威 | 处理中 |
2024.05.13 09:00--2024.05.14 09:00 BOL-IGW-T9K002-NPB02 11次
2024.05.14 09:00--2024.05.15 09:00 BOL-IGW-T9K001-NPB03 1次 BOL-IGW-T9K002-NPB02 16次 BOL-IGW-T9K002-NPB03 1次 BOL-IGW-T9K002-NPB04 1次 yangwei commented on 2024-05-16T09:55:57.372+0800:
根据现场返回的监控,存在下述两类情况:
- 5.13~5.14,BOL-IGW-T9K002-NPB02持续丢包
- 出现丢包的时段自10:00,至22:00~23:00,丢包规模随流量增长
- 触发Overload Protection的时间(20:00~21:00)在流量峰值开始时段 ** 说明日间丢包时,未触发单核CPU占用超限的现象
结论:Bole-IGW NPB02持续丢包的原因,需要进一步观察日间丢包的情况,根据现场判断是否由于单核分流或者流量处理延迟较高造成。
!image-2024-05-16-09-43-59-546.png|width=278,height=350!!image-2024-05-16-09-43-25-976.png|width=307,height=356!
!image-2024-05-16-09-45-17-408.png|width=287,height=300!!image-2024-05-16-09-45-36-112.png|width=302,height=327!
- 5.14 Bole-IGW站点在18:20前后,相关设备均出现规模在10K~15kpps的丢包
- 丢包规模和发生时间点较为一致,监控中TCP会话新建未出现突增
结论:推测为UDP异常流量,拟在2307的监控面板添加UDP会话监控指标,确认是否当时是否存在异常流量
!image-2024-05-16-09-40-51-957.png|width=296,height=345!!image-2024-05-16-09-40-25-601.png|width=271,height=352!!image-2024-05-16-09-41-26-969.png|width=280,height=351!
liuxueli commented on 2024-05-16T21:37:21.940+0800:
- 北京时间2024/5/16 21:30:00 BOL-IGW-T9K002-NPB02持续丢包,登录设备查看丢包线程对应的UDP新建/删除流较高(20000~35000/秒),调整SAPP参数,限制UDP的新建及淘汰速度,UDP新建/淘汰各限制为5000/秒,持续观察。 ** [^20240516213221.BOL-IGW-T9K002-NPB02.sysinfo.log.txt] ** !20240516213221.BOL-IGW-T9K002-NPB02.png!
liuxueli commented on 2024-05-17T10:37:36.454+0800:
- 北京时间2024/5/16 22:00:00 BOL-IGW-T9K002-NPB02捕获数据包,分析捕获的数据包,发现三元组固定但是源端口递增的DNS数据包(且数据包符合DNS格式),怀疑是DNS flood。 ** 数据包: *** NAS: E21_pcap/Bole-IGW02-NPB02 ** ip.addr==196.188.52.10 && ip.addr==208.87.242.217 && udp.port==53 *** [^dns.query.196.188.52.10-208.87.242.217.53.pcap] ** ip.addr==108.181.2.147 && ip.addr==213.55.125.42 && udp.port==53 *** [^dns.query.213.55.125.42-108.181.2.147.53.pcap]
liuxueli commented on 2024-05-17T14:52:24.760+0800:
- 北京时间2024/5/17 01:15:00~01:25:00 BOL-IGW-T9K002-NPB02设备存在rxdrop丢包,峰值约9Kpps,进一步调整SAPP参数,限制UDP的新建及淘汰速度,UDP新建/淘汰各限制为3000/秒,持续观察。 ** max_opening_per_sec=3000
** max_timeouts_per_sec=3000
** !image-2024-05-17-14-52-13-953.png|width=1297,height=682!
yangwei commented on 2024-05-17T15:19:07.001+0800:
- 查询现场返回的数据包中两个服务端IP 208.87.242.217 和108.181.2.147,ASN均为“AS40676 Psychz Networks”,属于一家提供CDN和DDoS Migration的公司
- 向上述服务端发起dns查询,未收到响应
推测对应的DNS流量为针对该服务商的DoS攻击流量/Dos攻击牵引流量
Attachments
Attachment: 1715775533880.jpg
Attachment: 20240516213221.BOL-IGW-T9K002-NPB02.png
Attachment: 20240516213221.BOL-IGW-T9K002-NPB02.sysinfo.log.txt
20240516213221.BOL-IGW-T9K002-NPB02.sysinfo.log.txt
Attachment: dns.query.196.188.52.10-208.87.242.217.53.pcap
dns.query.196.188.52.10-208.87.242.217.53.pcap
Attachment: dns.query.213.55.125.42-108.181.2.147.53.pcap
dns.query.213.55.125.42-108.181.2.147.53.pcap
Attachment: image-2024-05-16-09-38-50-887.png
Attachment: image-2024-05-16-09-40-25-601.png
Attachment: image-2024-05-16-09-40-51-957.png
Attachment: image-2024-05-16-09-41-26-969.png
Attachment: image-2024-05-16-09-43-25-976.png
Attachment: image-2024-05-16-09-43-59-546.png
Attachment: image-2024-05-16-09-45-17-408.png
Attachment: image-2024-05-16-09-45-36-112.png
Attachment: image-2024-05-17-14-52-13-953.png










