145 lines
6.4 KiB
Markdown
145 lines
6.4 KiB
Markdown
# 【XJ-TEST】试验局TSG-OS处理流量大于25Gbps时丢包
|
||
|
||
| ID | Creation Date | Assignee | Status |
|
||
|----|----------------|----------|--------|
|
||
| OMPUB-1039 | 2023-10-19T18:07:53.000+0800 | 刘洋 | 已关闭 |
|
||
|
||
|
||
---
|
||
|
||
TSG-OS版本为v23.07.19-9dc3a7e版本,夜间流量达到25Gbps时,SAPP存在丢包(CPU使用率50%左右,内存30%左右);现将4台设备流量全部分到一台设备(3.17),流量约60Gbps,SAPP丢一半流量,CPU的硬中断较高,CPU使用率上不去(流量25GBps/60Gbps都一直30%左右)。
|
||
**luqiuwen** commented on *2023-10-19T18:19:13.697+0800*:
|
||
|
||
经排查,sapp丢包时其包处理线程在等待锁,没有进行收包,收包队列满造成丢包。
|
||
{code:java}
|
||
658.234 ( 1.660 ms): futex(uaddr: 0x7f1ac4cfcd88, op: WAIT_BITSET|PRIVATE_FLAG|CLOCK_REALTIME, val3: MATCH_ANY) = 0
|
||
syscall (/usr/lib64/libc-2.28.so)
|
||
[0xb62227] (/opt/tsg/sapp/plug/business/tsg_vulpes/libonnxruntime.so.1.10.0)
|
||
[0xb616d0] (/opt/tsg/sapp/plug/business/tsg_vulpes/libonnxruntime.so.1.10.0)
|
||
[0xb617be] (/opt/tsg/sapp/plug/business/tsg_vulpes/libonnxruntime.so.1.10.0)
|
||
[0x8206ac] (/opt/tsg/sapp/plug/business/tsg_vulpes/libonnxruntime.so.1.10.0)
|
||
[0x8a82f1] (/opt/tsg/sapp/plug/business/tsg_vulpes/libonnxruntime.so.1.10.0)
|
||
[0x82cd71] (/opt/tsg/sapp/plug/business/tsg_vulpes/libonnxruntime.so.1.10.0)
|
||
[0x1857a1] (/opt/tsg/sapp/plug/business/tsg_vulpes/libonnxruntime.so.1.10.0)
|
||
[0x1895d0] (/opt/tsg/sapp/plug/business/tsg_vulpes/libonnxruntime.so.1.10.0)
|
||
auto_label_call_ML_c (/opt/tsg/sapp/plug/business/tsg_vulpes/tsg_vulpes.so)
|
||
traffic_process (/opt/tsg/sapp/plug/business/tsg_vulpes/tsg_vulpes.so)
|
||
plugin_call_streamentry (/opt/tsg/sapp/sapp)
|
||
call_streamentry (/opt/tsg/sapp/sapp)
|
||
stream_process (/opt/tsg/sapp/sapp)
|
||
stream_process_udp (/opt/tsg/sapp/sapp)
|
||
udp_free_stream (/opt/tsg/sapp/sapp)
|
||
streamaddlist (/opt/tsg/sapp/sapp)
|
||
[0x43948] (/opt/tsg/sapp/sapp)
|
||
dealipv4udppkt (/opt/tsg/sapp/sapp)
|
||
ipv4_entry (/opt/tsg/sapp/sapp)
|
||
eth_entry (/opt/tsg/sapp/sapp)
|
||
[0x2e9e1] (/opt/tsg/sapp/sapp)
|
||
[0x107c66] (/opt/tsg/sapp/sapp)
|
||
[0x108051] (/opt/tsg/sapp/sapp)
|
||
start_thread (/usr/lib64/libpthread-2.28.so)
|
||
__GI___clone (inlined) {code}
|
||
这一锁由tsg_vulpes使用,通过tsg-os-cli关闭该功能后,不再丢包运行正常。
|
||
|
||
|
||
|
||
---
|
||
|
||
**yangwei** commented on *2023-10-19T18:35:57.760+0800*:
|
||
|
||
临时解决方案,关闭加密语音识别功能,tsg-os-cli中做如下设置:
|
||
set template name tsg_traffic_engine_default encrypt_traffic_identify voice_bahavior_engine no
|
||
|
||
|
||
|
||
---
|
||
|
||
**xiapeng** commented on *2023-10-20T12:20:30.973+0800*:
|
||
|
||
TSG-OS版本升级为tsg-os-v23.07.22-c92d517版本并关闭加密语音识别功能后,单机流量在75Gbps以下时,未出现严重丢包的情况,在75Gbps以上时,开始丢包,流量峰值达到98Gbps时,丢包量达到最大,流量大于75Gbps时间段内存使用率稳定在35%以下,cpu使用率在70%–95%之间频繁波动
|
||
|
||
!image-2023-10-20-12-52-09-665.png|width=394,height=141!
|
||
|
||
!image-2023-10-20-12-52-33-703.png|width=394,height=139!
|
||
|
||
!image-2023-10-20-12-53-12-337.png|width=157,height=178!
|
||
|
||
|
||
|
||
---
|
||
|
||
**yangwei** commented on *2023-10-20T12:36:33.918+0800*:
|
||
|
||
贴下现场的监控,文字描述看不出丢包的量级和资源使用情况
|
||
|
||
|
||
|
||
---
|
||
|
||
**yangwei** commented on *2023-10-20T12:36:59.957+0800*:
|
||
|
||
现场测试环境怎么接的?有拓扑图么?
|
||
|
||
|
||
|
||
---
|
||
|
||
**xiapeng** commented on *2023-10-20T13:04:53.878+0800*:
|
||
|
||
测试环境设计拓扑图:[https://docs.geedge.net/pages/viewpage.action?pageId=94778025]
|
||
|
||
实际环境做了如下修改:
|
||
1.取消了串联设备,回流交换机,RCP交换机设备及所在线路
|
||
|
||
2.取消了ATCA通用流量接入设备与光保设备之间直连线路,改为光保设备 – ->光放设备–>ATCA通用流量 线路
|
||
|
||
|
||
|
||
---
|
||
|
||
**yangwei** commented on *2023-10-20T13:09:08.065+0800*:
|
||
|
||
上传下NZ上设备的完整监控
|
||
|
||
|
||
|
||
---
|
||
|
||
**yangwei** commented on *2023-10-20T18:31:45.663+0800*:
|
||
|
||
issue中描述的>25Gbps丢包原因已经定位并解决,先关闭,有其他情况另开bug
|
||
|
||
|
||
|
||
---
|
||
|
||
|
||
|
||
# Attachments
|
||
|
||
Attachment: 1697705757545.png
|
||

|
||
|
||
|
||
|
||
Attachment: 1697705784889.png
|
||

|
||
|
||
|
||
|
||
Attachment: image-2023-10-20-12-52-09-665.png
|
||

|
||
|
||
|
||
|
||
Attachment: image-2023-10-20-12-52-33-703.png
|
||

|
||
|
||
|
||
|
||
Attachment: image-2023-10-20-12-53-12-337.png
|
||

|
||
|
||
|
||
|