1.6 KiB
[WMS-UTR] msh01 marsio出现段错误导致pod重启
| ID | Creation Date | Assignee | Status |
|---|---|---|---|
| OMPUB-1278 | 2024-05-09T16:28:00.000+0800 | 宋延超 | 已解决 |
时间:2024-05-08 T14:15:23
设备:MSH TSGX01
版本:mrzcpd v4.6.71
现场:
!anydesk00001.png|width=537,height=302!
!anydesk00000.png|width=508,height=286!songyanchao commented on 2024-05-10T10:41:54.428+0800:
查看coredump发现 core 在了‘mlx5_rx_burst_vec’里面的CQE处理流程,google发现有人也出现过类似问题,不过他触发的场景是开启了VF功能,并且PF的mtu大于VF的mtu这样触发这个core,查看现场发现现场开启了‘ rxq_cqe_comp_en’。 针对此问题需要进一步跟进一下。 https://www.mail-archive.com/users@dpdk.org/msg07151.html https://bugs.dpdk.org/show_bug.cgi?id=334 !image-2024-05-10-10-44-13-537.png|thumbnail! !image-2024-05-10-10-43-27-154.png|thumbnail! !image-2024-05-10-10-43-37-359.png|thumbnail!
luqiuwen commented on 2024-05-10T15:55:22.750+0800:
建议调整一下mlx5固件的参数,将CQE_COMPRESS等级由BALANCED调整到AGRESSIVE。
songyanchao commented on 2024-05-17T10:17:40.950+0800:
已将 msh-tsgx01、pcap-tsgx06设备的CQE_COMPRESS等级由BALANCED调整到AGRESSIVE。
Attachments
57081/anydesk00000.png
57080/anydesk00001.png
57139/image-2024-05-10-10-42-26-990.png
57137/image-2024-05-10-10-43-27-154.png
57136/image-2024-05-10-10-43-37-359.png
57141/image-2024-05-10-10-44-13-537.png