5.3 KiB
【WMS-UTR项目】TSG-OS在hotfix过程commit等待时间较长
| ID | Creation Date | Assignee | Status |
|---|---|---|---|
| OMPUB-1239 | 2024-04-18T18:15:12.000+0800 | 付明卫 | 已关闭 |
按照文档https://docs.geedge.net/pages/viewpage.action?pageId=129089806&src=contextnavpagetreemode进行了设备升级,升级步骤中的commit步骤等待时间较长始终没有完成fu_mingwei commented on 2024-04-18T18:49:11.148+0800:
通过查询NZ发现,发现OS的disk used已经到达k8s GC阈值(disk used 85%),如下图所示:
!image-2024-04-18-18-48-55-917.png!
fu_mingwei commented on 2024-04-18T19:41:07.722+0800:
issue产生原因:
当disk used到达k3s GC阈值,k3s会清理当前不用的image。在当前issue的情况下,k3s GC清理了rancher/klipper-helm镜像(此镜像时执行kubectl install/delete helm-chart时所用到的镜像)。
在rancher/klipper-helm镜像被删除的情况下,clixon执行{{{}set service_function name tsg-traffic-engine-vsys-{}}}{{{}1{}}} {{enable no}} 和 commit后,会执行helm delete helm-chart操作。在helm delete helm-chart之后过程中需要获取rancher/klipper-helm镜像,当rancher/klipper-helm不存在时,helm delete helm-chart操作会一直阻塞。
参考文档:
问题1:为什么disk used会短时间内增长较快?
问题1原因:当前环境在4月10日前后,firewall产生了大量maat相关日志。每天产生的该日志文件占用的disk存储空间大约在100G左右。 这导致了磁盘used一直在增长。
问题2:为什么日志文件占用disk空间限制在总disk空间的50%的情况下(参考task:https://jira.geedge.net/browse/TSG-17006)disk used会超过85%?
问题2原因:日志文件存储在日志volume中,日志volume存放在/data目录下。当删除日志文件时,日志volume中的磁盘空间被释放(通过df -h查看),但/data相应的磁盘空间没有被释放(通过df -h查看),此时需要fstrim timer定时执行(执行周期为1周)去释放/data目录下日志文件所占用的磁盘空间。 这将会导致日志文件占用的磁盘空间达不到日志volume容量的限制,但是/data目录的disk used一直增长。
fu_mingwei commented on 2024-04-23T11:38:23.039+0800:
修改方法:将fstrim timer执行周期改为1天
fu_mingwei commented on 2024-04-23T11:40:41.360+0800:
OS修复版本: v24.02, v24.04
gitlab commented on 2024-04-23T11:59:53.800+0800:
[付明卫|https://git.mesalab.cn/fumingwei] mentioned this issue in [a commit|b9556b8b20] of [TSG / tsg-os-buildimage|https://git.mesalab.cn/tsg/tsg-os-buildimage] on branch [update-fstrim-timer-OnCalendar-to-daily|https://git.mesalab.cn/tsg/tsg-os-buildimage/-/tree/update-fstrim-timer-OnCalendar-to-daily]:{quote}bugfix:OMPUB-1239:Set fstrim.timer oncalendar to daily.{quote}
gitlab commented on 2024-04-23T12:01:14.508+0800:
[付明卫|https://git.mesalab.cn/fumingwei] mentioned this issue in [a commit|5651823ac3] of [TSG / tsg-os-buildimage|https://git.mesalab.cn/tsg/tsg-os-buildimage] on branch [update-fstrim-to-v24.02|https://git.mesalab.cn/tsg/tsg-os-buildimage/-/tree/update-fstrim-to-v24.02]:{quote}bugfix:OMPUB-1239:Set fstrim.timer oncalendar to daily.{quote}
gitlab commented on 2024-04-23T12:02:20.854+0800:
[付明卫|https://git.mesalab.cn/fumingwei] mentioned this issue in [a merge request|https://git.mesalab.cn/tsg/tsg-os-buildimage/-/merge_requests/2414] of [TSG / tsg-os-buildimage|https://git.mesalab.cn/tsg/tsg-os-buildimage] on branch [update-fstrim-to-v24.02|https://git.mesalab.cn/tsg/tsg-os-buildimage/-/tree/update-fstrim-to-v24.02]:{quote}bugfix:OMPUB-1239:Set fstrim.timer oncalendar to daily.{quote}
gitlab commented on 2024-04-23T14:23:01.417+0800:
[付明卫|https://git.mesalab.cn/fumingwei] mentioned this issue in [a merge request|https://git.mesalab.cn/tsg/tsg-os-buildimage/-/merge_requests/2415] of [TSG / tsg-os-buildimage|https://git.mesalab.cn/tsg/tsg-os-buildimage] on branch [update-fstrim-timer-OnCalendar-to-daily|https://git.mesalab.cn/tsg/tsg-os-buildimage/-/tree/update-fstrim-timer-OnCalendar-to-daily]:{quote}bugfix:OMPUB-1239:Set fstrim.timer oncalendar to daily.{quote}
gitlab commented on 2024-04-23T14:23:04.794+0800:
[付明卫|https://git.mesalab.cn/fumingwei] mentioned this issue in [a commit|89751448f9] of [TSG / tsg-os-buildimage|https://git.mesalab.cn/tsg/tsg-os-buildimage] on branch [update-fstrim-timer-OnCalendar-to-daily|https://git.mesalab.cn/tsg/tsg-os-buildimage/-/tree/update-fstrim-timer-OnCalendar-to-daily]:{quote}bugfix:OMPUB-1239:Set fstrim.timer oncalendar to daily.{quote}
caoshanfeng commented on 2024-08-29T17:33:47.932+0800:
已更新,测试观察无问题
Attachments
55666/image-2024-04-18-18-48-55-917.png
55657/截图2.png
55658/问题截图1.png