BugZero | VMware BugID 78155 - Tail input plugin in Fluent-bit is reporting "No s...

VMware - Defect ID: 78155

Tail input plugin in Fluent-bit is reporting "No space left on device"

VMware - Defect ID: 78155

Tail input plugin in Fluent-bit is reporting "No space left on device"

Last updated on 2/18/2021

Overall: 0N/A

Severity: 0N/A

Community: 0N/A

Lifecycle: 0N/A

What is the BugZero Risk Score?

Vendor details

No defect details.

Overall: 0N/A

Severity: 0N/A

Community: 0N/A

Lifecycle: 0N/A

What is the BugZero Risk Score?

Vendor details

No defect details.

Symptoms

You see errors similar to the following in the Tail input plugin in Fluent-bit: No space left on device When you check the Fluent-bit pods by running kubectl logs <fluent-bit-pod> -n pks-system, you see the entries similar to: [2020/03/04 20:16:17] [error] [in_tail] could not register file into fs_events[2020/03/04 20:16:17] [error] [plugins/in_tail/tail_fs.c:219 errno=28] No space left on device[2020/03/04 20:16:17] [error] [in_tail] could not register file into fs_events[2020/03/04 20:16:17] [error] [plugins/in_tail/tail_fs.c:219 errno=28] No space left on device[2020/03/04 20:16:17] [error] [in_tail] could not register file into fs_events You see that there is enough free space on /var/log inside the pod and the worker nodes also have enough free space.

Impact / Risks

If this situation occurs, the underlying log files are not lost or deleted. They are still there. However, they will no longer be monitored by fluent-bit after hitting that current limit. This situation and error occurs because (at that time) the system kernel has reached the limit of filesystem "inodes" (not a limit of storage space).

Resolution

This is a known issue affecting VMware Enterprise PKS / VMware Tanzu Kubernetes Grid Integrated Edition. There is currently no resolution.

Workaround

Note: This workaround will not persist across PKS upgrades or node recreation.As a work around, you can increase sysctl to 16384 to start with and see if this resolves the issue. For more information, see https://github.com/fluent/fluent-bit/issues/1018To Increase the sysctl parameter fs.inotify.max_user_watches on all the worker nodes: Check the current value:sysctl -a | grep fs.inotify.max_user_watchesIncrease the value to 16384sysctl -w fs.inotify.max_user_watches=16384Update the new value to the kernel:sysctl -pCheck the updated value:sysctl -a | grep fs.inotify.max_user_watchesYou can also edit the file /etc/sysctl.conf and search for this parameter and overwrite the existing value and then perform kernel update by running sysctl -p.

Original Vendor Announcement

No bugs this month

Ready to prevent the next vendor outage?

Get a demo

OPERATIONAL DEFECT DATABASE

VMware - Defect ID: 78155

Tail input plugin in Fluent-bit is reporting "No space left on device"

VMware - Defect ID: 78155

Tail input plugin in Fluent-bit is reporting "No space left on device"

Last updated on 2/18/2021

Vendor details

Vendor details

Description

Symptoms

Impact / Risks

Resolution

Workaround

Links

Top VMware defects by risk score

Ready to prevent the next vendor outage?