Silent data corruption Linux Kernel

Beware: Silent data corruption discovered in Linux kernels 4.10-4.17

Vasil Kolev Blog Leave a Comment

While debugging an issue in a new StorPool deployment we re-discovered a kernel bug that exists in some of the latest Linux kernels and causes silent data corruption. The setup is a virtualized environment where the kernel containing the bug was running on the host. The guests were KVM virtual machines with io=threads (the default in standard libvirt/qemu packages) and …

The weird interactions of cgroups and linux page cache in hypervisor environments

The weird interactions of cgroups and linux page cache in hypervisor environments

Vasil Kolev Blog Leave a Comment

Here’s one case we’ve spent some time debugging, related to StorPool, VMs, cgroups, OOM (out-of-memory) killer and caching. Some of it should be useful to a lot of sysadmins out there. The issue started with a customer complaining: > It’s happening again, on this hypervisor VMs are being killed > by the OOM killer. This doesn’t happen on hypervisors where …