Silent data corruption Linux Kernel

Beware: Silent data corruption discovered in Linux kernels 4.10-4.17

Vasil Kolev Blog Leave a Comment

While debugging an issue in a new StorPool deployment we re-discovered a kernel bug that exists in some of the latest Linux kernels and causes silent data corruption. The setup is a virtualized environment where the kernel containing the bug was running on the host. The guests were KVM virtual machines with io=threads (the default in standard libvirt/qemu packages) and …

The weird interactions of cgroups and linux page cache in hypervisor environments

The weird interactions of cgroups and linux page cache in hypervisor environments

Vasil Kolev Blog Leave a Comment

Here’s one case we’ve spent some time debugging, related to StorPool, VMs, cgroups, OOM (out-of-memory) killer and caching. Some of it should be useful to a lot of sysadmins out there. The issue started with a customer complaining: > It’s happening again, on this hypervisor VMs are being killed > by the OOM killer. This doesn’t happen on hypervisors where …

Serial Console Pros and Cons

Why does StorPool recommend against serial console on production installations?

Vasil Kolev Blog Leave a Comment

TL;DR – try ‘echo t > /proc/sysrq-trigger‘ and see what happens. The serial console and frame buffer are two separate ways for the kernel to display text, either to a remote machine or on the local display. They both suffer from a very similar problem, that they block the CPU they run on for long periods of time, which in …