An Introduction to CPU Steal, I/O wait and the top command

3 min read

CPU steal can be highly problematic. Fortunately it’s not common place on the providers we integrate with, but we do see this from time to time, and you may have seen us mention “noisy neighbours” in the Facebook group or directly on support. In this article we’ll give you a brief overview of what CPU steal is, and also give you a quick intro to the top command.

What is CPU Steal?

In a nutshell, CPU steal is CPU time that is stolen from a VPS by the hypervisor (AKA the virtual machine monitor / VMM).

Your VPS shares resources with other instances that also exist same server and CPU cycles are one of those shared resources. Unlike RAM, which has a specific, set limitation, CPU is more flexible. CPU cycles are shared across other instances that exist on the same server.

For example, if your VPS was one of 10 equally sized instances on one server, it’s CPU isn’t capped at 10%. You could potentially go above your allocation and steal from other instances on your server, and the same is true for them having the potential to steal from you.

If your VPS displays a high %st in top, this means CPU cycles are being taken away from you by other instances/hypervisor processes (more info in the next section). The more steal you experience, the more your servers performance will suffer.

If you’re experiencing server steal then you need to contact you IaaS provider directly to let them know what’s going on so they can take action accordingly.

If they’re unable (or unwilling) to solve the problem, you may need to move your site/s to a new server to ensure better, more reliable performance.

What is I/O wait

Far less prominent than CPU steal is I/O wait time. This is where your CPU is idle because there are no tasks ready to run, and it’s waiting on I/O. We rarely see it, but it’s good to know.

The following is copied and pasted from the sar manpage:

%iowait: Percentage of time that the CPU or CPUs were idle during which the system had an outstanding disk I/O request.

I/O wait is simply idle time where no tasks could be scheduled. 

Running the top command

For this you’ll need to SSH into your server. Please see the following articles to get started:

Generate your SSH Key:

Generate SSH Key on Mac

Generate SSH Key on Windows with Putty

Generate SSH Key on Windows with Windows Subsystem for Linux

Generate SSH Key on Windows with Windows CMD/PowerShell

Add your SSH Key to GridPane:

Add default SSH Keys

Add/Remove an SSH Key to/from an Active GridPane Server

Connect to your server:

Connect to a GridPane server by SSH as Root user.

 

Checking for CPU steal with top

The top command will let you know if other VPS’s on your server are stealing your CPU usage, or if your CPU is waiting on IO/drives. Run:

top

And you’ll see a table appear that looks as follows. We’re interested in the parts highlighted in red:

Steal “st”
Here you can see there’s no steal, but sometimes neighbours on your server can potentially steal a lot of your CPU – sometimes over 50% or more.

I/O Wait “wa”
If this is high then the CPU is waiting on the IO/drives. You can exit top and install run iotop to see which process is causing the most I/O utilization on your server.

To exit top hit Ctrl+C.

You can install iotop with:

apt install iotop

Then run:

iotop

iotop will list all of the processes running with the heaviest process at the very top.

To exit top/iotop hit Ctrl+C.