This site is now 100% read-only, and retired.

XML logo

Interpreting vmstat's wait stats on linux 2.6
Posted by dkg on Fri 9 Feb 2007 at 17:19
Tags:
I'm trying to figure out if the linux 2.6 kernel's wait cycles include cycles waiting on network I/O or specifically just waiting on disk access.

If network I/O is included, it would change my interpretation of a system with spiking wait percentages. Maybe the disk controllers aren't saturated; instead, it could be that processes are connecting to remote hosts which not responding, or delaying their responses.

Does anyone know? What would be the best way to go about finding the answer to this? I can imagine a handful of different meanings: Time CPU is idle while at least one process is:

  • waiting on any I/O, regardless of subsystem (including stdin/stdout for interactive processes?)
  • waiting on I/O that is eventually served by the disks (i.e. no network)
  • waiting on I/O from the filesystem (this might be different from the above: more in the case of NFS, tmpfs, sshfs, etc, or less in the case of swap or other non-filesystem disk use)
I'm sure there are other possible meanings too. Some AIX vmstat notes seem to imply it is most like the third option above, but AIX and linux are different enough (and this question is probably kernel-specific enough) that i want to be sure about my particular O/S.

Background for folks who aren't clear what i'm asking about: vmstat (and other tools) report CPU time broken out into us, sy, id, and wa. The wa (wait) percentage is described in man vmstat this way:

wa: Time spent waiting for IO. Prior to Linux 2.5.41, included in idle.
Unfortunately, this doesn't go into enough detail for me, hence this post...

 

Comments on this Entry

Re: Interpreting vmstat's wait stats on linux 2.6
Posted by Anonymous (66.190.xx.xx) on Wed 14 Feb 2007 at 05:26
Start top, hit 'f' (add field), 'y' (WCHAN: "sleeping in function")

I'm not sure exactly what the names refer to these days; when I first discovered this, it was always what *kernel* function the process was blocking in.

This is per-process, and I'm not exactly sure how it reflects into the cpu stats, because a process that is "waiting" is not eligible to be active on the cpu.

Hopefully, this is helpful. If you do figure out what it means, please post the info.

[ Parent ]

Re: Interpreting vmstat's wait stats on linux 2.6
Posted by dkg (216.254.xx.xx) on Wed 14 Feb 2007 at 06:43
[ View Weblogs ]
When i do this, the processes which appear to be hogging the machine during heavy I/O seem to show up as though they're either in "wait" or "rest_init", which isn't very helpful to me (i'm not a kernel hacker). What should i make of this info?

Also, i've only run this just now, and now isn't the time of heaviest load on the machine, so i might be missing something. any other ideas?

[ Parent ]