I won’t be going into a whole lot of detail about sar as this has been documented elsewhere multiple times but basically, SAR stands for System Activity Report and as its name suggests, the sar command is used to collect,report & save CPU, Memory, I/O usage in Unix like operating systems. The SAR command produces reports on the fly and can also save the reports in the log files as well. The sar man page states:
The sar command writes to standard output the contents of selected cumulative activity counters in the operating system. The accounting system, based on the values in the count and interval parameters, writes information the specified number of times spaced at the specified intervals in seconds. If the interval parameter is set to zero, the sar command displays the average statistics for the time since the system was started. If the interval parameter is specified without the count parameter, then reports
are generated continuously.
When running a sar command, certain fields are populated which look like this:
12:00:01 AM CPU %user %nice %system %iowait %steal %idle
12:00:01 AM all 30.52 0.04 1.57 0.28 7.84 59.75
12:00:01 AM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15 blocked
12:00:01 AM 14 293 0.41 0.49 0.56 0
depending on the flag used with the sar command.
Again, depending on the flag, the following metrics are seen and are defined:
- %user: The percentage of CPU utilization that occurred while executing at the user level (this is the application usage).
- %nice: The percentage of CPU utilization that occurred while executing at the user level with “nice” priority.
- %system: The percentage of CPU utilization that occurred while executing at the system level (kernel).
- %iowait: The percentage of time that the CPU or CPU’s were idle during which the system had an outstanding disk I/O request.
- %steal: The percentage of time spent in involuntary wait by the virtual CPU or CPU’s while the hypervisor was servicing another virtual processor.
- %idle: The percentage of time that the CPU or CPU’s were idle and the systems did not have an outstanding disk I/O request.
- runq-sz: The number of kernel threads in memory that are waiting for a CPU to run. Typically, this value should be less than 2. Consistently higher values mean that the system might be CPU-bound.
- plist-sz: The number of tasks in the task list.
- rrqm/s: The number of read requests merged per second that were queued to the device.
- wrqm/s: The number of write requests merged per second that were queued to the device.
- r/s: The number of read requests that were issued to the device per second.
- w/s: The number of write requests that were issued to the device per second.
- rMB/s: The number of megabytes read from the device per second.
- wMB/s: The number of megabytes written to the device per second.
- avgrq-sz: The average size (in sectors) of the requests that were issued to the device.
- avgqu-sz: The average queue length of the requests that were issued to the device.
- await: The average time (milliseconds) for I/O requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them.
- svctm: The average service time (milliseconds) for I/O requests that were issued to the device.
- %util: Percentage of CPU time during which I/O requests were issued to the device (bandwidth utilization for the device). Device saturation occurs when this value is close to 100%.
I had a little bit of difficulty in locating the meanings for these values so I figured i’d drop them in here!
I also found valuable info about sar flags on https://linux.die.net/man/1/sar