Librato Agent: Default Metrics

This article documents the current specific metrics tracked by default by the Librato Agent:

CPU Metrics (8 cpu + 3 load avg metrics):

Metric Description
Percent System
The percentage of time the CPU is currently spending processing system (or kernel-space)
instructions
Percent User The percentage of time the CPU is currently spending processing user-space instructions
Percent Idle
The percentage of time the CPU is currently idle (not processing instructions of any kind
or entering/leaving low power mode)
Percent Wait The percentage of time the CPU is currently spending waiting on I/O
Percent Steal
(On virtualized Hardware Only) The percentage of time the CPU wanted to work, but was not
allowed to by the Hypervisor
Percent Softirq The percentage of time the CPU spends on tasklets
Percent Nice The percentage of time the CPU spends on user-space processes that have a positive nice value
Percent Interrupt
The percentage of time the CPU spends processing and context-switching for hardware
interrupts
Load One The 1-minute system load average
Load Five The 5-minute system load average
Load Fifteen The 15-minute system load average

Memory (11)

Metric Description
Used The amount of memory in use by the kernel and memory-resident processes.
Free The amount of memory that is not in use
Buffered
The amount of memory used by the kernel to buffer writes to disk (For performance
reasons, when a process writes to disk, the disk driver actually writes to a memory
buffer which the kernel then periodically flushes)
Cached
The amount of memory used for Kernel caches. Uses include filesystem caches (to make
reads appear faster), Virtual machine slabs, and memory-mapped files.
Slab Reclaimable
The amount of memory reclaimable from Slab-space (slabs hold caches. See: slab
allocation)
Slab non-Reclaimable
The amount of memory in slab-space that cannot be reclaimed.
Slab reclaimable + slab non-reclaimable == total slab size.
Swap Free The amount of unused swap space
Swap Used The amount of used swap space
Swap Cached
The amount of swap space occupied by memory pages that exist in both memory and in
swap space. Swap cache is a write optimization for pages that have been swapped out to
disk, and then read back in to memory. In this situation, the kernel will consult the
swap cache before writing the page back out to swap, because if there is already a
valid entry for the page (ie the memory resident version is unmodified), the write
is unnecessary and can be ignored.
Swap I/O In Rate of Bytes written to disk as swapped memory pages
Swap I/O Out Rate of Bytes read from disk as swapped memory pages

Network (8 x n_interfaces)

Metric Description
RX Octets Rate of octets read from the interface
RX Packets Rate of packets read from the interface
RX Errors Rate of read errors recorded on the interface
RX Dropped
Rate of packets dropped from the read buffer before they could be processed (indicating
more inbound traffic than could be processed)
TX Octets Rate of octets written to the interface
TX Packets Rate of packets written to the interface
TX Errors Rate of write errors recorded on the interface
TX Dropped
Rate of packets dropped from the write buffer before they could be processed
(more outbound traffic than could be processed)

Disk

Mountpoints (6 x n_mountpoints)

Metric Description
df complex used Bytes used on disk
df complex free Bytes free on disk
df complex reserved
Bytes reserved for root (linux filesystems often reserve a small percentage of total disk
capacity for the root user to protect the system from non-root users filling up the
filesystem)
df inodes used Inodes used on disk
df inodes free Available inodes on disk
df inodes reserved Inodes reserved for root (see df complex reserved above)

Disk I/O (11 x n_disks)

Metric Description
I/O Time Amount of time (in milliseconds) a disk spends processing reads or writes
Weighted I/O Time
Weighted # of milliseconds spent doing I/Os. This field is incremented at each I/O start,
I/O completion, I/O merge, or read of these stats by the number of I/Os in progress
(field 9) times the number of milliseconds spent doing I/O since the last update of this
field. This can provide an easy measure of both I/O completion time and the backlog that
may be accumulating.
Merged Read
The number of read operations that were merged together by the kernel (because they were
adjacent)
Merged Write
The number of write operations that were merged together by the kernel (because they were
adjacent)
Octets Read Rate of octets read from the disk
Octets Write Rate of octets written to disk
Operations Read Total number of read operations performed by the disk
Operations Write Total number of write operations performed by the disk
Operations Pending Total number of operations waiting to be processed by the disk
Time read Amount of time (in milliseconds) the disk spent reading
Time Write Amount of time (in milliseconds) the disk spent writing