This chapter describes tuning information that you can use to improve I/O throughput and latency.
It is useful to place an application on the same node as its I/O resource. For graphics applications, for example, this can improve performance up to 30 percent.
For example, for an Altix UV system with the following devices:
# gfxtopology Serial number: UV-00000021 Partition number: 0 8 Blades 248 CPUs 283.70 Gb Memory Total 5 I/O Risers Blade Location NASID PCI Address X Server Display Device ---------------------------------------------------------------------- 0 r001i01b08 0 0000:05:00.0 - Matrox Pilot 4 r001i01b12 8 0001:02:01.0 - SGI Scalable Graphics Capture 6 r001i01b14 12 0003:07:00.0 Layout0.0 nVidia Quadro FX 5800 0003:08:00.0 Layout0.1 nVidia Quadro FX 5800 7 r001i01b15 14 0004:03:00.0 Layout0.2 nVidia Quadro FX 5800 |
% numactl -N 14 -m 14 /usr/bin/glxgears -display :0.2 |
This example assumes the X server was started with :0 == Layout0.
You could also use the dplace(1) command to place the application, see “dplace Command” in Chapter 5.
There can be latency spikes in response from a RAID and such a spikes can in effect slow down all of the RAIDs as one I/O completion waits for all of the striped pieces to complete.
These latency spikes impact on throughput may be to stall all the I/O or to delay a few I/Os while others continue. It depends on how the I/O is striped across the devices. If the volumes are constructed as stripes to span all devices, and the I/Os are sized to be full stripes, the I/Os will stall, since every I/O has to touch every device. If the I/Os can be completed by touching a subset of the devices, then those that do not touch a high latency device can continue at full speed, while the stalled I/Os can complete and catch up later.
In large storage configurations, it is possible to lay out the volumes to maximize the opportunity for the I/Os to proceed in parallel, masking most of the effect of a few instances of high latency.
There are at least three classes of events that cause high latency I/O operations, as follows:
Transient disk delays - one disk pauses
Slow disks
Transient RAID controller delays
The first two events affect a single logical unit number (LUN). The third event affects all the LUNs on a controller. The first and third events appear to happen at random. The second event is repeatable.