The Challenge of Managing Storage Performance
The task of maintaining peak performance is critical for the modern storage administrator. More and more mission-critical applications rely on the shared storage infrastructure for real-time activity, and enterprise-level businesses are graduating live workstations and production servers to SAN systems. Meanwhile, the potential for resource bottlenecks becomes more complicated due to the necessary interoperability of numerous software and hardware elements.
Admins have a variety of weapons in their arsenal to combat performance issues once capacity has been achieved; commonly, logs from servers and storage arrays can direct attention to timeouts, I/O retries, or failing elements in the array. I/O metrics provide a real view of performance, and investigation of what is happening to I/O requests down the stack at a granular level is essential.
The reasons behind performance loss can be numerous. Lack of multipathing, insufficient cache, inefficient virtual load balancing, all of these can be contributing factors (and certainly not mutually exclusive). However, there is one universal but often overlooked cause of performance loss and
even the physical defects which may be cropping up themselves: severe I/O inefficiencies that cause performance and reliability problems.