Blog
Storage Performance Monitoring: 3 Metrics Needed for Holistic Visibility
Justin Baker
January 23, 2013
The Service Level Agreement is always the first casualty in the war to assign blame.
Most IT organizations monitor their applications and infrastructure in an extremely disjointed manner, with each specialist team relying on tools that provide visibility into a specific technology silo: network tools for the network engineers, database profilers for the DBAs, agent-based APM tools for developers, and so forth. This fractured approach to monitoring contributes to high IT costs, poor user experience, wasted capacity, and an IT organization that is responding to issues reactively instead of proactively.
SAN and NAS Performance Visibility
ExtraHop offers a much-needed new approach that provides holistic visibility across the entire application delivery chain. This cross-tier view enables IT teams to easily understand how applications are impacting the database, network, and storage tiers. With shared operational intelligence, IT teams can collaborate to solve problems faster and identify interrelated issues that would otherwise go undetected.
This month's Performance Metric of the Month highlights the importance of CIFS, NFS, and iSCSI transaction metrics in the context of other application and infrastructure performance. The three real-world examples below demonstrate the value of this correlated visibility.
Case #1 – Tiered Storage vs. the Rogue Application
This customer used the ExtraHop system to inspect a list of all transactions hitting the DataDomain system during the periods of slow performance and identified a single system that was aggressively reading from the storage system. As the back-up storage system was optimized for writes and not reads, this activity had a serious impact on overall performance. The ExtraHop system made this diagnosis easy by showing all the read and write transactions on a per-client basis. This capability can also be applied to monitoring OLAP database applications, or data warehouses, which are optimized for reads.
Case #2 – iSCSI Connectivity Issues and the Confused SAN
Figure 1. Mapping iSCSI connections helped identify misconfigured servers.
During a proof-of-concept demonstration, the IT manager at the company and an ExtraHop systems engineer confirmed the iSCSI connectivity issues and then pinpointed the specific servers experiencing these problems out of the entire pool of Xen and VMware servers. By generating an application activity map that visually mapped all devices using the iSCSI protocol (see Figure 1), the IT manager confirmed that the two suspect servers were connecting to the SAN in different ways. These servers were using the Microsoft iSCSI Software Initiator in Windows in addition to host-bus adapters (HBAs). As the SAN tried to load-balance requests across all available interfaces and controllers, it would sometimes send a response from the HBA back to the Microsoft iSCSI Software Initiator on that same server, which would then drop the response.
The ExtraHop system helped to solve this obscure issue by providing the necessary context. With the problem identified, the IT manager turned off the Microsoft iSCSI Software Initiator on those servers, and the iSCSI connectivity issues disappeared.
Case #3 – The Bandwidth-Hog Logging System
Figure 2. The ExtraHop system analyzes L7 application protocols.
A bug in the log archive script caused large files to be copied across the network repeatedly. Five million files were unnecessarily rewritten. The network team was unfamiliar with the logging system and had assumed that this growth was organic. In fact, they were preparing a forklift upgrade of the network infrastructure to handle this increased traffic—a cost of hundreds of thousands of dollars. However, with the archive script fixed, network utilization dropped by an astounding 70 percent, which helped the company defer a significant unnecessary capital expense.
Legacy network-monitoring tools would not have helped in this case. Only the ExtraHop system, with its ability to analyze L7 application-level details, is able to distinguish CIFS traffic (see Figure 2) and list the filenames for each transaction.
What's Needed: An Operational Intelligence Solution
If you have your own networked-storage tales to tell, please leave a comment below. Or, if you're interested in finding out how the capabilities of the ExtraHop system can help you, try the free, interactive ExtraHop demo.
Discover more