Storage Troubleshooting in a 3-2-1 Hardware Stack

by Chuck Petrie on March 4, 2019

This blog post first appeared on VMware Cloud Management Blog. Read the full blog post here.

Original posted on: May 13, 2016
Updated on: March 4, 2019 by Tim George

As the storage market transitions further and further towards all-flash, hybrid arrays, storage providers work to accelerate performance and utilization. In doing so, storage troubleshooting becomes more difficult to categorize hot data, dynamic and agile storage capacities, and other new age concepts.

Today we’re following up on a previous blog where we investigated network troubleshooting in a 3-2-1 hardware stack. As a reminder, the term “3-2-1” refers to a redundant architecture of three servers, two switches and one storage array, or some derivative of that nature (2-2-1, etc). We will then dive into vRealize Operations to determine the root cause of an issue within a 3-2-1 stack at the storage layer and expand on using a robust management console as a data aggregator.

Relationships of the 3-2-1 stack
To begin the troubleshooting, we look at the relationships of the 3-2-1 stack using the Management Pack for HPE 3PAR to determine which storage array(s) are associated with the environment. After understanding the relationships we investigate performance and capacity metrics that point out issues associated with poor planning and the over utilization of resources.

Custom Dashboards

Storage Troubleshooting in a 3-2-1 Hardware Stack
Figure 1 – Custom 3-2-1 (Derivative) Stack Dashboard

To understand how quickly a 3-2-1 architecture can become complex, we’ve built out a custom dashboard as shown in Figure 1. As we can see, a simplistic concept becomes complex as we add in objects such as storage pools, virtual volumes, users, etc. Each of these objects has the ability to express a symptom or could be the root cause of an issue. Looking at this relationship map we can see dependencies through the stack and understand the health of each object based on the color associated with it. It’s standard in vRealize Operations Manager to have any yellow object associated with a warning, orange object with an immediate need, and red object with a critical alert.

Out-of-the-box Dashboards

Storage Troubleshooting in a 3-2-1 Hardware Stack
Figure 2 – HPE 3PAR Overview Dashboard

To understand the alerts associated with storage and virtual layer of the 3-2-1 stack quickly, we can utilize out-of-the-box dashboards. As shown in Figure 2, we can see each object associated with the storage layer and quickly see all alerts associated with it down to the virtual layer.

Storage Troubleshooting in a 3-2-1 Hardware Stack
Figure 3 – HPE 3PAR Performance Dashboard

Using another dashboard we can understand the current performance of the virtual layer and how the underlying storage is affecting it. At-a-glance, the HPE 3PAR Performance Dashboard (Figure 3) associates the array with the datastore down to the underlying virtual volume and controller node. As the relationships are displayed, we can also see the performance of each and quickly determine how one object impacts another.

Storage Troubleshooting in a 3-2-1 Hardware Stack
Figure 4 – HPE 3PAR Capacity

Drilling down further into the storage layer we can see capacity trends. Shown in Figure 4, we are able to look deeper into capacity utilization to understand and investigate Fast Class, Nearline and SSD ratios compared to total space, compaction ratios and throughput.

Assessing Performance for Storage Troubleshooting

Storage Troubleshooting in a 3-2-1 Hardware Stack
Figure 5 – HPE Virtual Volume Relationships

One of the most important issues we face within an environment is assessing performance and determining performance-based issues. Using the virtual volume relationship widget on the capacity dashboard, we can investigate performance metrics to determine latencies and associate them with performance issues. As shown in Figure 5, we can understand free space and reserved space, read/write IO and read/write throughput, as well as busy time.

Storage Troubleshooting in a 3-2-1 Hardware Stack
Figure 6 – HPE 3PAR Change in Capacity

Another great feature we can utilize inside vRealize Operations is to understand change rates within our environment. Figure 6 allows us to select a VM and see the growth trend in percentage, or GB. We can then track the change rate in the datastore and virtual volume, or look at all of our datastores with thin provisioning, all vSAN objects, all non-vSAN objects, or only our HPE 3PAR objects.The management console has the ability to understand what our data trends are, based on where we were, where we are, and then systematically calculate where we will be going.

Troubleshooting a 3-2-1 stack can be simplified by extending the capabilities of vRealize Operations. Using third party management packs to build relationships through the stack and provide key performance and capacity indicators allows us to have a centralized management console. Walking through vRealize Operations we are able to show the value of coupling an analytical engine with third party management packs to streamline troubleshooting within a 3-2-1 infrastructure stack.

Get started

Try BindPlane free for 14 days. No credit card required.

Sign up
True Visibility
BindPlane for VMware vRealize Operations

True Visibility allows cloud management teams to use VMware vRealize’s powerful machine learning and capacity planning engine across their entire hybrid cloud environment.

Azure Monitor...everything
BindPlane for Microsoft Azure Monitor

Make Azure Monitor your first-pane-of-glass across your entire multi-cloud, multi-database or hybrid platform environment.

Thank you for contacting us. Your information was received. We'll be in touch shortly.