This blog post first appeared on VMware Cloud Management Blog. Read the full blog post here.
Original posted on: May 13, 2016
Updated on: March 4, 2019 by Tim George
As the storage market transitions further and further towards all-flash, hybrid arrays, storage providers work to accelerate performance and utilization. In doing so, storage troubleshooting becomes more difficult to categorize hot data, dynamic and agile storage capacities, and other new age concepts.
Today we’re following up on a previous blog where we investigated network troubleshooting in a 3-2-1 hardware stack. As a reminder, the term “3-2-1” refers to a redundant architecture of three servers, two switches and one storage array, or some derivative of that nature (2-2-1, etc). We will then dive into vRealize Operations to determine the root cause of an issue within a 3-2-1 stack at the storage layer and expand on using a robust management console as a data aggregator.
Relationships of the 3-2-1 stack
To begin the troubleshooting, we look at the relationships of the 3-2-1 stack using the Management Pack for HPE 3PAR to determine which storage array(s) are associated with the environment. After understanding the relationships we investigate performance and capacity metrics that point out issues associated with poor planning and the over utilization of resources.
Custom Dashboards
To understand how quickly a 3-2-1 architecture can become complex, we’ve built out a custom dashboard as shown in Figure 1. As we can see, a simplistic concept becomes complex as we add in objects such as storage pools, virtual volumes, users, etc. Each of these objects has the ability to express a symptom or could be the root cause of an issue. Looking at this relationship map we can see dependencies through the stack and understand the health of each object based on the color associated with it. It’s standard in vRealize Operations Manager to have any yellow object associated with a warning, orange object with an immediate need, and red object with a critical alert.
Out-of-the-box Dashboards
To understand the alerts associated with storage and virtual layer of the 3-2-1 stack quickly, we can utilize out-of-the-box dashboards. As shown in Figure 2, we can see each object associated with the storage layer and quickly see all alerts associated with it down to the virtual layer.
Using another dashboard we can understand the current performance of the virtual layer and how the underlying storage is affecting it. At-a-glance, the HPE 3PAR Performance Dashboard (Figure 3) associates the array with the datastore down to the underlying virtual volume and controller node. As the relationships are displayed, we can also see the performance of each and quickly determine how one object impacts another.
Drilling down further into the storage layer we can see capacity trends. Shown in Figure 4, we are able to look deeper into capacity utilization to understand and investigate Fast Class, Nearline and SSD ratios compared to total space, compaction ratios and throughput.
Assessing Performance for Storage Troubleshooting
One of the most important issues we face within an environment is assessing performance and determining performance-based issues. Using the virtual volume relationship widget on the capacity dashboard, we can investigate performance metrics to determine latencies and associate them with performance issues. As shown in Figure 5, we can understand free space and reserved space, read/write IO and read/write throughput, as well as busy time.
Another great feature we can utilize inside vRealize Operations is to understand change rates within our environment. Figure 6 allows us to select a VM and see the growth trend in percentage, or GB. We can then track the change rate in the datastore and virtual volume, or look at all of our datastores with thin provisioning, all vSAN objects, all non-vSAN objects, or only our HPE 3PAR objects.The management console has the ability to understand what our data trends are, based on where we were, where we are, and then systematically calculate where we will be going.
Troubleshooting a 3-2-1 stack can be simplified by extending the capabilities of vRealize Operations. Using third party management packs to build relationships through the stack and provide key performance and capacity indicators allows us to have a centralized management console. Walking through vRealize Operations we are able to show the value of coupling an analytical engine with third party management packs to streamline troubleshooting within a 3-2-1 infrastructure stack.