Demo: How to monitor a hybrid cloud in Azure Monitor

So in this example, we also have a hybrid cloud that we’ve set up. Our private cloud is set up using some automation around VMware vCenter. With each of these dashboards that we can use, we’ll provide you with what we call an “Environment Dashboard” or kind of like an overview dashboard. We’re going to give you an idea what the this environment looks like what’s key, what kind of metrics that we should be looking at. For example in VMware vCenter were talking about like looking at things like CPU, memory and disk. And then kind of digging into each of these. So with each of these dashboards that we’re looking at here, we start off with just a health score. This health score is for each of our esxi hosts. Luckily for us right now all these hosts are healthy. But very quickly I want to be able to show you when something’s unhealthy, and not only currently, but then over time.

Now if something, you know, we need to dig in a little bit deeper. Obviously, we’re going to start looking at, you know, what metrics are making it unhealthy. We’ll look at things like CPU utilization, memory utilization on these hosts. We’re not going to look at things over time, like across. We’re going to give you that average so, in general, our CPU utilization on our esxi hosts are fine. We even kind of break it down one by one. A peak our worst host is around 42% and our least utilized host is at 27%. All within reason, and that explains some of those 100% health scores.

And as we still look a bit deeper, we’ll start looking at some of the, you know, another step down from the host . We look at things like virtual machines. Again, showing you some of the virtual machines that we have here. We’ll look at CPU, memory, disk. The nice part about here, is that in general we’re not very CPU constrained, we’re not very memory constrained, but we actually do have a couple of offenders here that are pegging the the CPU at 100%. These are VMs that we’re going to want to go investigate. Especially when we’re talking about something like virtualization, where if it’s starting to consume too much/too many resources these VMs could actually be affecting their neighbors as well. So we want to go make sure that why are these VMs taking so much CPU. Maybe these are the VMs that we’re going to have to provision more resources to, or maybe even turn off.

So being able to identify not only what the average is but also the worst offenders. We’ll do that, you know, not only for CPU, but for memory and disk. The last kind of table here that I want to show you guys, we’ll do this with a lot of these overview dashboards, we also provide you with a bunch of additional pieces of information. So if you want to see, maybe we didn’t focus so much on network information on the virtual machine and we want to see more network information. We will give you these links to actually show you the full queries and we’ll actually give you that data as well.

So if I click into the network information here. You’ll see that query that we wrote to be able to gather this information. I’ll run that query live here and we’ll start to populate the results. This can take just a minute. What it’s going to end up doing is take a look at what we have as our network throughput information. We’re going to see OK, what’s our total network throughput like for all these virtual machines. Now obviously some of these are going to be blank, because these are virtual machines that are turned off. The nice part about these different charts as we are looking at here cuz we can actually take the data from these charts and implement them into the dashboards as well.

