As a Solutions Architect for Blue Medora I have helped many end users of VMware’s vRealize Operations achieve operational visibility of their entire VMware infrastructure. Today I wanted to address a very common question I receive during deployment. “How can I track Windows Service availability or Linux Process usage?”
While Blue Medora can extend to many aspects of the environment, such as Server, Networking, Storage, Databases, and Applications, we also want to see the Operating Systems. Having performance metrics, capacity information, and critical alerting of your operating systems is critical to building the completed view.
The good news, is that this feature is readily available in the platform, and is known formally as the End Point Operations Management Solution in vRealize Operations Manager. This tool is also known as the EP Ops agent for vROps. I’ve even heard them called “EPO agents” and “E-Pops”. Before we begin to tackle service and process tracking at the OS level, lets first familiarize ourselves with the EP Ops agent.
This is what I was looking for, but what is the End Point Agent, and what can it do?
“You configure End Point Operations Management to gather operating system metrics and to monitor availability of remote platforms and applications. This solution is installed with vRealize Operations Manager.”
“You can map your virtual machines to an operating system to provide additional information to assist you to determine the root cause of why an alert was triggered for a virtual machine.
vRealize Operations Manager monitors your ESXi hosts and the virtual machines located on them. When you deploy an End Point Operations Management agent, it discovers the virtual machines and the objects that are running on them. By correlating the virtual machines discovered by the End Point Operations Management agent with the operating systems monitored by vRealize Operations Manager you have more details to determine the exact cause of an alert being triggered.”
That sounds straightforward, but what are the requirements, and what operating systems are compatible?
In general, we need to have Oracle Java version 8 installed, with a java home path configured. As for operating systems, most deployments are supported, including Solaris, AIX, RedHat Linux, SUSE Linux, and Windows.
Supported operating systems and their requirements can be easily located in the Supported Operating Systems section of the guide.
Now that we are familiar with the use of the EP Ops agents, and have any need agents deployed, we can move on to our main objective, tracking services and processes within the operating system. If you need help deploying the agent, follow the Install Guide or watch this Deployment Video from VMware on the subject.
According to the documentation, the agent discovers some of the objects to monitor. We can then manually add other objects, such as files, scripts or processes, and specify the details so that the agent can monitor them. This is detailed in the Manually Create Operating System Objects section of the guide.
To get started, we need to first find out EP Ops Agent object. One easy way to do this is by selecting the Environment tab at the top bar, then select Other Inventories on the side bar, and then select Operating Systems, for a full list.
After finding your OS Object, look for the Actions drop down. From there we can select Monitor OS Object and then select the appropriate option. For the example today, we are selecting Monitor Process as this is a linux host. As for the other options Monitor Windows Service can track availability of any windows service and Add Monitor Script will execute a custom data gathering script for you.
We are now prompted with the Monitor Process dialogue box. For our first process, let’s start with something simple, the SSHD process. If you are not familiar, SSHD is the OpenSSH server process. It listens to incoming connections using the SSH protocol and acts as the server for the protocol. It handles user authentication, encryption, terminal connections, file transfers, and tunneling.
There are a few ways to target specific processes in the operating system, but the easiest is with the process name. To view of list of active processes, I suggest using top or htop commands to begin. For most processes, we can simply document the name and append it to the statement in the process.query field. For the SSHD process, we will be using State.Name.eq=sshd.
We can also use the pgrep, ps, and pidof commands to look through the currently running processes and list the process IDs associated with a specific process or running location. For more help on this topic, check out this blog article that covers how to find a linux process by name.
Here are the detailed instructions provided by VMware:
|Supply the PTQL query in the form: Class.Attribute.operator=value|
Delimit queries with a comma.
For example, Pid.PidFile.eq=/var/run/sshd.pid
Class is the name of the Sigar class without the Proc prefix.
Attribute is an attribute of the given Class, index into an array or key in a Map class.
operator is one of the following (for String values):
eq Equal to value
ne Not Equal to value
ew Ends with value
sw Starts with value
ct Contains value (substring)
re Regular expression value matches
Continuing with the configuration, select Ok to save the Monitor Process configuration. We set the collection to be every minute for more accurate notification if the process becomes unavailable. If you enter invalid details when you create an OS object, the object is created but the agent cannot discover it, and metrics are not collected. Assuming this is configured correctly you will get a new object listed in vROps with a new relationship to your Operating System.
Looking at the object relationship view for our Operating System we can now see the SSHD Process as configured. It has a green health badge and has no alerts associated.
This is a great way to track the availability of a service. We can even use this new data to create custom alerts for availability, or performance thresholds.
While availability status is critical operational information, it is never enough, we would like to dig into actual usage. To highlight this ability, I went ahead and setup monitoring for a custom process titled Hot Process in my environment. As you can see the health badge is Red for the Hot Process and there is a grey number one in the upper right corner. This is due an alert that I manually created to notify me of over 90% cpu usage, and to degradation the health of the object.
Taking action based on my alert, I can pull up the All Metrics view and see at a glance that not only is the Memory Size and usage quite high for this process, but the CPU Usage is stuck at 98%. It is fair to say that I might want to restart the process or address the issue as soon as possible.
Thank you for following along with this effort to track Windows Services and Linux Processes using VMware vRealize Operations Manager 7.0. Use this tool in combination with Blue Medora’s True Visibility Suite to accomplish end to end monitoring from your physical infrastructure to cloud workloads. To get started, be sure to check out the True Visibility Suite page on the Blue Medora website and find a package that is right for you!