2011
06.07

My name is Ryan Abel and I am a Tivoli Common Reporting (TCR) developer at Blue Medora.

What is TCR?

Tivoli Common Reporting is a business tool utilized by Tivoli products to show key metrics such as CPU, memory, disk usage, and network I/O for a product. For more information on what TCR is visit: https://www.ibm.com/developerworks/mydeveloperworks/groups/service/html/communityview?communityUuid=9caf63c9-15a1-4a03-96b3-8fc700f3a364

This blog will highlight the TCR reports just released to compliment the Agent for Citrix XenServer v6.2.3. These reports can be downloaded here:

http://www-304.ibm.com/software/brandcatalog/ismlibrary/details?catalog.label=1TW10TM9U

Why Use TCR?

Although Tivoli Monitoring Agents do a great job of representing data, they can be more powerful with a compliment like TCR. The XenServer Agent collects data and represents it in the Tivoli Enterprise Portal Desktop (TEPD) in a number of ways (i.e graphs, tables) but creating custom workspaces can become advanced. Tivoli Common Reporting is a tool a customer can use to create reports that have the same or even more capabilities than a TEPD workspace. The reports for the Agent for Citrix XenServer v6.2.3 aim to increase the power of an agent by extending its capabilities.

XenServer 6.2.3 TCR Highlights

One of the reports included in the TCR package is named “XenServer Hypervisor Heat Chart”. This report shows percentages of memory and CPU utilization in a cross-tab.The cross-tab contains colors that correspond to a number the user inputs in the prompt page. If the metric inside the cross-tab is below the threshold it will appear in that color. As the threshold statuses increase to ‘critical’, the color inside the cross-tab becomes red. In an actual environment, this report will alert a user to hypervisors that may be running critically high on memory or hypervisors that are being under-used. See Figure 1 For an example report.

Another report bundled with the TCR package is named “XenServer Top Or Bottom Workload Consumers”. This report shows the top or bottom hypervisors in terms of key metrics such as CPU utilization, memory utilization, and network I/O. One feature of this report is the top/bottom option. This allows a user to choose if they want to view the key metrics by most utilized or least utilized. Additionally, a user can choose if they want to display percentages or real numbers. For example, they can view memory used as a number (MB) or as a percentage. Another feature of this report is it allows a user to filter the number of results that are displayed. For example, they can see the top five hypervisors by cpu utilization or the bottom ten hypervisors by percentage of memory used. All the features of this report make it flexible. It is essentially over ten reports packaged into one. See figure 2 for an example report.

Reports can also be run against virtual machines. One of the reports packaged with the 6.2.3 reports is “XenServer Top N VM CPU Utilization Report”. A user may select one, many, or all of the available hypervisors. Then, a list is generated of all the virtual machines on those hypervisors. Once again, a user may select any number of those virtual machines. Next, the user must enter a number (n) of virtual machines to display. The report will run and display the number of  virtual machines that use the most CPU. The unique feature of this report is that it contains a “drill-through”. A drill-through is a relationship between two reports. It runs another report based off of information passed from the current report. The drill-through for this report occurs when a user clicks on a virtual machine name. Another packaged report “XenServer Virtual Machine Daily CPU Trend” will then be run for the virtual machine that was clicked. This allows a customer to really scope in on a virtual machine’s CPU utilization. This dynamic allows a user the option to view more general trends of a virtual machine or focus on one week or even one day.

All of these reports contain prompt pages where a user can specify parameters to run the report. All of the reports contain some of the same parameters. One is date range, a user can run the report for a number of date ranges including today, yesterday, last week, last month, or even a specified date range, down to the hour.

The Future

Currently, reports are being developed for the Citrix XenServer 7.1 agent. Some features added to these reports:

  • Pool centered reports
  • Running reports for only peak hours
  • More advanced filtering
  • What-If Analysis reporting
  • Workload Forecasting

Figure 1:

Figure 2:

If you have any questions/comments, please e-mail me at: ryan.abel@bluemedora.com

2011
05.27

Recently, Blue Medora has developed a set of TCR reports for the ITM Ping Probe Agent. These reports were created using TCR v1.3 in Cognos. Previous Ping Probe reports were developed with BIRT (Business Intelligence and Reporting Tools) where as these new reports were created using Cognos Report Studio. The switch from BIRT to Cognos has proven beneficial. For instance, creating a report is more “drag and drop” than the previous BIRT methods. This simplification of reporting allows customers to easily create their own reports.

TCR reports can be beneficial to a company in many ways. For example, if a company would like to view a trend of response time by host over the last week they can run a report with the desired parameters to display such (see figure 1). Another situation where a report may be helpful is in identifying a down host (see figure 2). Once this report is run, the date where the host went down can be discovered. The user can then click on the date in the table to run another report that will dynamically filter the data into an hourly trend report for that date (see figure 3). Now, a company can see exactly what host went down and the hour it went down.

FIGURE 1:

FIGURE 2:

FIGURE 3:

Overall, Cognos Reporting can be a very powerful tool in the business world. It can give information that is aesthetically pleasing and helpful. Additionally, it has a practical use case and is easily deployable. The reports can be installed in minutes and the customer always has the option of creating their own reports. This truly makes the tool as powerful as what a business needs it to be.

Additional information on the Ping Probe Agent can be found at:

http://www.bluemedora.com/itm-agent-ping-probe

Added by: Ryan Abel

If you have any questions, concerns, or comments feel free to contact me at ryan.abel@bluemedora.com

2011
05.25

One of the features of the powerful lsof tool allows for the discovery of a daemon’s port bindings. To examine them, simply issue the following command in the terminal.

lsof -nP -i

One may use the grep command to parse the output as seen in the below example.

lsof -nP -i | grep cupsd

Example output:

cupsd   2865 root 4u  IPv4  8075   TCP 127.0.0.1:631 (LISTEN)
cupsd   2865 root 6u  IPv4  8078   UDP *:631

Above we see that the cupsd process (PID 2865) is bound to TCP 631 and UDP 631.

2011
05.25

Finding the directory from which a process is executing is often helpful. It can be done simply with a linux kernel newer than 2.0 by using the /proc virtual directory.

Discover your kernel version:

uname -a

The /proc/<PID>/exe soft link points to the directory from which the process is executing.  If a process’s PID is unknown, it can be discovered using the ps tool:

ps awxx | grep <Process Name>

Combining the above logic with some other common linux tools could yield an example bash command as follows:

ls -l /proc/$(ps awxx | grep cupsd | grep -v grep \
| awk -F' ' '{print $1}')/exe | awk -F' -> ' '{print $2}'

This simply parses the long format the soft link, printing the cupsd execution directory.

Example output:

/usr/sbin/cupsd

references:

http://linux.about.com/od/commands/l/blcmdl5_proc.htm

2011
05.25

TADDM “L1″ Sensors

TADDM Level 1 discoveries terminate the Result <-> Seed chain reaction with the Stack Scan Sensor.

  • The Stack Scan Sensor detects open ports and creates database objects based on mappings listed in the “%COLLATION_HOME%\osgi\plugins\com.ibm.cdb.discover.sensor.idd.stackscan_ 7.1.0\etc\PortAppScanSensor.properties” file.
  • The Stack Scan Sensor guesses which operating systems are running on each host using mappings listed in the “%COLLATION_HOME%\osgi\plugins\com.ibm.cdb.discover.sensor.idd.stackscan_ 7.1.0\etc\fingerprints.conf” file. It does this using icmp ping response signatures. Operating systems are not confirmed and guesses are often inaccurate.

Though it may be possible, attempting to create sensors for L1 discoveries is not recommended (as little information is available at L1). It is recommended that one submit a request for comments to IBM, recommending the modification of the Stack Scan Sensor’s port mappings.

2011
05.25

TADDM Sensor Icon Creation

Though TADDM includes a collection of icons, additional icons may be desired to help to easily identify CI’s.

Required Icons

There are 5 required icons. The examples below use ‘Postgres’ as an example sensor.

  • 3 topology icons – Used in topology graphs
  • 1 topology tooltip icon – Used in topology graphs when the mouse hovers over the node
    • Format: PNG
    • Size: 130 x 65 pixels
    • Path: ‘%COLLATION_HOME%\images\custom\tooltip’
    • Filename convention: tooltip_postgres.png
  • 1 tree icon – Used in the Discovered Components tree
    • Format: PNG
    • Size: 20 x 20 pixels
    • Path: ‘%COLLATION_HOME%\images\custom\tree’
    • Filename convention: postgres.png

Icon Deployment

The 5 icons described above must be placed into the below locations:

‘%COLLATION_HOME%\images\custom\icon_<name>_state.svg’
‘%COLLATION_HOME%\images\custom\icon_<name>_dormant.svg’
‘%COLLATION_HOME%\images\custom\tree\<name>.png’
‘%COLLATION_HOME%\images\custom\tooltip\tooltip_<name>.png’
‘%COLLATION_HOME%\images\custom\icon_<name>_default.svg’
‘%COLLATION_HOME%\deploy-tomcat\cdm\images\custom\tree\<name>.png’
‘%COLLATION_HOME%\deploy-tomcat\images\custom\icon_<name>_state.svg’
‘%COLLATION_HOME%\deploy-tomcat\images\custom\icon_<name>_dormant.svg’
‘%COLLATION_HOME%\deploy-tomcat\images\custom\tree\<name>.png’
‘%COLLATION_HOME%\deploy-tomcat\images\custom\tooltip/tooltip_<name>.png’
‘%COLLATION_HOME%\deploy-tomcat\images\custom\icon_<name>_default.svg’

Read More >>

2011
05.25

TADDM Sensor Logging

Sensor Logging

There is a property in the “%COLLATION_HOME%\etc\collation.properties” file that improves readability of the logs by separating the logging into per-sensor log files. To enable this, set the following property:

com.collation.discover.engine.SplitSensorLog=true

If you do not set this property to true, default logging for all sensors is dumped in “%COLLATION_HOME%\log\services\DiscoveryManager.log”. This property separates the logs into the following directories: “%COLLATION_HOME%\log\sensors\<runid>\sensorName-IP.log” ex: sensors/20070621131259/SessionSensor-10.199.21.104.log
The runid includes the date of the discovery run and the log file name includes the sensor name and IP address of the target. When using this option, the logs are not automatically cleared, this must be done manually, if required.

Logging levels

The following logging levels are valid:

  • FATAL (log only fatal messages -> MINIMUM LOGGING)
  • ERROR
  • WARN
  • INFO (default)
  • DEBUG
  • TRACE (log every message -> MAXIMUM LOGGING)

Setting the logging level for the Topology JVM to DEBUG can cause performance problems on some systems. You should set the level for this JVM to DEBUG only when you need to debug storage errors or other topology issues. If a problem takes more than a few minutes to reproduce, more space should be allocated for the Topology logs. Review the following properties in the collation.properties file, and increase the values of these properties as needed:

# File size of a rollover log file
com.collation.log.filesize=20MB
# Number of logfiles before rollover
com.collation.log.filecount=5

When all the space is used as allocated by the values of these properties, the oldest data is deleted.
A common setting for most issues is to set the global logging level to DEBUG and set the logging level for the Topology JVM to INFO, as shown in the following example:

com.collation.log.level=DEBUG
com.collation.log.level.vm.Topology=INFO

General Heuristics:

  • Trace – Only when one would be “tracing” the code and trying to find one part of a function specifically.
  • Debug – Information that is diagnostically helpful to people other than developers (IT, sysadmins, etc).
  • Info – Generally useful information to log (service start/stop, configuration assumptions, etc). This is TADDM’s default logging level.
  • Warn – Anything that can potentially cause application oddities, but for which the program can automatically recover from (switching from a primary to backup server, retrying an operation, missing secondary data, etc).
  • Error – Any error which is fatal to the operation but not the service or application (cant open a required file, missing data, etc). These errors will force user (administrator, or direct user) intervention. These are usually reserved for incorrect connection strings, missing services, etc.
  • Fatal – Any error that is forcing a shutdown of the service or application to prevent data loss (or further data loss). Reserve these only for the most heinous errors and situations where there is guaranteed to have been data corruption or loss.
2011
05.25

Generic Jython Script

We use a MySql Custom Server for this example.

Read More >>

2011
05.25

Custom Server Template

A Custom Server Template (CST) is probably the simplest variant of a TADDM sensor. It requires the ‘CustomAppServerSensor’ to execute and thus is only available for Level 2 and Level 3 discoveries by default. One may create a Custom Server Template as follows:

Open the TADDM ‘Discovery Management Console’ (prior to TADDM 7.2.1 this interface was referred to as the ‘Product Console’) and select ‘Custom Servers’ from the ‘Discovery’ drawer. A list of existing Custom Servers is displayed.

Read More >>

2011
05.25

Template Validation

There is an XML schema definition for the overall format of a template.  It will not validate the common data model statements but it will catch most errors that will stop the TMS DLA program from processing.  (This includes XML syntax errors.) The tmsdla.xsd file should be in the ibm\itm\cnps\tmsdla directory and the validate_template.bat file in the ibm\itm\cnps directory. The template must reference the XSD for validation to take place.  That means that the start of the template might look like this:

<?xml
version="1.0"
encoding="UTF-8"?>

<tmsdla:template
xmlns:tmsdla="http://localhost.com/tmsdla"
xmlns:cdm="http://localhost.com/cdm"
product="xx"
version="06\.20\.01" >

The format of the command for a template in the tmsdla directory is:

validate_template tmsdla\kxx_tmsdla.xml

If everything is correct the below message appears.

highest severity:  0

Errors will be displayed with the line number and column number where they were found in the template.  The TMS DLA utility will catch the same syntax errors, but the utility does not have the ability to display the line and column number.