0% found this document useful (0 votes)
121 views18 pages

Discover Best Practices

Discovery

Uploaded by

p4ukumar1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
121 views18 pages

Discover Best Practices

Discovery

Uploaded by

p4ukumar1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

eHealth Discover

Process Best Practices

Managing and Troubleshooting the eHealth


Discovery Process
The eHealth discover mechanism is the main avenue to integrate
existing network devices into the eHealth Fault and Performance
management environment. Through that process, network devices are
added to the eHealth configuration and provide the user with the data
necessary to successfully manage their network infrastructure and to
maximize the benefit of the eHealth suite

Prepared by:
Jason Normandin
Concord Technical Support

Copyright 2004 Concord Communications, Inc. eHealth, the Concord Logo, Live Health, Live Status, SystemEDGE, AdvantEDGE and/or other Concord marks or products referenced
herein are either registered trademarks or trademarks of Concord Communications, Inc. Other trademarks are the property of their respective owners.
I. INTRODUCTION ...................................................................................................................................................................3
II. PREREQUISITES...............................................................................................................................................................3
III. OVERVIEW OF THE EHEALTH DISCOVER PROCESS..............................................................................................3
1. HOW DOES THE E HEALTH DISCOVER PROCESS WORK?............................................................................................................3
2. WHAT ARE THE DIFFERENCES BETWEEN AD-HOC AND SCHEDULED DISCOVERIES? ...................................................................5
3. HOW DO DISCOVERIES IMPACT MY LICENSE CONSUMPTION? ..................................................................................................6
4. EXPLANATION OF THE E HEALTH MERGE ALGORITHM .............................................................................................................7
5. HOW DOES SELECTING THE MIB2 OPTION IMPACT MY DISCOVERY RESULTS?.......................................................................9
IV. TROUBLESHOOTING COMMON DISCOVERY ISSUES .............................................................................................9
1. TROUBLESHOOTING NO RESPONSE TO SNMP OR NO RESPONSE TO PING ERRORS ...............................................................9
2. TROUBLESHOOTING NO MIB S UPPORT FOR THIS AGENT ERRORS ........................................................................................11
3. TROUBLESHOOTING SYSTEMEDGE DISCOVERY ISSUES ........................................................................................................11
4. RECONCILING AND AVOIDING DUPLICATE ELEMENT CREATION ............................................................................................13
V. GENERAL DISCOVERY BEST PRACTICES....................................................................................................................14
1. AVOIDING DUPLICATE ELEMENTS BY IMPLEMENTING A STRONG CHANGE CONTROL PROCESS ...............................................14
2. MINIMIZING DATA LOSS THROUGH STATISTICS POLLER ERROR ANALYSIS AND NODBDATAFOR TOOLS .................................14
3. USING SEED FILES TO AUTOMATE INCREMENTAL CONFIGURATION UPDATES ...........................................................................15
4. SELF MONITORING THE E HEALTH SYSTEM USING PROCESS SET CREATION ............................................................................15
5. EFFECTIVELY INTERFACING WITH CONCORD TECHNICAL SUPPORT TO RESOLVE DISCOVERY ISSUES .......................................17
VI. CHANGES TO THE DISCOVERY PROCESS IN EHEALTH 5.6.X .............................................................................17
VII. OTHER RESOURCES......................................................................................................................................................18

Concord Communications Discover Best Practices 2


I. Introduction

The eHealth discover process is an integral piece of a successful eHealth implementation. The eHealth
discover mechanism is the main avenue to integrate existing network devices into the eHealth Fault and
Performance management environment. Through that process, network devices are added to the eHealth
configuration and provide the user with the data necessary to successfully manage their network infrastructure and
to maximize the benefit of the eHealth suite

Although relatively simple, the eHealth discover process does require a hands on approach to ensure success.
This document will provide the reader with the knowledge and tools necessary to ensure success managing that
process.

II. Prerequisites
This document is not intended as a replacement for the standard eHealth suite documentation such as the
Administration Guide or Users Guide. This document should be used in conjunction with the existing eHealth
manuals and Concord Knowledgebase. Additionally, the reader should possess a basic understanding of the
eHealth application, an understanding of their network infrastructure, and a basic understanding of SNMP and
device MIBs.

III. Overview of the eHealth Discover Process

1. How does the eHealth Discover Process Work?


There are three key steps to the discovery process.

1. The finder process searches the network for everything it can find within preset limits. The preset
limits, for either a scheduled or an interactive discovery, are determined by the user and are defined by
three major categories. These include IP addresses to search, technology type and community string.

The finder is a program written in the TCL ("Tool Control Language") scripting language. TCL is an
interpreted language and like any interpreted language tcl requires a runtime interpreter. The tcl
interpreter and related libraries are included in the $NH_HOME/bin/sys directory.

Finder is composed of several logical pieces that operate sequentially to perform one primary mission: the
creation of poll records in the $NH_HOME/poller/poller.cfg file. Finder is never run directly; rather it is
called from other scripts or programs (depending on the operating system), which check environment, set
variables, etc.

First, finder queries the sysObjectID of the device-in-question (DiQ). The sysObjectID is an entry in the
MIB2 system table that identifies the vendor who wrote and/or implemented the MIB being queried.
Depending on the object class being discovered (i.e. LAN/WAN, Router, Probe, or Server) the finder will
go to the class's main table and iterate through the list of possible OIDs until it finds a match for the
sysObjectID retrieved from the DiQ's MIB.

If a match is found, the table will then tell finder where to go next. In the case of LAN/WAN, the main
table will point finder toward a vendor-specific (or perhaps IETF standard) algorithm and a vendor-
specific (or IETF standard) interface table to be used as input to that algorithm. In this way finder can
cover any situation where a device is supported by standard MIBs, such as an RMON probe, or vendor-

Concord Communications Discover Best Practices 3


specific MIBs, such as a Cabletron EMME concentrator, or a combination of both, such as a Cisco
Catalyst 5000, which, while populating the MIB2 interface table (IETF standard table), places the module
and port numbers in the enterprise MIB (vendor specific algorithm).

Next, finder uses a collection of tables for interface types. These tables are used to choose the interface
types (ifType in MIB2) that will be added to the poller configuration. We generally do not want to use all
entries in the ifTable, as some entries are not relevant to eHealth. For example if we are discovering an
RMON probe for ethernet statistics we most likely do not want to discover the out-of-band (OOB) 9600
bps SLIP port, which is also in the MIB2 ifTable. So these tables are used as kind of an inclusive decision
filter to pass only the types of interfaces that we want downstream to the algorithm that will be used to
generate poll records.

Finder also looks at the ifAdminStatus or the ifOperstatus to determine if the device/interface is down or
up. In most cases the ifAdminStatus is used. There are a few cases where the ifOperStatus is used instead.
This is determined in the finder.tcl.

ifOperStatus is the actual electrical connection of device(plugged in, not plugged in)
ifAdminStatus is the desired position that the administrator chooses.
both up = active and discoverable.
both down = not active and not discoverable.
ifOperStatus = up, ifAdminStatus = down = active but not discoverable
ifOperStatus down, ifAdminStatus up = not active but admin does want it to be discovered

Once an interface has passed through this table it is passed off to an algorithm to generate the actual poll
record (poller entry). In some cases the algorithm will perform some additional exclusive interface-type
filtering. Sometimes only a single interface entry is sent, and we iterate through an array of entries, and
other times we go interface by interface and generate a poll record on each one. It depends on whether we
are doing standard support or enterprise-specific support and how many interfaces exist on the device.
Sometimes the enterprise support is extremely easy, using a simple table and the standard algorithm and
sometimes it is quite complex. It all depends on the complexity of the MIB implementation and the
statistics that the customer is requesting. If there is a requirement to cross reference variables from one
table to others, the algorithm can be quite intricate.

For example an RMON probe is quite simple. A stand-alone probe will typically populate its MIB2
ifTable with one or more ethernet ports and an OOB SLIP port. We filter out everything but the ethernet
ports and send them down to the standard ifTable algorithm, which generates poll records for RMON
"etherstats" elements.

An example of more complex support is the Bay 5000 chassis. Like most chassis designs the Bay 5000
can accept many types of blades - ethernet, token-ring, FDDI, ATM, management, etc. - each with
different capabilities and/or numbers of ports. The management software (i.e. Optivity) allows the user to
define logical groups of ports (virtual LANs if you will), which are either partitioned from the rest of the
network, or connected to other ports on another card or chassis. In order to provide utilization statistics
for all of these complex blade types and virtual LANs that the user can build, the vendor had to come up
with a very complex group of MIBs. Consequently, finder needs to sort through this jungle of options and
is rather involved.

Based on the above information, finder assigns an agent type to each element found on the device. The
agent type is associated with a MIB translation file in the $NH_HOME/poller directory. A list of these
associations may be found in $NH_HOME/poller/agent.types.

2. The newly created DCI file is passed through the eHealth Merge Algorythm.

Concord Communications Discover Best Practices 4


This collected data is then put into a single, temporary, internal, DCI file. DCI files are comma separated
flat files. The merge process then takes the temporary DCI file created by the Finder process and
compares that information against the known elements in the poller configuration. It does this to
determine one of three things about the new information. It determines:

a) Is the newly discovered item identical to a pre-existing item in the poller?


b) Does the new item already exist in the poller but need to be updated?
c) Is the discovered item an actual new discovery?

For more information on the eHealth merge process, please see section 3.4, Explanation of the eHealth
Merge Algorithm.

3. The new elements or updates to existing elements are saved to the eHealth configuration.

For more information on the save process, please see section 3.2, What are the Differences Between Ad-
hoc and Scheduled Discoveries?

2. What are the Differences between Ad-hoc and Scheduled Discoveries?

Scheduled and Ad-hoc discoveries perform the same duties with the exception of how/when the results
are saved to the eHealth configuration. The scheduled discovery can be configured to either save the
discovery results or simply log the results for review by the eHealth Administrator for a later discovery.
If the scheduled discover is configured to simply log the results, the eHealth administrator should review
the changes logged and re-run the discovery to actually save the results at a later time.

During the scheduled discovery where the job is configured to save the results, the merge process and the
save process take place at the same time in the config server. This is due to the fact that the scheduled
discovery does not allow the user to review the discovery information before saving it to the database.

In contrast, during an interactive discovery, eHealth gives the user the option to edit the findings before
saving. By selecting "Edit before Save", all new elements found are brought up in the poller configuration
editor. Here the user may modify the information found by the finder. A DCI file is generated from the
save process which contains the original discovery information along with the modifications made by the
user through the "Edit before Save". This DCI file is then sent to the config server to update the poller
configuration/database.

The interactive discovery has been engineered to be the more aggressive tool. As the user is allowed to
edit the findings before committing them to the database, the user has more control over what will be
polled on the network and how it is polled.

The eHealth Discover logfiles are an invaluable tool to better manage the Discovery process. The logfile
created by eHealth will vary depending on the type of discovery run.

For adhoc/interactive discoveries, eHealth will create interactiveDiscover logs located in


$NH_HOME/log directory which contain detailed information about each element discovered, possible
duplicates, and unresolved elements. The log files have the following naming convention:

discoverInteractive.mm.dd.yyyy.nnnnnn.log

If the adhoc/interactive results are not saved, a .unsaved will be appended to the log file name.

The $NH_HOME/log/discoverResults.log file will be created for an adhoc/interactive discovery as

Concord Communications Discover Best Practices 5


well.This log contains the listing of findings seen in the discover results window.

A poller audit log will also be created in the $NH_HOME/log directory. This log contains a listing of all
of the changes made to the poller configuration when the results are saved. These log files have the
following format:

pollerAudit.date.time.log

For scheduled discoveries with the 'Save Results' option selected, a discover..log will be created in the
$NH_HOME/log directory which contains the information which would have been displayed in the
Discover UI if this discover was run interactively.

Like the adhoc discovery above, a discoverResults.log and pollerAudit log will be created containing the
same information as documented above.

The scheduled discovery process will also create a discoverScheduled log. This log file has the same
naming convention and contents as the discoverInteractive log described above.

For scheduled discoveries with the 'Report only' option selected, the same log files will be created
containing the same information as the scheduled discovery with the 'Save Results' option selected with
the exception of the pollerAudit log. Since no changes are being made to the configuration, this log is not
created as that operation is not performed.

3. How do Discoveries Impact My License Consumption?


Discovering a particular element within a device does not necessarily consume a license. For example, if
a router discovery is performed without the LAN/WAN option selected all of the routers certified
interfaces will also be discovered. These interfaces will not consume a license however as they simply
contribute to the aggregate values reported by the parent router element.

If the LAN/WAN option was selected, or the 'Include in LAN/WAN reports' option selected within the
poller configuration UI then the interfaces would be actively polled to report individual statistics and
therefore a poller license would be consumed for each respective interface.

The same scenario exists during Server discoveries. Several disks, partitions, CPU's, etc. may be
discovered and actively polled, but once again these elements simply provide aggregate variables to the
parent Server element and therefore do not consume a poller license. The lan/wan elements of a server
would be subjected to the same scenario as described in the above Router discovery example.

Turning off polling for an aggregated element will not impact the total available licenses, while disabling
polling for non-aggregated elements will impact the total available licenses. It must be noted however that
disabling polling for aggregate elements will impact the total statistics reported by the parent device.

Other element types such as RAS and Process Sets share similar parent child relationships and license
usage of those technologies will be similar to as described above.

In addition, weighted licensing of certain Technology Types such as Wireless Access Points and Mobile
Wireless devices will affect license consumption. Weighted licensing simply indicates that certain
element types will consume more then 1 license per element. For example, PDSN elements will utilize
1000 statistical licenses per elements. This is due to the amount of information that 1 PDSN element
provides.

Concord Communications Discover Best Practices 6


Understanding eHealth element license consumption will allow the eHealth Administrator to better
manage and track license usage. The eHealth console provides Administrators with information regarding
license usage. Additionally, starting in eHealth 5.0.2 P05 and eHealth 5.5 P03, the command
nhListElementLicenses can be used to identify which elements are using a license. This command will
output a list of all elements with a number next to each. If the element has a -1 next to it, then it does not
use a license.

However, if the element has a positive number next to it, then it does use a license. If the same number is
next to several elements, then all of those elements only use one license. The numbers will increment with
each license used, so the bottom number is the total licenses being used.

For example, the following is the output from the nhListElementLicenses command:

2 sysName-SH
2 sysName-SH-/
2 sysName-SH-/export
2 sysName-SH-/opt
2 sysName-SH-/tmp
2 sysName-SH-/var
2 sysName-SH-/var/run
2 sysName-SH-Cpu-1
2 sysName-SH-disk-dad0c0t0d0s0
2 sysName-SH-disk-dad1c0t2d0s0
2 sysName-SH-disk-sd0c0t1d0s0
3 sysName-SH-enet-port-2

In this case, one license is being used by all the elements, except sysName-SH-enet-port-2. This element
utilizes its own license as can be seen by the incrementing of the count to three. Each additional element
that uses its own license will be incremented by one.

4. Explanation of the eHealth Merge Algorithm


Upon element discover or rediscovery, eHealth executes the merge process attempting to determine if the
elements discovered by the finder already matches an existing element in the poller configuration.

The merge process is invoked after the discovery process is finished creating an incoming DCI file. The
following is the DCI attributes search order used to determine if a discovered element is a "resolved
updated", "unresolved update" or a "new element":

For eHealth 5.5 and earlier:

1. nmsSource
o The default nmsSource is NH:DISCOVER, and it is hard coded in the discovery
process
o Integration modules and Application Response elements have a different nmsSource
o The matching search is limited to those elements having the
same nmsSource and nmsId not empty, if a match is found, move to item 2,
otherwise a "new element" is created.
o If nmsId is empty, move to item 3
2. uniqueDeviceId
o Unique attribute for each device in the network used to distinguish one from another
o Assigned by the finder upon discovery
o By default, it is set to the lowest MAC address found in the device

Concord Communications Discover Best Practices 7


o For Cisco routers the chassis-Id is used, except when the first 4 alpha characters in a
row
o In environments where neither the MAC address or Cisco SNMP ChassisId are
unique, the variable NH_DISCOVER_DEVID_IP can help to set the uniqueDeviceId
to be one of the following:
sysName
ipAddress
sysName-MAC (sysName-ChassisId or sysName-ipAddress for Cisco
Routers)
o The matching search is limited to those elements having the same uniqueDeviceID
o If a match is found, move to item 4, otherwise a "new element" is created
3. nmsId (Discover key)
o Assigned by the finder upon discovery, usually: sysName Descr
o Unique for each element, except for parent elements
o Parent elements do not have discover key, therefore nmsId is not used for matching
o If a match is found, a "resolved update" is issued, otherwise move to item 4
o If within the incoming DCI file more than 1 element is found with the same
nmsId (not empty), the nmsId is marked as "poisoned" and it's not used for element
matching.
o When nmsId is empty or "poisoned", the following applies:
If there is a match of the first item and a combination of the others listed
below, a "unresolved update" is issued, even if it has different nmsSource
and/or uniqueDeviceId
2 out of 3 match of uniqueDeviceId, sysName and ipAddr
mibTranslationFile
All indices (index1, index2, index3 and index4)
community string
4. ipAddress, mibTranslationFile, community string, All indices
o Last ditch resort used to match an element when matching by nmsId failed
o If there is a match of these attributes, a "unresolved update" is issued
o If there is no match found, a new element will be created

For eHealth version 5.6 and above:

1. nmsSource
o The default nmsSource is NH:DISCOVER, it is hardcoded in the discovery process
o Integration modules and Application response have a different nmsSource
o The matching search is limited to those elements having the same the same
nmsSource
o If a match is found, move to item 2, otherwise a "new element" is created
2. deviceHashKey
o New DCI field added in eHealth 5.6
o NOT visible from the GUI, only through DCI
o Assigned during the merge to uniquely identify each device within the configuration
o The matching search is limited to those elements having the same deviceHashKey
o If a match is found, move to item3, otherwise a "new element" is created.
o The following attributes are used to determine the uniqueness of the device:
uniqueDeviceId
ipAddress
sysName
ifPhysicalAddress cloud (List of all the physical addresses in the device)
ifIpAddress cloud (List of all the ip addresses in the device)
3. UDP Port, SNMP enterprise ID, parent mtf (if any)
o Used to identify multiple SNMP agents running in the same host

Concord Communications Discover Best Practices 8


o UDP Port and enterprise ID must match.
o The matching search is limited to those elements sharing these attributes
o If a match is found, move to item 4, otherwise a "new element" is created
4. nmsId
o Used to uniquely identify an element within the device
o If a match is found, a "resolved update" is issued, otherwise move to item 5
5. ifPhysAddr, dbId (in case of remote polled elements)
o Used to match a particular interface using it's MAC address
o If a match is found, a "resolved update" is issued, otherwise move to item 6
6. mtf_name, All indices (index1, index2, index3, index4)
o Last ditch resort to match an element
o If a match is found for all attributes, a "resolved update" is issued, otherwise a "new
element" is created

The merge algorithm was rewritten in eHealth release 5.6 to perform a more reliable comparison with
existing and new elements. This new algorithm greatly reduces the likelihood of duplicate element or
unresolved new elements being created during the merge. To further reduce the likelihood of duplicate
element creation, please refer to section 4.4, Reconciling and Avoiding Duplicate Element Creation.

5. How does selecting the MIB2 Option Impact My Discovery Results?

The Find MIB2 LAN option allows the finder to locate LAN interfaces which only contain basic MIB2
statistics such as In/Out/Total packets. This method of discovery is useful when a device has an
uncertified SNMP agent installed. When discovering this device, eHealth will generate a basic element
which will allow for reporting of availability and basic packet count information. This option is not
recommended for devices running a certified firmware version as the vendor specific interface will be
discovered allowing for a more robust reporting solution.

IV. Troubleshooting Common Discovery Issues

1. Troubleshooting No Response to SNMP or No Response to Ping errors


These errors indicate that the eHealth discover process timed-out waiting for the Ping or SNMP response
from the target device. The most probable causes of this scenario are:

1. The device is unable to respond to ping or responds to ping outside of the timeout threshold due to
network load.

Ping the device from the command line using the configured eHealth Ping packet size (default =
100 bytes). There are 3 steps that can be taken to resolve this issue:

1. Ensure the device is able to respond to ping and attempt to reduce the load by discovering
during off-peak hours.
2. Disable the discovery ping as described in section 4.1.2
3. Increase the timeout as described in section 4.1.3

2. The device is unable to respond to ping due to protocol restrictions placed on the device or the network
segment on which the device resides.

If a device is unable to respond to ping due to configuration restrictions, the discover ping can be
disabled via the NH_DISCOVER_DISABLE_PING variable. When this variable is set to yes,

Concord Communications Discover Best Practices 9


eHealth will not attempt to ping the device prior to sending the SNMP requests. This will allow
the discovery of devices which are unable to respond to ping.

3. Either network latency or load on the target device caused the SNMP request to either be dropped by
the device or received/transmitted outside the threshold of the discover timeout.

The NH_DISCOVER_TIMEOUT environment variable specifies the time in seconds that the
discover process waits for a ping response and an SNMP response from a device. Increasing the
value of this variable will allow eHealth to wait longer for device responses. The default value for
the NH_DISCOVER_TIMEOUT variable is equal to 1 second.

To determine the most appropriate 'timeout' value, perform a discovery from the command line:

*NOTE: Command line discovery results are output to the display (or a file) and not saved to the
poller configuration and database.

As the $NH_USER:

1) CD to the $NH_HOME/bin directory


2) Run command:

nhDiscover -c community string -mode mode1 [mode2] -t timeout IP Address

where:
mode = "lanWan", "router/Switch", "dialog", "server", "application", "modemPool",
"ras", "respelements"

timeout = timeout in seconds (2,3,4...)

Example: nhDiscover -c public -mode "router", "lanwan" -t 10 192.168.25.25

Once the minimum timeout has been determined, modify the setting of the
NH_DISCOVER_TIMEOUT variable to that value.

4. The SNMP agent on the target device is not running or is unresponsive.

Verify that the SNMP agent is properly configured by obtaining a MIB dump of the device using
the nhSnmpTool utility.

5. The port on which the SNMP agent on the target device is running is not configured in the following
eHealth variables:

NH_DISCOVER_PORTS
NH_DISCOVER_SERVER_PORTS
NH_DISCOVER_APPLICATION_PORTS
NH_DISCOVER_RESPONSE_PORTS

Determine the port on which the SNMP agent is running and add that port number to the
appropriate NH_DISCOVER_* variable(s).

Concord Communications Discover Best Practices 10


2. Troubleshooting No MIB Support for this Agent errors

This error message indicates that the finder process was unable to successfully match the agent in
question with the coded list of supported agents. Verify that the device in question is infact certified via
the Concord Communication Device Certification matrix:
http://www.concord.com/devices/html/default.html

If the device in question is not listed as certified, you can either:

1. Submit a certification request to have the device agent reviewed for certification via:
http://license.concord.com/custserv/certification.htm

Additional information on the Concord Communications certification policy can be found at:
http://www.concord.com/devices/cert_policy.asp

2. Rediscover the device using the Find MIB2 Lans option to attempt discovery of any MIB2
Lan ports on the device. See section 3.5 for additional information regarding this option.

3. Troubleshooting SystemEDGE Discovery Issues


Troubleshooting SystemEDGE discovery issues can be accomplished via the following method:

1. Verify valid licensing on SystemEDGE server

a. Windows: use the Sysedge_Home/setup c v command to validate license found and


installation issues

Example of valid return: setup: Found valid license key.


setup: Found valid license key.
No problems detected in SystemEDGE installation or license.

1. If valid key was not found:

Verify license string from sysedge.lic in the winnt/system32 directory.

If necessary, obtain a valid license from Concord Communications Licensing.

b. UNIX: examine the system log for errors relating to SystemEdge licensing.

1. If valid key was not found:

Verify license string from sysedge.lic in the /etc directory.

If necessary, obtain a valid license from Concord Communications Licensing.

2. Ensure the agent is running on the system in question.

a. Windows: ensure the SNMP service is running.

b. UNIX: verify agent is running using ps ef | grep sysedge command.

Concord Communications Discover Best Practices 11


3. Verify agents running port

a. UNIX: Use the 'ps ef | grep sysedge' command to locate process and port entry

b. Windows: Sysedge runs as a sub-agent of the Windows master SNMP agent. This will
usually be port 161, but can be verified in the winnt/system32/dirvers/etc/services file.

Example:

NeWS 144/tcp news


sgmp 153/udp sgmp
tcprepo 158/tcp repository
snmp 161/udp snmp
snmp-trap 162/udp snmp

4. Verify agent is running in full mode from the SystemEDGE system:

Syntax: sysvariable ipaddress:port community string V O a

Example of correct output:

System: CORVALLIS Build 1381, Service Pack 5 4.0 4.0 Patchlevel 1


SystemEDGE Mode: fullMode(1)
AgentVersion: 4.0 Patchlevel

Example of error indicating problem:

snmprecv timeout

5. Verify agent can communicate with eHealth system

a. Ping SystemEDGE system from eHealth system to verify network connectivity

b. Use the sysvariable command from the eHealth system using the SystemEdge systems
IP address.

c. Use the walktree command to verify valid SNMP communication from agent to
eHealth.

Syntax: walktree community IP addr:port mibpath outfile number of retries

Example: walktree public 192.168.18.208:161 1.3.6.1.2.1.1.3 walk.out 3

d. Use the nhSnmpTool command to verify if eHealth can complete a successful


mibdump of agent.

Syntax: nhSnmpTool c community Server t 8 ret 8 IP address of sysedge system

Example: nhSnmpTool c public Server t 8 ret 8 192.168.18.208

6. Verify environmental variable settings

a. Verify the NH_DISCOVER_SERVER_PORTS variable is set to the port on which the


agent is running.

Concord Communications Discover Best Practices 12


b. Verify that discover is using this port by examining the discover.log created

UNIX: examine the appropriate resource file for correct discover port entry

NT: use the system tab to examine variable settings

7. Run a command line discovery and modify the configurable values

a. Determine if any lan/wan ports can be discovered

Syntax: nhDiscover mode lanwan c community string ret 8 t 8 o


$NH_HOME/tmp/out.log res $NH_HOME/tmp/res.log ip address

b. Determine if agent can be discovered using command line with forced set timeouts.

Syntax: nhDiscover mode server c community string ret 8 t 8 o


$NH_HOME/tmp/out.force.log res $NH_HOME/tmp/res.force.log ip address

8. Examine files created by command line discover for potential problems:

a. Out.log : output of lan/wan discovery

b. Res.log: DCI formatted output of lan/wan discovery

c. Out.force.log: output of timeout and retry increased server discover

d. Res.force.log: DCI formatted output of timeout and retry increase server discovery

4. Reconciling and Avoiding Duplicate Element Creation

In most cases, the rediscovery of existing elements will result in resolved updates.
However, network environments are always changing and this creates a chance of getting duplicate
elements when the merge algorithm fails to resolve an update because of differences between the original
and newly discovered element's attributes. A duplicate element is simply an element where the eHealth
element naming convention duplicates an already existing element name. eHealth will attempt to ensure
uniqueness by appending a A (or B,-C etc.) to the newly found elements name.

In order to minimize this possibility, we recommend rediscovering the elements within the eHealth
configuration on a regular basis. This will limit the amount of updates that occur by minimizing the time
between updates.

In case of duplicate creation, examine the elements (original and duplicate) and determine if eHealth
should have merged those elements into one, once that assessment has been made, take note of the
following attributes of the duplicate element from the eHealth Discover UI:

Hardware ID (uniqueDeviceId)
Discover Key (nmsId)
System Name (sysName)
Agent Type (mibTranslationFile)

First delete the new element and update the original with these attributes, then rediscover to update all
other attributes.

Concord Communications Discover Best Practices 13


If there are more than a few elements to be reconciled, it is advisable to write a script to perform
these operations using DCI. In some cases, the duplicate element might have been collecting data for
quite a while so it's up to the customer/end user whether to delete the duplicate and keep the original
element or just delete the original element and keep the duplicate.

V. General Discovery Best Practices

1. Avoiding Duplicate Elements by Implementing a Strong Change


Control Process
A strong change management process is essential to ensuring an accurate and stable eHealth element
configuration. The eHealth Administrator should work in concert with the Network Administrators to
ensure that the eHealth Administrator is aware of when changes are going to be implemented.

It is strongly suggested that prior to a device change occuring, the eHealth elements associated with that
device be rediscovered. This will ensure the eHealth device configuration is current prior to the change
occuring. After the device change has been made, and additional rediscovery should be performed to
ensure that the new configuration is updated within the eHealth configuration.

This method will minimize the number of device changes detected by eHealth at one time thereby
minimizing the chance for duplicate element creation. For additional information on this topic, please
view sections 3.4 and 4.4.

2. Minimizing data loss through Statistics poller error analysis and


noDbDataFor tools

The eHealth installation includes valuable tools such as the nhListElements command to assist in
configuration management. That utility includes a noDbDataFor flag which creates a list of all elements
that have not reported data in the configured amount of time.

This usually means that an element either has polling disabled or is experiencing polling errors which are
causing eHealth to not insert data into the database for that element. That list can be used as a to-do list
of elements which should be rediscovered or investigated further. The rediscovery should resolve any
conflicts which may be causing the polling errors in question.

nhListElements

The nhListElements command displays a simple list of eHealth element names using selected
criteria. You can use arguments to filter the list and create specific lists of elements. You can also
redirect the output of this command as input to other commands to modify your poller
configuration file, such as nhModifyElements, nhDeleteElements and nhPopulateGroup.
Syntax The nhListElements command uses the following syntax:
nhListElements [-h] [-rev] [-showTypes] [-showDciFields]
nhListElements [-elements] [-outfile filename]
nhListElements -rebooted [-outfile filename]
nhListElements -where "whereClause" [-outfile filename]
nhListElements -elemType type [-outfile filename]

Concord Communications Discover Best Practices 14


nhListElements -noDbDataFor hours [-outfile filename]
nhListElements -groupType groupType -inGroup group [-outfile filename]

-noDbDataFor hours

Lists only those elements for which eHealth has not collected data and added it to the database for
the number of hours specified, and for which there is not any alarm data. This command allows
you to produce a list of elements that eHealth is not currently polling, or elements that it is
currently polling but that have poll errors. You cannot use this argument in combination with any
other nhListElements argument except for -outfile. You cannot use this argument on the central
site to return data in a remote polling environment. You must run it on the remote systems.
NOTE: in eHealth 5.0.2 and prior, only elements with polling turned on would be output from the
nhListElements command

3. Using seed files to automate incremental configuration updates

The eHealth discover mechanism allows for the use of seed files during the discovery process. These
seed files are simply a text file containing a list of IP address, community string combinations. This
provides the eHealth Administrator with an easy way to discover groups of elements and to automate the
discover process.

Example:

# Server 1
10.100.10.32 private
# Server 2
10.100.10.33 public

Seed files should also contain like technology types to ensure the discover is run against the correct
technology and there are no mismatches. For example, a router discovery of a server may actually
produce a router element as servers can act as a routing device.

The eHealth poller configuration file ($NH_HOME/poller/poller.cfg) can also be used as a rediscovery
seed file but this is only recommended for small configurations. Larger configurations should not utilize
this method as a rediscovery of the entire configuration causes a severe performance impact to the
eHealth server. It is recommended that the rediscovery target a portion of the configuration when using
seed files in large configurations.

4. Self Monitoring the eHealth System using Process Set Creation


Using the SystemEDGE agent, the eHealth system can be setup to self-monitor itself via the creation of
eHealth Process Sets. An eHealth and an Oracle (or Ingres) process set can be created to provide detailed
statistics on the eHealth and Database processes. These statistics can be used in capacity planning as well
as within LiveHealth monitoring to ensure application stability.

Process Sets are created via the eHealth discover UI via the Find Processes > Define option. Two new
process sets should be created for eHealth and the Database using the following processes and
parameters:

Process Set eHealth

o 5.5 and Above:

Concord Communications Discover Best Practices 15


nhiArControl
nhiCfgServer
nhiDbServer
nhiLiveExSvr
nhiMsgServer
nhiNotifierSvr
nhiPoller
Argument: remote No
nhiPoller
Argument: -live
nhiPoller
Argument: -dlg
nhiPoller
Argument: -import
nhiReplServer
nhiRespServer
nhiRftIn
nhiRftOut
nhiRmtIn
nhiRmtOut
nhiServer
nhiTrapServerCmu

o 5.0.2 and Earlier:

nhiArControl
nhiCfgServer
nhiConsole
nhiDbServer
nhiLiveExSvr
nhiMsgServer
nhiNotifierSvr
nhiPoller
Arguments: none
nhiPoller
Arguments: -live
nhiPoller
Arguments: -dlg
nhiPoller
Arguments: -import
nhiRespServer
nhiServer
nhiTrapServerCmu

Process Set Oracle (for eHealth 5.5 and above)

o ora_arc0_EHEALTH
o ora_arc1_EHEALTH
o ora_ckpt_EHEALTH
o ora_dbw0_EHEALTH
o ora_lgwr_EHEALTH
o ora_pmon_EHEALTH
o ora_reco_EHEALTH

Concord Communications Discover Best Practices 16


o ora_smon_EHEALTH

Process set Ingres (for eHealth 5.0.2 and earlier)

o dmfacp
o iigcn
o iidbms
Argument: recovery
o Iidbms
Argument: dbms

Each process should have the create if found flag set, match full name set, and the appropriate Operating
System set.

Once the process set has been defined, the eHealth server should be discovered using the read-write
community string to create the appropriate MIB rows. Once discovered, the Record Detailed Data
option can be enabled to allow for individual process data to be reported along with aggregate process set
data.

5. Effectively Interfacing with Concord Technical Support to Resolve


Discovery Issues

When the situation arises that it is necessary to contact Concord Technical Support, it is important to
provide Support with the information necessary to troubleshoot the issue. When dealing with Discovery
issues, the following information is often vital to the troubleshooting process:

A clear description of the problem


What may have changed in the environment since the issue has occurred
The current eHealth patch and certification level (including any verification kits that may have
been installed)
The OS of the eHealth system
Any discovery logs generated detailing the issue
The poller configuration file ($NH_HOME/poller/poller.cfg)
A tar archive of the $NH_HOME/log directory
A tar archive of the $NH_HOME/tmp/nhiCfgServer directory
A full MIB dump (stages 1 & 2) of the device in question using the nhSnmpTool command

Although the above information and files may appear to be unrelated, in the majority of instances this
information is required during the troubleshooting process. Providing this information to the Technical
Support Engineer when initially contact Concord Technical Support, will dramatically reduce the time
taken in obtaining all necessary information to resolve the issue.

VI. Changes to the Discovery Process in eHealth 5.6.x

The main change in the Discover process in eHealth 5.6 is the changes made to the merge algorithm.
eHealth no longer utilizes an elements discoverKey to determine uniqueness but now relies on a
deviceHashKey. This new key allows for a greater level of accuracy when determining if changes to a
device constitute a new element or an update to an existing element. Additional information on the
discover algorithm changes can be found in section 3.4, Explanation of the eHealth Merge Algorithm.

Concord Communications Discover Best Practices 17


VII. Other Resources

In addition to this document, there are many other resources available to the eHealth Administrator to
assist in the management of the Discover process and the eHealth element configuration. These include,
but are not limited to:

Standard eHealth Documentation (TotalDoc)


o eHealth Administration Guide
o http://www.concord.com/support/secure/products/search_prod.shtml

Concord Communications White Papers


o http://www.concord.com/support/secure/tech_wht_papers.shtml
o The eHealth Discovery and Certification Process White Paper
http://www.concord.com/support/secure/n_nhdisc.shtml
o The MIB2 LAN Element Type White Paper
http://www.concord.com/support/secure/n_mib2lan.shtml

o Customizing Element Names White Paper


http://www.concord.com/support/secure/n_customize.shtml

Concord Communications Knowledgebase


o http://search.support.concord.com
o Applicable Solution IDs:

PrimusTrain202
TS2906
TS15008
TS13051
TS11641
TS13242
PrimusTrain195
PrimusTrain90
TS11673
TS15008
TS4602
TS13577
TS14359

Scripts Contributed by Concord Employees and Customers:


o Concord Knowledgebase Solution # TS13791

Concord Communications Discover Best Practices 18

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy