The Problem with Duplicate CIs

Duplicate configuration items are a serious problem for any Configuration Management Database:

  • Inaccurate inventory asset reports
  • Could cost your company money on new licenses and maintenance
  • Creates confusion when users are submitting ticket requests
  • Makes it difficult to report on incident, change and problem trends
  • Makes configuration management more difficult
  • Can undermine the trust in configuration management

Getting rid of duplicate CIs is always a high priority for any configuration management team.

ServiceNow Discovery Preventing Duplicates

Each configuration item record is uniquely identified based on one or more field values specified in a CI Identifier Rule. A configuration item record is considered duplicate if the field values that uniquely identify it match the field values of another record of the same class.  By default, configuration items of class Hardware and all subclasses (Computer, Server, Unix Server, Windows Server) use the OS Serial Number as a unique identifier.  When SN Discovery or Service Mapping discovers a CI, the sensors and patterns use the internal function SNC.IdentificationEngineScriptableApi.createOrUpdateCI to send the CI to the CMDB.  If a CI already exists with the identifier attributes for that particular class, it is overwritten, otherwise a new CI is inserted. Here is sample JavaScript code invoking this function:

_________________________________________________________________________

var payload = {

            items: [{

              className: 'cmdb_ci_aix_server',

              values: {

                name: 'Aix Server 900',

                asset_tag: 'Asset 900',

                ip_address: '10.20.30.11',

                mac_address: 'ABCD1234',

                ram: '4096',

                cpu_name: 'SNow',

                serial_number: '123456783',

                cpu_type: 'SNow'

              }

            }]

};

 

var jsonUtil = new JSON();

var input = jsonUtil.encode(payload);

var output = SNC.IdentificationEngineScriptableApi.createOrUpdateCI ( 'ServiceNow', input);

  1. gs.print ( output);

______________________________________________________________________________________

 

Since Geneva, the CI Identifier rules can be edited via a simple user interface from the CI Class Manager.  It allows users to select which attributes will be used to identify CIs for a particular class and even allows for multiple identification rules to be applied.

ci_id.png

The CI identification Rules for the Hardware Class

 

How Duplicate CIs Get Into the CMDB

For better or worse, there are many ways to insert CI records into the CMDB without any checking to see if another CI is already there.  The REST API Explorer and GlideRecord object are just two mechanisms used to insert records into tables where no checking with the identification rules exists.  Unfortunately, CI Records with identical attributes will always find their way into the CMDB.

Detecting Duplicate CIs on Discovery

When a CI payload is inserted with the SNC.IdentificationEngineScriptableApi.createOrUpdateCI function, it does more than just check whether a CI exists to overwrite it.  It also checks for multiple CIs that might already be in the CMDB.  If more than one CI matches the identification attributes for that class, then a de-duplication task is created that points to all the duplicate CIs.  In addition, the discover_source field is updated in each duplicate record with the text “duplicate” so the records are now marked. These configuration items are now identified as duplicate CIs and will appear as duplicates on the CMDB Health Dashboard.

When duplicates are detected, the CI payload still may or may not be written to a record in the CMDB.  This depends on the system property named glide.identification_engine.skip_duplicates.  This is an internal system property in the sys_properties table.  By default, the skip_duplicates property is set to true (default value).  This means that if duplicate CIs are detected by the IdentificationEngineScriptableApi function, the CI payload will overwrite the oldest duplicate CI.  That is the CI with the oldest date/time in the updated field.  If skip_duplicates is set to false, then the CI is rejected and none of its fields are written to the CMDB.

The CMDB Health Dashboard Duplicate Detection

The Configuration Management Database Health Dashboard was introduced in the Helsinki release.  It measures many important metrics in the CMDB and to display the results on an interactive dashboard. Duplicate CIs is one of the correctness rules that it measures.

dup2.png

dup3.png

The Health Dashboard runs a scheduled job called the Correctness Score Calculation which is available as a list of Scheduled Jobs available on CMDB Health Preference.  The Correctness job scans the records in the CMDB (cmdb_ci) that have discovery_source set to empty.  It checks each CI against the CI Identification Rules for their particular class to see if they are duplicates.  If they are, then they get assigned to a de-duplication task and are marked as “duplicate”.  If they are not duplicates, then the discovery_source is set to “Unknown”.  If something changes in the CI marked with discovery_source as “Unknown”, it is set to an empty value.

dup4.png

After running the correctness score calculation, the CMDB Dashboard correctness dashboard is updated.  If you run this job manually, make sure to check the status of the job execution in the cmdb_health_metric_status table and clear the browser cache.

Visual Workflow of the Duplicate Detection Jobs

 

dup5.png

dup6.png