Skip navigation

Developer Community

3 Posts authored by: Johnny Walker

Having a background in System Engineering helps when it comes to administering MID Servers. I wrote a comment the other day about how to configure the wrapper program to auto-restart the JVM based on specific error messages. https://community.servicenow.com/community/operations-management/discovery/blog/2016/06/02/mid-server-crashing-due-to-so…

 

Sometimes it is necessary to start or restart the MID Server. Looking at the MID Server form, you will find the provided UI Action: "Restart MID" (green arrow below). This UI Action sends a command to the ecc_queue for that MID Server (ecc_agent) to pick up and execute just like any other probe or task.

 

MID Server Form.png

But what if the MID Server isn't picking up work from the ecc_queue? If something has gone awry, networking issues, storage problems - including brief disruptions - the MID Server application can end up in a bad state where the work isn't getting picked up from the ecc_queue. Symptoms of this would be log statements in the agent\logs\agent.log.0 file saying that xml is being enqueued or even SEVERE errors contacting your SN instance. In these cases and others, there is good reason to restart the Windows Service entirely.

 

Remoting into the Windows host is time consuming, should require some level of access control including group memberships that must be maintained for employees and contractors and is just all around tedious in nature. You could use the sc.exe command in Windows and structure the command to remotely restart the service from your workstation. I've used that for a good while and it's not a bad workaround. Writing the wrapper script for doing it inside Cygwin was just difficult enough to make things interesting. This solution has portability problems for people not used to using Cygwin and it still requires group memberships and access controls for your employees and contractors to access the MID Server host and restart services.

 

Enter the "Brute Restart MID" UI Action.

 

Thanks to PowerShell Probe Script Utility and a standard Windows Service naming convention (see line 4 below), it is possible to script this as a UI Action. Be sure that you have your condition set on this to limit access to your ITOM admins.

 

var target_agent_name = current.name.replace(/'/g, "\\'");
var target_host = current.host_name;
var ps_script = 'restart-service -inputobject (get-service -computername '+ target_host
+' -name snc_mid.'+ target_agent_name + ')';
var orch_host = gs.getProperty('mid.server.rba_default', 'NONE');
if (orch_host == 'NONE'){
  gs.addErrorMessage('Please set mid.server.rba_default sys_property.');
} else {
  var powerShellProbe = new PowershellProbeES(orch_host);
  powerShellProbe.setScript(ps_script);
  powerShellProbe.create();
  gs.addInfoMessage('Remote Service Restart Issued.');
}
action.setRedirectURL(current);

 

I also added a Business Rule to ecc_queue to move attachments from the "Grab MID logs" action to the MID Server record.

 

MID Server Attachments.png

 

Together these small enhancements simplify the routine tasks associated with MID Server administration. They also allow us to keep access to the MID Server host machines more tightly controlled, which is good practise anyways. They also allow those of us with access to stay off our VPN clients which is a definite win.

Johnny Walker
Acorio LLC

Discovery is a flexible and powerful tool for building a robust and trustworthy CMDB. If you've spent any time troubleshooting or enhancing Discovery, you may share my impression that Discovery Logs take a while to get used to reading. You may have Run network Discovery to add your IP Ranges, or you may have chosen to Import IP ranges into Discovery schedules with import sets . In either case, you probably have IP Addresses in your Discovery Schedules for devices that you cannot authenticate to.

 

There are many reasons you would see these in your Discovery Logs:

    

CreatedLevelShort MessageSourceDevice
2016-06-17 23:59:29WarningSSH authentication or connection failureUNIX Classify172.21.120.12
2016-06-17 23:58:35WarningAuthentication failure with the local MID server service credential.Windows Classify172.21.120.132

 

In my case, these devices - and thousands more - are things like workstations or network devices that I already have in my CMDB through an integration source like SCCM or CiscoWorks. Sometimes they are Security appliances that Discovery isn't allowed to have Credentials for. On rare occasions they are devices where authentication is failing even though Discovery should have access. With 30k+ of these authentication failures happening daily, a few issues can be observed:

 

  • Security and Systems Administrators take issue with something failing to authenticate to these devices every day. Especially if more than one credential is attempted. (rightfully so!)
  • Configuration Management and Service Now Administrators have messy logs to comb through when working with Discovery.
  • Discovery Schedules take longer than needed and they can already take all night to run if you have a large CMDB.

 

Out of the box, excludes can be done manually but it's a tedious process to do it and then later maintain them. The question was asked Can discovery ranges exclude IP addresses dynamically? I decided to work on it and the attached Update Set is what I came up with.

 

 

Discovery Status

Here is an example Discovery Status where upon completion, a Business Rule has processed the Discovery Logs looking for Authentication Failures.

If authentication failures exist, QuickExcludes will query the CMDB looking for Hardware Configuration Items matching those IP Addresses which have Discovery Sources other than Service-Now.

When a match is found, an Exclude Parent for the Discovery Source is created within the Discovery IP Range. Any matching IP Addresses with the same Discovery Source will be added to that Exclude Parent.

discovery status with excludes.png

Discovery Schedule

Here is the resulting Discovery Schedule with Excludes added as a Related List to the form view.

Note that the Discovery Source "MS SMS" (SCCM) appears multiple times, as does CiscoWorks. This is because the Discovery Schedule is comprised of multiple IP Ranges. Each of these Ranges must have an Exclude Parent to hold any IP Address Excludes within that IP Range. To logically separate the Excludes, an Exclude Parent for each integration source holds all the Exclude Range Item IP's for that Discovery Source.

discovery schedule with excludes.png

Manual Excludes

 

 

From time to time it may be required that IP Addresses should be excluded for reasons other than their presence in the CMDB through another Data Source. In this case, a person with the cmdb_admin role can list select the IP in the Discovery Logs to Manually Exclude those IP addresses. These Excludes will be contained as the others, with the Exclude Parent being the UID of the user who excluded them.

manual quick exclude UI action.png

Maintenance

Once we exclude an IP Address from Discovery, it becomes necessary to know when to expire that exclusion so that if the IP Address is re-assigned, we will again Discover any device using that IP Address. A scheduled job can be ran nightly to verify that each Exclude still has an Operational Hardware Configuration Item that matches.

 

var qe = new QuickExcludes();
qe.verifyExcludes();

 

Note: Manual QuickExcludes are not checked for hardware records before creation and not removed by the scheduled job.

manual quick excludes created.png

Johnny Walker
Acorio LLC

I ran into an issue recently where I discovered that if I set a dot-walked field to mandatory, the form would still submit regardless of the red mandatory indicator.

 

I had worked (what I thought was) a simple Incident to modify a client script for a newly dot-walked field name instead of the old field name. We had decided that instead of copying the comments journal entries from sc_task up to the request, we would instead just uniformly use the request comments field. The onChange client script which required customer communication notes when placing a Catalog Task in a Pending state were simply overlooked.

 

Easy ticket - "done before my second cup of coffee" - so I thought!

 

I modified the client script and visually confirmed the mandatory indicator on the dot-walked journal field when I placed the sc_task.state field in Pending. The idea being that we should notify the customer why their request is now pending.

 

I returned from a mid-morning meeting to find the incident reporter asking me which environment I wanted her to test in because it still wasn't working. I pulled up a test sc_task record and proudly displayed the red splat of mandatoryness on the form.

 

"Go ahead..." she said, "Click Submit."

 

I submit the form. No errors.

 

"Oh man... That's not good. I'll get on it." I tell her. I start guessing there's somehow a javascript error in one of our onSubmit scripts causing it to bypass the mandatory field checks.

 

"Easy fix" I think to myself. (I always think that)

 

After I work on this a little while, I decide to ask for help and open a ticket in HI. I'm told this is working as designed. They point me to Creating New Fields - ServiceNow Wiki  where it says this:

 

A form can be saved with an empty mandatory field, if that field is a reference field (derived from another table) and if the parent field is also blank. However, if the mandatory reference field shows a value from the parent field, then the form cannot be saved if this value is deleted. It is important to note that if the value in the referenced field is changed, the value for that field is changed everywhere it appears.

 

I was a bit confused by this, and while trying to wrap my mind around it, I notice that if I click into the field, type something and remove the value, then the tab section suddenly indicates a mandatory field is present. Sure enough when I try to submit the form now, I can't do it without entering something in this mandatory field. I can even toggle the mandatory off and on again using the state field's onChange script. The 'fix' seems to hold once I've entered text into that field.

 

Section not Mandatory
Section has Mandatory field
Section not Mandatory.PNGSection Mandatory.PNG

 

I took a look at the DOM and noted the onkeyup function for this field:

Inspect Element.PNG

I was then able to add the line 9 here to my client script so that my mandatory field enforcement was in place:

 

function onChange(control, oldValue, newValue, isLoading, isTemplate) {
  if (isLoading) {
  return;
  }
  //If we are hitting the pending state or closed incomplete, require a comment
  if(newValue == -5 || newValue == 4){
  g_form.setMandatory('request_item.comments', true);
  //workaround for PRB569134 where SN Dev states this is working as designed
  multiModified(g_form.getControl('request_item.comments'));
  } else {
  g_form.setMandatory('request_item.comments', false);
  }
}

 

I plan on writing a blog post soon about debugging the client side code, which is where we actually find the multiModified function declaration.

Johnny Walker
Acorio LLC

Filter Blog

By date: By tag: