13 reasons why the stats.do page is your best friend for troubleshooting

kobby_adu-nti · ‎07-05-2017

Troubleshooting your instance with the stats.do page

The number of applications offered in the ServiceNow platform has become quite a thing to behold. With so many potential business use cases, it is quite understandable if at some point you have found it necessary to engage the friendly Technical Support Engineers (TSE) at ServiceNow for some assistance with configuration, implementation, customization, or just an explanation of the inner workings of the applications offered.

For those of you who are familiar with creating incidents in the HI Portal, you may recall a time where the TSE attending to your issue asked you to navigate to the URI of the servlet statistics monitoring utility, stats.do. This troubleshooting utility is mainly used by the performance and platform engineers to assist them in getting to the root cause of your issue. By the end of this blog, you will have a better understanding of all the information contained in the stats.do page, so you can use it to manage your instance and perform your own troubleshooting when necessary.

Get to know the stats.do page

Before we begin, it is important to understand that all ServiceNow instances exist as a cluster of apache-tomcat containers, with the minimum of two separate apache-tomcat servers, or nodes, existing per instance. Throughout this article, the terms "node," "apache-tomcat server," and "container" will be used interchangeably. The term "instance" will be used interchangeably with "cluster." For a generic overview of the topology of a ServiceNow node, please see the diagram below:

servicenow node diagram.png

The stats.do page is segregated into 13 sections. In this article, I will be breaking down stats.do page with respect to the information it provides for your ServiceNow nodes and how you can utilise this information to your benefit in managing your instance. It is also worth noting that vital system information regarding the status of your ServiceNow instance can also be obtained from the Performance Analytics dashboard, The System Diagnostics UI page, and the xmlstats.do URI. You should leverage the data provided by all of these of resources for troubleshooting ServiceNow.

Section 1. Instance configuration and status

Where the stats.do URI is mainly used by our performance and platform engineering specialist for troubleshooting, this first segment of the stats.do page is utilised by all ServiceNow TSEs regardless of their subject matter expertise. Here we find generic details that describe your instance configuration and status. section 1 again.png

The information found in this section includes the following:

The name of the instance
The build information of the instance and MID Server, including release version
The status of the F5 load balancer distributing connections between the clustered nodes
The name of the application server to which the user is connected
The name of the ServiceNow instance node

The name of the node and application server is particularly significant information; this is because system logs are stored on nodes individually. With this detail, the TSE working on your incident will be able to identify which node to connect to perform analysis any transaction that requires evaluation to resolve your issue. Internally, ServiceNow has tools to connect to any active primary node in your instance cluster. However, it has been a long-standing complaint from ServiceNow administrators, and developers that work for our customers but are not ServiceNow employees, that they do not have this ability. As stated earlier, Every ServiceNow instance exists in a clustered environment. What this means is when attempting to connect to an instance, the node to which an end user is connected is determined by F5, the load balancer.

To work around this issue, I would encourage ServiceNow admins to install the Node Switcher — Google Chrome Extension. This handy utility grants developers, admins, and even end users the same ability as ServiceNow TSE's in selecting which of the active nodes in a cluster they wish to connect to. The Node Switcher is available for free from the Chrome Web Store at the following URL: Node Switcher for ServiceNow

Screen Shot 2017-06-25 at 00.27.58.png

Section 2. Apache-tomcat memory

In this segment of the stats.do page we find information about the memory of the apache-tomcat container. Servlet Memory.png

This section holds vital information if you are concerned with the performance of your ServiceNow instance. The information that can be found here includes the following:

The maximum memory allocation permitted on the container
The current memory allocation in the container
The current memory in use in the container
The amount of free memory available container

If you are facing performance issues with ServiceNow instance, such as long load times or general sluggish behaviour, it is well worth checking this section of the stats.do URI. If you discover that you are running low on memory, contact a ServiceNow TSE via the HI portal and they will assist in identifying the cause of the memory scarcity

Section 3. Apache-tomcat server

Here we find specific configuration and status information about the apache-tomcat server the browser is connected to. The information found here include the following:

A count of how many times a memory cache flush has been engaged in the instance
A count of transactions performed per node
A count of error handled per node
A count of processor transactions performed per node
A count of cancelled transaction performed on the node
A count of the logged in sessions per node
The maximum number of concurrent session permitted per node
A session timeout minute counter
A count of the cometD sessions available to push data to the browser
The status the Java Security Manager Policy
A record of the uptime the apache-tomcat server

Servlet statistics.png

So, how can you make use of these particular statistics? Let us take a hypothetical scenario where you have configured your instance to integrate with an external data source. For argument's sake that let us say that you are importing data from an FTP server. In this scenario, you find that connection to your FTP server fails for an unexplained reason, even though the server is up and the connection configuration has been validated as correct. If you find yourself in this situation, you may want to pay close attention to the uptime of your apache-tomcat server. There is no industry best practice for the time an apache-tomcat server should remain up. However, you may find that simply restarting the node will resolve your connection issues.

If you find yourself in the above hypothetical scenario, consider logging an incident in the HI Portal and request a restart to be performed by a TSE.

Section 4. Semaphores

In this segment of the stats.do page we find information regarding the semaphores available to the instance node. For those of you who are unfamiliar, a semaphore is a count of threads that are permitted to access resources on the node. For example, if your apache-tomcat server is configured to permit N number of semaphores in total, that means that there are up to N threads that are authorised to access a resource at any one point in time. The semaphore set section is broken down into six segments:

Default
Debug
Amb_Received
Amb_Send
API_Int
Presence

Depending on whether or not you have customised the number of semaphores permitted on your apache-tomcat server, you may encounter a scenario where the stats.do page reports an anomaly such as 'there are more semaphores held than maximum available'. This issue may occur if have configured your node in such a manner that the size of the semaphore pool held has been reduced.

The semaphore permits for your node can be configured on your instance from the frameset UI. I should mention that you should first engage a Service Now TSE for advice before undertaking any activities with the semaphore as poor configuration may have devastating implications for the performance of your instance. If, after advisement, you decide to configure your instance in this manner from the navigation menu, you should traverse to the System Maintenance > Semaphore module. Please note, the overall health of your instance's available semaphores can be reviewed from the Performance Analytics dashboard.

Section 5. Operating system

This section displays general information regarding the operating system on which the application container is installed.

OS configuration.png

Sections 6, 7, 8. Node response time statistics

Here we see statistical information gathered for the response time of your node. There are several articles available to explain the significance of these statics and how the information they provide can be best utilised to troubleshoot performance issues on the platform:

Response time.png

Section 9. Node connection to instance database

This section of the stats.do page displays information about a node's connection to its instance database pool and is replicated on the System Diagnostics UI page. This information is rather generic in terms of detail, so if you are truly concerned with the performance of your instance database, it would be prudent to also review the xmlstats.do URI. The following data concerning the health of your instances database can be found on stats.do:

The name of your instances database
The database online status

DB connection pool.png

Also, please review the following link for further information regarding the System Diagnostics UI Page: http://wiki.servicenow.com/index.php?title=Running_System_Diagnostics#gsc.tab=0

Section 10. Background scheduled jobs

In section 10 we find information about the background schedule jobs. The ServiceNow documentation site and wiki contain several articles with instruction on how to best utilise the scheduled job feature of the platform:

background scheduler.png

As a best practice, you should perform vigorous testing before you commit your code to a scheduled job. However, if you end up in a situation where the code in your scheduled job is executing unexpectedly, you can kill the active transaction that is running the job in the All Active Transactions module. This can be achieved by taking the following steps:

Navigate to User Administration > All Active Transactions.
Find the transaction in the list view. Right-click and select the Kill option from the context menu.

Section 11. Background progress workers

In this section, the stats.do page displays information regarding the background progress workers responsible for tracing background processes running on your instance. This includes such activities as Importing/Exporting data or committing an Update Set. You may find it preferable to monitor the progress workers active on your instance from the System Diagnostic UI page as the information on stats.do is replicated there.

background progress worker smaller.png

Section 12. Database Lazy Writer

In this section, we see statistics for the Database Lazy Writer. The DB Lazy Writer performs background insert/update/delete database activities. Essentially it is responsible for non-time critical operations, freeing up resources for foreground thread performance. Like all of the database monitoring features before, you can obtain additional information regarding the DB Lazy Writer from the System Diagnostic UI page or the Performance Analytics dashboard.

db Lazy writer small.png

Section 13. Logged in sessions

Lucky thirteen displays a count of the number of logged in users sessions. You can access a more detailed view of this same information from the frameset UI by navigating to the User Administration > Logged in users module.

logged user.png

The stats.do page contains a wealth of information that can help you troubleshoot a number of issues in your instance. They are more than just a set of statistics - they are clues that our own TSEs use frequently to discover the root cause of a problem so we can diagnose and resolve it. So remember, if you are facing an issue with your ServiceNow instance, get to know the facts by checking out the stats.

ServiceNow Community servicenow community