on 04-12-2018 03:01 AM
Get started with Incident Management and check out these FAQs about process, states and more on Incident. We'll go over a few aspects like:
The goal of Incident Management is to restore normal service operation as quickly as possible, while minimizing impact to business operations and ensuring quality is maintained.
ServiceNow Incident Management supports the incident management process in the following ways:
Incident Management also ensures communication with the user community throughout the life of the incident.
Any user can record an incident and track it through the entire incident life cycle until service is restored and the issue is resolved. Reports are used to monitor, track, and analyze service levels and improvement.
An ESS user can call a service desk agent and the agent can log an incident based on the information provided by the user.
An ESS user can send an SMS to the <<ServiceNow Customer Service>> number and an incident is automatically created for the user.
Note: The user must install the Notify plugin and set up a Twilio account in order to avail the messaging service.
Note: An ITIL user can copy or create any incident whereas an ESS user can copy only the incident that the user has created.
Create an incident when there is any unplanned interruption or degradation in the quality of an existing IT service and create a request when you want to put a formal request to the IT service desk to provide something. A request can be for a new hardware or application, information, training etc.
Example: If the existing RAM in your system is malfunctioning, then create an incident but if you want a new RAM for your system, raise a request.
Incident Requests are requests that denote the failure or degradation of an IT service. For example, unable to print, unable to fetch mails and so on.
Service Requests on the other hand are requests raised by the user for support, delivery, information, advice or documentation. Some examples are installing software in workstations, resetting lost password, requesting for hardware device and so on.
There are two ways to assign an incident to a group or a user:
On an Incident form, by default, the ‘Priority’ field is read-only and must be set by selecting the ‘Impact’ and ‘Urgency’ values. For example, if you set the value of ‘Impact’ and ‘Urgency’ to be high, then the value of the ‘Priority’ field will be Critical whereas if you set the value of ‘Impact’ to be medium and ‘Urgency’ to be low, then the value of the ‘Priority’ field will be low. In the Priority [dl_u_priority] table, the values of Impact, Urgency and Priority can be modified in the data lookup rules.
An administrator can either alter the priority look-up rules (in the Priority [dl_u_priority] table) or disable the “Priority is managed by Data Lookup - set as read-only” UI policy and create their own business logic.
In the ‘Configuration item’ (CI) field, you need to select the component of the business service which is affected and for which the incident is being logged. For example – email, blackberry, e-commerce.
Configuration Management Database (CMDB) contains a collection of configuration items (CI) as well as descriptive relationships between such CIs. When populated, the database becomes a means of understanding how critical assets such as information systems are composed, what are their upstream sources or dependencies, and what are their downstream targets.
CI Relationships form a crucial part of CMDB because with relationships, the users accessing CMDB can understand the inter-dependencies between the CIs, and in the case of a failure, the impact caused on another CI can be identified.
If you open an Incident form and enter a value in the Configuration item field, you will notice that a Dependency view icon appears next to the lookup icon. If you click the Dependency view icon, it will show you the upstream and downstream dependencies of that CI. If say, the CI item is email and is not working, you can check the dependency map to find out the servers on which this CI is dependent on and then validate the servers one by one to get to the root cause of the issue. This will be possible only when the relationships between CIs is well defined in the CMDB table.
When you populate the ‘Business Service’ and the ‘Configuration Item’ fields on the incident form and save the record, the selected values appear on the “Impacted Services/CIs” and “Affected CIs” related lists respectively.
If you want to add multiple affected CIs or impacted services, use the “Add” button provided on the related lists.
Note: If you have modified the cmdb_ci (configuration item) on the incident, the related list of “Impacted Services/CIs” will not reflect the change unless you do it manually using the “Refresh Impacted Services” UI action from the context menu.
On the Incident form, in the Short Description field, type the subject on which you want to find relevant knowledge articles. You can also type the subject in the Related Search field, in the Related Search Results section. All the articles relevant to the subject appears in the Related Search Results section.
A System Admin can configure the search results based on specific user field by performing the following actions:
a) Navigate to Contextual Search > Table Configuration.
b) Click on the ‘Incident [incident]’ configuration record.
c) In the ‘Search as’ tab, select the ‘Enable search as’ checkbox.
d) From the ‘Search as field’ choice list, select the field based on which you want to see the filtered search results.
For example: Select caller field (consider ITIL User as the caller).
After updating the incident configuration record, if you open any incident and view the ‘Related Search Results’, you will now find two tabs:
Note that the search results not only display the relevant knowledge article but also related service catalog items. For each search record, an 'Order' or an 'Attach' button appears for you to order the service catalog item or attach the knowledge article with the current incident record.
Triaging an incident involves two major activities. Firstly, classifying the incident into the right assignment group. Secondly, involving the right set of people in order to resolve the incident as quickly as possible. Identifying the correct and most appropriate assignment group or person for the incident is the most basic purpose of triage in incident management.
A process for sorting inefficient operations into ITIL processes based on the client's need for or likely business benefit from immediate improvement. ITIL Triage is used in the data center, at disaster recovery sites, and in boardrooms when limited financial resources must be allocated.
If the incident is in the On Hold state and the On hold reason is Awaiting Caller, the incident state changes to In Progress when the caller updates the incident. In case of all other On hold reason, the incident state remains in the On Hold state.
Before Kingston, the ‘Close codes’ and ‘Close notes’ fields were controlled using UI policies. UI policies are valid only if the fields on which the UI policies are applicable are present on the form. So, you could avoid entry to the ‘Close codes’ and ‘Close notes’ fields by not adding the fields to the form.
From Kingston release onwards, we have moved the UI Policy to a Data Policy that works on the server side. Hence, you will need to fill the ‘Resolution codes’ and ‘Resolution notes’ fields to be able to submit the form.
Note: Data policies on the ‘Resolution codes’ and ‘Resolution notes’ fields are not available OOB for existing or upgrade customers. Existing or upgrade customers can create a custom Data Policy on Incident table that makes the ‘Resolution Notes’ and ‘Resolution Codes’ fields mandatory.
‘Copy Incident’ (UI action in the contextual menu) copies the details of an existing incident record to a new incident record. There is no association between the original/source incident and the new incident.
‘Create Child Incident’ (UI action in the Context menu) copies the details of the parent incident and associates the new incident to the parent incident. The originating incident number is copied to the ‘Parent Incident’ field of the newly created child incident.
Note: The list of attributes and related lists that will be copied from the originating incident will be the ones that are mentioned in the following incident properties:
Starting Kingston release, the parent-child incident state synchronization is as follows:
Note: If an incident is reopened, all the child incidents of that incident are reopened and the state of the child incidents is changed to In Progress.
You can attach an incident to a problem or a change. To create a problem from an incident record, click the Additional actions menu icon and click ‘Create Problem’. If you want to relate incident to a change request, you can click the Additional actions menu icon and click ‘Create Normal Change’, ‘Create Emergency Change’ or ‘Create Standard Change’. The parent record (incident record) will be automatically associated with the new Problem or Change record.
Note: The ‘Create Standard Change’ UI action will be introduced in London release.
If you need to associate multiple Problems or Change Requests to a parent incident record, you can accomplish this by using the “New” and “Edit” buttons on the “Problems” and “Change Requests” related list present on the incident form.
If the incident property “Number of days (integer) after which Resolved incidents are automatically closed. Zero (0) disables this feature” has number of days defined in its field then resolved incidents will be automatically closed from the time the incident has been resolved/updated.
If the Incident property “Enable auto closure of incidents based on Resolution date. Setting this to 'No' will make auto closure to run based on the Updated date.” is selected, the incident will be auto-closed based on the resolution date else it will be based on the last updated date.
Note: Calculation of days based on Business Days is not supported out-of-the-box.
When incidents occur, the role of incident management is to restore service as rapidly as possible, without necessarily identifying or resolving the underlying cause of the incidents. If incidents occur rarely or have little impact, assigning resources to perform root cause analysis cannot be justified. However, if an individual incident or a series of repeated incidents causes significant impact, problem management is tasked with diagnosing the underlying cause of the incidents and, ultimately, to identify a means to remove that cause.
Major incidents are those incidents for which the degree of impact on the business/organization is extreme. Incidents for which the timescale of disruption – to even a relatively small percentage of users – becomes excessive should also be regarded as major incidents. It is possible to define some of these major incidents, but most will be prioritized as they happen based on impact and urgency. The major incident module has been introduced since Kingston release.
Some organizations equate a major incident with a Priority 1 Incident (or a Severity 1 Incident) but the mapping is not that crisp. Incident priority is for sorting and prioritizing (and measuring and reporting). A major incident is about abandoning the normal process and switching to different procedures.
A separate procedure, with shorter timescales and greater urgency, must be used for major incidents. A definition of what constitutes a major incident must be agreed and ideally mapped on to the overall incident prioritization system.
Where necessary, the major incident procedure should include the dynamic establishment of a separate Major Incident Management team, under the direct leadership of the Major Incident Manager. The Major Incident Management team is formulated to concentrate on that particular major incident and to ensure that adequate resources and focus are provided for finding a fast resolution.
If the cause of the incident needs to be investigated at the same time, then the Problem Manager will be involved as well, but the Incident Manager must ensure that service restoration and underlying cause are kept separate. Throughout, the communication manager will ensure that all activities are recorded and users are kept fully informed of progress. Communication is a hugely important activity in handling major incidents.
The Problem Manager should in these circumstances be notified (if not already aware) and should arrange a formal meeting with interested parties (or regular meetings if necessary). These should be attended by all key in house support staff, vendor support staff and IT services management, with the purpose of reviewing progress and determining the best course of action. The communication manager should attend these meetings and ensure that a record of actions/decisions is maintained, ideally as part of the overall incident record as major incidents are still logged in the same way as all other incidents (it is only the priority and management of the incident which is different).
If no Problem Manager or Problem Process Owner is currently in place, an Incident Management Executive and Major Incident Management team could take on the activities described above.
Note: The Incident Management - Major Incident Management plugin (com.snc.incident.mim) must be activated to work with major incidents.
Incidents that we want to promote as a major incident is first proposed to a major incident candidate. The major incident manager then analyses the candidate and decides whether the candidate can be considered as a major incident. So, we can say a major incident candidate is that state of an incident when it is proposed as a possible candidate for major incident but not yet approved by the major incident manager to be called a major incident.
Note: The major incident candidate functionality has been introduced since Kingston release.
I have activated the major incident management plugin, raised a new P1 incident but still do not see any changes to the incident form or any option to create a major incident candidate or to promote a candidate to a major incident. I can see a trigger rule set for Priority = 1 Critical so I was expecting this to become a candidate for major incident but nothing happened. What am I doing wrong?
The issue may be because the trigger rule you are looking at is inactive. ServiceNow ships 3 trigger rules OOB, all marked as active=false. You can review and activate the ones you need for your business.
The major incident workbench is a single pane view specifically designed for major incident managers, communication managers and resolver groups to manage major incidents.
To navigate to major incident workbench, click 'View Workbench' that appears on the header of the Incident form.
You will see the button when the incident is either proposed or is accepted as a major incident.
Major incident dashboard provides at-a-glance view of all major incident information.
There are a number of time-tracking fields available for users. The fields are as follows:
Note: All the above time-tracking fields are based on the calendar record available at System Policy > SLA > Calendars (sys_calendar table). These records define the working days, hours and holidays. You can also create customized calendar schedules by clicking the “New” button on the Calendar list view.
Response Time is defined as the amount of time between when the client first creates an incident report (which includes leaving a phone message, sending an email, or using an online ticketing system) and when service desk agent actually responds (automated responses don’t count) and lets the client know they are currently working on it.
Resolution Time is defined as the amount of time between when the client first creates an incident report and when that issue is actually solved.
Note: The above is just for theoretical explanation and is not tracked anywhere on the instance.
A known error is a fault in a configuration item (CI) identified by the successful diagnosis of a problem and for which a temporary workaround or a permanent solution has been identified. Therefore, a known error is an already identified solution to an existing or a new issue. A known error is identified when the cause of the problem is known.
Measurements are important across all stages of the ITIL lifecycle. Each process has metrics that should be monitored and reported to effectively evaluate the overall performance.
Examples of Incident Management KPIs that are shipped with the base system are:
Users can enable or disable a KPI and customize KPI conditions.
Integration with Performance Analytics provides daily data collection and drill-down capabilities on KPI data. KPIs should be related to Critical Success Factors (CSF) and CSFs should be related to objectives. This relationship helps with decision support for maintaining current state and improving to desired state. Although each organization is different, relevant reports for users, staff and management will help support important decisions that can be used to improve both the processes and the business as a whole.
A CSF is a critical factor or activity required for ensuring the success of a company or an organization. Alternative terms are key result area (KRA) and key success factor (KSF). These are often used to denote the mission statements, vision of an organization, or simply for a business strategy.
Key performance indicators or KPIs, on the other hand, are measures used to quantify management objectives, are accompanied with a target or threshold and enable measurement of performance. Another key term is measure of KPIs (threshold), which simply indicates the plotting of achievement against a definition, which may be either time based or denoted against numbers.
Note: You can define these metrics by navigating to Metrics > Definitions.
After the major incident is resolved, a post-incident review is conducted to analyze the incident and understand what can be done to prevent a similar incident in the future. This also provides an opportunity to review the incident response process and identify areas for improvements.
To streamline the process, a post-incident report is created when an incident is resolved which can then be reviewed and updated during the review process before sharing the report with stakeholders.
The post-incident report provides a summary of the incident, findings, resolution information including any change requests and problem records created. A timeline of activities is available in the report which can be edited by the major incident manager to include important activities, e.g., actions taken to resolve an incident.
In ServiceNow, duplicate numbering is a rare case since numbering does not enforces uniqueness by default.
However, if duplicate numbers are created, we need to set the Number field as unique at dictionary level in our instances as follows:
Note: The number should either be greater than or equal to the highest number in the incident list.
For example, if your highest incident number is INC176601 then update the number field in the number maintenance record
for incident to 176601.
To know more about duplicate incidents, refer KB0538764.
Users with itil_admin and admin role can see the "Closed" state of an incident.
Samiksha, this is a master piece. No words to explain how dreadfully people need to know the basic understanding of the basic things in snow. I vividly remember how helpless I used to feel in my initial days without proper basic understanding even after 2 years of work in snow. It's not because I didn't work hard (I swear I did and I remember how hard those days were for my life out of office) but because of the fact that all information was not accumulated in one place. I still feel and I am sure many many experienced campaigners would feel inside their heart that they don't know many basic things in incident management and they fear to go back to the basics because they have moved on a long way in that path. So, I feel this post was a necessity. You made my day.
Thanks a ton,
Arnab
Hi Samiksha,
A very nice article which will help us in understand deep basic of Incident Management. Please provide the articles for other ITIL topics as well.
Regards,
Saranya
Thank you Saranya! Will be working on other topics soon...stay tuned 🙂
Thank you Saranya! Will be working on other topics soon...stay tuned 🙂
Glad that you found the article useful! Many of us face similar struggle times at the beginning which is one of the reason to pen down the FAQ. Thanks again for your encouraging words.
Brilliant, thank you for posting.
Really very nice and helpful article to understand basics of IM module.
Thanks,
Anagha
Thanks for your words! I am glad you find it useful!
Hello,
Thank you @Samiksha Chaudhuri for this very interesting and helpful article.
I'm preparing for the ITSM Implementation specialist Certif and your article helped me understand more incident management process.
Thank you 🙂
Glad that it helped. You can find documentation related to Incident Management at https://docs.servicenow.com/bundle/istanbul-it-service-management/page/product/incident-management/concept/c_IncidentManagement.html.
Wow, really well put together. This covers just about everything, if not everything you would need to know about incident management. Great for people to review and for first timers to read.
It really wouldn't be a bad idea to stick / pin articles like this somewhere in the Community site, as long as it wouldn't take too much effort to keep them updated when new named versions come up (if any processes change).
Thanks! You can also find documentation related to Incident Management at https://docs.servicenow.com/bundle/istanbul-it-service-management/page/product/incident-management/concept/c_IncidentManagement.html.
Excellent article that intelligently maps best practice to the ServiceNow ITSM implementation.
The people in charge of docs.service-now should take note.
Hi Steven....thank you for your feedback....I am glad you find the FAQ to be helpful....FYI- I am from the InfoDev (Content) team that takes care of documents in ServiceNow 🙂
Thank you for your feedback....I am glad you find the FAQ to be helpful....I will try to provide supporting articles (FAQ or Blog) for other ITIL products as well - Stay tuned 🙂
Regards,
Samiksha
Aha ... so my point still stands - I wish docs looked more like this (background, reasoning, examples) ... well done regardless 🙂
Excellent article!
Great Work 🙂
Very good article.
One question here. Can we email a PIR (post incident report) as a PDF attachment as soon as the Major Incident is closed?