Adventures in Tensorland - Using TensorFlow with ServiceNow data - part 1

Laurent5 · ‎06-14-2019

TensorFlow from Google is one of the better known Machine Learning Libraries out there and certainly one that gets used for many projects and learning ML.

Since I am keen to expand my knowledge in that space and in particular apply it to the ServiceNow platform, I decided to explore it and build some simple examples, which I will document here.

Now as a caveat, this is in no way aimed at replacing some existing platform capabilities but simply as a way to educate oneself in building small ML projects. It is also by no means a definitive HowTo guide but rather a diary of my learnings so feel free to comment and amend and I’ll update accordingly.

First things first…

First of all, and it may sound extremely obvious but worth emphasising, you need to decide what you want to achieve!

Machine Learning applications can broadly be broken down into 2 categories: categorising or predicting.

This is very important as it will drive to a great extent that data set you need to prepare before writing a single line of code. Contrary to popular belief, this is actually the most important and time consuming part. Stats gathered in various ML forums indicate that data preparation can account for 60% of a project with actual Machine Learning sometimes accounting for only 5% of the effort!

With that in mind, any project will broadly contain the following phases:

Data identification -> Data extraction -> Data pre-processing -> Model coding -> Model training -> Model testing -> evaluate/change -> deploy.

I may revisit this for but for now, let’s start with data identification.

Its all about data

Let’s imagine we want to perform one of the simplest calculation there is, i.e a regression. Regression are often used for predicting dependentvariable (for instance sales volumes), based on independentvariables (factors influencing the dependant variable).

In an ITSM context, let’s say we want to predict the likely customer satisfaction based on a number of variables, such as the ticket category, the duration and any other criterion we think will have an influence.

So in the end we should get to a model looking like y=mx +b where y is the customer satisfaction score and x being, for instance, the time it took to solve their issue.

Again at this point I am trying to get the mechanics in place and may chose better use cases or variables as I explore.

So in order to build this example, I will need a list of tickets, together with some satisfaction/call quality metric and a variable such as time to resolve.

Tickets and time to resolve is easy since it is in the Incident table.

To measure the quality, we usually rely on surveys, which are captured in other tables (although reassignment count could also be used as an indicator of good categorisation)

So I created a simple survey for rating the quality of service with values from 1 to 5, which results I will try to correlate with the time to resolve.

I assigned that survey to incidents that have been resolved using a trigger condition.

Since survey data is stored in a different table (asmt_metric_result), I created a view to combine incident data with survey results.

The two key values I am interested in are 'resolve time’ (task_calendar_duration) and ‘string value’ (metricres_string_value) but let’s include reassign and call category too as we might need them later.

Obviously since I am using demo data of limited quantity, it will not provide a true representation of real customer interaction but hopefully should provide some data to work with.

It is also worth remembering the old mantra that "correlation is not causation" so our resolve time is not the only factor contributing to the survey results.

Once the data exported as CSV, we can start exploring it. For this example I will use the following tools: Python as the main language, Pandas for data manipulation and TensorFlow as my ML library.

These can be installed locally, downloaded as a Docker container or are also available on Google Colab. There are many tutorials available about installation so I will skip that part.

For this example, I used a Docker version of TensorFlow.

I created a new Jupyter notebook in Python 3.

Next I imported the libraries I will need, i.e Pandas, Numpy for math calculations of arrays and matrices, and Matplotyib, which as the name implies, is used for plotting graphs.

I will create a new (Pandas) data frame to read my data using the read_csv function (pointing to a local file).

df = pd.read_csv('/mywork/survey_results2.csv’)

I can then use df to display it

Pandas offers many data manipulation capabilities that I won’t detail here. These will come handy when we need to prepare our data set.

First we need to ensure that our fields of interest are in integer and not text format.

Pandas offers a to_numeric function for that:

pd.to_numeric(df['task_calendar_duration'], errors='coerce’)

Once this is done, we will assign the columns of interest (i.e the incident duration and survey result) to Data Frames, that we will label x_data and y_data respectively.

We also need to “enable” Matplotlib in our Jupiter notebook, which is done by using theinlinecommand

With our survey, we typically rate from 1 to 5 or 1 to 10, with the lowest number depicting a lower score (although in ServiceNow you can configure it).

When it comes to calculating regressions, we may want it to be the opposite, i.e low values representing the better score.

We can use Pandas to basically invert the field value with a function such as:

df[‘metricres_string_value’] = 6 - df[‘metricres_string_value’]

since 6-5=1 ans 6-1=5.

Finally, we can plot our data using the plot function of Matplotlib

A nice tutorial of Matplotlib can be found here: https://www.data-blogger.com/2017/11/15/python-matplotlib-pyplot-a-perfect-combination/

So our code should look like this:

We can now visualise it on a plot using Matplotlib:

Not the best distribution and that will have to do for our example. Also, we could also have created some sample data with Numpy but the purpose of this article is to see how we can feed our model with ServiceNow data.

Next we will look at rescaling, preprocessing our data and creating our model.

End of part 1

ServiceNow Community servicenow community

Adventures in Tensorland - Using TensorFlow with ServiceNow data - part 1