NLU Model Tuning

Asif · ‎03-06-2020

Step 5: Model Tuning . . .

Before we jump into model tuning, you must have a solid understanding of how you want the Virtual Agent to respond to seemingly obscure utterances. What's a seemingly obscure utterance? Here's an example: "Please add more capacity to my voicemail, I've run out of space, and my coworkers can no longer leave me voicemails." Fallback topics best handle these types of utterances.

In this article, we'll walk through how to identify good vs bad utterances, understanding model accuracy, and tuning techniques. Let's start with Step 1: Analyze.

Step 1: Analyze

First, take the time to read the intents and utterances within your NLU Model carefully. Doing so will help you understand what types of utterances your NLU model will be able to predict accurately.

Next, you’ll want to collect utterances from employees in your organization for each intent. These utterances will give you a sense of the types of utterances that might be missing from your NLU Model. They will also serve as the set of utterances for testing your NLU model. Below are example utterances collected for the Get Password Reset intent:

Can someone reset my password?
I need help logging in
I’m having issues

In the example utterances collected, “I’m having issues” is too generic and shouldn’t be used for testing or tuning. Read through each collected utterance and remove any bad utterances. Once you’ve collected and reviewed your utterances, it’s time to move onto Step 2: Test.

Note: Use out-of-box NLU models such as ITSM where possible, which contains tried and true utterances. Password Reset is an example of an existing out-of-box Intent you can use or modify.

Step 2: Test

Use the set of utterances from Step 1 to test the accuracy of your NLU Model. Document the returned intents (above the NLU Models threshold) and whether or not the prediction was correct for every utterance you test. Accurate predictions occur when one of the returned intents matches the expected intent and would be shown to the end-user via Virtual Agent. Incorrect predictions are defined as either no intents or incorrect ones returned.

Included NLU tools such as Batch Testing can help you automate this process.

Here’s an example of what your test document might look like:

Test Utterance	Expected Intent	Returned Intent	Correct?
Can someone reset my password?	Get Password Reset Link	Get Password Reset Link RSA Token	Yes
My password isn't working	Get Password Reset Link	Get Password Reset Link Email Issues Meeting Room Issues	Yes
I'm having issues getting into my application	Get Password Reset Link	Email Issues Meeting Room Issues	No
Reset password	Get Password Reset Link	Get Password Reset Link Guest WiFi Access	Yes

Step 3: Tune

Most missed predictions are caused by a lack of relevant samples in the NLU Model, generic utterances that can span multiple intents, or deficient vocabulary with the applications, words, or phrases that are specific to your company. Analyze your test results and see if you can find hot spots among the missed predictions. Once you understand your hot spots, you’re ready to start making adjustments using the following tuning techniques:

Adding vocabulary: Add your custom applications, lingo, and phrases to your NLU Model. The synonyms you select will help the NLU Model understand utterances better.

Removing ambiguous or misaligned utterances: Remove utterances from intents that don’t align. If you read an utterance that can belong to more than one intent, you should consider removing it.

Adding utterances: Add the test utterances, or a rephrased version of the test utterance, to the intent. If you think an utterance can belong to more than one intent, you should think about where you want to add it.

Remember to perform a sample test after every adjustment to see if you’ve tuned the NLU Model to your specifications. Once you’re done with your adjustments, go back to Step 2 and perform another round of testing to see how much NLU accuracy improved with the adjustments. A model that predicts with 80% accuracy is considered good and can be promoted to production.

Once you’re done with your tuning your model, go back to Step 2 and perform another round of testing to see how much NLU accuracy improved with the adjustments. Try to get your model's prediction accuracy to 80%. On average, it takes around 2-3 iterations of testing and tuning to get there.

Happy tuning!

Anil Madamala · ‎06-24-2022

"Tune" section talks about Removing ambiguous or misaligned utterances.

This can be achieved using NLU Conflict review tool.

Make sure that you have the NLU Workbench - Advanced Features (com.snc.nlu.workbench.advanced) ServiceNow® Store plugin installed and activated on your instance. For more information on the required plugins, see Activate the NLU Workbench.