Skip to main content Skip to navigation
MDS
Guidelines

Benchmarking

Contents

What is benchmarking

Simply put… A benchmark study is a deep assessment of a product with the goal of measuring Key Performance Indicators (KPIs) as a basis for future improvement. We use KPIs taken directly from user-impacted metrics. Each score equates to 25% of the final benchmark:

  • Product Engagement Score A combination of feature adoption, user stickiness and user growth

  • Product Satisfaction Score An in-app NPS survey.

  • Product Accessibility Score Generated from the product accessibility statement

  • Support Score Sales Force supports calls against unique visitors calculating an overall percentage

These are explained in more detail throughout this guide, along with a breakdown of how each one is calculated and combined into the total benchmark score.


Why do we benchmark our products?

Product benchmarking informs the direct team and wider company of the performance and position your products hold in their markets. A benchmarking study gathers unbiased customer research, which focuses on the user experience to identify strengths and expose weakness. These weaknesses can then be investigated, and the solution built into the road map.
Using a benchmark study to drive your roadmaps will mean can help to directly measure the impact your roadmap projects are having on the UX.

For example:

  • 01 Jan: Benchmark score – 27
  • 12 Feb: New feature release – Allows the user to track stock at a higher batch level
  • 30 Jun: Benchmark score – 21

1 KPI was majorly affected by the new release, which brought the benchmarking score down. After some investigation, the team learns that the support calls have increased due to some confusion around batch tracking vs the existing serial number tracking.

The team went back and reviewed the initial point at which they asked the user to select tracking. This time around when they selected tracking, they had to pick batch or serial and were given an explanation of both. Once they had made a selection, every instance where tracking was later referenced was labelled either batch or serial as per their selection.


When do we complete a benchmarking study?

The UX benchmarking study is conducted every 180 days (twice a year), January 1st and June 30th. However, smaller and more granular studies are also a valuable source of information before any redesign efforts begin. They can be used as the baseline score you’ll compare against after the new release. This type of study may use KPIs that directly link to the feature/function your project is looking to improve (for example, you could use an in-app survey or time to complete statistic to measure feature feedback).

Contents

Product engagement score

The product engagement score (PES) is an objective measurement of user behaviour. It's gathered via Pendo and is made up of three different metrics:

  • Adoption The average number of Core Events recorded across all active Visitors or Accounts
  • Stickiness The average percentage of users who are high-frequency return users using Daily/Weekly/Monthly active user metrics
  • Growth The sum of new and recovered Accounts or Visitors divided by dropped Accounts or Visitors

Once the three metrics have been gathered, they are put into a formula to generate the PES: (Adoption + Stickiness + Growth) /3= PES

For example: (10 + 32.1 + 37) /3 = 26

You can find more information about the Pendo PES score on the (Pendo website)

Product satisfaction score

The Product satisfaction score (PSS) is collected via a single-question survey, limited to a (0-10) scale response. The survey uses Pendo's built-in in-app NPS survey. The question posed is ‘How satisfied are you with ‘Product Name’? The lowest end of the scale is ‘Not at all satisfied’ and the highest is ‘Extremely satisfied’.

  • Scores from 0-6 are detractors
  • Scores of 7-8 are passive
  • Scores of 9 or 10 are promoters

The responses are then put into a formula to generate the PSS: (Promoters/Respondents = %) - (Detractors/Respondents = %) = PSS

For example: (206/722 = 29%) - (332/722 = 46%) = -17%

Product accessibility scores

The Product accessibility score (PAS) is driven by AA WCAG Accessibility Guidelines. We aim for WCAG AA 2.1 compliance in all our products, and the accessibility statement will outline any accessibility failings. Every major or minor failing in an accessibility statement deducts from an initial score of 100. Major failings deduct 5 points, and minor failings deducting 2 points.

The level of failing that is applied is dependent on the impact. For example, if your users' understanding of your product relies on lots of diagram images, but none of these images have Alt text, then this would be a major failing because the product would likely be unusable for a user who has a visual impairment and uses a screen reader. If the same failing was on a product that uses graphical images that a user doesn't need to see to complete a task, it would be marked as a minor failing. If your product doesn’t have an up-to-date statement it will revive a score of 0.

The calculation below is used to generate a PAS: (Current accessibility statement = 100) – (Sum of failures) = PAS

For example: (Current accessibility statement = 100) – (2 + 2 +2 + 5 + 5 = 16) = 84

Support score

Customer service often reflects a UX failure, and as such, it plays a role in benchmarking. Using data from SalesForce, we extract the numbers of calls tagged ‘Help and Advice’ across the 6-month period between benchmark tests. We then combine them with the number of unique visitors over the same time frame to make a percentage score, which forms the Support score (SS). The data is put into the following formula:

(Number of calls / Unique visitors) x100 = SS

For example: (949/ 47600) x100 = 2

Note: This benchmark score is unique among the scores, as it benefits from being lower rather than higher.

Contents

Understanding your score

It’s important to understand we don’t have a target score, and we are instead looking for continuous progression. Incremental improvement is more important than achieving an arbitrary goal. It's important to note that PES, PSS and SS use a percentage to determine their end score, and this approach equalises the products regardless of the number of users.

The four benchmarks are added into the following formula to generate your final score:

(PES + PSS + PAS) – SS = UX Benchmark

For example (26 + -17 + 84) – 2 = 91

Product and market comparison

Products will get the most value from the study when they produce all 4 KPIs, as tracking by market will be invalid without all of the products tracking the same metrics.

Tracking a single product will give a more rounded view of your UX. You can periodically review the benchmarking score and each KPI to see where has improved (or not), and allows you to evaluate the areas that need attention and focus. You can view the currently collected benchmark scores below:

[Graph mapping each product against each other](Link to excel with graph mapping all products) [Spreadsheet with individual KPI results](Link to excel with product KPI results)

Improving your score

Don’t worry if your score isn’t what you were expecting! Having room for improvement is a great way to kickstart your projects and start to make positive, experience-focused changes to your product.

There’s no single way to fix your score - first, you need to drill down into the metrics to find out what is driving the score down:

  • Have you got too many accessibility failings?

Book time with your development team to see if any of the fixes can be included in the next set of releases

  • Too many support calls vs unique visitors?

Dive into your product's Salesforce data, is there a common theme that users are calling in about? It might be a pain point you are already aware of, or something completely new that you may need to review your roadmap for. You may find that improving other scores will have a knock-on effect on your SS.

  • Are your detractors outweighing the PSS? Your survey should offer the user a space to comment, and you can look at the responses to see if there is a theme across the comments. It could be centred around a certain function. If not, look at the accounts that are rating low, and see if they are giving feedback on another platform, or perhaps check in directly with their account manager.

  • Is the adoption score affecting the total PES?

It might be worth checking your core events are set up correctly. Poorly set up core events can lead to bad data.

Search