System Usability Scale (SUS) explained

Published in

Bootcamp

5 min readSep 11, 2021

Here is what you need to know about System Usability Scale commonly known as the SUS with 5 easy steps to follow.

How and why it works?

SUS measures the perceived ease of use. Yes, there are 10 “standard” questions in a likert scale that goes from Strongly Disagree to Strongly agree.

Here are the standard set of 10 questions.

I think that I would like to use this system frequently.
I found the system unnecessarily complex.
I thought the system was easy to use.
I think that I would need the support of a technical person to be able to use this system.
I found the various functions in this system were well integrated.
I thought there was too much inconsistency in this system.
I would imagine that most people would learn to use this system very quickly.
I found the system very cumbersome to use.
I felt very confident using the system.
I needed to learn a lot of things before I could get going with this system.

Step 1

You will need to change some of this language to suit your needs.

But STOP.

Before you change the language, understand what this set is trying to achieve. Also note how much you change the question will affect the way you interpret the score. (https://uxpajournal.org/dropping-item-sus/)

Each group of 2 is measuring a similar “thing”. Lets take the first 2 questions for example:

I think that I would like to use this system frequently.
I found the system unnecessarily complex.

One might say that this set measures usefulness of the system/software/website/app.

If you split the above 10 questions into groups of 2, you may notice the following 5 categories emerge (more or less).
(The language in the original questions doesn’t work well sometimes — Especially question number 1.)

A. Usefulness

Q1. I think that I would like to use this system frequently.
(It has been found that even if you drop question 1, it doesn’t have a significant impact on the score. Therefore it might be hypothesized that you may change question #1 slightly to suit your context better.)

Q2. I found the system unnecessarily complex.

To me “unnecessarily” is the keyword here. If something is unnecessarily complex would mean that I wouldn’t use it because of its unnecessary complexity. Hence, it is not useful for me.

B. Ease of use

Q3. I thought the system was easy to use.
Q4. I think that I would need the support of a technical person to be able to use this system.

3. Efficiency

Q5. I found the various functions in this system were well integrated.
Q6. I thought there was too much inconsistency in this system.

It is assumed here that if the various functions in the system are well-integrated, that would mean it takes me less time to get things done which can be cateogrized as “perceived efficiency”. Whereas if there is too much inconsistency within the system or I am having to use other methods to get things done, it will add to the time take to get the job done.

4. Learnability

Q7. I would imagine that most people would learn to use this system very quickly.
Q8. I needed to learn a lot of things before I could get going with this system.(Originally Q10)

5. Satisfaction

Q9. I felt very confident using the system.
Q10. I found the system very cumbersome to use. (Originally Q8)

How much flexibility you have with your questions is clearly stated here:
https://uxpajournal.org/dropping-item-sus/

Then under each category craft 2 questions — One in a positive tone and the other in a negative tone.

Even numbered questions are always in a negative tone.

Once you have this down. The rest is easy and has been said on several different sites. If you know how it works after this, you may not read further. If you don’t please continue.

Step 2

Your likert scale should look something like this (with a total of 10 questions):

Step 3

Once you get your data in a spreadsheet, replace all in the following way:

Strongly Disagree: 1 point
Disagree: 2 points
Neutral: 3 points
Agree: 4 points
Strongly Agree: 5 points

Your spreadsheet should look something like this:

SUS spreadsheet — words replaced with numbers

Step 4

The column after the last question should contain the following formula with appropriate cell numbers, like so:

=((F2–1)+(5-G2)+(H2–1)+(5-I2)+(J2–1)+(5-K2)+(L2–1)+(5-M2)+(N2–1)+(5-O2))*2.5

Step 5

Take the average of the column (in this case column P) with the above formula, like so:

What does it all mean?

The hypothesis for modifying the questions based on categories above, is that the adjective rating below, will still hold true, because the original questions organically fall into these categories.

It has been brought to my attention by John Brooke, the creator of SUS, that the categories I am suggesting above are an unintentional coincidence.

However, to me the organic conincidence is a result of the very nature of usability. In other words… it had to be so.

It will be interesting to study the SUS scores based on the above suggestion of categorizing the questions and see if the adjective rating or the benchmarking suggested below still holds true.

> 80.3 = A Best imaginable/Excellent

68–80.3 = B Good

68 = C Okay

51–68 = D Poor

< 51 = F Worst Imaginable /Awful

Hope this helps some of you who are a little bit unsure of the SUS works, like I was not a long time ago! :)

If you have changed the questions in your SUS even slightly, please contact me on LinkedIn. I am building database to see what effect the change in questions based on categories has on the adjective rating.

References
https://measuringu.com/sus

https://uxpajournal.org/sus-a-retrospective/

Here is an example: