Stop using AI for research synthesis
AI is best at derivative, time consuming work. So why do we keep trying to use it in user research?
One of the most common mistakes I see inexperienced teams make is during synthesis of research. Everyone on the team is given access to the digital sticky board and told to just ‘have at it,’ so everyone begins silently stacking stickies with little to no cross communication to build out an affinity map. After a high level categorization has been completed, we all pat ourselves on the back.
So it’s natural that when AI (specifically ChatGPT) entered the scene, one of the first things designers did was try to apply it to synthesis. Speeding up what feels like a boring exercise of pattern recognition? Bah! Sign me up, says the team.
But silent mapping rarely produces valuable insights, and sticking AI into it is going to make it worse, not better or faster.

What makes initial sorting exercises important?
Let’s understand why this silent-sort tendency is problematic.
Affinity Mapping isn’t just about developing an artifact. It’s about a team getting to a consensus about the insights, problem spaces, assertions, and hypothesis that are revealed in the affinity map.
The key here is consensus — if everyone is too busy silently grouping, then information isn’t being shared. Patterns aren’t being identified. Nuanced assertions aren’t found. Without consensus-making, affinity maps are performative at best, wasteful at worst.
Some will claim that after the first sort, *that’s* when they begin the heavy sorting to get to consensus, but we’re missing a critical opportunity when we take this path. All too often for these teams, their synthesis doesn’t get past rudimentary categorization.
A method for consensus making during synthesis
First, let’s contrast the method above with how I typically do my initial sorting exercises.
Whenever I kick off the initial sort, I always make sure that everyone who participated in the research is present. Engineers, stakeholders, subject matter experts — if they contributed to or participated in the research, I want them there.
I then go around the room and call on someone to talk about a sticky they found insightful (typically this is from the pool of notes of any sessions they attended, but isn’t required), and then they place it on the board. Others who have similar sentiments in their notes will then map those to the original sticky right in the moment. We then briefly discuss the new group, and move on to the next person until we’re done.
After it’s completed, we then have our initial mapping of the most obvious insights/assertions. The entire team was present, and was able to collectively understand the research findings landscape, so even if everyone can’t be present going forward we’ve gained some level of consensus. The team then continues churning on the map and finding deeper patterns and insights. Our understanding is a shared one.
Compare that to a team silently mapping or having everyone concurrently mapping.
In a silent or concurrent mapping, nobody knows where patterns might be emerging. The team has no shared understanding, just individual ones. The weight of any given finding is felt less, because the individuals didn’t collectively experience it(‘oh my god, 20 people said they it takes 3 hours to fill one form?!’).

AI makes the silent mapping method worse, not better.
First, let’s remember that AI doesn’t ‘understand’ anything that it’s doing. LLM’s are word-calculators, determining the next possible word or phrase most likely to show up. At best, it will pick up on common word patterns, and categorize against that. It doesn’t ‘understand’ anything — context, nuance, or even what the hell is being discussed in any given note. It’s not a subject matter expert, and most likely the training data doesn’t contain subject matter expertise. It only knows what words are used, and statistically determines what the next most likely word or phrase is. It’s been labeled as a stochastic parrot for this reason.
….does that sound like what’s valuable about affinity mapping?
For a team doing the ‘silent’ method, this is now robbing them of the one opportunity they have to learn about their research landscape and begin developing a consensus. The team members aren’t even a part of the initial sort anymore, putting all their trust and bias by default into whatever groups the AI word-calculates.
This pollutes our insights.
Too often I see teams that do the ‘silent’ method coming up with arbitrary and basic categories (groups labeled as “challenges,” “day to day”, “other software”), but almost never get down to the level where they can make assertions or develop testable hypothesis (“They need to be able to enter this data in 20 seconds or less.” “If we can automate X, then they can spend more time reducing Y.”).
Adding a word calculator to an already broken process just makes it more broken. The only thing AI gives a team is the savings of a few hours a few times a month, at best. But what it takes from us, the ability to understand a problem landscape as a team, is the whole point of synthesis.
Please be careful with using AI — if you don’t understand what it’s doing, then you can’t trust what it’s saying.