
Goliath Proving Tired: GPT is not ready yet.
A Tale of Three Chat-GPT Models in the Context of UX Research
As we tread upon the uncharted territory of a post-AI UX, assessing the accuracy and practicality of the available versions of Chat-GPT — Legacy GPT 3.5 (free), Turbo GPT 3.5, and GPT 4 (subscription-based) — becomes imperative.
To this account, I asked each to analyze interview data from a school project, identify patterns and potential bias in the questions, provide two different personas, and explain its rationale, as you can read in more detail here.
The sample is too short to be statistically significant and is biased in its methodology, but the data is used for educational purposes. We do this exercise with students of the UX Fundamentals program at Kingsborough Community College (CUNY) in the Affinity Mapping & Persona class to identify how implicit bias can derail a project.
Findings
The free Chat-GPT version (Legacy GPT 3.5, soon to be deprecated) struggled with nuances, generating overgeneralized labels and broad personas that were unsuitable for practical use. It also failed to justify the exclusion of an interview and identify biases in the questions.
Turbo GPT 3.5, on the other hand, showcased better adherence to prompts and more accurate pattern analysis. Although the personas it generated were more precise, they still required refinement for practical use. While it could explain the dismissal of a user interview, it did not address biases in the questions.
GPT 4, available to Plus users, stood out with the most detailed pattern analysis and the generation of usable personas. However, it fell short of explaining the exclusion of an interview. Notably, GPT 4 identified biases in the questions and offered guidance for improvement.
Implications in the UX Research Landscape
The study’s findings hold critical implications for using Chat-GPT in the context of UX research:
- While the free version, Legacy GPT 3.5, could be a starting point for exploring user data, it is ill-equipped for in-depth analysis.
- Turbo GPT 3.5 (subscription-based) presents a step up, offering better outputs but requiring human refinement.
- GPT 4 (subscription-based, currently limited to 25 messages every 3 hours) emerges as the most promising option, providing detailed analyses and practical personas. Nevertheless, its inability to address certain aspects, such as explaining the exclusion of an interview, highlights the need for human-led analysis.
GPT-4, the latest iteration of Chat-GPT, has undoubtedly advanced the capabilities of AI language models in user research analysis. However, its limitations serve as a stark reminder that UX Designers and Researchers must not rely solely on this tool.
Balancing AI-generated insights and traditional human-led analysis techniques is essential as we stand at the precipice of AI integration in UX research. Only then can we harness the true potential of AI while ensuring accurate, nuanced, and actionable insights for the betterment of user experience.
Embrace the future of UX, but do so with a discerning eye and a human touch.