Why We Should Be Careful With What Users Say: Understanding the Limits of Self-Reported Judgments in Research
- Mohsen Rafiei
- 14 minutes ago
- 6 min read

One of the most common instincts in research is to ask people exactly what they think. We ask users why they chose a specific product, what they liked about an interface, what confused them during a task, what mattered most to their workflow, and what ultimately influenced their final decision. This approach feels intuitive, respectful, and efficient; after all, who knows the user better than the user themselves? Yet, decades of psychological and behavioral research point to an uncomfortable truth: what people say about their decisions is often only loosely connected to the latent processes that actually produced those decisions. This discrepancy does not imply that users are inherently dishonest or deceptive. Rather, it suggests that human cognition is not architected for transparent self-introspection. Much of our perception, evaluation, and decision-making occurs within the "cognitive black box," existing outside the reach of conscious awareness. When we ask users to explain their choices, we are frequently asking them to perform a generative act of story-construction rather than a simple retrieval of factual data. The primary danger in research is not that these stories are useless, but that we mistakenly treat these narratives as the actual mechanisms of behavior.
The phenomenon of the "illusion of introspective access" is a cornerstone of cognitive psychology, popularized by the seminal work of Richard Nisbett and Timothy Wilson. Their research famously demonstrated that people generally feel a high degree of confidence when explaining their choices, yet this confidence is often a poor predictor of accuracy. In classic "position effect" studies, participants were asked to choose the best quality item from a row of identical products. Most participants chose the item on the far right, yet when asked why, they confidently attributed their choice to the texture, color, or superior craftsmanship of the item. None mentioned the position of the product, and most explicitly denied that the layout could have influenced them. This shows that we have limited access to the mental processes that generate our judgments. When pressed for an answer, we do not "look inside" to find a recording of the event; instead, we rely on a-priori causal theories, which are plausible-sounding explanations based on cultural norms and logical expectations, to justify our actions after the fact. From the participant’s perspective, this explanation feels like a direct observation of their own mind, but from a scientific perspective, it is a post-hoc rationalization.
To conduct rigorous research, we must maintain a sharp distinction between judgments and processes. A judgment is a discrete outcome, essentially a "readout" of a final state, whereas a process is the complex, often parallel chain of cognitive events that produced that outcome. Self-report is an excellent tool for capturing outcomes and the subjective narratives surrounding them, but it is an unreliable instrument for capturing the process itself. When a user claims they chose an option because it "felt more intuitive" or because they "trusted it more," they are applying a linguistic label to a holistic experience rather than describing the mechanics of how that experience emerged. These labels collapse a multitude of underlying factors, such as perceptual fluency, prior exposure, affective priming, and task framing, into a single, digestible concept. Consequently, two different users might offer identical verbal explanations for choices driven by entirely different cognitive mechanisms. One user’s "intuition" might be driven by familiarity with a design pattern, while another’s might be driven by the visual salience of a specific call-to-action, yet both will report the same vague sensation of "ease."
Ironically, the methodological pressure we apply to users often makes their explanations less accurate. The harder we push for a "why," the more we trigger the mind's "interpreter module," a concept from split-brain research suggesting that the brain is hardwired to create coherent stories to explain behavior, even when it lacks the relevant data. When people feel required to justify a decision in a research setting, they search for reasons that sound reasonable, socially acceptable, and internally consistent. This often leads to confabulation, where the participant unknowingly invents a logical path to a decision that was actually reached through heuristic or emotional shortcuts. In interviews and usability tests, repeated probing can inadvertently push users further away from the underlying process and closer to a polished, logical narrative construction. The result is data that looks clean and actionable on a slide deck but is fundamentally disconnected from the messy, non-linear drivers of actual human behavior.
Language itself acts as a significant bottleneck in the reporting of cognition. Many cognitive processes are continuous, parallel, and probabilistic, whereas language is discrete, sequential, and categorical. When users attempt to translate a nuanced internal state into words, they are performing a "lossy compression" of their experience. This phenomenon, known as "verbal overshadowing," suggests that the very act of describing an experience can fundamentally alter or even impair our memory of it. By forcing a complex, multi-dimensional feeling into the rigid structure of a sentence, nuance disappears, uncertainty is replaced by false confidence, and gradients of feeling are forced into binary categories. What remains is a simplified, vocalized version of the experience that is heavily shaped by the user's vocabulary, cultural background, and their expectations of what a "good" or "professional" answer should sound like. We are often measuring the user's ability to articulate a story rather than the experience they are attempting to describe.
Furthermore, user judgments are never provided in a vacuum; they are inherently social acts shaped by the context of the research and the perceived expectations of the audience. This introduces "demand characteristics," where participants subconsciously tailor their responses to appear consistent, competent, or helpful to the researcher. In product development, this often manifests as an overemphasis on "rational" features, like price or technical specifications, that users believe they should care about, while they underreport "trivial" factors like aesthetic appeal or status-seeking, even if those were the primary drivers of the interaction. In UX research, users may attribute friction to surface-level UI issues because those are easy to name, masking deeper mismatches between their mental models and the system's underlying logic. The result is a persistent gap between what users say matters during an interview and how they actually behave when they are alone with the product.
Acknowledging these limitations does not mean we should dismiss the user's voice entirely; rather, it requires us to reframe what that voice represents. User judgments are invaluable for understanding the "what it is like" of an experience, specifically the subjective reality of trust, satisfaction, and perceived value. These narratives are essential because they reveal the beliefs and expectations that will shape a user's future behavior and long-term loyalty. The error lies in treating these self-reported explanations as objective ground truth about the cognitive machinery of decision-making. We must remember that self-report and behavioral observation answer different questions. A user who says, "I didn't notice that button," is not necessarily providing a factual report on their foveal vision; they are reporting that the button did not enter their conscious narrative of the task. Both the behavioral fact, such as whether their eyes actually landed on the button, and the subjective report are important, but they must not be conflated.
Ultimately, the responsibility falls on the researcher to interpret user feedback with a high degree of humility and professional skepticism. This means resisting the seductive clarity of a confident user quote and instead designing studies that triangulate data from multiple sources, including behavioral traces, task performance, timing, and physiological signals. We must be honest with stakeholders: a clear, confident quote from a user is often less accurate than a vague, messy behavioral signal. Good research does not ask users to act as their own cognitive scientists; instead, it creates environments where users can reveal their patterns through natural interaction with well-designed tasks. We must treat verbal reports not as a window into the mind, but as a window into how the user makes sense of their world. When we design for the stories people tell rather than the systems they actually inhabit, we risk building products that satisfy a user's narrative while failing to solve their actual problems.
The human mind is not lying to us; it is simply doing what it evolved to do. It acts first, then it explains. Those explanations are vital for social navigation and for maintaining a coherent sense of self, but they were never intended to serve as high-precision scientific instruments. As researchers, our job is not to silence the user, but to listen with a deep understanding of the cognitive architecture that produces their words. We must remember that what sounds like an answer is often a narrative, and the real work of research lies in understanding the hidden forces that produced that narrative in the first place. By shifting our focus from what they said to why they might have said it, we move closer to a more honest and effective practice of human-centered design.



Comments