top of page
Search

When More Questions Do Not Mean Better Insight

  • Writer: Mohsen Rafiei
    Mohsen Rafiei
  • Dec 22, 2025
  • 5 min read

ree

I remember one specific survey that finally forced me to confront this problem head on. We were evaluating an e commerce redesign and the survey looked perfectly reasonable on paper. Among many items, we asked users to rate Visual Appeal and Attractiveness as two separate questions. Different words, different intentions, different stakeholders pushing for each. When the data came back, the correlation between those two items was almost perfect. Not theoretically similar. Practically identical. We had asked users the same question twice and congratulated ourselves for being thorough while quietly wasting their time. That moment stuck with me because nothing about the survey design process had warned us this would happen, and yet the data could not have been clearer.

Most UX researchers have a version of this story. Surveys grow slowly and politely. One extra question here to satisfy design. Another one there to reassure product. Someone senior asks whether we can also capture overall feeling just to be safe. We tell ourselves this is rigor, but often it is insecurity dressed up as coverage. When the results arrive, the unease begins. We are no longer worried about response rate. We are worried about meaning. Which questions actually told us something new and which ones just echoed what was already there.


Why Do UX Surveys Quietly Accumulate Redundant Questions


This problem does not exist because researchers are careless. It exists because research happens inside organizations. Every question has a sponsor, and removing questions later is never just a technical decision. I have found that the hardest part of cleaning a survey is rarely the analysis. It is the conversation where you explain to a product manager or a VP that their favorite question is not pulling its weight. That question about Brand Excitement or Emotional Resonance sounded strategic in the meeting, but in the data it behaves like pure noise or worse, it is indistinguishable from three other items already in the survey.

There is also a cognitive mismatch at play. We design surveys using frameworks and tidy conceptual boundaries, but users experience products as blended wholes. What we think of as clarity, ease, and intuitiveness often collapse into a single gut level judgment for respondents. At the same time, some questions that feel obvious to us are interpreted very differently across users. The survey does not break when this happens. It quietly absorbs the inconsistency, and the averages still look fine. The real damage is structural, not visible in a dashboard.


What Question Should We Be Asking Instead


Eventually, I stopped asking whether a question sounded good and started asking whether it behaved well. When people answer this survey, which questions actually move together in a consistent way across respondents. Not which ones should move together according to a framework we agreed on months ago, but which ones actually do. This shift sounds small, but it changes everything. It reframes survey quality as an empirical property rather than a design aspiration.

Once you ask this, certain patterns become hard to ignore. Some questions rise and fall together almost perfectly, which tells you they are tapping into the same underlying perception whether you like it or not. Others barely connect to anything, floating in the data as lonely items with no clear home. These are the questions that feel vague in hindsight, or that try to do too much at once, or that depend heavily on personal interpretation. You can feel the survey creaking under its own weight at this stage.


Letting the Math Be the Bad Guy


This is where factor analysis earns its keep, not as a magical solution but as a way to make the math say the uncomfortable things you already suspect. Factor analysis looks at shared variance among items and groups together questions that behave as if they are measuring the same thing. In practice, it often tells you something blunt. These three questions are the same question. This one does not belong anywhere. This factor does not make conceptual sense, so maybe your wording does not either. Of course, this only works if you respect the constraints. You cannot run factor analysis on ten responses and pretend the output means anything. Garbage in still means garbage out. In most UX contexts, you need at least a hundred responses and preferably closer to two hundred before the structure stabilizes enough to trust. Otherwise, you are just fitting patterns to noise and giving yourself false confidence.


One thing I wish more people said out loud is that factor analysis results are often messy. You do not always get clean factors with tidy labels. Sometimes a question about speed loads onto the same factor as trust, and you sit there trying to understand whether users associate performance with reliability or whether the wording accidentally dragged emotion into a functional item. Cross loadings happen. Factors appear that technically meet eigenvalue thresholds but make no interpretive sense. This is normal, not a failure of the method. This is also where experience matters. You do not blindly follow the output. You interpret it. You combine the statistical signal with domain knowledge, wording review, and a gut check. Sometimes the right call is to drop a question even if it technically loads well because it adds no actionable insight. Other times you keep a messy item because it captures something strategically important that you plan to refine later.


When you do this work honestly, the benefits show up quickly. Surveys get shorter, and not just shorter but sharper. Respondents stay engaged longer. Composite scores become defensible because they are built from items that actually belong together, often verified with something as simple as Cronbachs Alpha. Downstream analyses behave better because you are no longer averaging together questions that never should have been grouped in the first place.

There is also a political benefit that surprised me early on. Once you can point to loadings, eigenvalues, or even a basic correlation matrix from R, Python, SPSS, or yes, sometimes even Excel, the conversation shifts. It is no longer you versus a stakeholder. It is the data doing the talking. The math becomes the bad guy, and that makes the decision to cut questions much easier to swallow.


How This Changes the Way We Design Questions


After you go through this process a few times, it changes how you design surveys from the start. You worry less about perfection and more about testability. You allow some redundancy early, knowing you can clean it up later. You pay more attention to wording because you have seen how quickly vague language turns into structural noise. Most importantly, you stop equating more questions with better research.


In the end, factor analysis is not about dimensionality reduction in the abstract. It is about respect. Respect for users time, respect for the data, and respect for the fact that how people experience products is often simpler and messier than our frameworks suggest. When used this way, it stops being a textbook technique and becomes a practical tool for making better decisions under real world constraints.

 
 
 

Recent Posts

See All
Rigorous Qualitative UX and Market Research

Organizations now have endless behavioral data. Clickstreams, funnels, retention curves, support tickets, reviews, session replays. Yet teams still struggle to answer the question behind every metric:

 
 
 

Comments


  • LinkedIn

©2020 by Mohsen Rafiei.

bottom of page