top of page
Search

HF/UX People, Have You Checked Your Real Outliers in Your Data? If Not, Revisit Your Results ASAP

Writer's picture: Mohsen RafieiMohsen Rafiei

During my postdoc, I worked on evaluating how people interacted with Adidas wearable equipment. Most participants followed expected patterns, but occasionally I encountered results that completely broke the mold. One participant wore the device incorrectly, leading to unusable data. Another had an unusual walking pattern that confused every sensor we tested. Then there was the triathlete whose fitness level was so far above the average that their physiological data seemed like an anomaly. This was not an error but a natural reflection of their unique characteristics. These experiences forced me to ask an important question. Should these data points be included in the analysis or excluded because they might distort the broader findings?



Outliers are data points that deviate significantly from the rest of your dataset. In HF and UX research, they can occur for many reasons. They might be caused by human error, equipment malfunction, or even natural variation in user behavior. The challenge with outliers is deciding what to do with them. They can provide meaningful insights about edge cases and unique user groups. However, they can also distort averages, increase variability, and reduce the reliability of statistical results.



The first step in handling outliers is understanding their origin. Were they caused by a participant misunderstanding the instructions, a technical glitch, or are they valid reflections of unique user experiences? If they are meaningful, they might highlight important user needs or design limitations. If they are noise, excluding them could clarify the trends in your data. Either way, your decision must be intentional and transparent.



Detecting outliers often requires more than a quick glance at the data. Techniques like Mahalanobis Distance are useful for identifying multivariate outliers by measuring how far a point deviates from the overall distribution. Local Outlier Factor can help detect anomalies by comparing the density of a data point to its neighbors. Machine learning tools such as Isolation Forests are effective for identifying unusual patterns in large datasets. In regression models, techniques like Cook’s Distance can pinpoint influential points that might skew the results. These advanced methods allow researchers to identify outliers objectively, even in complex datasets.



Outliers are not always a problem. In HF and UX research, they might represent rare but critical user groups whose experiences challenge assumptions about your design. They can reveal new insights and inspire improvements. So, before finalizing your analysis, carefully evaluate outliers for their origin, relevance, and impact. Whether you include, transform, or exclude them, document your decision. Outliers can strengthen your research and offer valuable insights if handled wisely. Ignoring them risks unreliable results.

0 views0 comments

Recent Posts

See All

Comments


©2020 by Mohsen Rafiei.

bottom of page