INTRODUCTION: Differential item functioning (DIF) analyses are commonly used to evaluate health-related quality of life (HRQoL) instruments. There is, however, a lack of consensus as to how to assess the practical impact of statistically significant DIF results. METHODS: Using our previously published ordinal logistic regression DIF results for the Fatigue scale of a HRQoL instrument as an example, the practical impact on a particular Norwegian clinical trial was investigated. The results were used to determine the difference in mean Fatigue scores assuming that the same trial was conducted in the UK. The results were then compared with published information on what would be considered a clinically important change in scores. RESULTS: The item with the largest DIF effect resulted in differences between the mean English and Norwegian Fatigue scores that, although small, could be considered clinically important. Sensitivity analyses showed that larger differences were found for shorter scales, and when the proportions in each response category were equal. DISCUSSION: Our scenarios suggest that translation differences in an item can result in small, but clinically important, differences at the scale score level. This is more likely to be problematic for observational studies than for clinical trials, where randomised groups are stratified by centre.