The effect of measurement accuracy and rounding on hypothesis testing
I am checking the temperature I have at home with accuracy of one tenth of a grade (Celsius). The easily publicly available information is temperature in grades, with no decimals.
I am doing very basic hypothesis testing: My null hypothesis is that temperatures I measure do not have a systematic bias upwards or downwards from the public data, while my alternative hypothesis is that the temperatures I measure are systematically higher or systematically lower. I am doing a simple T-test and checking if the average of the differences is far from zero.
Does it make a difference whether I round my own measurements to integers, or, (essentially equivalently,) round the differences to integers? In particular, is there a bias in a particular direction, towards throwing away the null hypothesis or the opposite, if my data has more accuracy then what I am comparing it to?
Notes
This project breaks assumptions of hypothesis testing; at least independence of measurements and possibly the normal distribution of the differences.
The point of the question is not these, but rather the possible effect of rounding or different precision in the ground truth data and the measurements on the hypothesis test.
2 answers
You are accessing this answer with a direct link, so it's being shown above all other answers regardless of its score. You can return to the normal view.
If the standard deviation of your measurements is in the trillions, then rounding to integers will make little difference – probably no practical difference – since the rounding error is so tiny by comparison to the value. If the standard deviation is $1/2$ then you would be throwing away huge amounts of information and your results will be wrong.
Generally you shouldn't round more than you have to until the last step.
And notice that $(20+\tfrac13)\times 3 = 60 + \left(\tfrac13\times3\right) = \text{exactly } 61,$
but $20.33\times3 = 60.99,$
So $61$ is an exactly answer and $60.99$ is a rounded answer.
If you show $61$ and $60.99$ to a person whose grasp of arithmetic is at a naive level, and as which one is rounded, they'll get it wrong.
1 comment thread
Your measurements are more precise, which is actually a good thing. Rounding them to match the less-precise public data might feel like you’re making things consistent, but what you’re really doing is tossing out useful info. And when you’re running a hypothesis test, that extra decimal can matter—especially if the differences you’re trying to detect are subtle.
Here’s the deal: If your data is pretty tight (like, low variability), then rounding could mess things up more. But if there’s a lot of natural fluctuation (like temps bouncing all over the place), then rounding might not change much. Still, why risk it?
Best move? Keep your 0.1°C data as-is for the actual stats and analysis. You can always round later when you’re showing results to someone who doesn’t need the fine-grained details.
If you’re really worried about fairness, you can even treat the public data like it represents a range—like if it says 22°C, maybe it actually means somewhere between 21.5 and 22.5°C. That way, you can still work with your full-precision data and just adjust how you interpret the comparison.
0 comment threads