Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Why mustn't the proportion of smokers among married people be the same as the proportion of smokers in the whole population?

+0
−0

Please see the embolded phrase below. When I read this for the first time, I didn't see this problem at all, and this problem didn't present itself immediately to me. After rereading this four times, I still don't understand this immediate problem!

      If you multiply both sides of each inequality by the common denominator (all people) × (all smokers) you can see that the two statements are different ways of saying the same thing:

(married smokers) × (all people) < (all smokers) × (all married people) Why doesn't $\tag{3}$ work?

      In the same way, if smoking and marriage were positively correlated, it would mean that married people were more likely than average to smoke and smokers more likely than average to be married.
      One problem presents itself immediately. Surely the chance is very small that the proportion of smokers among married people is exactly the same as the proportion of smokers in the whole population. So, absent a crazy coincidence, marriage and smoking will be correlated, either positively or negatively. And so will sexual orientation and smoking, U.S. citizenship and smoking, first-initial-in-the-last-half-of-the-alphabet and smoking, and so on. Everything will be correlated with smoking, in one direction or the other. It’s the same issue we encountered in chapter 7; the null hypothesis, strictly speaking, is just about always false.

Ellenberg, How Not to Be Wrong (2014), page 348.

History
Why does this post require moderator attention?
You might want to add some details to your flag.
Why should this post be closed?

0 comment threads

1 answer

+0
−0

The key word is exactly. If I flip a fair coin, I expect about half of my flips to be heads. If I flip it twice, exactly one head is quite likely. If I flip it twenty times, exactly ten heads is not that unusual, but still a little lucky maybe. If I flip that coin one million times and get exactly 500,000 heads, well, that's quite unlikely indeed (the probability is about 0.08%, as it happens). Getting exactly the expected number of a random binary event gets less and less likely the larger the population gets.

If you think of being a smoker or not as a random binary event, like the coin flip, then the expected fraction of smokers in the married population might be equal to the (actual) fraction of smokers in the general population, but the chance of actually having the exact number of married smokers that would make those fractions equal is very small when discussing populations in the millions.

History
Why does this post require moderator attention?
You might want to add some details to your flag.

0 comment threads

Sign up to answer this question »