Talk:Correlation does not imply causation/Archive 1

This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 1

Archive 2

Text from February 2002

Re "ultimately we rely on correlations to establish the existence of a link of casualty". We rely on correlations to suggest causal links, but not to prove them. Maybe the problem occurs when an assumption is made that causality is proven when the correlation is not complete. Maybe the wording should be changed to reflect this idea? Graham Chapman

I do feel that the only way to prove a link of causality is by means of a correlation (that can never be complete). In my opinion, this is closely related to the Problem of Induction, which Karl Popper claimed to have solved, although some might disagree. Anyway, I'm not an expert on this subject. I think philosophers should step in. Calypso

I moved this example here from the main article because I thought it too contentious - probably many people believe that the US Legal system does discriminate against black people. If it is contentious then it only serves to distract from the article, which is about a dry area of logic, not race relations. Here is the example I moved:

It is sometimes assumed that the American legal system discriminates against black people because the number of black prison inmates is proportionally higher than the number of white inmates. But we could as well conclude that the legal system discriminates against men, because there are proportionally fewer convicted females. The correlation between "being black" and "being more likely to end up in prison" does not prove that judges are racist. Of course it does not disprove it either.

Graham Chapman

Art imitates penguins (or was it the other way around)?

Another example illustrating this fallacy was a study which found that British arts funding levels had an extremely close correlation with Antarctic penguin populations. Neat factoid, but an encyclopedia is not a factoid collection. Please provide a source and/or an explanation for this, as with the other examples. For all our readers know we completely made this up. Removing it for now. 82.92.119.11 20:34, 24 Dec 2004 (UTC)

Vague reasoning

But if there was a common cause, and you had that data as well, then often you can establish what the correct structure is. Likewise (and perhaps more usefully) if you have a common effect of two independent causes.

What does "that data" refer to? Some sort of data about the "common cause", I presume, but what exactly? I can't figure out what these sentences are supposed to mean. 82.92.119.11 20:38, 24 Dec 2004 (UTC)

Reichenbach's principle is closely tied to the Causal Markov Condition used in Bayesian networks. The theory underlying Bayesian networks sets out conditions under which you can infer causal structure, when you have not only correlations, but also partial correlations. In that case, certain nice things happen. For example, once you consider the temperature, the correlation between ice-cream sales and crime rates vanishes, which is consistent with a common-cause (but not diagnostic of that alone).

This is presumably obvious if you know what Bayesian networks are. But let's assume I am just a reader interested in logic—do I care that "certain nice things happen"? What exactly does it mean to "consider" the temperature? Enter the data into the sample? And how does that make the relation "vanish"? If we want to mention Bayesian networks, we should put it in better context for non-technically inclined readers. 82.92.119.11 20:43, 24 Dec 2004 (UTC)

Expert attention

Someone familiar with statistics needs to give some concrete examples and a clearer, more detailed explanation so that non-scientific readers can understand this concept. -- Beland 02:40, 28 Feb 2005 (UTC)

195.42.89.34 08:56, 26 December 2005 (UTC) Perhaps it is not easy, with statistics. I saw a book, in English it should be entitled like "Paradoxes in mathematical statisitics" with sveral examples that are hard to believe. Examle is testing new drugs. Testing new drug in two hospitals independently shown drug is effective. Then summarising quantity of patianets classes - and we find that fro two hospitals in total drug is counter-effective. Hard to believe, but... Ok, i still want to donate: Common Russian joke about statistics state that It was prooved that 100% of men, died of cancer, had eaten cucumbers. We all know now how harming cucumbers are.

PS: BTW, what perhaps could be more example, is games. Do wild, cynic games provoke wild, cruel behaviour, or do cruel people prefer such kind of games ? AD&D history or later PC Games scandals.

Political Issue

Since this is an encyclopedia, perhaps the bit about gun control could be changed to a more neutral example. As it stands, the article appears to oppose gun control.

Regards, Rajeev.

I agree. A more neutral example would be better, and I'll change it (sooner or later) if there are no objections. I think it is worse than just being political; the argument seems to assume that either gun ownership or crime is the cause, and the other is correlated whilst ignoring the possibility of a common cause. In fact, none of the examples are very inspiring. Rls 23:49, 24 Aug 2004 (UTC)

Changed it. I don't really like my example, but feel free to edit. Sir Elderberry 00:51, 2 May 2006 (UTC)

Monty Python example

The Monty Python example is flawed. As everyone knows, after Sir Belvedere's explanation, the accused woman herself says "It's a fair cop", confessing to being a witch. JIP | Talk 17:52, 31 May 2006 (UTC)

Unorganised and confusing -> more systematic

The article as it stood was (is) very unorganised and confusing, trying to explain through loose examples. I've tried to make things more systematic in the first paragraph. The later paragraphs should perhaps point to cases (1)-(4) to make things easier to understand.

Also, does "Teenage girls eat lots of chocolate, teenage girls are most likely to have acne, therefore, chocolate causes acne" give an example of a correlation? I'm not sure it does, and if not, the example should be removed.

I'm also not convinced if (4) actually is a possible outcome of a correlation.. help me out :) Narssarssuaq 16:31, 3 July 2006 (UTC)

I've removed the following because it doesn't contain a correlation:

Hume

There should be more on David Hume, who essentially said that 'correlation is causation' - it is impossible to see causation, all we can know is correlation. - 05:31, 10 November 2005 (UTC)

Actually, there is a way of seeing causation. Conduct an experiment (however horribly impractical and expensive it may be), and you can determine causation, as opposed to comparing sets of data to one another. The statement made by Hume would apply to the Chemical "x" causes Cancer situation. You can see that large concentrations of Chemical "x" might have a correlation with higher cancer rates by comparing data on cancer rates in certain areas with data on concentrations of Chemical "x" within those areas, but that essentially tells you nothing about true causation, though it provides possibilities.

The way to know causation for certain is to actually conduct an experiment with this chemical, by exposing people to it (and controlling other possible variables) and checking for the emergence of cancer from the direct effect of this chemical. That is how you can know causation, by direct experimentation and observation, as opposed to checking various statistics and lining them up for examination. I will admit, however, that Hume is right in a way. Most experimentation of the sort that would provide meaningful proof of causation for statisticians is either unethical or extremely impractical. In all practical terms, David Hume's statement can be assumed to have a grain of truth. It's clear, however, that in simple experimentation and observation, Hume's logic is incorrect; say you were to find a very high correlation between pedaling a bike harder and the bike going faster. This would be, of course, based on two specific sets of data you studied: the speed at which the bike was pedaled, and also the rates of speed that the bike travelled at. You can assume correlation doesn't imply causation here, but by direct experimentation, and observation of your own, it's safe to assume that pedaling harder will make the bike travel faster. Of course, these are no-brainers (And probably worded falsely somewhere...I'm nothing but an ametuer). The point is, Hume's logic does indeed apply in the grand scale, but his generalization in smaller cases is more a matter of formality as opposed to the actual case of assuming causation. —The preceding unsigned comment was added by 70.95.202.104 (talk • contribs) .

Even with numerous experiments, all you actually see is correlation, maybe even 100% correlation, but you can't actually see causation. I'm speeking in a philosophical sense here, not in the day-to-day practical science sense. I'm guessing the article on causation, maybe also Popper's falsifiability, deals with some of this Philosophy of Science. - Matthew238 22:25, 22 August 2006 (UTC)

Additional information

Moved from the main article (articles are not for discussion!!)

But I thought it important to note that even though a logical fallacy, there was a stronger but deeper link between causation and correlation. If you believe Reichenbach's principle, then you believe that robust correlation implies SOME causation, just not necessarily direct links.

An earlier version of this page offered two examples "that show that it is sometimes quite difficult to judge correlations":

Statistics prove that most car accidents happen between vehicles driving at rather low speeds. Few accidents take place at, let's say, 100 mph. Does this mean that it is safer to drive fast? Of course not, most accidents take place within 25 miles of their primary city or suburban residence, usually driving at a moderate speed, ergo most accidents happen at moderate speeds.

Yes indeed. Although mostly this is a common failure to take base rates into account when doing correlations (or any other kind of inference). The prediction is that once you separate out the fact that most driving happens at low speeds, the correlation between speed and accidents will change.

Note that you could make another faulty inference given the explanation. Most accidents do take place within 25 miles of home. Does this mean it is more dangerous to drive near your house than far away? It might mean this. It might mean that people become complacent on familiar roads, and are more alert and safer when travelling. But those seem unlikely, and it is far more likely that it is another base rate effect: by far most driving happens near home than away. After all, every successful trip both starts and ends at home.

the correlation between reserve parachute deployment failures and death is quite high; if the main parachute fails and then the reserve parachute fails, the parachutist almost always dies. However it is the sudden deceleration when the parachutist hits the ground that causes the death, not the parachute failure. The parachute failure leads almost inevitably to death, but does not cause it.(1)

Actually I strongly disagree here. The reserve failure most certainly does cause the death! A good way to kill your (parachutist) enemies is to disable their chutes. Granted, the ground is a more proximate cause of death, but there will always be a more proximate cause of death. For example, even the impact with the ground is not the most proximate cause. The impact of your organs with your skeleton handles that. Etc.

The correlation between reserve chute failures and death is in fact a good indication of a causal link. Not that we needed statistics to find that one. :-)

Even though there are ways to go more securely from (many) correlations to (some) causal structure, unmeasured common causes can always interfere with those attempts. And most social scientific methods do not even try to do proper causal inference. They merely apply regression measures (correlations) and then make causal inferences from there. All statistics textbooks warn (overly much) against inferring causation from correlation, but in practice correlation methods are almost always used to establish causation. And this practice commonly leads to faulty conclusions drawn from scientific research.

This is in contrast to experimental science with proper controls, where you hold everything else constant and wiggle A. When you do this (and only then), you see B wiggle. That is a pretty foolproof way to establish causation. Lacking perfect control, a randomized clinical study is the best alternative, because it is the best guarantee that other causes are averaged out between the treatment and the non-treatment population.

The previous version included links to Problem of induction, and physics along with a 1-paragraph discussion that I feel blurred the important distinction between experimental correlations and observational correlations. For example, I do not think that moving from Newton's laws to the Theory of Relativity is a good example of problems with correlation. There is more on that shift no doubt under Philosophy of science, especially topics like theory choice, confirmation, induction, and paradigm-shift.

This page should probably be broken up.

The discussion in this page is a bit tortured. I've revised the section on "determining causation" to emphasize counterfactual reasoning. This should help readers think about conditions under which correlation provides persuasive evidence of causation. 09:14, 28 August 2006 (UTC)

The Latin name for this fallacy is post hoc ergo propter hoc: literally, "After the fact, therefore because of it."

(1)The United States Parachute Association (http://www.uspa.org/) terms this type of fatality "Impact with ground"

Cannabis and possible other 'examples'

I think bringing in subjects like this is introducing POV bias and implications into the article. We should be able to demonstrate what a logical fallacy is without bringing up morally contentious issues as examples. Things like 'going to bed with shoes on' are great demonstrative examples, I don't see why we can't continue along those lines with the rest. Even the myopia one isn't really horrible beause I doubt you're seeing a lot of people out there with a strong interest in demonstrating causative links between lights being on and myopia.

On the other hand, bringing in a subject like cannabis use here is just opening a door to controversy as well as being a little devisive with regards to the implications of doing so in an article like this. The section is "examples of logical fallacies", and the topic we're given is a statistical link between cannabis use and mental illness. The section does not cite a specific case study (nor probably should it, this article isn't about a large debate like that), but rather makes a strongly implied generalisation that studies like this are or may be commiting logical fallacies. For all we know the studies go to great lengths to show that there is a causative effect behind the observed statistical correlations.

Ask yourself this: If most studies about links between cannabis use and mental illness didn't involve logical fallacies (as they supposedly do here), would this be a good example or a confusing/misleading one? I think it would clearly be the latter, and therefore the presence of this example is based on the underlying assumption that a statement about causal link between cannabis use and mental illness is something likely to be a logical fallacy. Do we know this? Have we done surveys of studies, looked at whether or not they are mostly guilty of logical fallacies, or do they mostly take this into account? Frankly, doing so is way beyond the scope of this article anyway.

The fact is, by putting claims like this in the section that demonstrates the fallcy, we are implying without evidence that the statement itself usually is false by way of the fallacy, which is not NPOV--and going to lengths to demonstrate it is infact NPOV (such statements statistically usually are fallacial) is out of the scope of the article. I simply don't see what we gain by having this here, we can do just as well explaining the topic while using entirely neutral examples. Honestly even the myopia one has problems really, but at least there the topic itself isn't a morally controversial one. --Rankler 00:51, 28 September 2006 (UTC)

I too agree that any possible controversial examples would not be appropriate, or maybe I should say best-suited, for this article. Instead of using a topic that may hinder groups of people from interpreting this article correctly, simple and neutral topics would probably be best to use. - Dozenist talk 01:33, 28 September 2006 (UTC)

NPOV and cannabis

Hi. I read over the section listed as a POV violation and couldn't really see it. I'm not a smoker of cannabis, nor do I really support legalisation, so I don't think it's an issue of my own engrained bias speaking for me.

The article is quite clear that the entire section is "maybe" and "possibly", without stating that the statement is false due to fallacy (any more than any other fallacy in this section - remember, just because a statement falls into Cum Hoc Ergo Proctor Hoc doesn't ever mean it's necessarily a 'false' statement (that, in fact, would be a case of Cum Hoc Ergo Proctor Hoc itself - just because a lot of statements that imply causation from coorelation are false (for instance, shoes to sleep and headaches) doesn't mean that they all are (for instance, cigarettes and lung cancer)).

Anyway, put short, the unsubstantiated fear that some Joe Average might read this article and get confused and immediately assume that pot is/is not harmful/helpful/related/unrelated to mental/physical/spiritual illness/well-being is not, by itself, reason enough to change the example given - which cannot easily be replaced with another, toned-down example, since the entire progression of the article goes from simple examples to more complex, real-worldish examples. In fact, using this argument to justify a rewrite is itself a fallacy - quite ironic, given the subject matter at hand. --151.200.252.164 09:08, 8 October 2006 (UTC)

I re-examined it and found it to be a bit defensive at the very beginning - as stoner stuff tends to be - and removed a bit of the "ALLEGED NATURE OF THE ALLEGED ALLEGED (POSSIBLY UNTRUE) UNVERIFIED ^{[citation needed]}^{[citation needed]}^{[citation needed]}" stuff in the beginning. Consensus? --151.200.252.164 09:13, 8 October 2006 (UTC)

Are the tags still needed?

Are the tags at the start of the article ({{Cleanup-date}}, {{expert}} and {{sources}}) still needed? Seems to me that they aren't and can be removed. Any Objections? Rami R 08:07, 18 October 2006 (UTC)

Objection. I do not believe that an expert has weighed in on this subject. There are many points without reference as well, so both the cleanup and sources templates apply. Chris53516 13:10, 18 October 2006 (UTC)

I sought out an expert (Prof. Yaacov Ritov). This was his input. He has, however, suggested that a science philosopher look at the article (mostly because of the terminological discussion in the beginning, and may be there is a need for an extended historical references beyond Hume). So the expert tag can stay. But i'm not sure what needs to be referenced. Could you show me an example? Rami R 07:32, 22 October 2006 (UTC)

Godwins Law?

Is there any relevance to Godwin's Law in the "See also" section, Or should it be removed? Rodo2 08:04, 25 October 2006 (UTC)

Definitely no relevance. I am removing it.--Dylan Lak e 23:10, 25 October 2006 (UTC)

Shoes example

The example about waking up with a headache after sleeping with shoes on is really an example of post hoc; shouldn't the first example be more specific to this article? --Tardis 06:52, 31 October 2006 (UTC)

Remove Monty Python Example?

As much as I love Monty Python, the monty python example in this article is confusing, and I don't think it helps to illustrate the "Correlation does not imply causation" point. I'm thinking about replacing it with a simpler example from popular culture/news/etc. Anyone would care to comment? Claus Aranha 08:55, 31 October 2006 (UTC)

I agree with Claus that the monty python example should be removed, it is quite long winded and not particuarly helpful. Unless I see a specific example though I am not in favour of putting something else there, I think the Simpsons example above is fine. Grumpyyoungman01 10:48, 31 October 2006 (UTC)

I also agree - in fact having read the article, I came to the talk page specifically to see if anyone else had objected to it yet. It seems much less clear than the Simpsons example, and is longer. I'm not sure if the "Popular Culture" section merits such a long example, lest it start to dominate the article. Bobstay 19:26, 7 November 2006 (UTC)

Remove "Popular culture" section including Simpsons example

I think the examples in popular culture are crap and are not encyclopedic. I agree with the change made by the anonymous user that removed the section. – Chris53516 (Talk) 15:06, 9 November 2006 (UTC)

That anonymous user was a vandal, have a look at contributions from that ip and you will see that they were all reverted. The reason that ip address gave for deleting the Simpsons example was that it was 'irrelevant to the article', which is not true. What reasons do you have for wanting to delete the Simpsons example? So far you have said because it is 'crap' and because it is 'unencyclopedic', that is circular reasoning. Grumpyyoungman01 22:50, 9 November 2006 (UTC)

It's not encyclopedic because it is unnecessary in conveying the meaning of the article. When was the last time you read Encyclopedia Brittanica and read something about a cartoon in a logic article? – Chris53516 (Talk) 03:45, 10 November 2006 (UTC)

And if you had bothered to look further at the contributions of that contributor, you might have noticed multiple edits which decided aren't vandalism, since often anon IPs span multiple users. The text is irrelevant since even though it gives an example of the subject, there were already equally valid examples earlier in the article which A) aren't as long (since they don't require a contextual description), and B) aren't related to a pop culture reference, which is inherently more encyclopedic - having a "Pop culture" section is irrelevant because, as a logical argument, it invariably will appear in many popular pop culture media. Not signing since this IP changes several times a day, 18:32, 10 November 2006 (UTC)

Thanks for the support, but I think we'd all appreciate if you would sign in when your IP address changes a lot. – Chris53516 (Talk) 19:16, 10 November 2006 (UTC)

Where can I get one of those rocks that repels Tigers? How much? Just kidding. If anyone deletes this, they are morons. This is a discussion. As for the reference, it is a poingant example. You may not think it is "relevant," but it points out an example of this falacy that has been used in front of millions of people.

Not to stray too far off topic, but I don't plan on actually setting up an account here (I remember reading an essay in the WP namespace with some reasons, but darned if I can find it right now). 22:59, 10 November 2006 (UTC)

I think that Chris is right when he/she says that a popular culture section is unecyclopedic. For people coming to the article they are not interested in references in popular culture because that by itself is irrelevant. However, I think that the example itself is a good one, it is easy to understand and I think that it "hits the spot". I am in favour of removing the popular culture section, but placing the Simpsons example elsewhere so that it can stand on its own merits, not on the irrelevancy that it appeared in a popular television programme. Grumpyyoungman01 22:24, 10 November 2006 (UTC)

Granted. My original edit summary referred to the section, not the example (though I obviously disagree with including the example), though perhaps my argument wasn't that apparent. I do apologize for not consulting the Talk page first (the type of article seemed like one of those where discussion on the Talk page ends up with inaction in the article, so I skipped it), and for the somewhat aggressive tone of my last post - I just hate being unilaterally labeled "vandal" after supplying (IMHO) justification for a bold edit). 22:59, 10 November 2006 (UTC)

Sorry for calling you a vandal. Unilateral action (without consulting the talk page) under the be bold principle is good. You just need to be prepared to be reverted if someone doesn't like it. No problem there, my "bold" edits are reverted all the time and consensus is usually reached fairly promptly. So far we only have three contributers to this discussion, we need some more for a consensus. Grumpyyoungman01 05:05, 11 November 2006 (UTC)

The reason I don't like the Simpsons example is more mathematical. Its illustrates a very limited form of Correlation where the two random variables can just take binary values: bear patrol/no bare patrol, bears/no bears or rock/no rock, tigers/no tigers. Correlation is normally used in situations where you have continuous random variables, like the pirates examples. As such it is a poor possibly misleading example, and may actually be an example of a different logical falacy. --Salix alba (talk) 09:00, 11 November 2006 (UTC)

This is sort of related, so I'll ask here: is the Flying Spaghetti Monster example (and attendent image) really necessary? It's funny, but not a very good example (not to mention a much more limited audience has heard of FSM compared to the Simpsons - which I also think should be removed). Virog^{It's not}_{my fault!} 05:15, 12 November 2006 (UTC)

Requested move to Correlation does not imply causation

I've started a requested move to Correlation does not imply causation at /Page title. --Salix alba (talk) 09:40, 20 November 2006 (UTC)

Removing the "Pirates vs global warming" chart

I have removed the graph which shows a correlation between the number of pirates and global warming (Image:Pchart.jpg). Even though it is trying to make a good (and fun !) point, this graph is of such low quality that we should not put it on one of Wikipedia's page. The X-scale, in particular, is terrible:

Most importantly, the points are not in order (35000, then 45000 then 20000) ! Any variable can be shown to be related to another variable with this kind of trick (It may not be a deliberate choice by the person who made the graph; Excel is particularly keen on plotting graphs such as this, and in most cases will not order your data).
It is not linear (the difference between 45000 and 20000 is the same as the difference between 400 and 17 !), thus increasing the feeling that there is an almost linear relationship between the variables.
It is reversed compared to common usage, thus giving the impression that there is a positive correlation.

Making a good graph would not be too hard; however, it would probably not be as impressive in term of showing correlation. Schutz 13:48, 27 November 2006 (UTC)

I agree with you, but I would like to point out that no data set will ever truly be completely linear. Data will go up and down. Regressions estimate a possible explanatory line that lay between differing points. But perhaps I'm preaching to the choir. – Chris53516 ^(Talk) 14:29, 27 November 2006 (UTC)

You are completely right, of course — which is the reason why the same point could probably be made using this data without doctoring it (and rearranging the x-axis qualifies as such), even if the graph would not look as impressive. Schutz 14:50, 27 November 2006 (UTC)

Yes, a correlation doesn't necessarily imply a linear relationship. Only a perfect correlation demands a linear relationship, and perfect correlations are just special cases. As far as I can remember from statistics class. Narssarssuaq 16:56, 27 November 2006 (UTC)

Redirect from "Cum hoc ergo proptor hoc"

Is there a reason this redirect doesn't exist? Mnc4t 17:10, 8 December 2006 (UTC)

You mis-spelled it. Its propter,[1] not proptor. However, if you think that's a common mis-spelling, you can make the redirect. — Chris53516 ^(Talk) 17:24, 8 December 2006 (UTC)

Nevermind, I made it. — Chris53516 ^(Talk) 17:29, 8 December 2006 (UTC)

Fact templates

The numerous {{fact}} templates are probably a bit redundant with the main "citations needed" template at the beginning of that particular section. I move to take them out (to allow for easier reading) and trust that the first, enormous template will alert editors to try to find sources for the statements about Mr. Hume (or take out the section altogether, if everybody else here understands it as well as I do). 64.90.198.6 22:40, 12 December 2006 (UTC)

The {{fact}} templates are there to point out which statements need citations. That's the point of the template, and it's good to point out to the reader which statements are without source. — Chris53516 ^(Talk) 14:11, 13 December 2006 (UTC)

Question

So I was debating a skeptic (I use the term skeptic loosely here, as some of these individuals are downright loopy) on the issue of global warming. And they used the analogy that say the both the stork population and the number of babies are on the rise, should it be assumed that the two are related. Obviously, this analogy is derived from the correlation does not imply causation argument. However, it would seem to be wholly irrelevant to the debate at hand (since outside of a 3-year-olds imagination, storks and babies are completely unrelated, whereas rising carbon dioxide concentrations and increasing global temperatures are related). I would think, then, unless the two variables being considered are independent, this argument doesn't hold up. Maybe it should read "Correlation does not neccessarily imply causation" or "Absent dependence among the variables, correlation does not imply causation" or something to that effect.—Preceding unsigned comment added by 128.2.251.57 (talk)

I think the answer to your general comment is contained in the "Usage" section, which indicates that there are two meanings of the word "imply" — the "everyday" one, and the "logical" one. This article is about the logical fallacy, and as such, the meaning of the word "imply" is precisely defined. This explains why the current title means indeed "Correlation does not neccessarily imply causation" in everyday language. For the particular example of storks and babies, I feel that one problem may be the use of the word "related": even though there may be no causal effect between the two things, there may still be a relation between the two, for example a third variable which has an effect on both (as described under number 2 in the section "General pattern" of the article). Some people have suggested, for example, that an increase in people living in a given region is likely to cause an increase in both the number of newborns (relatively obvious), and in the number of storks (more people means more houses, which means more chimneys, which means more places where storks can nest), so that there may be a correlation between the two things even though there is no causal effect... In fact, the example of storks and babies was imagined by the statistican Jerzy Neyman to illustrate a well-known statistical artefact: the correlation between two sets of ratios is likely to be high if the denominators are the same in both sets (for example, number of storks per 10,000 women and number of babies per 10,000 women). Now I should probably add this to the article at some point... Hope it helps ! Schutz 21:26, 16 December 2006 (UTC)

Commenting on the unsigned contributor's 128.2.251.57 remarks, please review the policies and standards for wikipedia before you decide to give us your opinions. Here are a few points to consider ...

Be civil. There is no need to call anyone "loopy". You are obviously trying to equate those with a climate warming opinion different from you as suffering some form of mental deficiency. However, all that you have done is demonstrate that, at least in your case, people with your climate opinion are not able to be civil when they discuss the matter.

correlation does not imply causation is not an "argument" as you state. It is a concept so fundamental to research, investigation, & statistical analysis that anyone who doesn't "get it" should refrain from any comment on any public policy issues that purport to be based upon any type of causation. Unless of course, they seek power, money, or fame and such pursuits are independent of any requirement that public policy influence be based upon genuine and rational argument.

"I would think, then, unless the two variables being considered are independent, this argument doesn't hold up." Your thinking is flawed, for example by your failure to consider possibilities that don't conform to your preconceived notion. You already believe that A causes B, and fail to consider anything else. Assuming there is a demonstrated correlation between A and B, you are reckless to assume that A causes B. For example, perhaps C causes both A & B. How about, B causes A? How about, it's just a coincidence? Your non logic would similarly conclude that the cottage cheese salad plate causes weight gain, given that you mostly see heavy people ordering it. And since you are on the subject of global warming, who is to say that any purported correlation between global T and CO2 shows either as the cause? Could not global temperature rise shift the equilibrium between the massive quantities of CO2 dissolved in our planet's oceans and the portion in the atmosphere? See Partial pressure.

Maybe it should read "Correlation does not neccessarily imply causation" No. Adding the word "neccessarily" adds nothing to the meaning, is superfluous, and IMHO seems a useless effort to downplay the articles purpose, to enable those seeking true knowledge to learn of the separate concepts of correlation and causation.

It's much better for you to think outloud on the talk page, but please refrain from adding or editing content on subjects which you do not understand. Wiki is intended to be an encyclopedia of fact, not a soapbox for opinion.

Dispute Over First Paragraph

Your first paragraph some consider to be "stable" is written in too convoluted a format for the typical Wikipedia user and makes the topic much more difficult to understand than need be. There is no reason the page can't be edited because it is not "locked." Moreover, it shouldn't be locked because the first paragraph as it was written was unnecessarily confusing. I proposed this paragraph:

When a correlation implies causation, whether in the sciences or through statistics, it may be assumed there is a cause-and-effect relationship between two variables. However, according to logical reasoning a correlation should not immediately be considered a cause-and-effect relationship. When two events that occur together are prematurely considered to be a cause-and-effect relationship, and evidence only exists that the relationship is correlational, it is a logical fallacy to imply (especially in published research) one factor is immediately causing the other. This type of logical fallacy is known as cum hoc ergo propter hoc (Latin for "with this, therefore because of this") and false cause (see General pattern below).

--I'm not changing what was there prior because I'm trying to be a bully. What was there is just too difficult to understand. What I wrote pertains to the topic and is easier to read!

4.240.183.104 21:55, 26 January 2007 (UTC)

Further comments on the first paragraph written as:

"Correlation does not imply causation is a phrase used in statistics to indicate that correlation between two variables does not imply there is a cause-and-effect relationship between the two. Its negation correlation implies causation is a logical fallacy by which two events that occur together are prematurely claimed to have a cause-and-effect relationship. It is also known as cum hoc ergo propter hoc (Latin for "with this, therefore because of this") and false cause."

1. There's no "phrase" used in statistics that's called "correlation does not imply causation." It's a matter of scientific research that correllations should not be reported as a cause-and-effect relationship, and to imply this is a logical fallacy.

2. The wording "It's negation..." is VERY unclear as it follows the first sentence.

3. "It" in the third sentence is undefined and vague. What is "it" that you are referring to?

4. More can be done to fully explain what the topic title pertains to, but certainly not in the confusing format written above. That's the primary reason I edited the first paragraph.

Again, there's no violation of Wikipedia rules here on my part.

4.240.183.104 21:45, 26 January 2007 (UTC)

I am not going to engage in a petty argument with an unknown user. Read Wikipedia:What is a good article? and Wikipedia:Lead section. — Chris53516 ^(Talk) 22:02, 26 January 2007 (UTC)

From Wikipedia:

"A good article has the following attributes.

1. It is well written. In this respect:

(a) the prose is comprehensible, the grammar is correct, and the structure is clear at first reading. (b) the structure is logical, introducing the topic and then grouping together its coverage of related aspects; where appropriate, it contains a succinct lead section summarising the topic"

This paragraph may be succinct, but it fails because it is NOT comprehensible as written and the structure is absolutely NOT clear:

"Correlation does not imply causation is a phrase used in statistics to indicate that correlation between two variables does not imply there is a cause-and-effect relationship between the two. Its negation correlation implies causation is a logical fallacy by which two events that occur together are prematurely claimed to have a cause-and-effect relationship. It is also known as cum hoc ergo propter hoc (Latin for "with this, therefore because of this") and false cause."

My above post explains what is wrong with this introduction, and I sincerely wish Chris and the rest of you would stop simply trying to control this topic without discussion. That's not Wikipedia etiquette.

---And Chris, anyone can edit a topic whether you have an account or not, and you're on the verge of my contacting Wikipedia for your unnecessary bullying. I have your user name.

4.240.183.104 22:09, 26 January 2007 (UTC)

Don't make petty threats. You're in violation, and another user has reverted your edits too. I don't care if you edit Wikipedia, you're just not doing a good job! — Chris53516 ^(Talk) 22:21, 26 January 2007 (UTC)

Chris53516:

I have made no "threats" to you. Surely someone else is going to be alerted to this problem and further changes will be made to the first paragraph of this topic. I have now posted it in at least 5 places at Wikipedia including attempts to request arbitration over the matter which you refuse to discuss (very poor usership and violation of Wikipedia good conduct.) Other users (whether registered or not) who see this discussion will also see that you are willing to resort to a reversion war rather than discuss the problems I noted above regarding the first paragraph of this topic. Administrators at Wikipedia do not like to see when one person tries to ruthlessly control a topic, and registered users who are doing this are making Wikipedia a less repudable source of information.

4.240.165.227 00:30, 27 January 2007 (UTC)

Can we try and keep this discussion cool, WP:NPA, WP:CIVIL are both guidelines we should all follow. We need to establish consensous here rather than jumping straight away to arbitration. Also could you get a user name, it makes it hand to follow contributions when your IP address keeps changing. The current introduction has stayed the same since the article changed it name in Noveber, and no one apart from yourself has objected.

Specific points,

The word phrase could be changed, statement perhaps?
I've changed negation to converse.
No ambiguity about it in third paragraph. Its clear it is referring to the phrase. It could possibly be changed to the phrase.

When a correlation implies causation, whether in the sciences or through statistics, it may be assumed there is a cause-and-effect relationship between two variables.

This does not meet the standard encyclopedic tone used in wikipedia. See Wikipedia:Manual of Style (mathematics).
Logically tricky, if there is causation then there is a cause and effect relationship, which leaves the correlation part superfilious. --Salix alba (talk) 09:28, 27 January 2007 (UTC)

Salix alba:

The topic "Correlation does not imply causation" is NOT a topic that is exclusive to mathematics and also pertains to the sciences as well as philosophy. I am well aware of the poverty of language style that is part of the APA style of writing, and what is considered to be succinct as opposed to too short to be intelligibly clear is open to a wide variety of opinions. What transcends a "mathematical" or scientific form of writing is taking into account whether the public is going to understand what is written, and the language used in the original paragraph presented a poor introduction to explain this topic. Moreover, if the guidelines for writing articles on Wikipedia are being edited by people who favor mathematics, it is highly likely in the future that this topic is going to be explained eventually in a 2 + 2 does not always equal 4 way in which nobody is going to understand. I vote to use clear simple to understand sentence structure and language, which is NOT always a poverty of words.

--and it is quite irrelivant that this article's first paragraph has not been changed since November. Wikipedia is an e-encyclopedia designed to be constantly edited. The problems arise when there is a difference of opinion regarding what an "improvement" really means, and I can see by your attempts at a change in language that I've made my point. Thank you, Salix alba.

4.240.183.161 14:33, 27 January 2007 (UTC)

Literal definition of imply and negation

Edit: I read this wrong formerly and changed below appropriately.

Using "not" in its literal sense makes the statement "Correlation does not imply causation" wrong, for then we have ~(P => Q) == P & ~Q, which is clearly false. We are not saying we will always have correlation without causation, which is why many people prefer "Correlation does not necessarily imply causation." The statement should be modal, not propositional. Upon more carefully reading the article (*wince*) I see this was covered. It just maybe could be a bit more formalized. 64.221.248.17 16:59, 11 April 2007 (UTC)

Surely the very definition of logical implication is that P => Q == ~(P & ~Q)? Hence the well-known logical principle "a false proposition implies any proposition". 193.122.47.170 17:23, 20 May 2007 (UTC)

Clarification?

Can anybody clearly explain the difference between "Correlation does not imply causation" and "Post Hoc Ergo Propter Hoc"? They seem to be the same, and both articles even say that both are called Faulty Cause. RageGarden 17:31, 9 June 2007 (UTC)

This article, "Correlation does not imply causation", should be just about the phrase and Post Hoc Ergo Propter Hoc should be about the fallacy, which it is. Apart from that there is no difference between the two. - Grumpyyoungman01 08:08, 10 June 2007 (UTC)

Merge-ish proposal with Post hoc ergo propter hoc

As per above I propose that bits of this article which deal with the fallacy should be moved to and merged with the fallacy page Post hoc ergo propter hoc and that this article should be left by itself to deal with the phrase and the use of and related issues. Grumpyyoungman01 08:08, 10 June 2007 (UTC)

No change, everything is fine

Different fallacies different articles. Some explination of the difference may be required. --Salix alba (talk) 11:38, 11 June 2007 (UTC)
Yeah, ok. Could you please draft a couple of sentences that we could add to the introduction of both articles by way of explanation? -- Grumpyyoungman01 22:12, 11 June 2007 (UTC)
I see the difference now, so I vote to keep them as they are. RageGarden 20:17, 17 June 2007 (UTC)

Two articles, one for the phrase and one for the fallacy

~~As per my nomination above. Grumpyyoungman01 08:08, 10 June 2007 (UTC)~~

Merge both articles onto the one page, decide which page later

Some other proposal

Requires Mathematical Precision

There are many criticisms that I could give of this article. The first and most obvious ones are that the article does not define either the notions of correlation, or the notion of causation – both of which require something of the order of a mathematical definition in order to make precise the terminology (even a couple of formulae for correlation would be best). This is, in fact, my general criticism – the article has much scope to being very mathematically descriptive, yet it does not utilise much in the way of mathematical definitions. This is a shame – rendering the article as incomplete.

I dont think this is necessary, for the links to correlation and to causality provide enough mathematical details.

Just a quick point : “Correlation is not causation but it sure is a hint.”, seems as if it need not be a sensible phrase either – you would require something like a numerically/mathematically discernible notion of `hint'. That is, is a 0.000001 probability that A's correlation to B implies a causative link between the two the same as stating that you have been given sufficient information to `hint' that A causes B?

P.S - I'm sure that the use of Mathematical Precision requires a consideration of Measure Theory and Probability in order to understand how mathematical theoretical frameworks showing how correlation is very dissimilar to causation might sometimes differ with reality (doing the maths doesn't necessarily mean that you've done the physics) - or vice versa (that matheamtics would predict something that is physically impossible).

ConcernedScientist 19:05, 28 June 2007 (UTC)

No correlation no causation?

I wonder about a situation where there is no or an inverse correlation leading to the conclusion that there is no causation. something like - Today is sunnier than yesterday and today is colder than yesterday therefore the sun does not heat the earth.

Or

In county X there is a higher rate of drunk driving than in county Y and there are fewer accidents in county X. Therefore, driving drunk does not cause accidents. is this a named fallacy? Does it belong here?--YellowLeftHand 15:21, 19 July 2007 (UTC) Nevermind.

think I found it. http://en.wikipedia.org/wiki/Normally_distributed_and_uncorrelated_does_not_imply_independent --YellowLeftHand 19:34, 19 July 2007 (UTC)

Correlation does not imply causation includes a misnomer in the word imply

The usage section is misleading. If we look at Logical implication which gives a strict mathematical meaning to the word imply we see that it is equivilent to If P then Q. This must hold for all possible P, that is all cases where there is a correlation between two variables. Further the souce cited is an email discussion which does not satisfy WP:RS. --Salix alba (talk) 11:39, 11 November 2006 (UTC)

You are talking about a mathematical meaning to the word imply. The word imply does not have that same meaning in ordinary usage. Yeah, it is a dodgy source, but I included it because it argues the case nicely. I think the section is unambiguious and I would have also thought uncontroversial, thus not in need of a citation. The correlation between diminishing numbers of pirates and levels of CO2 in the atmosphere does imply a causation, not in some hokey mathematical way, but in an Oxford dictionary way. Many readers would be coming from an Oxford dictionary school of thinking and not a scientific "lets take a common word and give it a subtle but significantly different meaning for our own purposes" school. Speaking about the oxford dictionary, here is a definition of the word "imply" from the Macquarie dictionary, there are two conflicting definitions, one follows the form from logical implication and another follows what I was arguing for.

1. To involve as a necessary circumstance

3. To indicate or suggest, as something naturally to be inferred

Now of course usually the mathematical definition would be used in an article on an aspect of logic, but the phrase "correlation does not imply causation" is widely used by lay people, with I think, use a lay definition of "imply", that is definition 3. But I have no evidence for that last point. My current thoughts are that ambiguities exist and the article needs a section to clear up these ambiguities. I agree that the section as it currently stands is misleading when using the logical definition of imply, but the opposite (such as the article pre my edit to "Usage"), is equally as misleading. Grumpyyoungman01 12:25, 11 November 2006 (UTC)

Brilliant research, Grumpyyoungman01! We'll have to mention this ambiguity of "imply" in the article --however, I don't think the correct concept is misnomer, as one of the concepts under the term imply's umbrella is actually the correct one. This leads to the question: Does a wrong term need to be used for it to be a misnomer, or is it sufficient to use a wrong concept that shares the same term? I think the first is correct. ---I added some stuff in an attempt to clarify, but I'm in serious doubt if the use of the concept prove in the paragraph is totally correct. In that case, "Correlation does not prove causation" would be true, and thus follows(?) that a totally precise and foolproof formulation of the fallacy would be "Correlation proves causation". I'm looking forward to some informed discussion here. Again, thanks to those who brought this up. Narssarssuaq 23:59, 11 November 2006 (UTC)

Also, given that "suggests causation" is true, that doesn't necessarily... uh, imply(involve as necessary circumstance) "suggests the direction of causation". Narssarssuaq 01:39, 12 November 2006 (UTC)

That was good edit Narssarssuaq. I have removed the dispute tag as the consensus is that imply is not a misnomer. Grumpyyoungman01 22:15, 12 November 2006 (UTC)

I think its important to give the technical mathematics definition of the term. For someone versed in mathematics when they use the word imply they mean the logical implication form of the meaning. From that page

The truth table associated with the material conditional if p then q (symbolized as p → q) and the logical implication p implies q (symbolized as p ⇒ q) is as follows: (snip)

Following Humpty Dumpty

"When I use a word," Humpty Dumpty said, in rather a scornful tone, "it means just what I choose it to mean -- neither more nor less."

when a stastician uses the word they will mean precisly this form. --Salix alba (talk) 23:28, 12 November 2006 (UTC)

Disagreement about "suggests"

I still disagree that "correlation suggests causation" because of the lack of a time-order sequence of events in correlation. Often correlations are of two events that occur at the same time, and in order for one to cause the other, it must precede the other. If one event does not precede the other, there should be no implication or suggestion of causality. – Chris53516 (Talk) 14:38, 13 November 2006 (UTC)

This may exclude events that cause an instantaneous response. For example, a CO2 emission at once causes increasing the greenhouse effect, but there is still an undisputed causal arrow. In this, and many other examples of causation, the cause doesn't really precede the effect, at least not in a strong sense of the word. I'd like to hear your viewpoints on this if you disagree. Narssarssuaq 15:34, 13 November 2006 (UTC)

Even if the interval of precession may have to be measured in nanoseconds, the cause still precedes the effect. In such cases where the cause and effect appear to us to be co-existing, we should have doubt about causality until further evidence is available. For your example, I would say that the release of the CO2 is the cause, not the CO2 itself. Therefore, there was an event of the release of C02 that preceded a rise in the greenhouse effect. As the event unfolded, the greenhouse effect changed in proportion. Besides, the greenhouse effect is very complicated, and the release of more pollutants would take some time to effect global warming. (I disagree that you think this example is undisputed. I have heard accounts that have disputed the effects of CO2 emissions. Of course, I don't agree, but the fact is that it is disputed.)

I digress. My point is that unless there is solid proof of one event preceding another, we should not assume or even think of causality. It's like what politicians do. Our economy may have causes in the far past, but politicians will claim their bill passed last month or last year improved the economy. This is an example of where the two events have coincided: The bill was passed as the economy was improving, not before.

I think that, as humans who search for patterns, we are prone to seek patterns (e.g., causal patterns) and see it where it simply is not. It's like seeing that face on Mars. (If you shift the viewing angle, it doesn't look like a face anymore.) Or like seeing patterns in the stars. Therefore, I think the section that states that causality is suggested by coinciding is simply misleading and lends itself to the mistakes we make as pattern-seeking humans. People will read it and say, "So one event could cause the other; therefore, the number of pirates probably does effect global warming."

I believe that most events are caused by multiple events, since no event is in a vacuum. Therefore, saying A causes B ignores the other potential factors. In science, I would like to think we try to pinpoint which causes are stronger, not which cause is the only cause.

– Chris53516 (Talk) 16:08, 13 November 2006 (UTC)

First of all, you're in part talking about the Regression fallacy. Secondly, the point is that the article excludes in every single case that correlation implies causation when "imply" is used in the logical, strong sense. Sometimes, however, if only very rarely, correlation MAY be suggested, however weak, by a correlation. The point in this article is just that a correlation doesn't NECESSARILY IMPLY a causal relationship or a certain causal arrow. So while "Correlation suggests causation" may usually be false, as you're saying, it's beyond the scope of this fallacy.

Thirdly, if an event is caused by multiple events, it may still just be made up by the sum of the parts. For example, the greenhouse effect is made up from CO2 emissions AND methane emissions AND NOx emissions AND CFC emissions etc. Your comments may fit into a philosophical debate of holism vs reductionism. However, encyclopedic activity and all science is based upon the phenomenon of analysis of a system, where everything outside the system is considered constant or irrelevant. Sometimes, this works perfectly well at least from a pragmatic point of view; at other times, one will want to make a more complex analysis by bringing more parameters into the picture. This has the advantage that it yields a better result, but the disadvantage that it takes longer time. Bringing in an infinite number of parameters would give total understanding, but would take infinitely long and thus would be the stupidest experiment ever. :)

After giving it a little thought, I agree that a CO2 molecule is emitted (if only nanoseconds) before it starts trapping infrared light or whatever they do to enhance the greenhouse effect. Thanks for pointing that out. However, if a CO2 emission lasts for, say, an hour, at a macroscopic level there will be a simultaneous increase in greenhouse effect this hour, excepting only the first nanoseconds or so, and adding the first nanoseconds after the emission stops.

In sum, I feel that you bring up a number of interesting issues, but right now I don't quite see what may be added or removed in the article from what you suggest. Narssarssuaq 17:32, 13 November 2006 (UTC)

I know. You're preaching to the choir regarding research scope. Someone once said something like "All things are probably related, but a study designed to examine the relationship among all things would fall apart." Any better ideas for a page title? – Chris53516 (Talk) 17:56, 13 November 2006 (UTC)

See also Fallacy of the single cause! Some of the fallacies listed on Wikipedia are apparently logical "shortcuts" that may actually work at some pragmatic level. Also, it's worth noting that committing fallacies is perfectly OK in humour. And it may also be in social chit-chat, where the main point often is to agree on something, and not necessarily to find truths. But that's a different story :) Narssarssuaq 18:20, 13 November 2006 (UTC)

I believe the "Usage" section does not need the long-winded information on the perils of misusing the term "imply." A note that "imply" does not mean in logic what it means in everyday discourse is sufficient. I'm paring it down. Feel free to edit it back if it sucks. Thudworthy 03:58, 13 September 2007 (UTC)

Literal logical meaning of material implication

I've removed the section titled "Literal logical meaning of material implication". This details my reasons:

The literal meaning may well correspond to that of material implication. There's a misunderstanding by the author of this section: the negated material implication is not true in all cases, so (p & ~q) need not always hold, as is implied. Rather, think of the cases where "correlation does not materially imply causation" is true. In all such cases, we will indeed have correlation being true and causation being false - (p & ~q) is quite acceptable. So, material implication is a perfectly sufficient interpretation of the phrase.
Even if material implication were not sufficient, this would not mean a modal account would be the best one, let alone the one the sentence "means". As an example, I might try an axiomatic (but still propositional) system to account for the "implies". I might say: ~(|- p -> |- q). Or I might give a completely different account using analytic descriptivism. Or I might say the sentence means "the sky is green on tuesdays". In other words, the section fails to give an account of why a modal account should be preferred.
Even if material implication were not sufficient, and the modal account were the best one, this whole section would still be original research, not backed up by any references, with weasel words about "many people" and problems of notation.

--Dom 11:34, 19 September 2007 (UTC)

I'm not sure how the "Usage" section is off topic. It clarifies the terminology. In standard everday usage, there is a great difference between "implies" and "proves". Fuzzform (talk) 01:18, 25 November 2007 (UTC)

Random Comment

To sum up:

Causation causes correlation but correlation does not cause causation.

Yeah! Root⁴(one) 21:24, 30 January 2008 (UTC)

Suggested move

I suggest moving this to the name correlation and causation. Yes, correlation does not imply causation, but the article is simply a discussion of the relationship between the two. We could also call it correlation may or may not involve causation, though if the correlation is unlikely to be by chance we should expect there to be some causal relation, even if mediated by a confounding variable (it give a more accurate impression, but is probably more suited to Uncyclopedia). Richard001 (talk) 06:08, 15 February 2008 (UTC)

The reason behind the title is that this is a common phrase used in the literature, which expresses a mathematical result. There could be scope for a separate correlation and causation article. --Salix alba (talk) 11:24, 15 February 2008 (UTC)

Health and INcome

I really dont like the income health correlation example. I think it is bised for rich countries. In rich countries it may be true that income does not have much meaning in terms of health because everyone has a good diet. But in poor countries the health income correlation may well be indicative of good income causing good health. If you have a 1300 calorie diet, a higher income may well mean that you will have a 1600 calorie diet instead. Although both diets are not sufficient, I think the person with 1600 will be healthier.

Another example:

Teenage girls eat lots of chocolate.

Teenage girls are most likely to have acne.

Therefore, chocolate causes acne.

This argument, and any of this pattern, is an example of a false categorical syllogism. One observation about it is that the fallacy ignores (4), the possibility that the correlation is coincidence. We can pick an example where the correlation is as statistically "robust" as we please, but we still cannot assume one factor causes the other. If chocolate-eating and acne were strongly correlated across cultures, and remained strongly correlated for decades or centuries, it may not be a mere coincidence. However, in this particular example, the last statement is a logical fallacy because it ignores the possibility that a third factor may be the cause of eating chocolate and having acne (e.g. being young). See joint effect.

Someone correct me if I'm wrong about this. Narssarssuaq 16:50, 3 July 2006 (UTC)

Well clearly the example is flawed, but it could be either an example of common cause or coincidence. Being a teenage girl both causes you to eat chocolate (because of teenage girls' tastes) and have acne (because of teenage girls' hormones); neither causes the other. Of course, given a small sample size it could indeed just be a total coincidince unrepresentative of the larger population. But the whole point of the example is that it represents the fallacy this article explains. It is an example of a fallacy--of course it's fallacious!Eebster the Great (talk) 03:50, 7 May 2008 (UTC)

Flying spaghetti monster example

This section is not well explained and will communicate nothing unless one already knows about the flying spaghetti monster pirate/global warming joke. Needs to be rewritten. --Xyzzyplugh 11:56, 10 December 2006 (UTC)

I do like this section as it does illustrate the point that its possible to get correlation between two wholely unconnected sets of data. --Salix alba (talk) 22:00, 10 December 2006 (UTC)

Except that I would like to actually see the data these correlations are based on — or more to the point, I'd like to see the value of these correlations. There was this ugly incorrect graph, which we removed recently, but I have not seen anything else. It is very likely that there is a correlation here (the number of pirates went down and the average temperate went up over the last few centuries), but if we are not able to present the data and/or show what this correlation actually is, then this example is not worth much and should be removed. Schutz 22:33, 10 December 2006 (UTC)

Yes, the pirates/global warming example is an amusing and fitting one, it's just not being explained properly as of now. --Xyzzyplugh 13:41, 12 December 2006 (UTC)

Well except it is deliberately inaccurate, or at least badly researched, as there are thousands of pirates around currently, possible more than their ever has been in history, so the graph is probably headed in the opposite direction to how it is in reality. -- 86.129.7.144 01:26, 24 August 2007 (UTC)

Well, that graph was drawn up by Bobby Henderson, was totally made up, and given a hilarious scale just to emphasize its absurdity. Yes, it's clearly deliberately inaccurate, and actually isn't a particularly strong correlation, which is what makes it so funny. To clear things up, Bobby doesn't consider the pirates around today to be true pirates, since they steal and kill people and stuff. True pirates were divine beings who gave candy to children and such. There are very few true pirates left, although it is still common to dress up as one on Halloween or National Talk Like a Pirate Day (U.S.) :). Eebster the Great (talk) 04:10, 7 May 2008 (UTC)

Examples need improvement

Someone needs to improve the examples given here. It is inappropriate to try to inform people on this subject using an example with enough lack of clarity as HRT, where even top scientific minds can reasonably disagree today. Also, the twin example given later is a poor example of a controlled experiment as the twins obviously have a history that is separate at the start of the test, and even may have very different GPAs. A more classic example with a reference should be substituted. Manyanswer (talk) 16:33, 12 June 2008 (UTC)

Quote

This is a quote that I ran across in Stephen Jay Gould's book "The Mismeasure of Man" (1981, W. W. Norton & Company NY, NY). I thought that it may make a nice addition to the introduction to illustrate the high frequency of this mistake.

"The invalid assumption that correlation implies cause is probably among the two or three most serious and common errors of human reasoning." p. 242

("cause" rather than "causation" is not a typo)

Timothyjwood (talk) 04:59, 30 July 2008 (UTC)

Example 1

Example 1 is labeled as B causes A, but in fact it is just another instance of what is found in Example 2, ie. third factor C causes A and B. In the example given in 1, it is the fire that causes damage and firemen. This can be turned back into B -> A from C -> A,B if the element of Damage is removed, changing the formulation as follows:

The more firemen fighting a fire, the bigger the fire is going to be.

Therefore, firemen cause fire.

If someone else has a better example in mind, by all means change it. --Iamthedeus (talk) 21:45, 1 March 2009 (UTC)

On "3. Coincidence"

The parody religion Pastafarianism does say that the number of pirates corellates with the amount of CO2 in the air, but the correlations is negative, not positive. In the article, it presents it as a positive correllation, mainly where it says: "Since the 1950s, both the atmospheric CO2 level and crime levels have increased sharply." Whereas it should read marine crimes have decreased sharply. Cencere (talk) 22:48, 27 May 2009 (UTC)

Those are two separate examples. Read again.--Loodog (talk) 03:12, 28 May 2009 (UTC)

Simpsons paradox

What about the Simpsons paradox? That would somewhat more like it seems like A causes C more then B, but actually D causes C and A more. I don't really know how to put it in there but it in there and there already are a bunch of things in the 'see also'. How many 'statistical paradoxes' pages are there? Maybe it needs a page with a list, so it can link there. —Preceding unsigned comment added by 88.159.74.89 (talk) 14:13, 23 August 2009 (UTC)

Racism

This fallacy is also the core of racism. It would probably be very helpful if there was some reference or link to it...since racism is such an important problem and so few people understand why it's wrong. 68.123.154.74 (talk) 14:39, 5 October 2009 (UTC)

Actually, you're thinking Hasty generalization. Racism is related to Correlation does not imply causation, but is not caused by it.--Louiedog (talk) 15:10, 5 October 2009 (UTC)

There is overlap in these categories. I can see how racism would fit under hasty generalization, but I think it's more specific to explain exactly why the generalization is being made...in this case, by confusing correlation with causation. Jim Crow and all other racist conclusions are damaging because they try make rules by assuming race is a cause of some observed difference or perceived problem instead of what actually causes the problem (in America, the big one being long-term effects of slavery). Interestingly enough, I don't see any reference to racism on the hasty generalization page either. 68.123.154.74 (talk) 21:53, 6 October 2009 (UTC)

I did add an example to the hasty generalization page of racism. Any kind of prejudice is just the human mind needing to apply its (usually) useful thing of putting things into categories.--Louiedog (talk) 22:06, 6 October 2009 (UTC)

Tufte's volubility

I can make it shorter: "Causation logically implies correlation."

"Logically implies" removes the "casual" ambiguity of "implies" by invoking "logic", thus it allows that correlation can occur without causation.

Most people either aren't familiar or aren't facile with logic, which is why they think as if the converse were true.
198.207.0.5 (talk) 17:41, 19 November 2009 (UTC)