Wikipedia:Articles for deletion/Human genetic clustering

From Wikipedia, the free encyclopedia
The following discussion is an archived debate of the proposed deletion of the article below. Please do not modify it. Subsequent comments should be made on the appropriate discussion page (such as the article's talk page or in a deletion review). No further edits should be made to this page.

The result was delete. Sandstein 08:54, 14 November 2020 (UTC)[reply]

Human genetic clustering[edit]

Human genetic clustering (edit | talk | history | protect | delete | links | watch | logs | views) – (View log)
(Find sources: Google (books · news · scholar · free images · WP refs· FENS · JSTOR · TWL)

This page is a confused, unreadable mess of original synthesis that does not cover the actual topic. It does not accurately reflect the body of research on genetic clustering and needs to be blown up and rewritten.

Human genetic clustering is, roughly speaking, an approach that uses cluster analysis to study patterns in genetic data; it is a set of methods to characterize populations within studies. The problem is that this article is (and always has been) solely about the clusters themselves with close to no text dedicated to explaining how clusters are determined or what the actual process is. Virtually every modern study of human population genetics includes some type of cluster analysis, and they will always find clusters, leading us to the current coat-racked article revolving around group differences, race, ethnicity, and genetics. The article's current structure (to talk about clusters rather than the clustering algorithms, their applications, and interpretations/results) is designed to be a battlefield of POV-pushing, which we see with its contentious edit history and frequent visits from socks and trolls. Its original creator was banned long ago for being a sockpuppet account of a user who edit-warred on race-related articles like race and intelligence so I believe this article was not created in good faith or with good understanding of the topic.

There is a good article to be written on the topic as it relates to algorithms, inference, and major findings related to population histories, but there is no version of the existing article that would make a good base for that. In summary:

  • This article does not actually cover the topic it claims to. Virtually none of the text is dedicated to clustering, and almost all of it is dedicated to discussions of group similarities/differences or arguments about race and genetics
  • It is basically unreadable, consisting of duelling blocks of quotes that leaves readers with less understanding of the topic.
  • It reads like an argumentative essay and is original synthesis. Citing (talk) 15:31, 29 October 2020 (UTC)[reply]
  • Comment: This is one of a cluster of articles that have been progressively lengthened by the thorny debates over race and genetics. That debate is of social and (to a lesser degree) scientific significance and ought to be covered somewhere on Wikipedia, and its intersection with a series of studies of human genetic clustering led to enormous amounts of coverage in both scientific and mainstream news RS. The overlap between genetic clustering and race spawned enough discussion to spawn a Social Sciences Research Council forum, and a prominent edited volume on the issue, so it deserves coverage on Wikipedia.
I agree with Citing that there is much unworkable material here, but disagree that blowing up the article is the best solution. My suggestions:
  1. Shrink and merge the "Analysis of human genetic variation" material into human genetic variation. Some of that material is more clearly shown here, so it could replace some of the chaos over there. The two-paragraph Edwards vs. interlocutors back and forth could be cut entirely or drastically trimmed.
  2. I'm not sure where or whether there is a home for "Blood polymorphism study," but it isn't here. A brief summary could be added to Luigi Cavalli-Sforza
  3. The early part of "Genetic cluster studies" should be spun upwards into a description of genetic clustering algorithms and methods. The latter half should remain here as a list of major studies. Some cleaning throughout.
  4. Rosenberg's "Genetic Structure of Human Populations," which has over 2900 Google scholar citations, its own critical literature, and numerous news media articles about it probably deserves its own article: Genetic Structure of Human Populations (scientific article) that should absorb some of the legitimate back and forth surrounding it. Summarize the article briefly here, since its one of the oldest and smallest global cluster studies.
  5. Blow up and rewrite "Controversy of genetic clustering and associations with race" as "Genetic clustering and race": Much of this doesn't belong here, but in Race and genetics; and much of that is too lengthy, wordy, and back-and-forth-y for an encyclopedia article. Still this is a substantial controversy that is significant for human genetic clustering.
Fundamentally, this is an article where various editors have tried to correct structural flaws by adding new material for too long. I'm not sure who can take on the rewrites involved, but I do think there is valuable material here, much of it in coherent blocks.--Carwil (talk) 17:08, 29 October 2020 (UTC)[reply]
Resource: Also, if someone wants to blow up the overall structure and needs a map to lay out the issues involved, they should consider this source. (I'd be happy to climb the paywall and share a PDF.): Novembre, John; Ramachandran, Sohini (2011). "Perspectives on human population structure at the cusp of the sequencing era". Annual review of genomics and human genetics. 12: 245–274.--Carwil (talk) 19:20, 29 October 2020 (UTC)[reply]


Thanks for your input @Carwil:. I agree largely with the points you've mentioned, but my problem is that there are simply too many topics and results being confused with one another in this article to be useful to work with. Any rewrite would have to start from scratch, and I think this is a consequence of how it was framed by its original author -- from the opening sentence onwards, the description of the topic is incorrect or misleading (i.e. that there are distinct a priori clusters of people and scientists are trying to find out who they are and what their genetics are). At no point is there an explanation of what genetic clustering is. Ideally this article should start with describing what the methods of clustering in human genetics are and go from there and (imo) the article would have to be totally scrubbed for that.
As to your specific points:
  1. "Analysis of human genetic variation" and FST are related to the clustering process (in that after clusters are generated you may be interested in the FST/variation between/within them) but not central to it.
  2. Agreed that this needs to be removed. This works much better at the Cavallie-Sforza page, if anywhere.
  3. I think having methods described is critical, but a list of major studies would probably not help. Almost every modern genetics study has some cluster analysis so this would be unworkable (What counts as a major study? What unique population clusters are worth including? How do we write about them? You can see where this is going....). I have noticed a lot of new articles formed as Genetic history of X so maybe linking to a list of those in a "See also" section would be more helpful instead.
  4. Agreed that the Rosenberg study has ben given too much weight.
  5. I think in the context of a rewritten article this could be reduced considerably to a section on "relation to race and ethnicity", which would be a much clearer and more scientific framing
Even if the material in this article is useful enough to be included elsewhere and the article is kept, I think WP:TNT (stubbify and expand from there) is the only way forward for it to be a scientifically accurate and encyclopedic.Citing (talk) 19:18, 29 October 2020 (UTC)[reply]
Not going to respond to everything but re 3: there has been a limited pool of large sample-size, global clustering studies which then get re-cited. I agree that there's a ton of cluster analysis in regional population histories.
I may want to back up a virtual moving truck to merge out some of this material before any top-to-bottom rewrite.--Carwil (talk) 19:48, 29 October 2020 (UTC)[reply]
Note: This discussion has been included in the list of Biology-related deletion discussions. Coolabahapple (talk) 10:28, 30 October 2020 (UTC)[reply]
Relisted to generate a more thorough discussion and clearer consensus.
Please add new comments below this notice. Thanks, Sandstein 18:46, 6 November 2020 (UTC)[reply]
  • Delete I think the discussion above has identified the issues well enough; the question is what to do about them, and deletion seems the simplest course of action. The basic framing of the current article is unhelpful, and the amount of refactoring necessary to save any of the current material is not, I think, worth the trouble. XOR'easter (talk) 20:30, 8 November 2020 (UTC)[reply]
  • Keep You'd have to be a fool to ignore that groups of people in the world look similar. That pre-DNA concept identified (often poorly) what is objectively and measurably true and this page explains these measurements. All the discussion of group similarities due to genetics how trends and the groupings back it up. It could certainly be cleaned up, but deletion is a last resort. The POV pushing types need to kept in check by citing scientific papers, but deleting the whole article would ALSO be giving in to POV-pushing trolls. Literally censorship of facts that don't align with their political stances. Science and facts are above that. 97.122.84.35 (talk) 23:27, 9 November 2020 (UTC)[reply]
This !vote has almost nothing to do with the article, or with the arguments for deletion. Further, the euphemistic avoidance of the word race while defending the validity of it as a "pre-DNA concept" tips the hand that this comment is itself "pushing" a political stance. Biological racialism is pseudoscience, and if this article is, as you (unintentionally) suggest, a coat-rack for a fringe position, it should be deleted. It is not enough to merely "cite scientific papers" willy-nilly. We need to summarize what WP:MEDRS papers say with a strong preference for WP:IS about the entire topic. Individual studies are WP:PRIMARY, and citing arbitrarily selected examples would be WP:OR. Opposing supposed "censorship" is not a valid argument for the inclusion of badly-sourced or cherry-picked pseudoscience. Grayfell (talk) 00:48, 10 November 2020 (UTC)[reply]
In addition to the above, none of this addresses the main points I've made, which are that the article is (and always has been) filled with original research and that it does not accurately cover the topic and leaves readers with a poorer understanding.Citing (talk) 16:53, 10 November 2020 (UTC)[reply]
  • Delete - Per above Grayfell (talk) 00:48, 10 November 2020 (UTC)[reply]
  • Delete - nominator stated the case completely.   // Timothy :: talk  15:19, 10 November 2020 (UTC)[reply]
  • Delete. This is a non-viable article on a potentially viable topic. Genetic clustering is a tricky subject, and any article about it would need to reflect sources actually discussing clustering, rather than genetic variation more generally. Indeed it's quite possible that the distinction between genetic clustering and genetic variation is too subtle for Wikipedia, and both topics are best treated at Human genetic variation. Vanamonde (Talk) 01:50, 14 November 2020 (UTC)[reply]
The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made on the appropriate discussion page (such as the article's talk page or in a deletion review). No further edits should be made to this page.