MediaWiki talk:Spam-blacklist/archives/March 2020

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

belgaumtrend.site[edit]

Spam-only website. ~ ToBeFree (talk) 14:20, 1 March 2020 (UTC)[reply]

plus Added to MediaWiki:Spam-blacklist. ~ ToBeFree (talk) 14:22, 1 March 2020 (UTC)[reply]

Datanet India Pvt. Ltd[edit]

A long-term spamming problem (see also Wikipedia talk:WikiProject Spam/2007 Archive Jul 2), systematic recurring spamming (and occasional good-faith misuse) of a non-reliable data aggregator and "research" website. GermanJoe (talk) 17:55, 2 March 2020 (UTC)[reply]

@GermanJoe: plus Added to MediaWiki:Spam-blacklist. --GermanJoe (talk) 17:56, 2 March 2020 (UTC)[reply]

U.S. tax code edit[edit]

Hello there! I was looking at some articles on the U.S. tax code, I stumbled upon an article that needed to be edited so I did right? Turns out someone else other than our site administrators are using our links on various websites including adult and spammy websites for negative SEO purposes to make it look bad for Google. I don't know how it got in Wikipedia but I believe it was on articles related to either tax or finance. I checked the log but our website "futufan.com" was nowhere to be found. By no means, we are not related to the addition of our website's pages on Wikipedia. While we do not plan on including our links on Wikipedia as long as it is 100 percent necessary, it would be the just thing to do if it was removed from the blacklist. Aligraying (talk) 07:52, 2 March 2020 (UTC)[reply]

@Aligraying:  Not done, actually, it is not even blacklisted. Do note that it is not our business who is adding those links, our business is to protect Wikipedia against unsolicited additions. --Dirk Beetstra T C 10:54, 2 March 2020 (UTC)[reply]
@Dirk Beetstra: Are you sure though? I edited the Standard Deduction article for the 2020 updates and added our dedicated article to it which explains the changes the taxes will be paid in 2020 but it wouldn't let me because of site was blacklisted. I'm confused now that you're telling me it isn't blacklisted. I took a screenshot of it when I saw the error to show my staff. Can I provide the link here from Imgur? Aligraying (talk) 07:23, 3 March 2020 (UTC)[reply]
@Beetstra: Actually it is blacklisted here, and rather recently too, judging by its position a half-dozen lines from the bottom of the list.
@Aligraying: We generally don't consider delisting requests from individuals associated with a listed site. If a trusted high-volume editor deems the link worthy of including in an article, we will consider it. ~Anachronist (talk) 07:37, 3 March 2020 (UTC)[reply]
@Anachronist: ST47 added it, and seen how it was added that was the proper action to protect the encyclopedia (though maybe they should be trouted for not logging it :-) ). Aligraying I find your argument highly unconvincing: 7 editors have been adding this over 19 days, at, on average ~3 days intervals. Then you come, 3 days later and try to add it. That is too much of a coincidence.
Actually, I do think that ST47 has been a bit hasty here, this should have been blacklisted globally: Rejected and  Defer to Global blacklist. --Dirk Beetstra T C 11:06, 3 March 2020 (UTC)[reply]

Can't you guys check the IP addresses of the people who added our links to the site? Whoever put our links on Wikipedia clearly isn't associated with us. This was obviously for negative SEO purposes. But yeah. Anyhow, this isn't right but it is understandable on you guys' end. Aligraying (talk) 08:26, 3 March 2020 (UTC)[reply]

Aligraying, no. And it would not matter anyway. It was spammed, and the usual source is the SEO companies employed by the website. Choose more wisely in future. Guy (help!) 10:29, 3 March 2020 (UTC)[reply]

stmarks-cardiff.co.uk[edit]

I don't understand why this is blacklisted unless it is because other sites with cardiff.co.uk in the name are blacklisted. ActiveRetired (talk) 13:36, 3 March 2020 (UTC)[reply]

May be a false positive caught by the blacklisting for cardiff.co.uk. Dirk Beetstra, per this comment, looks like we may need to modify the listing. OhNoitsJamie Talk 13:44, 3 March 2020 (UTC)[reply]
@ActiveRetired: It is easier to whitelist this one:  Defer to Whitelist. Do you mind requesting it there? --Dirk Beetstra T C 14:00, 3 March 2020 (UTC)[reply]

tripbibo.com[edit]

Spam by confirmed and suspected sockpuppets. ~ ToBeFree (talk) 04:41, 4 March 2020 (UTC)[reply]

plus Added to MediaWiki:Spam-blacklist. ~ ToBeFree (talk) 04:42, 4 March 2020 (UTC)[reply]

Logs[edit]

Is there a log of hits for the blacklist? Some of the entries have been on here for years, and it might be worth reviewing and removing anything with no hits in two years, to keep the blacklist from blowing up. Guy (help!) 10:15, 5 February 2020 (UTC)[reply]

@JzG: there is enough material on there that simply should never be taken off, even if it hasn't been hit in two years. Blowing up however would be a good thing, so maybe 14 year old bugs are finally going to be solved. You know, if something is not broken, develop something else that will break it.</sarcasm> --Dirk Beetstra T C 11:20, 5 February 2020 (UTC)[reply]
User:Lustiger seth however has been cleaning up sometimes on meta removing things. Some domains can be removed because they have now a new owner, or have cleaned up their act. --Dirk Beetstra T C 11:22, 5 February 2020 (UTC)[reply]
Beetstra, sure, but there are a lot of sites added after brief spamming sprees - often by meatbots - where the risk is probably over. I wonder if we could at least check whether older sites are still online, using a bot? I they are 404 or domain parked, we could probably remove them. Guy (help!) 11:32, 5 February 2020 (UTC)[reply]
JzG, Ls should have that script/bot. Situation is somewhat complex but there will for sure be material that can be cleaned up. Dirk Beetstra T C 11:36, 5 February 2020 (UTC)[reply]
Beetstra, see removal requests above for a possible quick win - I ran a DNS lookup script. A sample of around 100 manual checks has yielded no false positives. Guy (help!) 14:58, 5 February 2020 (UTC)[reply]
Hi!
I could create a list of non-hitting entries (for the last ~5? years). Afterwards we should remove url shorteners (and some other exceptions?) from the list. Then we could decide whether the remaining entries should be removed from the black list.
imho this is almost indipendent from the content of the webpages. -- seth (talk) 22:59, 6 February 2020 (UTC)[reply]
Lustiger seth, Fantastic! Yes, please. Guy (help!) 14:49, 9 February 2020 (UTC)[reply]
I started a bot run to collect all data. This will take a while. Maybe next weekend I can create a page with some results. -- seth (talk) 18:19, 9 February 2020 (UTC)[reply]
Lustiger seth, heroic work, thanks. Guy (help!) 12:32, 11 February 2020 (UTC)[reply]
As a start: User:Lustiger_seth/sbl_log_2013--2020. this is not yet finished. it takes about 6--7 minutes per blacklist entry (and there are ~8,2k of them) to search the whole sbl log table (which has about 100M rows). -- seth (talk) 10:36, 15 February 2020 (UTC), 21:19, 15 February 2020 (UTC)[reply]

Subsection for \b^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b[edit]

The very first entry is quite interesting: \b^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b: this is a contradiction. it will never match any link addition because an url never starts with a digit, it starts with a protocol (e.g. http or https). so ^\d will fail -- always. this entry is just superflous. the question is: should it be deleted or should it be fixed (by replacing the \b^ with (?<=//)). the latter would require us to look for all ip urls and check them. -- seth (talk) 10:43, 15 February 2020 (UTC)[reply]

@Lustiger seth: can we figure out who added that and what they had in mind at that time? --Dirk Beetstra T C 05:40, 16 February 2020 (UTC)[reply]
Well, it was here by User:Reaper Eternal. It appears they only wanted to blacklist IPs. I can see good cause for that (I have seen it being used as blacklist evasion). But I am afraid there are quite some IPs on Wikipedia and (quite) some of them might be genuine-ish. --Dirk Beetstra T C 05:50, 16 February 2020 (UTC)[reply]
hi!
right now we have only 4 blacklisted explicit ip addresses in the sbl. so i guess, we could just remove the entry and continue blacklisting single ip-adresses.
another solution would be to correct the general blacklist entry and whitelist explicit ip adresses. -- seth (talk) 11:26, 16 February 2020 (UTC)[reply]
Lustiger seth, it is very difficult to determine in how far IP addresses are an issue. I do know one issue if we block all IP addresses .. User:COIBot will fail to save reports until I fix it .. Dirk Beetstra T C 11:39, 16 February 2020 (UTC)[reply]
Spamming of IP addresses obviously isn't much of a problem if it has taken 8 years to notice that this didn't take. - MrOllie (talk) 11:56, 16 February 2020 (UTC)[reply]

MrOllie, It's not that easy .. there are some IP addresses on the list, which are likely the cases which jumped out. Others are less visible because it likely is limited to just a couple of additions per IP. Some searching on the ones that LiWa3 found suspicious:

(COIBot is convinced that these do match the rule .. funny; I haven't analyzed whether these are 'a problem'). --Dirk Beetstra T C 12:12, 16 February 2020 (UTC)[reply]

Next steps[edit]

Lustiger seth, is User:Lustiger seth/sbl log 2013--2020 all entries with no hits, or does it require that the entry was in place before 2013? It looks like the former, the entries are added sequentially and the earliest one I can find is \bmuineresorts\.com\b - this was added in 2008 according to MediaWiki talk:Spam-blacklist/archives/June 2008 § Vietnam Travel Promotion Group. If so, I think we should go ahead and purge any with zero hits in your list - 1017 records, or about 12% of the list. Is there any way of checking whether that has a performance impact? Do we track server cost of blacklists? Guy (help!) 13:11, 19 February 2020 (UTC)[reply]

hi Guy!
as i said above, the list was not completed yet. however, the list should be complete now (since 2020-02-23 15:49).
the first column contains all sbl entries that were on the sbl at the beginning of 2020-02. the second column contains the number of hits in the sbl log. the sbl log was created 2013-09, iirc. that means: 1. there is no log data prior to that date. 2. if an sbl entry is just 1 week old, this might be a reason for a low number of hits.
performance: i don't know whether this can be measured (easily). -- seth (talk) 15:58, 23 February 2020 (UTC)[reply]

Meta[edit]

93 items have matching items on the global blacklist.

Extended content
  • Regex requested to be blacklisted: \bxsl\.pt\b
  • Regex requested to be blacklisted: \badelaide-classifieds\.info\b
  • Regex requested to be blacklisted: \baliexpress\.com\b
  • Regex requested to be blacklisted: \bamericaswomenmagazine\.xyz\b
  • Regex requested to be blacklisted: \bcool-fuel\.co\.uk\b
  • Regex requested to be blacklisted: \bgeolocation\.ws\b
  • Regex requested to be blacklisted: \bgreattibettour\.com\b
  • Regex requested to be blacklisted: \bhappyjanamashtamiwishes\.blogspot\.com\b
  • Regex requested to be blacklisted: \blnk\.pics\b
  • Regex requested to be blacklisted: \bsportstation\.store\b
  • Regex requested to be blacklisted: \bstores\.ebay\.com\b
  • Regex requested to be blacklisted: \bhidemyass\.com\b
  • Regex requested to be blacklisted: \bsplit\.to\b
  • Regex requested to be blacklisted: \bempowernetwork\.com\b
  • Regex requested to be blacklisted: \bgrowtobacco\.net\b
  • Regex requested to be blacklisted: \bmeatspin\.com\b
  • Regex requested to be blacklisted: \bsukmulberryshops\.co\.uk\b
  • Regex requested to be blacklisted: \badfoc\.us\b
  • Regex requested to be blacklisted: \bgetinfo\.co\.in\b
  • Regex requested to be blacklisted: \boptimalstackproduct\.com\b
  • Regex requested to be blacklisted: \bmaletestosteronebooster\.org\b
  • Regex requested to be blacklisted: \bthehealthyadvise\.com\b
  • Regex requested to be blacklisted: \bteespring\.com\b
  • Regex requested to be blacklisted: \bfirstleaks\.com\b
  • Regex requested to be blacklisted: \bmuscleperfect\.com\b
  • Regex requested to be blacklisted: \bpharmshop-online\.com\b
  • Regex requested to be blacklisted: \blovifm\.com\b
  • Regex requested to be blacklisted: \bdankontorstole\.dk\b
  • Regex requested to be blacklisted: \bclonezone\.link\b
  • Regex requested to be blacklisted: \bgoods555\.com\b
  • Regex requested to be blacklisted: \bbikramsinghmajithia\.blog\.com\b
  • Regex requested to be blacklisted: \brebootmymodem\.net\b
  • Regex requested to be blacklisted: \b123malikoki\.info\b
  • Regex requested to be blacklisted: \bpisinaspa\.gr\b
  • Regex requested to be blacklisted: \bkickass\.ink\b
  • Regex requested to be blacklisted: \bpulseoxadvocacy\.com\b
  • Regex requested to be blacklisted: \bsport2018\.org\b
  • Regex requested to be blacklisted: \bmentaldaily\.com\b
  • Regex requested to be blacklisted: \bshort4free\.us\b
  • Regex requested to be blacklisted: \bpetstation\.store\b
  • Regex requested to be blacklisted: \bedubirdie\.com\b
  • Regex requested to be blacklisted: \batheistrepublic\.org\b
  • Regex requested to be blacklisted: \bwelookups\.com\b
  • Regex requested to be blacklisted: \bmywikibiz\.com\b
  • Regex requested to be blacklisted: \belbo\.in\b
  • Regex requested to be blacklisted: \beasy-bator\.com\b
  • Regex requested to be blacklisted: \ballxreport\.com\b
  • Regex requested to be blacklisted: \b1mg\.com\b
  • Regex requested to be blacklisted: \bsci-hub\.
  • Regex requested to be blacklisted: \bwhereisscihub\.now\.sh\b
  • Regex requested to be blacklisted: \bksol\.vn\b
  • Regex requested to be blacklisted: \byoucanplayandhavefun\.blogspot\.com\b
  • Regex requested to be blacklisted: \btournament-player-magazine\.blogspot\.com\b
  • Regex requested to be blacklisted: \blearn-how-to-play-this\.blogspot\.com\b
  • Regex requested to be blacklisted: \bletsmegetme\.blogspot\.com\b
  • Regex requested to be blacklisted: \bmedicines-for-allergies\.blogspot\.com\b
  • Regex requested to be blacklisted: \bnothingmoretodobefore\.blogspot\.com\b
  • Regex requested to be blacklisted: \bstarslots\.pw\b
  • Regex requested to be blacklisted: \btranscription-services-us\.com\b
  • Regex requested to be blacklisted: \bhoanganhmart\.com\b
  • Regex requested to be blacklisted: \bsuadieuhoagiare247\.com\b
  • Regex requested to be blacklisted: \bbladejournal\.com\b
  • Regex requested to be blacklisted: \bopknice\.com\b
  • Regex requested to be blacklisted: \bgame24h\.co\b
  • Regex requested to be blacklisted: \busagoldentour\.com\b
  • Regex requested to be blacklisted: \bozinice\.com\b
  • Regex requested to be blacklisted: \bsubweb\.co\.il\b
  • Regex requested to be blacklisted: \bdaynightcarebd\.com\b
  • Regex requested to be blacklisted: \bcamcavetxegiacao\.com\b
  • Regex requested to be blacklisted: \bgopaintsprayer\.com\b
  • Regex requested to be blacklisted: \bpro-pharmaceuticals\.com\b
  • Regex requested to be blacklisted: \bzom\.vn\b
  • Regex requested to be blacklisted: \buscagsa\.com\b
  • Regex requested to be blacklisted: \bsitusrajabola\.net\b
  • Regex requested to be blacklisted: \bagendominopro\.net\b
  • Regex requested to be blacklisted: \bforkeq\.com\b
  • Regex requested to be blacklisted: \bhempoilxll\.com\b
  • Regex requested to be blacklisted: \bgenericbuddy\.com\b
  • Regex requested to be blacklisted: \bmasterpkr\.com\b
  • Regex requested to be blacklisted: \btaruhanbandarq\.xyz\b
  • Regex requested to be blacklisted: \brevistas\.nics\.unicamp\.br\b
  • Regex requested to be blacklisted: \bmasters-of-fun\.de\b
  • Regex requested to be blacklisted: \bfreemansworld\.de\b
  • Regex requested to be blacklisted: \bonlinecasinounion\.us\.com\b
  • Regex requested to be blacklisted: \bschooltips\.com\.ng\b
  • Regex requested to be blacklisted: \bfreebitco\.in\b
  • Regex requested to be blacklisted: \blocuspharmaceuticals\.com\b
  • Regex requested to be blacklisted: \byoulike222\.com\b
  • Regex requested to be blacklisted: \bpharmacosmed\.com\b
  • Regex requested to be blacklisted: \bthrillophilia\.com\b
  • Regex requested to be blacklisted: \bsafe-steroids\.net\b
  • Regex requested to be blacklisted: \bbitmix\.biz\b

I guess these alsoc an be cleaned up. Guy (help!) 15:54, 19 February 2020 (UTC)[reply]

JzG, Glancing through this list, I guess they can all just go. Dirk Beetstra T C 06:01, 1 March 2020 (UTC)[reply]
@JzG: minus Removed from MediaWiki:Spam-blacklist. --Dirk Beetstra T C 06:05, 1 March 2020 (UTC)[reply]

Proposal[edit]

I propose to do the following:

  1. Take the blacklist as of 1-Mar-2017 (3 years ago).
  2. Match any entry in seth's list with 0 hits since 2013.
  3. Remove all entries which have been on the SBL since 1 Mar 2017 or earlier with no hits since 2013.
  4. Remove all entries that are on the global blacklist.

What do people think? Guy (help!) 22:02, 29 February 2020 (UTC)[reply]

hi!
i did similar things in former times. and i still think, this is useful. so i support this (and i could help getting it done). -- seth (talk) 22:20, 29 February 2020 (UTC)[reply]
@Lustiger seth and JzG:, I have just removed the ones that do have an item on the global blacklist above (though, if the blacklisting reasons are significantly different it may be worth to keep them on here as well, but they can always be re-added if they become a new problem).
Although it needs some care, most items that don't hit can indeed be safely removed. Here it is less of an issue than on meta, where you certainly do not want to remove redirect sites, malware sites etc. Dirk Beetstra T C 11:06, 2 March 2020 (UTC)[reply]
ok! on friday or saturday i should have time to remove the ones, mentioned in 3. -- seth (talk) 22:28, 4 March 2020 (UTC)[reply]
done.[1] the first script i used was faulty. this one should be correct:
#!/usr/bin/perl
use strict;
use warnings;
use File::Slurp qw/slurp write_file/;

my @sbl_present = slurp("present.txt");
my @sbl_2017    = slurp("2017.txt");
my @zero_hits   = slurp("zero_hits.txt", {"chomp" => 1});
my (@sbl_new, @sbl_deleted);
# reduce to actual entries (without any comments)
@sbl_2017 = grep {s/^[^\s#]+\K.*//s; !/^#/} @sbl_2017;
# filter present list
for my $p(@sbl_present){
	$p =~ s/[ \t]+$//;  # trim right
	my $entry = $1 if $p =~ /^([^\s#]+)/; # get entry (without comments)
	if(defined $entry 
		&& grep($entry eq $_, @zero_hits)
		&& grep($entry eq $_, @sbl_2017)
	){
		push @sbl_deleted, $entry . "\n";
	}else{
		push @sbl_new, $p;
	}
}
write_file('new.txt', @sbl_new);
write_file('deleted.txt', @sbl_deleted);
i used the output in new.txt to update the sbl. the output in deleted.txt i'll use now to log the removals.
-- seth (talk) 10:45, 6 March 2020 (UTC), 11:38, 6 March 2020 (UTC)[reply]
Lustiger seth, I understand about 70% of that! Thanks :-) Guy (help!) 22:42, 6 March 2020 (UTC)[reply]

10bet.com[edit]

10bet.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

I've been working on articles related to casino and games and recently created this article (10bet). Turns out the official site is blacklisted. I couldn't find the reason in the blocklog. Lunar Clock (talk) 22:07, 6 March 2020 (UTC)[reply]

hi!
this domain is blacklisted at meta (for all wikipedias and not just for enwiki): you may consider to ask for unblocking at m:Talk:Spam_blacklist.
the domain was blacklisted in 2007.[2], as far as i see, a reason has not been given. -- seth (talk) 22:29, 6 March 2020 (UTC)[reply]
See  Defer to Whitelist (if it hasn't already been whitelisted for 10bet) - [3] OhNoitsJamie Talk 22:33, 6 March 2020 (UTC)[reply]
I've made request at the meta. Thank you. Lunar Clock (talk) 22:19, 7 March 2020 (UTC)[reply]

ymail.info[edit]

Garbage site recently spammed by multiple IPs, three logged so far. plus Added OhNoitsJamie Talk 13:26, 3 March 2020 (UTC)[reply]

@Ohnoitsjamie:  Defer to Global blacklist, cross-wiki problem. --Dirk Beetstra T C 13:55, 3 March 2020 (UTC)[reply]
@Ohnoitsjamie: Handled on meta. --Dirk Beetstra T C 13:58, 3 March 2020 (UTC)[reply]
@Beetstra: Should it be removed from en? OhNoitsJamie Talk 14:02, 3 March 2020 (UTC)[reply]
@Ohnoitsjamie: up to you, it can be as it is now global'd. — billinghurst sDrewth 02:02, 8 March 2020 (UTC)[reply]

Vietnamese websites[edit]

New links are appearing as fast as I can remove them. NinjaRobotPirate (talk) 23:26, 7 March 2020 (UTC)[reply]

NinjaRobotPirate
Lets see if we have all links by these users Dirk Beetstra T C 05:21, 8 March 2020 (UTC)[reply]

COIBot suspicious local reports[edit]

(crossposted to WT:WPSPAM)

m:User:LiWa3 is doing basic statistics on domains that it has seen being added, and when those statistics are suspicious it throws those in the general direction of COIBot. COIBot is then reporting those in a local category or a xwiki category, depending on the type of statistics.

COIBot has been saving reports for years, and most of those are still lingering (COIBot closes some automatically when they have been cleaned up independently, but with so many it does not check all). I evaluated a good handful of the old ones and closed them, and I have been trying to keep up with some of the new ones for a week. I do find that a significant portion of them do need a follow up (most need cleanup, quite some outright blacklisting).

May I ask you to turn on category changes in your watchlist, watchlist Category:Open Local COIBot Reports, and evaluate all that COIBot is opening in there. Please try to close them with an evaluation remark for further reference. Thanks. --Dirk Beetstra T C 06:08, 8 March 2020 (UTC)[reply]

spdload.com / webspero.com[edit]

Recurring spam for marketing sites / PR blogs, multiple warnings for each. GermanJoe (talk) 10:34, 10 March 2020 (UTC)[reply]

@GermanJoe: plus Added to MediaWiki:Spam-blacklist. --GermanJoe (talk) 10:37, 10 March 2020 (UTC)[reply]

econlib.org[edit]

I have added a reference to an article from a 1945 economic journal in the pricing signal wiki article. I used the journal template, I then added a URL to the site that hosted the article. The website that hosts the article appears to be blacklisted. I just removed the URL, since the reference is valid and points to a historic article that exists outside the web.

My goal was to find the reason for deletion. My initial impression was that perhaps econlib had credibility issues, maybe there had been an edit war, or it was recurrently added with poor regards to wikipedia's standards.

However, what I found in the arhives is a very weak reason for blacklisting:

"All appear to have been added by Vipul (talk · contribs · logs · edit filter log · block log), Riceissa (talk · contribs · logs · edit filter log · block log) or other paid surrogates of Vipul, most are selling legal services related to immigration, and the overall conclusion is SEO. All valid content can be drawn from more authoritative sources such as law books, pages in law faculty websites, official government sources etc. Guy (Help!) 23:24, 10 March 2017 (UTC)"

Another user comments: "I think econlib.org needs to be removed. This is a legitimate, relatively prominent Economics blog where relatively prominent economists(Sumner, Bryan Caplan) discuss current issues in econ. Dark567 (talk) 01:31, 11 March 2017 (UTC) "

The concern was heard but nothing was done about it.

This single user resulted in the blacklist of 8 or 10 sites and econlib appeared to be caught in the fire. I'm not going to bother searching what the particular edits were.

In addition to removing the individual site from the blacklist, there must be other false positives, so I propose some systemic improvements that can be made.

First I'll list the steps I took to try to solve this issue:

  1. Check in both the local and global blacklists.
  2. navigate through the archives to find the reason for the ban.
  3. Post here about it.

Between 2 and 3, finding the relevant edits would be a useful tool for determining whether the ban was warranted, and would be of significant difficulty. So far I would like to see a search tool that:

  • looks in both local and global banlists
  • finds the reason for blacklisting and links the user towards it.
  • is triggered upon an edit that includes the url, and shows the reason to the user.
  • Remove the restriction from talk pages, since it makes it hard to even discuss these blacklists. (it's /library/Essays/hykKnw.html )
  • log attempted edits that include this url. If 1 user made a low quality or vandal edit to wikipedia with that link, but 50 users attempted to make 50 good edits with that link, the url should automatically be suggested for removal.

The benefits would be:

  • Decreasing references without free access to the cited content.
  • Decreasing biases in wikipedia.
  • Decreasing the amount of erroneously rejected edits.
  • Better communicating to users why their edit was rejected.

The costs of this edit would be:

  • Development time, I'm free and have the skills to implement this.
  • Admin time, this might increase the maintenance of the blacklist.

So if I get backing from those currently responsible of going through this list , I can post this suggestion to the appropriate section and start working on it. — Preceding unsigned comment added by TZubiri (talkcontribs) 23:06, 8 March 2020 (UTC)[reply]

@TZubiri: again, a) there is a direct connection between the paid editor you are talking about and Bryan Caplan, and b) by far most of the material is replaceable (outside of the (already whitelisted) encyclopedia there will be very, very few exceptions, and most of the cases we have seen until now are replaceable with links to e.g. WikiSource. Repeated requests on the whitelist have shown the latter.
The material that was the problem has been deleted, and is visible only to admins. It has however been explained over and over. See my first paragraph. Removing this needs a consensus (which you are free to gather) showing that the site is absolutely needed. The benefits you are quoting are hardly true:
  • most references have free access to the cited content, if it is freely hosted on econlib, it is likely freely hosted elsewhere as it is out of copyright protection (up to, often, WikiSource).
  • that argument is totally useless. The use of references is not affecting the bias. And as you have referenced it now it is totally unbiased and properly referenced. Even better, it is plainly referenced to the official source of the information. Everyone can find the reference is they want to.
  • The edits are not erroneously rejected, someone with a vested interest was editing this, it is rightfully blacklisted.
  • You just have to ask. It is less than 6 hours and you have an answer.
There is not a lot to develop, we have the searchbox above, and this track that shows you where this was discussed and gives above explanation several times. And hence, it does not increase admin time.
no Declined,  Defer to Whitelist for specific links on this domain (but this one is freely accessible elsewhere, even on 'neutral' servers, so I would not bother). --Dirk Beetstra T C 06:07, 9 March 2020 (UTC)[reply]

Thanks for looking into this. I wasn't aware of the direct link of the infractions to the site owners. Since in this case econlib is functioning as a content host for a widely available primary source, I linked to an alternate host. I still think the user interface can be improved, perhaps a summarized reason for the blacklist could be provided in the rejection message along with the search results of the searchbox you reference in case users want to dig deeper, you'll apologize if users miss this but there's an overload of information for regular users. I'm interested in your perspective on this idea:

What would you think would be a good message for users to see when they link to econlib?
If you had the capacity to do so, would you send different messages to users depending whether econlib is being cited as a primary or secondary source? 

Out of curiousity, is it technically possible to blacklist a website as a secondary source but still allow it to work as a host for primary sources? --TZubiri (talk) 19:19, 9 March 2020 (UTC)[reply]

TZubiri, not using this blacklist, no. Guy (help!) 00:09, 11 March 2020 (UTC)[reply]
TZubiri, except for the encyclopedia there is hardly anything there that needs to be linked there. Yes, there is a lot of material in the library that is suitable as primary or secondary source, but, again, it is practically ALL available on 'neutral' websites (up to WikiSource). Whitelisting can take care of the rest, but I do not recall having seen any requests that pass that bar.
No, the spam-blacklist is black-and-white. That is something that should have changed (and that request is basically 14 years old), but WMF. Anyway, I don't think that the software should make the distinction 'oh, it is used as a secondary source, so it is fine to link to this spammed site'. Also, yes, it would be nice that we have the possibility for more 'custom' messages on domains. That would be possible, but again, that needs WMF. Dirk Beetstra T C 07:22, 11 March 2020 (UTC)[reply]

WikiLeaks[edit]

I came across some citations to WikiLeaks. That seems like a really bad idea: pretty much by definition the material they host is in violation of copyright. Guy (help!) 13:05, 20 February 2020 (UTC)[reply]

As I understand it, the material they host was produced by governments and is not copyrighted.
There is a possible problem in linking to information that governments consider classified. When I was a defense department employee, we couldn't even look at Wikileaks (even on personal time) due to the danger of being exposed to classified information we weren't cleared to know, which is a serious thing if you're in government. It was a weird situation where the public could do what they wanted but those of us in government service had restrictions. That was years ago; I don't know how they handle it these days.
To the extent that government documents are reliable sources, citing such documents on Wikileaks should not be a problem if that's the only venue where they can be seen. ~Anachronist (talk) 05:56, 24 February 2020 (UTC)[reply]
Anachronist, I'm not sure I understand .. is material from a government not copyrighted? I would expect that the organisation (not the individual that wrote it) holds the copyright.
Though I agree that some of the material can be a reliable source, there is also not a necessity to have a working link to the information (if too much of the info is problematic linking to). Dirk Beetstra T C 06:01, 24 February 2020 (UTC)[reply]
@Beetstra: I'll answer your question with the lead sentence of our article Copyright status of works by the federal government of the United States. If the communique, document, or other work was written by a government employee, it isn't subject to domestic copyright, but if the work was written by a contractor the situation is muddier. I'd wager that most of the documents on Wikileaks are generated by governments (largely the US government) and therefore not subject to copyright.
I oppose blacklisting Wikileaks, but if we don't, then citations to it would have to be examined on a case-by-case basis. ~Anachronist (talk) 17:02, 24 February 2020 (UTC)[reply]
Anachronist, not unless you consider material stolen from the DNC's email servers by the Russians to be "produced by governments". Also British government materials are Crown copyright. So there's absolutely no guarantee. And work product is exempt, I believe. Guy (help!) 19:40, 24 February 2020 (UTC)[reply]
The DNC stuff isn't produced by governments, of course. I'm thinking more of US military messages, diplomatic communiques, stuff that Chelsea Manning released, and so on. I'm skeptical that government work products are exempt. There's legitimate material in there, and as I said, the citations would need to be examines on a case-by-case basis.
I note that [link search] reveals an extremely low percentage of Wikileaks links in main article space. Most of them appear to be on talk pages and Wikipedia namespace. I wish the linksearch feature had a filter to show only mainspace pages. Glancing through it, there don't seem to be many articles actually citing Wikileaks. ~Anachronist (talk) 04:33, 25 February 2020 (UTC)[reply]
Anachronist, did you look through wikileaks.org HTTPS links HTTP links? Guy (help!) 15:54, 25 February 2020 (UTC)[reply]
Cool. I didn't know about that search parameter. I stand corrected. :) ~Anachronist (talk) 17:13, 25 February 2020 (UTC)[reply]
  • I hope y'all aren't seriously considering blacklisting WikiLeaks here. It's a valuable primary source, hardly something that could be called "spam". wbm1058 (talk) 23:08, 28 February 2020 (UTC)[reply]
    Wbm1058, no it's not a "valuable" anything. It selectively publishes material of sometimes dubious provenance, in furtherance of an increasingly obvious political agenda. Guy (help!) 10:30, 3 March 2020 (UTC)[reply]
    I disagree it isn't valuable, but I agree they give the appearance of having an underlying political agenda. But that's irrelevant. What should matter to us is whether wikileaks links have been spammed or otherwise added abusively to Wikipedia. ~Anachronist (talk) 17:46, 3 March 2020 (UTC)[reply]
    Anachronist, abused is more than just spammed. A cult following can be functionally indistinguishable from a troll farm. Guy (help!) 08:47, 11 March 2020 (UTC)[reply]

"duleweboffice"[edit]

Shamelessly nicked from User:Praxidicae/fakenews


This set all belongs to a gmail account "duleweboffice@gmail.com" and several of thes sites, including foreignpolicyi.org were originally legitimate sites however they sniped the domain and it has since become an unreliable and frankly garbage spam site (as is the case for the rest, too.) Legitimate uses of this link look like: http://www.foreignpolicyi.org/node/17539 and we should see if there is an archived version somewhere. The illegitimate uses look like this and are rather easy to spot (basically anywhere this is used on entertainment, media personalities and media in general is the spam version.) the spam variant looks like this: https://foreignpolicyi.org/tanya-nolan-is-becoming-a-hit-with-new-single-love-ya/


I did some checking. These sites have been abused on Wikipedia, in some cases severely so. It's hard to conclude anything other than SEO involvement. I salute Praxidicae for this hard work. If we blacklist then at least no new links will be added, and old ones will be nuked as the articles are edited. It's a huge job removing them entirely. Lustiger seth, is there any way to write a bot to copy the contents of the blacklist and compile a table with the number of active links on enWP, ideally just in mainspace? We might be able to use that as the basis for a reward system for Wikignomes. Guy (help!) 20:35, 19 February 2020 (UTC)[reply]

  • Just a quick note that I archived a bunch of foreignpolicyi.org links (to the original site) and deleted all traces to the original, so those are fine but should be blacklisted going forward. Same for vermontrepublic. The rest are just plain old spam and can be blacklisted unless we would rather filter as a honeypot. There's another set by the same person/email (duleweboffice) under the name "santosmilewa" (see demotix . com) and "kravitzcj" (see icharts . net/author/arni/). I'll make a list of these shortly. They're all operated by the same 3 blackhat SEO firms along with another handful that are using a dead woman's identity (I filed actual reports with the proper agencies about this FWIW), a fake phone number and a fake real life address (it's public, so i'm not disclosing anything out of the ordinary.) Anyhow, my lists are kind of a mess right now so I'll throw some together over the next few hours/days that'll make it all easier. Praxidicae (talk) 21:13, 19 February 2020 (UTC)[reply]
    Praxidicae, heroic work, thanks. Guy (help!) 21:19, 19 February 2020 (UTC)[reply]
    • Here's the first related set: User:Praxidicae/fakenews/sbl. Some may not be used on wiki (most are!) but we should probably deal with them before they are used. I'll work on the list for the larger set (like the one I linked you to with Uma Thurman) Praxidicae (talk) 21:25, 19 February 2020 (UTC)[reply]
@Praxidicae/fakenews: plus Added to MediaWiki:Spam-blacklist. per User:Praxidicae/fakenews/sbl. --Guy (help!) 21:28, 19 February 2020 (UTC)[reply]
hi!
regarding the question about the table: it would be possible, but it would take long time (weeks or months), i guess. (and i would need some time to adapt my scripts. that's propably the bottleneck.) -- seth (talk) 16:56, 23 February 2020 (UTC)[reply]
The domain scholarlyoa.com was Beall's list. Articles on dodgy academic publishing practices are likely to point to archived copies of it. I discovered the problem when trying to revert section-blanking at World Academy of Science, Engineering and Technology; the last good version had links that are now spam-blacklisted. Not being too familiar with how spam-blacklisting works around here, I'm not sure of the best course of action. (Ping JzG.) XOR'easter (talk) 13:29, 25 February 2020 (UTC)[reply]
XOR'easter, request whitelisting of specific URLs. Are you familiar with that process? I can help if not. Guy (help!) 15:52, 25 February 2020 (UTC)[reply]
JzG, I'm not sure the links should just be whitelisted, since the site itself is down, probably permanently, and the actual content we should be pointing readers to is in the archived copies. XOR'easter (talk) 20:11, 25 February 2020 (UTC)[reply]
@JzG: scholarlyoa being blacklisted is really annoying, and prevents many discussions about predatory journals. It should also be limited to non-archived links, since those are the problematic ones, rather than archived links, which refer to the legitimate site back when Jeffrey Beall ran it. Headbomb {t · c · p · b} 07:19, 27 February 2020 (UTC)[reply]
Headbomb, No, it does not prevent that, it makes it more difficult as you now cannot link to it directly but have to disable the links when discussing them (which is highly annoying) . Unfortunately the AbuseFilter is not a reasonable alternative either, it is too heavy handed for this. That is indeed a shortcoming of the spam-blacklist and of the AbuseFilter. Do these discussions happen so often? Dirk Beetstra T C 10:04, 27 February 2020 (UTC)[reply]
I think that such problems (hijacked domains that once had legitimate use) will become more common as spammers become more sophisticated and the use of the spam blacklist expands. I dunno, is it possible to selectively whitelist archived versions of a blacklisted URL? I know that in its current form the spam blacklist catches archived versions of a blacklisted link as well. Jo-Jo Eumerus (talk) 11:52, 27 February 2020 (UTC)[reply]
It does not prevent that. It does. And they happen often enough on journals-related pages, given Beall's importance in that area. This is best implemented as an edit filter, which would not interfere with non-article space, such as talk spaces. Headbomb {t · c · p · b} 14:45, 27 February 2020 (UTC)[reply]
Headbomb, it does not prevent discussing, it prevents linking to the material directly (which, I totally agree, is completely annoying). The problem here is, that there is no other solution: allow only the archive links also allows the archive links to current material, and does so everywhere everywhere (which is exactly what you don't want to be linking to). Removing it from the blacklist altogether also allows the current material, and also everywhere (as above). It is a shortcoming of the spam blacklist. It needs to be changed (well, it needed to be changed years ago). Currently your only way forward is either using it in a 'broken' form (if it is talkpage only likely the best way forward), or getting it whitelisted (not very likely to be granted for use on a talkpage). Everything else is something that needs a change to the software. Note also, that the edit filter needs to be rather heavy to be less restrictive than the spam-blacklist. Dirk Beetstra T C 12:32, 1 March 2020 (UTC)[reply]
Headbomb, Note: the link you were all trying to add (which is the only one every hitting the filter in the whole so many years of history on that page) is one that is used in the article. Hence, it should be whitelisted just to make sure that we do not run into problems in the future. Dirk Beetstra T C 12:39, 1 March 2020 (UTC)[reply]
Beetstra, And there's always the potential for a removal request if we think it's a false positive or joe-job. Which in this case is entirely possible: predatory publishers hate the site. Guy (help!) 00:12, 11 March 2020 (UTC)[reply]
JzG, the spam-blacklist is to protect Wikipedia, it does not matter if things are Joe Jobbed, genuine spam, or through community consensus decided to be bad. Dirk Beetstra T C 07:03, 11 March 2020 (UTC)[reply]
Beetstra, sure, but there's precedent for removal when the spamming was done with the deliberate intent of getting a site blacklisted. And in this case that is entirely plausible. Guy (help!) 08:46, 11 March 2020 (UTC)[reply]
I have found my way here because one of the news sites (fin24.com) I have used for years now I have found to be suddenly black listed. It is regarded as a reliable source in South Africa (where it is based) so I was surprised to find it here. Just wondering if I should try and get that source white listed?--Discott (talk) 11:01, 11 March 2020 (UTC)[reply]

thehinduopinion.com[edit]

Minor blog spam, but deceptive impersonation of the newspaper and manipulating of existing source links. GermanJoe (talk) 17:17, 13 March 2020 (UTC)[reply]

@GermanJoe: plus Added to MediaWiki:Spam-blacklist. --GermanJoe (talk) 17:18, 13 March 2020 (UTC)[reply]

dailyhunt.in[edit]

There is virtually no reason this site should be used as it's almost exclusively an aggregate publisher and it very often picks up items from unreliable, blackhat SEO "news" sources. [https://m.dailyhunt.in/news/india/english/the+free+press+journal-epaper-fpressjr/ace+fashion+and+lifestyle+influencer+gaurav+gaikwad+becomes+the+top+blogger+of+the+country-newsid-137552738 example], see the disclaimer at the bottom. In the event that they do publish something as an aggregate that isn't from a non-rs, the rs should just be used. Praxidicae (talk) 19:52, 19 February 2020 (UTC)[reply]
@Praxidicae: plus Added to MediaWiki:Spam-blacklist. Agreed, a net negative to Wikipedia, sufficiently so that addition invites questions about the good faith of the user linking it. --Guy (help!) 20:13, 19 February 2020 (UTC)[reply]
I'm late to this discussion, but dailyhunt.in has been a big problem in Indian entertainment articles as they shamelessly aggregate (aka steal) content from all sorts of sites without even bothering to attribute the source, which creates lots of problems when people assume that the unnamed origin source is reliable. I'm glad that it's been blacklisted. Cyphoidbomb (talk) 06:13, 14 March 2020 (UTC)[reply]

essaycorp.co.uk[edit]

Spammed on a couple of articles, appears to be an essay-selling website. No reason to be linking to this. creffett (talk) 13:45, 14 March 2020 (UTC)[reply]

@Creffett: plus Added to MediaWiki:Spam-blacklist. I also blocked Euricana indefinitely. Thanks for reporting this. — Newslinger talk 17:04, 14 March 2020 (UTC)[reply]

successpanachahtehai.com[edit]

Spammed on multiple articles by one account (reported as promo-only account to AIV), looks like it's here to sell us something and doesn't appear to have any redeeming value. creffett (talk) 03:07, 15 March 2020 (UTC)[reply]

@Creffett: plus Added to MediaWiki:Spam-blacklist. --Dirk Beetstra T C 06:03, 15 March 2020 (UTC)[reply]

techpassionworld.com[edit]

Recurring blog spam from dynamic IPs, including already partially-blocked LTA IP. GermanJoe (talk) 09:30, 15 March 2020 (UTC)[reply]

@GermanJoe: plus Added to MediaWiki:Spam-blacklist. --GermanJoe (talk) 09:30, 15 March 2020 (UTC)[reply]

websitestrategies.com.au[edit]

SEO spammer via dynamic IPs, two final warnings have been ignored. GermanJoe (talk) 17:09, 15 March 2020 (UTC)[reply]

@GermanJoe: plus Added to MediaWiki:Spam-blacklist. --GermanJoe (talk) 17:10, 15 March 2020 (UTC)[reply]

equifax.cf[edit]

Directly related to the already blacklisted Wikipedia:WikiProject Spam/LinkReports/creditkarma.cf and Wikipedia:WikiProject Spam/LinkReports/creditskarma.cf (where similar hijacking was observed). Reported to on ELN by user:Gbear605. Pinging WhatamIdoing. Will immediately blacklist. --Dirk Beetstra T C 10:25, 16 March 2020 (UTC)[reply]

@Gbear605 and WhatamIdoing: plus Added to MediaWiki:Spam-blacklist. --Dirk Beetstra T C 10:26, 16 March 2020 (UTC)[reply]
Thanks. It looks like there are no links in articles, so we're set for now. WhatamIdoing (talk) 17:59, 16 March 2020 (UTC)[reply]

chiropractornearmereviews.com[edit]

Per Wikipedia:WikiProject_Spam/LinkReports/chiropractornearmereviews.com - spam links for even spammier content. Guy (help!) 12:36, 17 March 2020 (UTC)[reply]

plus Added to MediaWiki:Spam-blacklist. --Guy (help!) 12:38, 17 March 2020 (UTC)[reply]

discount-24hour.blogspot.com[edit]

Spammer. plus Added to MediaWiki:Spam-blacklist. --Guy (help!) 16:16, 17 March 2020 (UTC)[reply]

flicktokick.wordpress.com[edit]

Spam links. Cleaning up now. Some additions go back years. Guy (help!) 16:29, 17 March 2020 (UTC)[reply]

Cleaned and plus Added to MediaWiki:Spam-blacklist. --Guy (help!) 16:33, 17 March 2020 (UTC)[reply]

scholarlyoa.com[edit]

This has got to be the most annoying blacklisting possible. This is a hijacked site, but it (and its archived versions) are extensivly used everywhere on Wikipedia when discussing predatory journals, open access, and Beall's list (see https://en.wikipedia.org/wiki/Special:LinkSearch/*.scholarlyoa.com). Whenever someone vandalized a page, or whitewashes an article, or tries to archive a discussion with those links in (of which there are several), the blacklist is tripped. This should be removed from the BL, because it causese way more headaches than it solves. I can't even make whitelist/blacklist removal requests without triggering the damned thing, hence the spaced version.

Headbomb {t · c · p · b} 15:16, 16 March 2020 (UTC)[reply]

@Headbomb: no Declined what is basically 2 hits in 3 days/1000 hits on a link that is now whitelisted (and the count over 4000 hits/10 days is basically 3 hits in mainspace and 2 discussions). NO archiving faults (which can easily be resolved), NO use in discussions (which I agree can be annoying if you can’t link, but hey, you maim the link and everyone knows where you found the info and can go there with a little bit more effort). So not “several”. Use in mainspace is limited, it boils down to a couple of pages, not widespread use. Nothing that the whitelist can’t handle (and is generally then quickly resolved as you may have noticed), and the annoyance is clearly minimal.  Defer to Whitelist for the rest. --Dirk Beetstra T C 19:04, 16 March 2020 (UTC)[reply]
If there's only two hits in 3 days, that's clearly proof that does not needed to be blacklisted. And the archives are still broken since the last time, because you can't repair them because of the blacklist. Headbomb {t · c · p · b} 19:13, 16 March 2020 (UTC)[reply]
I agree with Headbomb here. Right now, the blacklist would prevent the reversion of vandalism on Abstract and Applied Analysis, Academic journal publishing reform, Aging (journal), Altmetrics, Beall's List, Chinese Chemical Letters, Clinical Practice, Corruption in Canada, Entropy (journal), Epistemologia, Experimental & Clinical Cardiology, Frontiers Media, Future Medicine, Google Scholar, Hindawi Publishing Corporation, Imaging in Medicine, Index Copernicus, International Archives of Medicine, International Journal of Advanced Computer Technology, International Journal of Clinical Rheumatology, Jeffrey Beall, Journal of Cosmology, Journal of Medical Internet Research, Journal of Natural Products, List of academic databases and search engines, List of confidence tricks, MDPI, Mega journal, Neuropsychiatry (journal), Nova Science Publishers, OMICS Publishing Group, Pattern Recognition in Physics, Plastic Surgery (journal), Polonnaruwa (meteorite), Predatory publishing, Pulsus Group, Redalyc, SciELO, Scientific Research Publishing, Sylwan, The Scientific World Journal, The Veliger, Vanity press, Who's Afraid of Peer Review? and Wulfenia (journal). I've had to deal with this recently, and it's quite exasperating. That's not counting its use in Talk and User-talk pages. XOR'easter (talk) 20:00, 16 March 2020 (UTC)[reply]
And also prevent the upgrading of old links to the archived versions. Headbomb {t · c · p · b} 20:01, 16 March 2020 (UTC)[reply]

Headbomb, yes, the links are intentionally broken, you can still copy-paste them. Anyway, that is a technical problem with the Spam-blacklist that has NOT been resolved by the WMF for 14 years.

Wow, not hitting in 3 whole days .. I deal with spammers who come and return for over 10 years and are continuously trying to link and/or get their blacklisted links removed.

XOR'easter no, it does NOT hamper reversion of vandalism (except in some very rare cases). Anyway, for genuine use, as for the archive-links, they should be whitelisted so that in those rare cases that vandalism cannot be reverted it is solved for the future. The use on a mere fraction of our pages (not counting talkpages, maybe 60 out of 6 million ..?) is not a reason to remove, there must be widespread use of the links.

Get those cases whitelisted. --Dirk Beetstra T C 12:12, 17 March 2020 (UTC)[reply]

It hampered my reversion of vandalism on World Academy of Science, Engineering and Technology; I had to do cutting and rewriting for what should have been a one-click fix. XOR'easter (talk) 16:09, 17 March 2020 (UTC)[reply]
Likewise it hampered my reversion of whitewashing at Allied Academies. There clearly is both widespread and legitimate use of this domain, which by far supercedes spamming efforts. As for the links are intentionally broken, you can still copy-paste them, no you can't, because they are broken. They should also be working links, because they point to the intended pages, and aren't spam. That this issue is "technical" in nature is irrelevant, it's an issue period, because you are blacklisting a site that should not be blacklisted. If you want to prevent this type of spam, we have edit filters for this (set to warn, at least for editconfirmed+, so humans can go through them). Headbomb {t · c · p · b} 16:32, 17 March 2020 (UTC)[reply]
Headbomb, edit filters are not suitable for blocking external links, the AbuseFilter is a) way too heavy on the servers for that, b) that extension is just a bit less out-dated than the spam-blacklist extension, and c) being edit-confirmed is not a reason to stop spamming or using links wrongly (I've once trouted an admin for deliberately linking to a copyright violation). Dirk Beetstra T C 17:35, 17 March 2020 (UTC)[reply]

Both could have been rollbacked and that would not have triggered the spam-blacklist. And still, get them whitelisted which is a more permanent solution. 10 ppm is 'whitespread use' .. I guess we will see a massive influx of whitelist requests then. --Dirk Beetstra T C 17:32, 17 March 2020 (UTC)[reply]

Both could have been rollbacked and that would not have triggered the spam-blacklist. Patently false. Because that's exactly what I did to trigger the blacklist. edit-confirmed is not a reason to stop spamming or using links wrongly. True. But this forgets that these links are neither used wrongly, nor are spam to begin with. Nor are they being spammed on a scale that requires to make use of the blacklist. Headbomb {t · c · p · b} 18:17, 17 March 2020 (UTC)[reply]
Headbomb, I agree and have commented the entry out for now. Guy (help!) 18:23, 17 March 2020 (UTC)[reply]
@JzG: thanks. Good to see that someone finally sees reason on this. Headbomb {t · c · p · b} 18:24, 17 March 2020 (UTC)[reply]
Headbomb, We all did from the outset. It's a question of balancing competing interests. In my view, this tips on the side of removal because I strongly suspect a joe-job. Guy (help!) 21:09, 17 March 2020 (UTC)[reply]
Headbomb, if you could not roll back then there is something wrong, as that was possible and should be possible. That needs to be tested and reported if that is not possible anymore. Dirk Beetstra T C 23:39, 17 March 2020 (UTC)[reply]

imgx.in[edit]

Though I just happened upon this spam today, there has been a very clear slow-burn effort to spam this site at Wikipedia. Behold the majority of what I reverted recently: imgx.in spam

  • 13 December 2019 - 49.207.140.124 - [4]
  • 14 January 2020 - 49.207.132.52 - [5]
  • 16 January 2020 - 49.206.127.89 - [6]
  • 17 January 2020 - 49.205.77.115 - [7]
  • 18 January 2020 - 49.205.78.118 - [8]
  • 22 January 2020 - 49.207.141.114 - [9]
  • 25 January 2020‎ - 49.207.131.67 - [10]
  • 25 January 2020 - 49.205.79.220 - [11]
  • 26 January 2020 - 183.83.154.167 - [12]
  • 27 January 2020 - 49.207.143.58 - [13]
  • 29 January 2020 - 49.207.142.210 - [14]
  • 30 January 2020 - 183.83.152.194 - [15]
  • 31 January 2020 - 49.207.138.65 - [16]
  • 2 February 2020 - 49.207.134.250 - [17]
  • 2 February 2020 - 49.207.128.232 - [18]
  • 7 February 2020 - 49.207.136.32 - [19]
  • 9 February 2020 - 183.83.154.105 - [20]
  • 11 February 2020 - 49.207.134.97 - [21]
  • 13 February 2020 - 49.207.131.192 - [22]
  • 14 February 2020 - 49.206.125.120 - [23]
  • 15 February 2020 - 183.83.154.112 - [24]
  • 16 February 2020‎ - 183.83.154.112 - [25]
  • 17 February 2020 - 49.207.139.89 - [26]
  • 23 February 2020 - 49.207.138.170 - [27]
  • 24 February 2020 - 183.83.154.165 - [28]
  • 24 February 2020 - 183.83.154.165 - [29]
  • 27 February 2020 - 49.207.131.195 - [30]
  • 4 March 2020 - 49.206.125.143 - [31]
  • 7 March 2020 - 49.207.130.177 - [32]
  • 7 March 2020 - 49.207.130.177 - [33]
  • 8 March 2020 - 49.207.134.144 - [34]
  • 9 March 2020 - 49.206.125.95 - [35]
  • 11 March 2020 - 49.207.141.116 - [36]

Obviously the domain should be blacklisted, but if someone wants to figure out any rangeblocks, that's not exactly my specialty, so I'm grateful in advance. Thanks, Cyphoidbomb (talk) 06:23, 14 March 2020 (UTC)[reply]

@Cyphoidbomb: plus Added to MediaWiki:Spam-blacklist. These IPs only spammed the imgx.in domain, and the blacklist will prevent this in the future. I don't think a range block is necessary unless the IPs start spamming other domains. I've also upgraded Imgxbot from a soft block to a hard block. Thanks for reporting this. — Newslinger talk 08:11, 14 March 2020 (UTC)[reply]
Cyphoidbomb, I hope you did not manually compile above list ... please ask for a COIBot report next time you see something like this. Dirk Beetstra T C 08:46, 15 March 2020 (UTC)[reply]
@Beetstra: Sigh... I did. Cyphoidbomb (talk) 23:01, 16 March 2020 (UTC)[reply]
Cyphoidbomb, :-) next time report the domain here in the LinkSummary template, wait until COIBot has saved/refreshed the report (in the template: 'Reports: .... COIBot') .. Dirk Beetstra T C 05:01, 17 March 2020 (UTC)[reply]
Beetstra Y'know, I *knew* that, I just don't have an explanation other than being a total shit-for-brains. Cyphoidbomb (talk) 00:07, 18 March 2020 (UTC)[reply]
Cyphoidbomb, WP:CIR-related block needed? :-p Dirk Beetstra T C 05:28, 18 March 2020 (UTC)[reply]

Honestbussinessman24 and market-mirror[edit]

I came across these two services while looking over a cryptocurrency article's sources. Both seeming lack any sort of editorial control, and both advertise paid article writing services. Honestbussinessman24 even describes itself as Market-mirror in its "About" section. SamHolt6 (talk) 22:56, 17 March 2020 (UTC)[reply]

@SamHolt6: plus Added to MediaWiki:Spam-blacklist. --Guy (help!) 18:16, 18 March 2020 (UTC)[reply]

lyricspandits.blogspot.com[edit]

Copyright-violating blogs

Spam blog. Guy (help!) 18:16, 18 March 2020 (UTC)[reply]

@JzG: plus Added to MediaWiki:Spam-blacklist. I guess we should have a relatively low threshold for these. --Dirk Beetstra T C 05:42, 19 March 2020 (UTC)[reply]

unearthedarcana.com[edit]

This is basically an SEO spam blog with stolen/copyrighted Dungeons & Dragons content and affiliate links so they earn a commission on sales. It'll probably never be usable as a source so there's no encyclopedic value here. Saireddy9666 added links to Dungeons & Dragons three times and was reverted each time, then apparently returned as Jakkidirajashakerreddy to spam again. (Note that each username includes "reddy".) Woodroar (talk) 15:21, 21 March 2020 (UTC)[reply]

@Woodroar: plus Added to MediaWiki:Spam-blacklist. --Dirk Beetstra T C 18:58, 21 March 2020 (UTC)[reply]

tripraja.com[edit]

Spammed by a variety of IPs and single-purpose users over a long period of time. Appears to be a travel sales website, so don't think it has any redeeming encyclopedic value. creffett (talk) 13:23, 21 March 2020 (UTC)[reply]

@Creffett: Handled on meta. --Dirk Beetstra T C 19:01, 21 March 2020 (UTC)[reply]

sastedeal.com[edit]

Per Wikipedia:WikiProject_Spam/Local/sastedeal.com, persistent addition by IPs, and the links to this site are blatant spam. creffett (talk) 03:04, 21 March 2020 (UTC)[reply]

@Creffett: plus Added to MediaWiki:Spam-blacklist. --Dirk Beetstra T C 19:02, 21 March 2020 (UTC)[reply]

mymotivationalsupport.com[edit]

Site has been spammed by probably one person IP hopping, not even close to a good reference. Ravensfire (talk) 15:44, 22 March 2020 (UTC)[reply]

plus Added to MediaWiki:Spam-blacklist. Recurring issue since 2019. --GermanJoe (talk) 19:21, 22 March 2020 (UTC)[reply]

uplaw.us[edit]

Added as a "ref" by a couple of IPv6 ranges to various law-related articles, the links are basically to services uplaw.us offers related to the article topic. creffett (talk) 23:16, 22 March 2020 (UTC)[reply]

@Creffett: plus Added to MediaWiki:Spam-blacklist. Systematic IP spam since 2019. --GermanJoe (talk) 23:30, 22 March 2020 (UTC)[reply]

morninglazziness[edit]

Users

Please blacklist.-KH-1 (talk) 03:18, 23 March 2020 (UTC)[reply]

@KH-1: Last two accounts blocked. plus Added to MediaWiki:Spam-blacklist. — JJMC89(T·C) 03:34, 23 March 2020 (UTC)[reply]

More spam blogs[edit]

— Preceding unsigned comment added by JzG (talkcontribs)

plus Added to MediaWiki:Spam-blacklist. --Dirk Beetstra T C 04:50, 23 March 2020 (UTC)[reply]

.shop TLD[edit]

  • Regex requested to be blacklisted: \b[_\-0-9a-z]+\.shop\b

As we did with the .guru top-level domain, I am wondering if we should consider blacklisting the .shop TLD. I just went through the link search list and removed .shop domains from every main space article containing them (leaving a few in which the .shop site is the actual website of the article subject).

There weren't many instances of this, but that may be because .shop is a fairly new TLD, I reckon less than 4 years old. I don't really see strong evidence of abuse except in one case where innodot.shop kept getting added to skeleton watch (that IP user is now blocked); the others I removed were either one-offs, or the domain no longer pointed at anything relevant.

Any thoughts? Blacklisting a TLD is a big deal. I believe we did the right thing with .guru, and even though .shop isn't a popular domain for spamming yet, the nature of it gives it a higher potential for abuse than .guru, possibly. ~Anachronist (talk) 22:25, 21 March 2020 (UTC)[reply]

@Anachronist: I am first going to feed this to XLinkBot (references in next addition). Lets see where this gets reverted and review this in a week. If we see a number popping up and a couple of reversions we should likely consider to blacklist. Can you indicate how many you removed? plus Added to User:XLinkBot/RevertList. --Dirk Beetstra T C 06:56, 22 March 2020 (UTC)[reply]
@Anachronist: plus Added to User:XLinkBot/RevertReferencesList. --Dirk Beetstra T C 06:57, 22 March 2020 (UTC)[reply]
@Beestra: Good. It is unlikely that we'll see any activity in a week, so XLinkBot is the best place for now. ~Anachronist (talk) 17:05, 22 March 2020 (UTC)[reply]
@Beetstra: \.shop\b may result in false positives. It should be the regex we're using for .guru. I have corrected it above. ~Anachronist (talk) 17:09, 22 March 2020 (UTC)[reply]
@Anachronist: plus Added to User:XLinkBot/RevertList. --Dirk Beetstra T C 05:19, 23 March 2020 (UTC)[reply]
@Anachronist: plus Added to User:XLinkBot/RevertReferencesList. --Dirk Beetstra T C 05:20, 23 March 2020 (UTC)[reply]

thefilmslife.com[edit]

Spammers

Please blacklist. -KH-1 (talk) 05:21, 24 March 2020 (UTC)[reply]

@KH-1: plus Added to MediaWiki:Spam-blacklist. — JJMC89(T·C) 05:56, 24 March 2020 (UTC)[reply]

Filter 1045[edit]

All added to articles despite warnings from Filter 1045 (log). plus Added to MediaWiki:Spam-blacklist. --Guy (help!) 17:22, 25 March 2020 (UTC)[reply]

mercurie.blogspot.com[edit]

From today's review of Filter 1045 (log). More to come... Guy (help!) 21:41, 25 March 2020 (UTC)[reply]

trendontop.com[edit]

Spammed by an IP and registered user to the External Links section of various Indian BLPs and topics, particularly Bollywood-related articles. Does not look at all like a reliable source, more likely a gossip blog (with "wiki" thrown into some titles to make it look more legitimate, I guess?). creffett (talk) 13:49, 25 March 2020 (UTC)[reply]

plus Added. I remember cleaning up a few of these months ago and giving out final warnings. OhNoitsJamie Talk 22:08, 25 March 2020 (UTC)[reply]

thekhwabeeda.com[edit]

Users

Domain promoting a "blogger, photographer, calligrapher". Main goal was adding it to Ubinas, which is today's featured article and the first link on the main page: they went through eleven rounds of adding links onto the article, getting reverted and trying again. The user's been blocked now. No scholarly content, indeed all the pages on the website seem to be meaningless fragments of text like "Humble Nature.." Blythwood (talk) 01:35, 26 March 2020 (UTC)[reply]

@Blythwood: plus Added to MediaWiki:Spam-blacklist. --GermanJoe (talk) 07:15, 26 March 2020 (UTC)[reply]

montdigital.com[edit]

Spammed by IPs and single-purpose users. It's a marketing company's website, most of the links they added were to really low quality blog-like/how-to entries on its website (no idea why, half of the ones I saw are pretty irrelevant to marketing). I, for one, wouldn't trust them to market anything if this is their idea of promotion. creffett (talk) 16:30, 28 March 2020 (UTC)[reply]

@Creffett: plus Added to MediaWiki:Spam-blacklist. --Dirk Beetstra T C 11:40, 29 March 2020 (UTC)[reply]

socialnews.xyz[edit]

I don't know how this has flown under the radar for so long but this is nothing more than a spammy blog that has been widely inserted in movie articles. Praxidicae (talk) 18:59, 28 March 2020 (UTC)[reply]
@Praxidicae: plus Added to MediaWiki:Spam-blacklist. --Dirk Beetstra T C 11:41, 29 March 2020 (UTC)[reply]

adelaidenaturalrainwater.com.au[edit]

Spammed as an external link on several pages. No redeeming value. creffett (talk) 15:40, 30 March 2020 (UTC)[reply]

@Creffett: plus Added to MediaWiki:Spam-blacklist. The listed IPs have spammed several other dubious domains, but most of these seem to be minor sporadic cases. --GermanJoe (talk) 17:03, 30 March 2020 (UTC)[reply]

penwap.xyz[edit]

Spamming across multiple articles with multiple accounts. Ravensfire (talk) 16:22, 30 March 2020 (UTC)[reply]

plus Added to MediaWiki:Spam-blacklist and blocked remaining listed accounts. --GermanJoe (talk) 17:05, 30 March 2020 (UTC)[reply]

bmdays.com[edit]

Spam by 37.111.232.14 (7), 37.111.248.179 (4), 37.111.232.18 (1). See WP:WikiProject Spam/LinkReports/bmdays.com. ~ ToBeFree (talk) 18:05, 30 March 2020 (UTC)[reply]

plus Added to MediaWiki:Spam-blacklist. --~ ToBeFree (talk) 18:08, 30 March 2020 (UTC)[reply]

xyz[edit]

Seen a couple of these coming up .. --Dirk Beetstra T C 06:48, 31 March 2020 (UTC)[reply]

acquire.io[edit]

Previously tagged spammers [37]

Please blacklist. -KH-1 (talk) 09:59, 31 March 2020 (UTC)[reply]

@KH-1: plus Added to MediaWiki:Spam-blacklist. --GermanJoe (talk) 10:05, 31 March 2020 (UTC)[reply]