The current implementation of MediaWiki:SpamBlacklist could use some improvements:
- It's a long set of regexes 6000-ish to be exact (on English Wikipedia)
- Running every regex match on every edit is not very fast
- It's hard to search given the escaping
- Most of them don't need to be regex either
- Since most admins are not familiar with regex, it limits ability of admins to fight spam
- It's fragile, it can easily cause everyone in the wiki not to be able to save any edit adding any link.
- It's not structured
- It can't tell you exactly why your edit can't be saved (which domain is blocked)
- It doesn't have notes so people have to log it separately
- The naming is problematic (T254646)
Proposal:
- Create a new special page called Special:BlockedExternalDomains, editable by anyone having "delete" right
- Make it save into a json page called MediaWiki:BlockedExternalDomains.json
- Store everything in Extension:AbuseFilter
- Put it behind a feature flag and deploy it to a couple of pilot wikis
- Deploy it widely and send a message to admins noticeboard of wikis to migrate from the current system to the new one
- (out of scope of this ticket) After migration is done, move more features to Extension:Abusefilter, such as keeping regrex-based denylist and allow list as a mediawiki page, email denylist, global denylist (both regex and non-regex), and so on.
- (out of scope of this ticket) Undeploy SpamBlacklist extension (after migration of some other functionalities)
Mocks:
(non-admin view)
Test setup:
https://en.wikipedia.beta.wmflabs.org/wiki/Special:BlockedExternalDomains
Python script to migrate off MediaWiki:Spamblacklist for simple cases: P49299