TZ: UTC +1/+2
User Details
- User Since
- Sep 1 2016, 6:48 AM (435 w, 3 d)
- Availability
- Available
- IRC Nick
- marostegui
- LDAP User
- Marostegui
- MediaWiki User
- MArostegui (WMF) [ Global Accounts ]
Fri, Jan 3
I fixed this
But how are we going to approach a host going down here? Plus, if we change hostnames/hosts, we'd need to edit MW?
We should make them run with weight 0 in the general traffic set up, @Ladsgroup that means MW won't check for lag, am I right? (which I am fine with). I just want to make sure we don't send ANY general production traffic there.
Jan 03 09:24:25 phab2002 systemd[1]: Starting phabricator public task dump... Jan 03 12:13:51 phab2002 systemd[1]: phabricator_task_dump.service: Succeeded. Jan 03 12:13:51 phab2002 systemd[1]: Finished phabricator public task dump. Jan 03 12:13:51 phab2002 systemd[1]: phabricator_task_dump.service: Consumed 2h 7min 57.096s CPU time.
Process still running - I think connection/credentials wise we are all set.
I have restarted phabricator_task_dump.service and based on strace I believe it is working
I have fixed this by changing the IP.
I've deleted the entries with the old IP
Ready for DC-Ops
This is ready for DC-Ops
Thu, Jan 2
This was a test to see if the change at https://gitlab.wikimedia.org/toolforge-repos/switchmaster/-/merge_requests/7 was applied.
Are there any significant schema changes on this upgrade?
This is done
Weight was 300
This is all done - all hosts have now decommission tasks which are just pending on-site steps.
This is ready for DC-Ops
Ready for DC-Ops
This is ready for DC-Ops
Downtime removed, notifications enabled and the host being slowly automatically repooled.
Thanks everyone who responded to this incident! Much appreciated.
This is ready for DC-Ops