Page MenuHomePhabricator

bd808 (Bryan Davis)
Principal Software EngineerAdministrator

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Wednesday

  • Clear sailing ahead.

User Details

User Since
Oct 3 2014, 2:36 PM (535 w, 2 d)
Roles
Administrator
Availability
Available
IRC Nick
bd808
LDAP User
BryanDavis
MediaWiki User
BDavis (WMF) [ Global Accounts ]

I'm BDavis (WMF) on wiki, bd808 on irc & GitLab, and BryanDavis on Gerrit & Wikitech.

I've got a thing for 🦄s. Don't judge.

I work for or provide services to the Wikimedia Foundation, but this is my only Phabricator account. Edits, statements, or other contributions made from this account are my own, and may not reflect the views of the Foundation.

Recent Activity

Fri, Jan 3

bd808 closed T382863: cfdw-28928147-9qtjx stuck in Terminating state as Resolved.
$ ssh cloudcumin1001.eqiad.wmnet
$ sudo cookbook wmcs.toolforge.k8s.worker.drain --cluster-name tools --hostname-to-drain tools-k8s-worker-nfs-69
...
wmcs_libs.k8s.kubernetes.KubernetesTimeoutForDrain: Waited 300 for node tools-k8s-worker-nfs-69 to drain, but it never did. Still has 2 pods running. Running pods:
...
$ sudo cookbook wmcs.toolforge.k8s.reboot --cluster-name tools --hostname-list tools-k8s-worker-nfs-69
...
INFO: tools-k8s-worker-nfs-69: reboot phase: wait_drain
Something happened while rebooting host tools-k8s-worker-nfs-69, trying a hard rebooting the instance
...
END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-nfs-69
$ kubectl sudo get nodes | grep tools-k8s-worker-nfs-69
tools-k8s-worker-nfs-69   Ready    <none>          108d   v1.28.14
Fri, Jan 3, 9:54 PM · User-bd808, cloud-services-team, Toolforge
bd808 updated the task description for T382974: maintain-dbusers failing to create user for 'u4692'@'%' on instance-tools-db-4.tools.wmcloud.org.
Fri, Jan 3, 9:20 PM · Toolforge, cloud-services-team
bd808 updated subscribers of T382974: maintain-dbusers failing to create user for 'u4692'@'%' on instance-tools-db-4.tools.wmcloud.org.
Fri, Jan 3, 9:14 PM · Toolforge, cloud-services-team
bd808 created T382974: maintain-dbusers failing to create user for 'u4692'@'%' on instance-tools-db-4.tools.wmcloud.org.
Fri, Jan 3, 9:14 PM · Toolforge, cloud-services-team
bd808 closed T382962: Missing replica.my.cnf for freshly created Toolforge account vehicle-keeper-markings as Resolved.

The credentials provisioning service had hung somehow. Restarting it seems to have fixed things up:

$ sudo ls -lh /data/project/vehicle-keeper-markings/replica.my.cnf
-r--r----- 1 tools.vehicle-keeper-markings tools.vehicle-keeper-markings 53 Jan  3 20:58 /data/project/vehicle-keeper-markings/replica.my.cnf
Fri, Jan 3, 9:04 PM · User-bd808, cloud-services-team, Toolforge
bd808 claimed T312694: Support WebExtensions Manifest v3.

I think we may able to use a modifyHeaders rule with a set operation to inject the needed X-Wikimedia-Debug header. The header value we can supply is static, but I think we may be able to add the rule itself dynamically so that we can define the desired state based payload. More research is needed.

Fri, Jan 3, 6:44 PM · User-bd808, Release-Engineering-Team, Patch-For-Review, WikimediaDebug
bd808 updated subscribers of T312694: Support WebExtensions Manifest v3.

@Catrope is reporting seeing the extension as blocked:

image (1).png (310×1 px, 35 KB)

Fri, Jan 3, 5:13 PM · User-bd808, Release-Engineering-Team, Patch-For-Review, WikimediaDebug

Thu, Jan 2

bd808 added a comment to P49617 503, MediaWiki Docker on MacOS.

For what it is worth, I have had no problems in the last year on an M3 Max MacBook Pro when using "Apple Virtualization framework" as my Docker Desktop VMM with Rosetta emulation enabled and VirtioFS file sharing.

Thu, Jan 2, 10:54 PM
bd808 added a comment to T382709: Since the update to bookworm, unable to start mediawiki-web on my M2 Mac.

Adding Mutex posixsem to apache2 in the container appears to fix the issue. That being said, I'm not sure how exactly (and whether that might break things on other systems).

Thu, Jan 2, 6:20 PM · Patch-For-Review, ARM support, Growth-Team, Release-Engineering-Team, dev-images, MediaWiki-Docker

Fri, Dec 20

bd808 closed T365048: Deleting an envvar breaks ReplicaSet driven automatic restarts of a Pod (CreateContainerConfigError) as Invalid.

I will try to verify my reproduction case later today and report back on the status of that experiment.

Fri, Dec 20, 5:39 PM · cloud-services-team, Toolforge

Thu, Dec 19

bd808 added a comment to T334626: Selenium test "add image.mobile: user can close the image suggestion UI" is flaky.

https://integration.wikimedia.org/ci/job/wmf-quibble-selenium-php74/46146/console

00:11:20.036 [Chrome 90.0.4430.212 linux #0-0] 1) add image desktop: user can view image info and image details
00:11:20.036 [Chrome 90.0.4430.212 linux #0-0] Evaluation failed: notloggedin
00:11:20.036 [Chrome 90.0.4430.212 linux #0-0] Evaluation failed: notloggedin
00:11:20.036 [Chrome 90.0.4430.212 linux #0-0] Error: Evaluation failed: notloggedin
00:11:20.036 [Chrome 90.0.4430.212 linux #0-0]     at ExecutionContext._evaluateInternal (/workspace/src/extensions/GrowthExperiments/node_modules/puppeteer-core/lib/cjs/puppeteer/common/ExecutionContext.js:221:19)
00:11:20.036 [Chrome 90.0.4430.212 linux #0-0]     at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
00:11:20.037 [Chrome 90.0.4430.212 linux #0-0]     at async ExecutionContext.evaluate (/workspace/src/extensions/GrowthExperiments/node_modules/puppeteer-core/lib/cjs/puppeteer/common/ExecutionContext.js:110:16)
00:11:20.037 [Chrome 90.0.4430.212 linux #0-0]     at async ElementHandle.evaluate (/workspace/src/extensions/GrowthExperiments/node_modules/puppeteer-core/lib/cjs/puppeteer/common/JSHandle.js:107:16)
00:11:20.037 [Chrome 90.0.4430.212 linux #0-0]     at async ElementHandle.$eval (/workspace/src/extensions/GrowthExperiments/node_modules/puppeteer-core/lib/cjs/puppeteer/common/JSHandle.js:810:24)
00:11:20.037 [Chrome 90.0.4430.212 linux #0-0]     at async DevToolsDriver.executeScript (/workspace/src/extensions/GrowthExperiments/node_modules/devtools/build/commands/executeScript.js:39:20)
00:11:20.037 [Chrome 90.0.4430.212 linux #0-0]     at async Browser.wrappedCommand (/workspace/src/extensions/GrowthExperiments/node_modules/devtools/build/devtoolsdriver.js:102:26)
00:11:20.037 [Chrome 90.0.4430.212 linux #0-0]     at async AddImageArticlePage.setup (/workspace/src/extensions/GrowthExperiments/tests/selenium/pageobjects/addimage.article.page.js:84:3)
00:11:20.038 [Chrome 90.0.4430.212 linux #0-0]     at async Context.<anonymous> (/workspace/src/extensions/GrowthExperiments/tests/selenium/specs/addimage.js:19:3)
Thu, Dec 19, 4:15 PM · ci-test-error (WMF-deployed Build Failure), Browser-Tests, Growth-Team, GrowthExperiments

Wed, Dec 18

bd808 added a comment to T374129: openstack: consider removing labs-ip-aliaser.

Some things I did when Andrew asked me to double check that things seemed to work without split horizon DNS remapping:

root@abogott-T374129:~# hostname -f
abogott-T374129.testlabs.eqiad1.wikimedia.cloud
root@abogott-T374129:~# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host noprefixroute
       valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether fa:16:3e:e0:38:a3 brd ff:ff:ff:ff:ff:ff
    altname enp0s3
    inet 172.16.0.47/21 metric 100 brd 172.16.7.255 scope global dynamic ens3
       valid_lft 83628sec preferred_lft 83628sec
    inet6 fe80::f816:3eff:fee0:38a3/64 scope link
       valid_lft forever preferred_lft forever
root@abogott-T374129:~# host 185.15.56.77
77.56.15.185.in-addr.arpa is an alias for 77.0-25.56.15.185.in-addr.arpa.
77.0-25.56.15.185.in-addr.arpa domain name pointer instance-abogott-T374129.testlabs.wmcloud.org.
root@abogott-T374129:~# ping 185.15.56.77
PING 185.15.56.77 (185.15.56.77) 56(84) bytes of data.
64 bytes from 185.15.56.77: icmp_seq=1 ttl=63 time=1.72 ms
64 bytes from 185.15.56.77: icmp_seq=2 ttl=63 time=0.758 ms
^C
--- 185.15.56.77 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.758/1.241/1.724/0.483 ms
root@abogott-T374129:~# traceroute -I 185.15.56.77
traceroute to 185.15.56.77 (185.15.56.77), 30 hops max, 60 byte packets
 1  instance-abogott-T374129.testlabs.wmcloud.org (185.15.56.77)  1.005 ms  0.976 ms  0.970 ms
 2  instance-abogott-T374129.testlabs.wmcloud.org (185.15.56.77)  0.826 ms  0.821 ms *
root@abogott-T374129:~# traceroute -T 185.15.56.77
traceroute to 185.15.56.77 (185.15.56.77), 30 hops max, 60 byte packets
 1  instance-abogott-T374129.testlabs.wmcloud.org (185.15.56.77)  0.563 ms  0.489 ms  0.432 ms
 2  instance-abogott-T374129.testlabs.wmcloud.org (185.15.56.77)  0.797 ms  0.754 ms  0.633 ms
root@abogott-T374129:~#

The first traceroute tests I did were not as clean, but then Andrew opened up UDP and ICMP in the service group applied to the host and things got better.

Wed, Dec 18, 6:28 PM · Patch-For-Review, Cloud-VPS, User-aborrero, cloud-services-team

Tue, Dec 17

bd808 added a comment to T374830: Various CI jobs running in the integration Cloud VPS project failing due to transient DNS lookup failures, often for our own hosts such as gerrit.wikimedia.org.

@hashar do the containers in question map the system /etc/resolv.conf such that if I alter it the change will take effect immediately in the containers? Or do I need to alter the container config somehow?

Tue, Dec 17, 11:55 PM · Patch-For-Review, User-aborrero, Cloud-VPS, cloud-services-team, Continuous-Integration-Infrastructure, Release-Engineering-Team (Seen), User-brennen, ci-test-error (WMF-deployed Build Failure)
bd808 added a comment to T378752: Install PHP 8.1 on deployment-prep bullseye MediaWiki instances.

Upgrade announced to wikitech-l: https://lists.wikimedia.org/hyperkitty/list/wikitech-l@lists.wikimedia.org/thread/WXG426YV5N27JBF3LIF4V6WAMMFVNFU4/

Tue, Dec 17, 9:21 PM · OKR-Work, Patch-For-Review, Beta-Cluster-Infrastructure, User-bd808
bd808 changed the subtype of T378752: Install PHP 8.1 on deployment-prep bullseye MediaWiki instances from "Spike" to "Feature Request".
Tue, Dec 17, 9:14 PM · OKR-Work, Patch-For-Review, Beta-Cluster-Infrastructure, User-bd808
bd808 updated the task description for T378752: Install PHP 8.1 on deployment-prep bullseye MediaWiki instances.
Tue, Dec 17, 5:48 PM · OKR-Work, Patch-For-Review, Beta-Cluster-Infrastructure, User-bd808
bd808 updated the task description for T378752: Install PHP 8.1 on deployment-prep bullseye MediaWiki instances.
Tue, Dec 17, 5:36 PM · OKR-Work, Patch-For-Review, Beta-Cluster-Infrastructure, User-bd808
bd808 updated the task description for T378752: Install PHP 8.1 on deployment-prep bullseye MediaWiki instances.
Tue, Dec 17, 5:26 PM · OKR-Work, Patch-For-Review, Beta-Cluster-Infrastructure, User-bd808
bd808 updated the task description for T378752: Install PHP 8.1 on deployment-prep bullseye MediaWiki instances.
Tue, Dec 17, 5:20 PM · OKR-Work, Patch-For-Review, Beta-Cluster-Infrastructure, User-bd808
bd808 updated the task description for T378752: Install PHP 8.1 on deployment-prep bullseye MediaWiki instances.
Tue, Dec 17, 5:06 PM · OKR-Work, Patch-For-Review, Beta-Cluster-Infrastructure, User-bd808
bd808 added a comment to T378752: Install PHP 8.1 on deployment-prep bullseye MediaWiki instances.

Notes from deployment-mwmaint03.deployment-prep.eqiad1.wikimedia.cloud:

  • Stable state after 2 puppet runs now. The first run leaves these files to be cleaned by the second run:
Notice: /Stage[main]/Php/File[/etc/php/8.1/cli/conf.d/20-tideways.ini]/ensure: removed
Notice: /Stage[main]/Php/File[/etc/php/8.1/cli/conf.d/20-wmerrors.ini]/ensure: removed
Notice: /Stage[main]/Php/File[/etc/php/8.1/fpm/conf.d/20-tideways.ini]/ensure: removed

This seems to be an artifact of how the ::php::extension class interacts with deb package provided ini files. I am not sure that it is really worth anyone's time to eliminate this minor config churn.

Tue, Dec 17, 5:02 PM · OKR-Work, Patch-For-Review, Beta-Cluster-Infrastructure, User-bd808
bd808 edited projects for T378752: Install PHP 8.1 on deployment-prep bullseye MediaWiki instances, added: OKR-Work; removed Spike.
Tue, Dec 17, 12:32 AM · OKR-Work, Patch-For-Review, Beta-Cluster-Infrastructure, User-bd808
bd808 renamed T378752: Install PHP 8.1 on deployment-prep bullseye MediaWiki instances from Figure out how to install PHP 8.1 on bullseye MediaWiki instances to Install PHP 8.1 on deplopyment-prep bullseye MediaWiki instances.
Tue, Dec 17, 12:30 AM · OKR-Work, Patch-For-Review, Beta-Cluster-Infrastructure, User-bd808
bd808 updated the task description for T378752: Install PHP 8.1 on deployment-prep bullseye MediaWiki instances.
Tue, Dec 17, 12:28 AM · OKR-Work, Patch-For-Review, Beta-Cluster-Infrastructure, User-bd808

Mon, Dec 16

bd808 added a comment to T378752: Install PHP 8.1 on deployment-prep bullseye MediaWiki instances.

Notes from deployment-parsoid14.deployment-prep.eqiad1.wikimedia.cloud:

  • The fix in PS14 was not sufficient. It still took 3 puppet runs to reach stable state. The next package the first git run wanted was php8.1-fpm.
Mon, Dec 16, 10:34 PM · OKR-Work, Patch-For-Review, Beta-Cluster-Infrastructure, User-bd808
bd808 added a comment to T378752: Install PHP 8.1 on deployment-prep bullseye MediaWiki instances.

Notes from deployment-snapshot05.deployment-prep.eqiad1.wikimedia.cloud:

  • Added hiera settings to instance config via Horizon.
$ sudo -i run-puppet-agent
  # no changes
$ php --version
PHP 7.4.33 (cli) (built: Apr 18 2024 14:41:42) ( NTS )
Copyright (c) The PHP Group
Zend Engine v3.4.0, Copyright (c) Zend Technologies
    with Zend OPcache v7.4.33, Copyright (c), by Zend Technologies
$ sudo -i run-puppet-agent
  # saw some warnings about missing packages.
Error: Execution of '/usr/bin/apt-get -q -y -o DPkg::Options::=--force-confold install php8.1-cli' returned 100: Reading package lists...
Building dependency tree...                                                     
Reading state information...
E: Unable to locate package php8.1-cli                                          
E: Couldn't find any package by glob 'php8.1-cli'
E: Couldn't find any package by regex 'php8.1-cli'
...
Notice: /Stage[main]/Php::Fpm/File[/etc/php/8.1/fpm/php-fpm.conf]: Dependency Package[php8.1-fpm] has failures: true
Warning: /Stage[main]/Php::Fpm/File[/etc/php/8.1/fpm/php-fpm.conf]: Skipping because of failed dependencies
  # seems to be a dependency order problem because eventually we see the component added as would have been expected before the first apt run
...
Notice: /Stage[main]/Profile::Mediawiki::Php/Apt::Package_from_component[wikimedia-php81]/Apt::Repository[repository_wikimedia-php81]/File[/etc/apt/sources.list.d/repository_wikimedia-php81.list]/ensure: defined content as '{sha256}f6a9310fa9ca1920c0e7c0630149a6c709c29638d31898f2f860568b3a2f1fb4'
...
Notice: Applied catalog in 55.95 seconds
$ sudo -i run-puppet-agent
  # second run made things look much better
...
Notice: Applied catalog in 77.25 seconds
$ php --version
Warning: Module "tideways_xhprof" is already loaded in Unknown on line 0
PHP 8.1.31 (cli) (built: Nov 21 2024 21:07:42) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.1.31, Copyright (c) Zend Technologies
    with Zend OPcache v8.1.31, Copyright (c), by Zend Technologies
$ sudo -i run-puppet-agent
Info: Using environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Info: Caching catalog for deployment-snapshot05.deployment-prep.eqiad1.wikimedia.cloud
Info: Applying configuration version '(51a18fb309) gitpuppet - MediaWiki: Only proxy existing .php files, otherwise return nice 404'
Notice: /Stage[main]/Php/File[/etc/php/8.1/cli/conf.d/20-tideways.ini]/ensure: removed
Notice: /Stage[main]/Php/File[/etc/php/8.1/cli/conf.d/20-wmerrors.ini]/ensure: removed
Notice: /Stage[main]/Php/File[/etc/php/8.1/fpm/conf.d/20-tideways.ini]/ensure: removed
Notice: Applied catalog in 9.64 seconds
$ sudo -i run-puppet-agent
  # no changes
Notice: Applied catalog in 10.02 seconds
$ php --version
PHP 8.1.31 (cli) (built: Nov 21 2024 21:07:42) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.1.31, Copyright (c) Zend Technologies
    with Zend OPcache v8.1.31, Copyright (c), by Zend Technologies
Mon, Dec 16, 9:27 PM · OKR-Work, Patch-For-Review, Beta-Cluster-Infrastructure, User-bd808
thcipriani awarded T378752: Install PHP 8.1 on deployment-prep bullseye MediaWiki instances a Yellow Medal token.
Mon, Dec 16, 5:17 PM · OKR-Work, Patch-For-Review, Beta-Cluster-Infrastructure, User-bd808
bd808 added a subtask for T319432: Migrate WMF production from PHP 7.4 to PHP 8.1: T378752: Install PHP 8.1 on deployment-prep bullseye MediaWiki instances.
Mon, Dec 16, 5:09 PM · Dumps-Generation, MediaWiki-Platform-Team, serviceops
bd808 added a parent task for T378752: Install PHP 8.1 on deployment-prep bullseye MediaWiki instances: T319432: Migrate WMF production from PHP 7.4 to PHP 8.1.
Mon, Dec 16, 5:09 PM · OKR-Work, Patch-For-Review, Beta-Cluster-Infrastructure, User-bd808
bd808 added a comment to T382242: [engineering] Create gerrit dashboard for Trust and Safety Product team.

See https://www.mediawiki.org/wiki/Module:Gerrit_dashboard for one way you could prototype a dashboard from the wikis. Ancient usage example at https://www.mediawiki.org/wiki/User:BDavis_(WMF)/Projects/Core_code_review_dashboard

Mon, Dec 16, 5:00 PM · Trust and Safety Product Sprint (Sprint Chimes (Dec. 9 - Jan. 17)), Trust and Safety Product Team, Developer Productivity
bd808 added a comment to T378752: Install PHP 8.1 on deployment-prep bullseye MediaWiki instances.

The PCC reports for the patch have diffs like this for various prod hosts:

--- Php::Extension[yaml].orig
+++ Php::Extension[yaml]
Mon, Dec 16, 4:55 PM · OKR-Work, Patch-For-Review, Beta-Cluster-Infrastructure, User-bd808
bd808 reopened T381508: Subdomain for catalyst-dev project as "Open".

Reopening. I imagine the first thing to triple check is the configuration I set in T381508#10381726.

Mon, Dec 16, 4:17 PM · User-bd808, Cloud-VPS (Quota-requests), cloud-services-team (FY2024/2025-Q1-Q2)

Fri, Dec 13

bd808 removed a member for acl*phabricator: lbowmaker.
Fri, Dec 13, 9:35 AM
bd808 removed a member for acl*phabricator: WDoranWMF.
Fri, Dec 13, 9:35 AM
bd808 removed a member for acl*phabricator: Jrbranaa.
Fri, Dec 13, 9:35 AM

Thu, Dec 12

bd808 added a comment to T381948: Create automation service to reduce toil and increase visibility of local Puppetserver cherry-picks.

My high level idea here is to create a web interface (more accessible than cli tool) that provides:

  • view of the current local changes on the local operations/puppet.git checkout (git log --oneline @{upstream}..HEAD)
  • add a new cherry-pick from gerrit
  • remove an existing cherry-pick
  • "refresh" an existing cherry-pick (remove + add latest revision)
Thu, Dec 12, 9:11 AM · User-bd808, Beta-Cluster-Infrastructure

Wed, Dec 11

bd808 added a comment to T381508: Subdomain for catalyst-dev project.

Horizon presented error:

Error: Unable to create proxy: <!doctype html> <html lang=en> <title>500 Internal Server Error</title> <h1>Internal Server Error</h1> <p>The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.</p> Details
Internal Server Error (HTTP 500)

So the backend goes boom, but that is about all that tells us. Some box in https://openstack-browser.toolforge.org/project/project-proxy hopefully holds more details.

Wed, Dec 11, 11:30 AM · User-bd808, Cloud-VPS (Quota-requests), cloud-services-team (FY2024/2025-Q1-Q2)
bd808 created T381948: Create automation service to reduce toil and increase visibility of local Puppetserver cherry-picks.
Wed, Dec 11, 9:04 AM · User-bd808, Beta-Cluster-Infrastructure
bd808 awarded T381773: [Session] what happens when you type tr.wikipedia.org? a Unicorn! token.
Wed, Dec 11, 8:43 AM · Wikimedia-Hackathon-2025
bd808 updated subscribers of T225730: Reduce runtime of MW shared gate Jenkins jobs to 5 min.

@thcipriani and @dduvall: I just noticed y'all missing from the subscriber list here, so you're welcome. :)

Wed, Dec 11, 8:33 AM · MW-1.43-notes (1.43.0-wmf.22; 2024-09-10), Wikimedia-Performance-recommendation, MW-1.41-notes (1.41.0-wmf.20; 2023-08-01), MW-1.40-notes (1.40.0-wmf.12; 2022-11-28), Release-Engineering-Team (Priority Backlog 📥), MW-1.39-notes (1.39.0-wmf.8; 2022-04-18), MW-1.38-notes (1.38.0-wmf.16; 2022-01-03), MW-1.36-notes (1.36.0-wmf.36; 2021-03-23), MW-1.35-notes (1.35.0-wmf.27; 2020-04-07), Patch-For-Review, Developer Productivity, Code-Health, Epic, MediaWiki-Core-Tests, Continuous-Integration-Config

Dec 4 2024

bd808 closed T381508: Subdomain for catalyst-dev project as Resolved.

The profile::wmcs::novaproxy::supported_zones data from T381508#10381726 was not quite right. The project name is "catalyst-dev", but the project ID is "7209100e0e744a4fbdf447534d4eb825". The project value in that dict needs to be the project ID. This switch from the old name override to UUIDs is just going to continue to be confusing for quite a while.

Dec 4 2024, 11:07 PM · User-bd808, Cloud-VPS (Quota-requests), cloud-services-team (FY2024/2025-Q1-Q2)
bd808 added a comment to T381508: Subdomain for catalyst-dev project.

I figured out that the id refers to the Designate zone record. For catalyst-dev.wmcloud.org that is 35699886-add9-4ad3-88d7-5b2829f8c72c.

Dec 4 2024, 10:53 PM · User-bd808, Cloud-VPS (Quota-requests), cloud-services-team (FY2024/2025-Q1-Q2)
bd808 added a comment to T381508: Subdomain for catalyst-dev project.
  • Delegate the specific subdomain in Designate to the project using wmcs-makedomain
root@cloudcontrol1005:~# wmcs-makedomain --project 7209100e0e744a4fbdf447534d4eb825 --domain catalyst-dev.wmcloud.org. --orig-project cloudinfra
root@cloudcontrol1005:~#
  • Provision the TLS certificates in the project-proxy-acme-chief prefix hiera
`project-proxy-acme-chief` prefix hiera
profile::acme_chief::certificates:
  customcatalyst:
    CN: catalyst.wmcloud.org
    SNI:
    - catalyst.wmcloud.org
    - '*.catalyst.wmcloud.org'
    authorized_regexes:
    - ^.+\.project-proxy\.eqiad1\.wikimedia\.cloud$
    challenge: dns-01
  • Run Puppet on the active project-proxy acme-chief host.
$ ssh root@project-proxy-acme-chief-02.project-proxy.eqiad1.wikimedia.cloud
# puppet agent -tv
Info: Using environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Info: Caching catalog for project-proxy-acme-chief-02.project-proxy.eqiad1.wikimedia.cloud
Info: Applying configuration version '(a41235c804) Cwhite - webperf: disable statsd-exporter relaying flag'
Notice: /Stage[main]/Acme_chief::Server/File[/etc/acme-chief/config.yaml]/content:
--- /etc/acme-chief/config.yaml 2024-10-30 15:49:55.302873393 +0000
+++ /tmp/puppet-file20241204-2627403-sezto8     2024-12-04 22:22:01.463620877 +0000
@@ -11,6 +11,14 @@
     authorized_regexes:
     - "^.+\\.project-proxy\\.eqiad1\\.wikimedia\\.cloud$"
     challenge: dns-01
+  customcatalyst-dev:
+    CN: catalyst-dev.wmcloud.org
+    SNI:
+    - catalyst-dev.wmcloud.org
+    - "*.catalyst-dev.wmcloud.org"
+    authorized_regexes:
+    - "^.+\\.project-proxy\\.eqiad1\\.wikimedia\\.cloud$"
+    challenge: dns-01
   customcatalyst-qte:
     CN: catalyst-qte.wmcloud.org
     SNI:
Dec 4 2024, 10:41 PM · User-bd808, Cloud-VPS (Quota-requests), cloud-services-team (FY2024/2025-Q1-Q2)
bd808 closed T378571: Move @wikimedia_sal off botsin.space as Resolved.

Wikitech updates:

Dec 4 2024, 6:05 PM · User-bd808, Stashbot
bd808 changed the status of T378571: Move @wikimedia_sal off botsin.space from Open to In Progress.

https://wikimedia.social/@sal will be the bot's next home.

Dec 4 2024, 5:14 PM · User-bd808, Stashbot
bd808 set Due Date to Fri, Dec 13, 12:00 AM on T378571: Move @wikimedia_sal off botsin.space.
Dec 4 2024, 5:12 PM · User-bd808, Stashbot

Dec 3 2024

bd808 added a comment to T381419: Future testing-infra growth on cloud-vps.

A couple of months ago I grabbed a larger quota for the Integration project with T376847: Quota increase for Integration project (Jenkins CI runners). I have not yet built out the +12 instances that was designed to accommodate, but I hope to get to that soon™.

Dec 3 2024, 5:22 PM · collaboration-services, Continuous-Integration-Infrastructure, QTE-TestingOverview, GitLab (CI & Job Runners), Cloud-VPS, cloud-services-team
bd808 added a comment to T381410: Prominent docs on GitLab CI are missing information on runner options and best practices.

Some docs with useful information that ideally would be easier to find:

Dec 3 2024, 4:33 PM · Documentation, GitLab (CI & Job Runners)
bd808 created T381410: Prominent docs on GitLab CI are missing information on runner options and best practices.
Dec 3 2024, 4:27 PM · Documentation, GitLab (CI & Job Runners)
bd808 added a comment to T376267: ☂ Wikitech account linking and SUL error reporting.

Please note that I seem to have another account "Aleksandar Mastilovic" that is still active on Wikitech, but I haven't used that one to access the site before. I did the "special password reset" thing using that account, since AMastilovic-WMF wasn't available in the drop-down list.

Dec 3 2024, 12:11 AM · wikitech.wikimedia.org

Dec 2 2024

bd808 updated subscribers of T378752: Install PHP 8.1 on deployment-prep bullseye MediaWiki instances.

With some help from @thcipriani the PHP 8.1 test was a success. The deployment-mediawiki81.deployment-prep.eqiad1.wikimedia.cloud instance ran MediaWiki without any obvious hard crashes. The log spam seen in the ELK stack did not seem to be out of the ordinary. The selenium-daily-beta-MediaWiki suite ran to success.

Dec 2 2024, 11:05 PM · OKR-Work, Patch-For-Review, Beta-Cluster-Infrastructure, User-bd808
bd808 added a comment to T378752: Install PHP 8.1 on deployment-prep bullseye MediaWiki instances.

I did some crawling around in various running WMCS instances and configuration repositories to try and figure out how the InternetCDNMediaWiki traffic flow finds a MediaWiki server. For my current purposes the interesting bit seems to be the data in the profile::trafficserver::backend::mapping_rules hiera key for prefix "deployment-cache-text". This collection of hiera settings configures the Apache Traffic Server (ATS) cache server which is the lowest layer of the CDN edge stack. When a URL is missed in the in-memory and on-disk cache pools, ATS looks at this config to figure out which upstream server to contact to handle the request. At the time I am writing this comment, the interest bits are:

deployment-cache-text.yaml excerpt
- params:
  - '@plugin=/usr/lib/trafficserver/modules/tslua.so'
  - '@pparam=/etc/trafficserver/lua/rb-mw-mangling.lua'
  replacement: http://deployment-mediawiki13.deployment-prep.eqiad1.wikimedia.cloud/w/api.php
  target: http://(.*)/w/api.php
  type: regex_map
- params:
  - '@plugin=/usr/lib/trafficserver/modules/tslua.so'
  - '@pparam=/etc/trafficserver/lua/normalize-path.lua'
  - '@pparam="3A 2F 40 21 24 28 29 2A 2C 3B"'
  - '@pparam="5B 5D 26 27 2B 3D"'
  - '@plugin=/usr/lib/trafficserver/modules/tslua.so'
  - '@pparam=/etc/trafficserver/lua/rb-mw-mangling.lua'
  replacement: http://deployment-mediawiki14.deployment-prep.eqiad1.wikimedia.cloud
  target: /
  type: map

This tells me that Action API requests are routed to deployment-mediawiki13 and all otherwise unconfigured URLs are routed to deployment-mediawiki14.

Dec 2 2024, 10:00 PM · OKR-Work, Patch-For-Review, Beta-Cluster-Infrastructure, User-bd808
bd808 added a comment to T317341: Findings in Security Readiness Reviews of Trusted GitLab Runners.

@Jelto Can we make this task public now so it is easier to document the firewall restrictions on the WMCS hosted runners?

Dec 2 2024, 9:06 PM · SecTeam-Processed, Vuln-Misconfiguration, Security-Team, Security, collaboration-services, GitLab (CI & Job Runners)
bd808 moved T381237: Allow scheduling for current backport window from Backlog to Need discussion on the Tool-schedule-deployment board.

How many minutes of slack in adding content to a deployment window are reasonable? The current logic for displaying windows available for scheduling uses the window start time as the cutoff. Leaving all windows found on the Deployments page in the list would satisfy this specific request, but would also remove the decluttering of only showing future events.

Dec 2 2024, 4:41 PM · Tool-schedule-deployment

Nov 30 2024

bd808 closed T381110: #wikimedia-traffic does not use wikimedia global bans list / wmopbot etc. as Resolved.

This task can be made made public now. It was only protected because of the WP:BEANS aspect of the config changes desired.

Nov 30 2024, 6:10 PM · Vuln-Misconfiguration, SecTeam-Processed, User-bd808, ircservserv, Traffic, wikimedia-irc-libera, Security-Team, Security
bd808 added a comment to T381110: #wikimedia-traffic does not use wikimedia global bans list / wmopbot etc..
[18:02]  <    bd808> !issync
[18:02]  <ircservserv-wm> Syncing #wikimedia-traffic (requested by bd808)
[18:02] ChanServ sets mode +o ircservserv-wm
[18:03] ChanServ sets mode +v jinxer-wm
[18:03]  <ircservserv-wm> Set /cs flags #wikimedia-traffic jinxer-wm +Vv
[18:03]  <ircservserv-wm> Set /cs flags #wikimedia-traffic brett -Res
[18:03]  <ircservserv-wm> Set /cs flags #wikimedia-traffic vgutierrez -Res
[18:03]  <ircservserv-wm> Set /cs flags #wikimedia-traffic fabfur -Res
[18:03]  <ircservserv-wm> Set /cs flags #wikimedia-traffic sukhe -Res
[18:03]  <ircservserv-wm> Set /cs flags #wikimedia-traffic ChrisDobbins901_ -Res
[18:03] ChanServ sets mode +v stashbot
[18:03]  <ircservserv-wm> Set /cs flags #wikimedia-traffic stashbot +Vv
[18:03]  <ircservserv-wm> Set /cs flags #wikimedia-traffic Az1568 -ARefiorstv
[18:03]  <ircservserv-wm> Set /cs flags #wikimedia-traffic *!*@libera/staff/* +o
[18:03]  <ircservserv-wm> Set /cs flags #wikimedia-traffic kwakuofori -Res
[18:03] ChanServ sets mode +v wmopbot
[18:03]  <ircservserv-wm> Set /cs flags #wikimedia-traffic wmopbot +Votv
[18:03] ChanServ sets mode +v wm-bot
[18:03]  <ircservserv-wm> Set /cs flags #wikimedia-traffic wm-bot +Vv
[18:03]  <ircservserv-wm> Set /cs flags #wikimedia-traffic ircservserv-wm +V
[18:03]  <ircservserv-wm> Set /cs flags #wikimedia-traffic litharge +o
[18:03] ircservserv-wm sets mode +b $j:#wikimedia-bans
[18:03]  <ircservserv-wm> Set /mode #wikimedia-traffic +b $j:#wikimedia-bans
[18:03] ircservserv-wm sets mode -o ircservserv-wm
Nov 30 2024, 6:06 PM · Vuln-Misconfiguration, SecTeam-Processed, User-bd808, ircservserv, Traffic, wikimedia-irc-libera, Security-Team, Security
bd808 closed T374651: Toolhub crawler hasn't run since 2024-07-23T21:03 as Resolved.
Nov 30 2024, 1:23 AM · User-bd808, Toolhub
bd808 added a project to T381110: #wikimedia-traffic does not use wikimedia global bans list / wmopbot etc.: Patch-For-Review.

I pinged the current channel founders (@BBlack, @Legoktm, and @ema) with this:

[00:15]  <    bd808> We have an ircservserv config ready for the #wikimedia-traffic channel, but I need help from one of the folks who currently have +F there to set it up. That would be bblack, legoktm, or ema. I need them to `/cs flags #wikimedia-traffic ircservserv-wm +AFRefiorstv`. There are 4 +F users so they likely also will need to `/cs flags #wikimedia-traffic Az1568 -F` first.
[00:15]  <    bd808> https://gitlab.wikimedia.org/toolforge-repos/ircservserv-config/-/merge_requests/17

I somehow just realized that @Az1568 could do the needful for this too. I likely wasn't thinking about that because of the particular channel I sent the ping in.

Nov 30 2024, 12:24 AM · Vuln-Misconfiguration, SecTeam-Processed, User-bd808, ircservserv, Traffic, wikimedia-irc-libera, Security-Team, Security

Nov 29 2024

bd808 moved T380537: Adoption request for bullseye from Incoming to Abandoned tool policy (adoption & usurpation) on the Toolforge-standards-committee board.
Nov 29 2024, 11:00 PM · Tool-bullseye, Toolforge-standards-committee
bd808 moved T381138: Adoption request for ftools from Incoming to Abandoned tool policy (adoption & usurpation) on the Toolforge-standards-committee board.
Nov 29 2024, 11:00 PM · Toolforge-standards-committee

Nov 28 2024

bd808 added a comment to T380108: ircservserv not detecting message sender when running behind ZNC v1.8.2.

Cursed idea: I think I could abuse the python build system to perform arbitrary source builds. Modern python has pluggable build backends which introduces the possibility of writing a completely custom backend. That is an extremely cursed idea.

Nov 28 2024, 6:04 PM · ircservserv
bd808 renamed T380127: [builds-builder] Add support for Heroku's "24" builder stack based on Ubuntu 2024.04 noble from Add support for Heroku's "24" builder stack based on Ubuntu 2024.04 noble to [builds-builder] Add support for Heroku's "24" builder stack based on Ubuntu 2024.04 noble.
Nov 28 2024, 3:54 PM · Toolforge, cloud-services-team
bd808 added a comment to T381110: #wikimedia-traffic does not use wikimedia global bans list / wmopbot etc..

@Urbanecm if you make an ircservserv patch I don't think it would be too hard to get @BBlack or @Legoktm to apply it to the channel.

Nov 28 2024, 3:37 PM · Vuln-Misconfiguration, SecTeam-Processed, User-bd808, ircservserv, Traffic, wikimedia-irc-libera, Security-Team, Security
bd808 added a comment to T378752: Install PHP 8.1 on deployment-prep bullseye MediaWiki instances.

I would really like to consolidate the hiera settings for deployment-prep into the Horizon managed system because it is so much easier for mortals to use than the ops/puppet.git:hieradata/cloud/eqiad1/deployment-prep/common.yaml settings. This is a distraction from this task however and also potentially controversial, so I am making this note and moving on with my experiment.

Nov 28 2024, 12:38 AM · OKR-Work, Patch-For-Review, Beta-Cluster-Infrastructure, User-bd808
bd808 added a comment to T378752: Install PHP 8.1 on deployment-prep bullseye MediaWiki instances.
[00:02]  <    bd808> Southparkfan: Do you remember from your work on deployment-prep bullseye stuff what changes are needed to pool a new MediaWiki server? Context is T378752 and my next step desire to route traffic to that PHP 8.1 node.
[00:02]  < stashbot> T378752: Figure out how to install PHP 8.1 on bullseye MediaWiki instances - https://phabricator.wikimedia.org/T378752
[00:05]  <    bd808> ah ha, I bet T361387 will tell me a lot
[00:05]  < stashbot> T361387: Replace or delete deployment-mediawiki[11-12].deployement-prep.eqiad1.wikimedia.cloud - https://phabricator.wikimedia.org/T361387
[00:06]  <Southparkfan> bd808: I'm about to log off to get some sleep, but what I can tell for sure is that I had to provision the new appservers with puppet, then add them to their appropriate dsh groups so that they get the scap syncs
[00:07]  <Southparkfan> However, I don't think all appservers are pooled in ATS for cache_text traffic, iirc only one of the two appservers is pooled right now. Technically, the other one can also serve traffic just fine, though
[00:08]  <    bd808> Southparkfan: good to know. I found some of what you did. I'll ask you more if I end up really stuck. :)
[00:09]  <Southparkfan> I also recall having issues installing new VMs with certain puppet roles right away, at some point I just decided to bootstrap them with the 'base puppet role' (i.e. the default role for Cloud VPS instances, nothing app-specific applied), then changing the puppet role to the appropriate role
[00:10]  <    bd808> I had "fun" with puppetmaster certs on the new instance I built. It felt like there has been some regression in switching to a self-hosted puppetmaster
[00:10]  <Southparkfan> (otherwise the host could get stuck in some unknown state, where it boot# fine, but wouldn't allow me to log in)
[00:11]  <    bd808> I luckily have all the s3cr3t cloud root juice if that happens
[00:12]  <Southparkfan> Oh heh, I didn't have issues with the puppetmaster certificates. I did have issues with incorrect permissions on the git clone of the puppet repo on the puppetmaster in deployment-prep, but I think it has been fixed since
[00:12]  <Southparkfan> Ahaha :D magic sauce for those who shall not depend on working LDAP clients
[00:13]  <    bd808> yeah, I can even enter through the vm's root console if things are very messed up.
[00:14]  <Southparkfan> Feel free to brain dump the installation process somewhere, I can surely take a look later today (or tomorrow I think, with regards to your TZ - it's already past midnight here)
[00:15]  <    bd808> Southparkfan: thanks for the offer. Get some good sleep. :)

T361387: Replace or delete deployment-mediawiki[11-12].deployement-prep.eqiad1.wikimedia.cloud

Nov 28 2024, 12:14 AM · OKR-Work, Patch-For-Review, Beta-Cluster-Infrastructure, User-bd808

Nov 27 2024

bd808 added a comment to T381027: 3 public Python tool configuration files.

I revoked the database credentials for wikifile-transfer. They should be recreated shortly by the provisioning service.

$ ssh cloudcontrol1005.eqiad.wmnet
$ sudo /usr/local/sbin/maintain-dbusers delete tools.wikifile-transfer --account-type=tool
INFO [root.delete_account:1162] Deleted tool account in 185.15.56.15:3306 for tools.wikifile-transfer
INFO [root.delete_account:1162] Deleted tool account in clouddb1013.eqiad.wmnet:3311 for tools.wikifile-transfer
INFO [root.delete_account:1162] Deleted tool account in clouddb1017.eqiad.wmnet:3311 for tools.wikifile-transfer
INFO [root.delete_account:1162] Deleted tool account in clouddb1014.eqiad.wmnet:3312 for tools.wikifile-transfer
INFO [root.delete_account:1162] Deleted tool account in clouddb1018.eqiad.wmnet:3312 for tools.wikifile-transfer
INFO [root.delete_account:1162] Deleted tool account in clouddb1013.eqiad.wmnet:3313 for tools.wikifile-transfer
INFO [root.delete_account:1162] Deleted tool account in clouddb1017.eqiad.wmnet:3313 for tools.wikifile-transfer
INFO [root.delete_account:1162] Deleted tool account in clouddb1015.eqiad.wmnet:3314 for tools.wikifile-transfer
INFO [root.delete_account:1162] Deleted tool account in clouddb1019.eqiad.wmnet:3314 for tools.wikifile-transfer
INFO [root.delete_account:1162] Deleted tool account in clouddb1016.eqiad.wmnet:3315 for tools.wikifile-transfer
INFO [root.delete_account:1162] Deleted tool account in clouddb1020.eqiad.wmnet:3315 for tools.wikifile-transfer
INFO [root.delete_account:1162] Deleted tool account in clouddb1015.eqiad.wmnet:3316 for tools.wikifile-transfer
INFO [root.delete_account:1162] Deleted tool account in clouddb1019.eqiad.wmnet:3316 for tools.wikifile-transfer
INFO [root.delete_account:1162] Deleted tool account in clouddb1014.eqiad.wmnet:3317 for tools.wikifile-transfer
INFO [root.delete_account:1162] Deleted tool account in clouddb1018.eqiad.wmnet:3317 for tools.wikifile-transfer
INFO [root.delete_account:1162] Deleted tool account in clouddb1016.eqiad.wmnet:3318 for tools.wikifile-transfer
INFO [root.delete_account:1162] Deleted tool account in clouddb1020.eqiad.wmnet:3318 for tools.wikifile-transfer
INFO [root.delete_account:1178] Deleted replica config for tool account tools.wikifile-transfer
Nov 27 2024, 8:55 PM · SecTeam-Processed, Vuln-Infoleak, Tools, Security
bd808 added a comment to T380991: Various CI jobs failing with: Could not resolve host: gerrit.wikimedia.org (2024-11-27).

Andrew reopened this task as Open.

@Andrew, did you generally disagree with folding this into T374830: Various CI jobs running in the integration Cloud VPS project failing due to transient DNS lookup failures, often for our own hosts such as gerrit.wikimedia.org?

Nov 27 2024, 7:21 PM · ci-test-error (WMF-deployed Build Failure), cloud-services-team, Cloud-VPS, Continuous-Integration-Infrastructure, Release-Engineering-Team
bd808 added a comment to T374830: Various CI jobs running in the integration Cloud VPS project failing due to transient DNS lookup failures, often for our own hosts such as gerrit.wikimedia.org.

Also, having read all the task, I'd bet the problem is first of all a network instability, as @bd808 has suggested.

One thing you could try is to force dns resolution on a few machines in that group to use TCP (declaring use-vc in resolv.conf, IIRC).

Using TCP we should both get more failure tolerance and better debugging info, at the cost of slightly more expensive dns queries. If after the change has been running for a few days, the machines using TCP have a lower failure rate, it's highly probable the problem is indeed with the network.

Nov 27 2024, 4:47 PM · Patch-For-Review, User-aborrero, Cloud-VPS, cloud-services-team, Continuous-Integration-Infrastructure, Release-Engineering-Team (Seen), User-brennen, ci-test-error (WMF-deployed Build Failure)
bd808 renamed T374830: Various CI jobs running in the integration Cloud VPS project failing due to transient DNS lookup failures, often for our own hosts such as gerrit.wikimedia.org from Various CI jobs failing with: Could not resolve host: gerrit.wikimedia.org to Various CI jobs running in the integration Cloud VPS project failing due to transient DNS lookup failures, often for our own hosts such as gerrit.wikimedia.org.
Nov 27 2024, 4:35 PM · Patch-For-Review, User-aborrero, Cloud-VPS, cloud-services-team, Continuous-Integration-Infrastructure, Release-Engineering-Team (Seen), User-brennen, ci-test-error (WMF-deployed Build Failure)
bd808 reopened T374830: Various CI jobs running in the integration Cloud VPS project failing due to transient DNS lookup failures, often for our own hosts such as gerrit.wikimedia.org as "Open".
Nov 27 2024, 4:34 PM · Patch-For-Review, User-aborrero, Cloud-VPS, cloud-services-team, Continuous-Integration-Infrastructure, Release-Engineering-Team (Seen), User-brennen, ci-test-error (WMF-deployed Build Failure)
bd808 merged T380991: Various CI jobs failing with: Could not resolve host: gerrit.wikimedia.org (2024-11-27) into T374830: Various CI jobs running in the integration Cloud VPS project failing due to transient DNS lookup failures, often for our own hosts such as gerrit.wikimedia.org.
Nov 27 2024, 4:33 PM · Patch-For-Review, User-aborrero, Cloud-VPS, cloud-services-team, Continuous-Integration-Infrastructure, Release-Engineering-Team (Seen), User-brennen, ci-test-error (WMF-deployed Build Failure)
bd808 added a comment to T374830: Various CI jobs running in the integration Cloud VPS project failing due to transient DNS lookup failures, often for our own hosts such as gerrit.wikimedia.org.
Nov 27 2024, 4:33 PM · Patch-For-Review, User-aborrero, Cloud-VPS, cloud-services-team, Continuous-Integration-Infrastructure, Release-Engineering-Team (Seen), User-brennen, ci-test-error (WMF-deployed Build Failure)
bd808 merged task T380991: Various CI jobs failing with: Could not resolve host: gerrit.wikimedia.org (2024-11-27) into T374830: Various CI jobs running in the integration Cloud VPS project failing due to transient DNS lookup failures, often for our own hosts such as gerrit.wikimedia.org.
Nov 27 2024, 4:32 PM · ci-test-error (WMF-deployed Build Failure), cloud-services-team, Cloud-VPS, Continuous-Integration-Infrastructure, Release-Engineering-Team
bd808 updated subscribers of T380991: Various CI jobs failing with: Could not resolve host: gerrit.wikimedia.org (2024-11-27).

I thought T374830 was expected to be fully fixed, but I’m also fine with reopening it and closing this one.

Nov 27 2024, 4:31 PM · ci-test-error (WMF-deployed Build Failure), cloud-services-team, Cloud-VPS, Continuous-Integration-Infrastructure, Release-Engineering-Team
bd808 added a comment to T380991: Various CI jobs failing with: Could not resolve host: gerrit.wikimedia.org (2024-11-27).

My instinct is that we should merge this into T374830: Various CI jobs running in the integration Cloud VPS project failing due to transient DNS lookup failures, often for our own hosts such as gerrit.wikimedia.org and keep that task open due to the amount of investigation that has already been done there. DNS lookup instability will have spikes during larger network events, but there is also a relatively steady background issue here that really should be explainable if not preventable.

Nov 27 2024, 4:16 PM · ci-test-error (WMF-deployed Build Failure), cloud-services-team, Cloud-VPS, Continuous-Integration-Infrastructure, Release-Engineering-Team
bd808 moved T380950: wikistream.toolforge.org not working properly with new irc.wikimedia.org implementation from To Do to Volunteer time on the User-bd808 board.
Nov 27 2024, 1:28 AM · User-bd808, Tools
bd808 claimed T380950: wikistream.toolforge.org not working properly with new irc.wikimedia.org implementation.
Nov 27 2024, 1:28 AM · User-bd808, Tools
bd808 created T380950: wikistream.toolforge.org not working properly with new irc.wikimedia.org implementation.
Nov 27 2024, 1:26 AM · User-bd808, Tools
bd808 closed T232547: Create docs on how to restart both varnish and the node process for wikistream as Declined.

The wikistream Cloud VPS project was deleted after migrating wikistream to Toolforge. See also T251555: wikistream.toolforge.org needs new maintainers.

Nov 27 2024, 1:23 AM · VPS-Projects, Documentation
bd808 added a comment to T379712: Run PHPUnit tests for mediawiki/core in parallel.

This work sped up the build process from T369115: [WE6.2.1] Publish pre-train single version containers significantly. Thank you!

[23:56]  <    bd808> kostajh: I just checked on the wmf/next build from last night. The parallel work on wmf-quibble-core-vendor-mysql-php74 helped a bunch. It ran 12 minutes faster than the historical average. We are now seeing wmf-quibble-selenium-php74 as the slowest suite (18m) with mediawiki-quibble-vendor-postgres-php74 close behind at 17m.
[23:56]  <    bd808> More at https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1097575?tab=checks is you're interested.
[00:01]  <    bd808> *if
Nov 27 2024, 12:30 AM · wmde-wikidata-tech, Wikidata, Developer Productivity, MediaWiki-Core-Tests
bd808 awarded T379712: Run PHPUnit tests for mediawiki/core in parallel a Unicorn! token.
Nov 27 2024, 12:24 AM · wmde-wikidata-tech, Wikidata, Developer Productivity, MediaWiki-Core-Tests

Nov 26 2024

bd808 added a comment to T380886: openstack: increase virtual network observability.

running on every toolforge kubernetes worker node, ping other workers on the pod network, and coredns

Nov 26 2024, 10:37 PM · Sustainability (Incident Followup), Cloud-VPS, User-aborrero, cloud-services-team
bd808 updated subscribers of T379683: [WE6.2.6] Create design document for Group -1 deployment.

Today I drafted lists of open questions for testing folks (Quality Services and Test Platform teams) and for SRE folks (Service Operations team) along with a straw-dog proposal utilizing my personal answers to the questions. Docs have been shared with @SDunlap and @akosiaris for further distribution as appropriate. Everything is in gdocs at this point just because that is an easier tool for early stage comments and discussions than wiki pages. My hope is that we get some initial discussion going that we can use in followup conversations during the upcoming MediaWiki/Developer Experience offsite in Barcelona.

Nov 26 2024, 12:37 AM · User-zeljkofilipin, Release-Engineering-Team (Doing 😎), OKR-Work

Nov 25 2024

bd808 added a comment to T379927: Puppet removed "nameserver" line from /etc/resolv.conf.
Nov 25 2024, 4:45 PM · Puppet, Infrastructure-Foundations, Cloud-VPS, cloud-services-team
bd808 updated the task description for T380704: Block/unblock account feature request.
Nov 25 2024, 4:37 PM · Infrastructure-Foundations, Bitu

Nov 23 2024

bd808 updated the task description for T388: Graphical configuration interface.
Nov 23 2024, 9:05 PM · Platform Engineering (Icebox), MediaWiki-Configuration

Nov 22 2024

bd808 added a comment to T377663: Support autocompletion in CodeEditor.

You can disable it by pressing Ctrl + , and unticking Live Autocompletion

How the heck is an end user supposed to figure that out without coming here and somehow finding this ticket? Why is there a hidden preferences menu hidden behind an unintuitive and undocumented keyboard shortcut instead of something standard like a gear icon? It's not like there's a lack of space on the toolbar for the code editor, which currently only has six buttons.

Nov 22 2024, 6:39 PM · User-notice-archive, MW-1.44-notes (1.44.0-wmf.4; 2024-11-19), CodeEditor
bd808 reassigned T380535: Gerrit Projects API listing does not include design/codex-php (Codesearch does not index codex-php) from bd808 to thcipriani.

My guess is the project list cache was off/outdated and setting the description refreshed the cache entry. Well done @bd808

Nov 22 2024, 4:27 PM · User-bd808, Release-Engineering-Team, VPS-project-Codesearch, Gerrit
bd808 added a comment to T379030: openstack: wmfkeystonehooks: project ids rather than names are being used in LDAP group creation.

I am not coming up with reasons to keep mirroring the Keystone project membership data into LDAP at all if it is going to be obfuscated.

Up until this week, I would have said that keystone projects are in ldap for pam/sssd and sudo rule management. @bd808 can you tell me more about what kinds of things you use ldap for directly? I think I understand the stashbot case (which can potentially be altered to talk to keystone directly).

Nov 22 2024, 12:49 AM · Patch-For-Review, User-aborrero, Cloud-VPS, cloud-services-team
bd808 added a comment to T380535: Gerrit Projects API listing does not include design/codex-php (Codesearch does not index codex-php).

Adding a description made it show up for me, can you all confirm?

{"design/blog":{"id":"design%2Fblog","name":"design/blog","parent":"design","state":"READ_ONLY","web_links":[{"name":"gitiles","url":"https://gerrit.wikimedia.org/g/design/blog"}]},"design/codex":{"id":"design%2Fcodex","name":"design/codex","parent":"design","state":"ACTIVE","web_links":[{"name":"gitiles","url":"https://gerrit.wikimedia.org/g/design/codex"}]},"design/codex-php":{"id":"design%2Fcodex-php","name":"design/codex-php","parent":"design","state":"ACTIVE","web_links":[{"name":"gitiles","url":"https://gerrit.wikimedia.org/g/design/codex-php"}]},"design/landing-page":{"id":"design%2Flanding-page","name":"design/landing-page","parent":"All-Projects","state":"READ_ONLY","web_links":[{"name":"gitiles","url":"https://gerrit.wikimedia.org/g/design/landing-page"}]},"design/strategy":{"id":"design%2Fstrategy","name":"design/strategy","parent":"design","state":"READ_ONLY","web_links":[{"name":"gitiles","url":"https://gerrit.wikimedia.org/g/design/strategy"}]},"design/style-guide":{"id":"design%2Fstyle-guide","name":"design/style-guide","parent":"All-Projects","state":"READ_ONLY","web_links":[{"name":"gitiles","url":"https://gerrit.wikimedia.org/g/design/style-guide"}]}}
Nov 22 2024, 12:27 AM · User-bd808, Release-Engineering-Team, VPS-project-Codesearch, Gerrit

Nov 21 2024

bd808 added a comment to T380535: Gerrit Projects API listing does not include design/codex-php (Codesearch does not index codex-php).

Trying some variations:

{"design/codex":{"id":"design%2Fcodex","name":"design/codex","parent":"design","state":"ACTIVE","web_links":[{"name":"gitiles","url":"https://gerrit.wikimedia.org/g/design/codex"}]}}
{"design/codex":{"id":"design%2Fcodex","name":"design/codex","parent":"design","state":"ACTIVE","web_links":[{"name":"gitiles","url":"https://gerrit.wikimedia.org/g/design/codex"}]}}
{}
Nov 21 2024, 10:51 PM · User-bd808, Release-Engineering-Team, VPS-project-Codesearch, Gerrit

Nov 20 2024

bd808 added a comment to T376267: ☂ Wikitech account linking and SUL error reporting.
Wikitech account/LDAP:Rodrigo
SUL accountRodrigo
Account linked on IDMY
I have visited MediaWiki:LoginpromptY
I have tried to reset my password using Special:PasswordResetY

Abandoned account Rodrigo not in use, with zero edit: It gives me error message. Please remove or attach to me.

Nov 20 2024, 10:54 PM · wikitech.wikimedia.org
bd808 edited Description on MediaWiki-Quickstart.
Nov 20 2024, 10:31 PM
bd808 added a comment to T364605: Move Striker to Bitu username validation API.

Tokens can be requests via https://idm-test.wikimedia.org once the attached patch has been merged.

Nov 20 2024, 10:07 PM · Patch-For-Review, Striker, Infrastructure-Foundations, Bitu, cloud-services-team
bd808 added a parent task for T364605: Move Striker to Bitu username validation API: T380384: [toolsadmin] Striker cannot create Developer accounts with names matching existing SUL accounts.
Nov 20 2024, 10:06 PM · Patch-For-Review, Striker, Infrastructure-Foundations, Bitu, cloud-services-team
bd808 added a subtask for T380384: [toolsadmin] Striker cannot create Developer accounts with names matching existing SUL accounts: T364605: Move Striker to Bitu username validation API.
Nov 20 2024, 10:06 PM · Striker
bd808 updated the task description for T380384: [toolsadmin] Striker cannot create Developer accounts with names matching existing SUL accounts.
Nov 20 2024, 9:37 PM · Striker
bd808 renamed T380384: [toolsadmin] Striker cannot create Developer accounts with names matching existing SUL accounts from [toolsadmin] Username is already in use or invalid. to [toolsadmin] Striker cannot create Developer accounts with names matching existing SUL accounts.
Nov 20 2024, 9:36 PM · Striker
bd808 added a comment to T380384: [toolsadmin] Striker cannot create Developer accounts with names matching existing SUL accounts.

This is an unfortunate side effect of T161859: Make Wikitech an SUL wiki having started and T364605: Move Striker to Bitu username validation API not yet having been implemented.

Nov 20 2024, 9:36 PM · Striker
bd808 triaged T380384: [toolsadmin] Striker cannot create Developer accounts with names matching existing SUL accounts as High priority.
Nov 20 2024, 9:35 PM · Striker
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy