refactor: convert workspacesdk.AgentConn to an interface #19392

DanielleMaywood · 2025-08-18T13:37:29Z

We convert workspacesdk.AgentConn to an interface and generate a mock for it. This allows writing coderd tests that rely on the agent's HTTP api to not have to set up an entire tailnet networking stack.

DanielleMaywood · 2025-08-18T13:42:01Z

coderd/coderd.go

+	if options.PrometheusRegistry == nil {
+		options.PrometheusRegistry = prometheus.NewRegistry()
+	}
 	if options.Authorizer == nil {
 		options.Authorizer = rbac.NewCachingAuthorizer(options.PrometheusRegistry)
 		if buildinfo.IsDev() {


This re-order fixes a nil pointer access when no PrometheusRegistry and Authorizer are passed through options.

DanielleMaywood · 2025-08-18T13:42:41Z

coderd/workspaceagents_test.go

-	t.Run("PayloadTooLarge", func(t *testing.T) {
-		t.Parallel()
-
-		var (
-			ctx               = testutil.Context(t, testutil.WaitSuperLong)
-			logger            = slogtest.Make(t, &slogtest.Options{IgnoreErrors: true}).Leveled(slog.LevelDebug)
-			mClock            = quartz.NewMock(t)
-			updaterTickerTrap = mClock.Trap().TickerFunc("updaterLoop")
-			mCtrl             = gomock.NewController(t)
-			mCCLI             = acmock.NewMockContainerCLI(mCtrl)
-
-			client, db = coderdtest.NewWithDatabase(t, &coderdtest.Options{Logger: &logger})
-			user       = coderdtest.CreateFirstUser(t, client)
-			r          = dbfake.WorkspaceBuild(t, db, database.WorkspaceTable{
-				OrganizationID: user.OrganizationID,
-				OwnerID:        user.UserID,
-			}).WithAgent(func(agents []*proto.Agent) []*proto.Agent {
-				return agents
-			}).Do()
-		)
-
-		// WebSocket limit is 4MiB, so we want to ensure we create _more_ than 4MiB worth of payload.
-		// Creating 20,000 fake containers creates a payload of roughly 7MiB.
-		var fakeContainers []codersdk.WorkspaceAgentContainer
-		for range 20_000 {
-			fakeContainers = append(fakeContainers, codersdk.WorkspaceAgentContainer{
-				CreatedAt:    time.Now(),
-				ID:           uuid.NewString(),
-				FriendlyName: uuid.NewString(),
-				Image:        "busybox:latest",
-				Labels: map[string]string{
-					agentcontainers.DevcontainerLocalFolderLabel: "/home/coder/project",
-					agentcontainers.DevcontainerConfigFileLabel:  "/home/coder/project/.devcontainer/devcontainer.json",
-				},
-				Running: false,
-				Ports:   []codersdk.WorkspaceAgentContainerPort{},
-				Status:  string(codersdk.WorkspaceAgentDevcontainerStatusRunning),
-				Volumes: map[string]string{},
-			})
-		}
-
-		mCCLI.EXPECT().List(gomock.Any()).Return(codersdk.WorkspaceAgentListContainersResponse{Containers: fakeContainers}, nil)
-		mCCLI.EXPECT().DetectArchitecture(gomock.Any(), gomock.Any()).Return("<none>", nil).AnyTimes()
-
-		_ = agenttest.New(t, client.URL, r.AgentToken, func(o *agent.Options) {
-			o.Logger = logger.Named("agent")
-			o.Devcontainers = true
-			o.DevcontainerAPIOptions = []agentcontainers.Option{
-				agentcontainers.WithClock(mClock),
-				agentcontainers.WithContainerCLI(mCCLI),
-				agentcontainers.WithWatcher(watcher.NewNoop()),
-			}
-		})
-
-		resources := coderdtest.NewWorkspaceAgentWaiter(t, client, r.Workspace.ID).Wait()
-		require.Len(t, resources, 1, "expected one resource")
-		require.Len(t, resources[0].Agents, 1, "expected one agent")
-		agentID := resources[0].Agents[0].ID
-
-		updaterTickerTrap.MustWait(ctx).MustRelease(ctx)
-		defer updaterTickerTrap.Close()
-
-		containers, closer, err := client.WatchWorkspaceAgentContainers(ctx, agentID)
-		require.NoError(t, err)
-		defer func() {
-			closer.Close()
-		}()
-
-		select {
-		case <-ctx.Done():
-			t.Fail()
-		case _, ok := <-containers:
-			require.False(t, ok)
-		}
-	})


This test case is now covered by the test in coderd/workspaceagents_internal_test.go.

We convert `workspacesdk.AgentConn` to an interface and generate a mock for it. This allows writing `coderd` tests that rely on the agent's HTTP api to not have to set up an entire tailnet networking stack.

johnstcn

Refactor looks fine to me 👍 Thanks for de-flaking this!

mafredri

The humongous new interface does not feel ideal, but I also don't have a better suggestion at the moment. One suggestion to improve the test but otherwise LGTM.

mafredri · 2025-08-18T15:01:04Z

coderd/workspaceagents_internal_test.go

+		decodeCh := decoder.Chan()
+
+		// When: We close the `containersCh`
+		close(containersCh)


Should we have at least one test message sent first to confirm the base case works and then the exit?

Sounds good to me

sreya · 2025-08-18T19:43:30Z

codersdk/workspacesdk/agentconn.go

 		Conn: conn,
 		opts: opts,
 	}
 }

 // AgentConn represents a connection to a workspace agent.
 // @typescript-ignore AgentConn
-type AgentConn struct {
+type AgentConn interface {


Given this is an SDK are we really ready to commit to this just to fix a flake? It feels a little heavy handed

Without this or something like it it's basically impossible to test Coderd routes that need to interact with Agents without the whole baggage of a real tailnet and a real agent.

Our testing is straining under the weight of spinning up all this stuff to test individual API handlers.

SDKs that we develop need to be designed with mocking in mind, otherwise we can't effectively unit test code that depends on them.

sreya · 2025-08-18T19:45:33Z

@DanielleMaywood could you elaborate on the underlying cause of the flake?

DanielleMaywood · 2025-08-18T20:20:34Z

@DanielleMaywood could you elaborate on the underlying cause of the flake?

Sure thing.

The underlying cause of the flake is that it is just incredibly slow. On my workspace it could take up to around 10 seconds, and that is dedicated hardware without contention. Running it in CI means results in it often exceeding the testutil.WaitLong duration of 25 seconds, and relying on testutil.WaitSuperLong feels like a code smell.

You can see in this flake that it took almost 8 seconds just for the tailnet to begin accepting TCP connections (test started at 05:29:36.490 and only accepted the connection at 05:29:42.288)
https://github.com/coder/coder/actions/runs/17031823892/job/48276142706#step:15:932

As you can see in the logs, most of the time is spent setting up the agent and the networking stack. We don't need any of this running to test for a specific behavior on coderd.

The idea behind the change to make workspacesdk.AgentConn an interface, rather than a struct, is that we can entirely eliminate the agent and the agent's tailnet. This removes a very slow part of the test, and allows more control over triggering certain behavior. This would also allow future tests to be written without this heavy dependency (so not just aiming for this specific flake).

github-actions bot assigned DanielleMaywood Aug 18, 2025

DanielleMaywood commented Aug 18, 2025

View reviewed changes

DanielleMaywood added 2 commits August 18, 2025 13:53

refactor: convert workspacesdk.AgentConn to an interface

1cc4903

We convert `workspacesdk.AgentConn` to an interface and generate a mock for it. This allows writing `coderd` tests that rely on the agent's HTTP api to not have to set up an entire tailnet networking stack.

chore: appease linter

ea4de03

DanielleMaywood force-pushed the danielle/flake/payload-too-large branch from d579e1d to ea4de03 Compare August 18, 2025 13:53

chore: comments

fa0b97e

DanielleMaywood marked this pull request as ready for review August 18, 2025 14:31

DanielleMaywood requested review from johnstcn, spikecurtis and mafredri August 18, 2025 14:44

johnstcn approved these changes Aug 18, 2025

View reviewed changes

mafredri approved these changes Aug 18, 2025

View reviewed changes

chore: add validation that channel works first

123760a

sreya reviewed Aug 18, 2025

View reviewed changes

DanielleMaywood requested a review from sreya August 20, 2025 00:12

Merge branch 'main' into danielle/flake/payload-too-large

4c9a9e6

DanielleMaywood merged commit 5e84d25 into main Aug 20, 2025
31 checks passed

DanielleMaywood deleted the danielle/flake/payload-too-large branch August 20, 2025 09:00

github-actions bot locked and limited conversation to collaborators Aug 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor: convert workspacesdk.AgentConn to an interface #19392

refactor: convert workspacesdk.AgentConn to an interface #19392

DanielleMaywood commented Aug 18, 2025 •

edited

Loading

Uh oh!

DanielleMaywood Aug 18, 2025

Uh oh!

DanielleMaywood Aug 18, 2025

Uh oh!

johnstcn left a comment

Uh oh!

mafredri left a comment

Uh oh!

mafredri Aug 18, 2025

Uh oh!

DanielleMaywood Aug 18, 2025

Uh oh!

sreya Aug 18, 2025

Uh oh!

spikecurtis Aug 19, 2025

Uh oh!

sreya commented Aug 18, 2025

Uh oh!

DanielleMaywood commented Aug 18, 2025

Uh oh!

Uh oh!

Uh oh!

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

refactor: convert workspacesdk.AgentConn to an interface #19392

refactor: convert workspacesdk.AgentConn to an interface #19392

Conversation

DanielleMaywood commented Aug 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DanielleMaywood Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

DanielleMaywood Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

johnstcn left a comment

Choose a reason for hiding this comment

Uh oh!

mafredri left a comment

Choose a reason for hiding this comment

Uh oh!

mafredri Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

DanielleMaywood Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

sreya Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

spikecurtis Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

sreya commented Aug 18, 2025

Uh oh!

DanielleMaywood commented Aug 18, 2025

Uh oh!

Uh oh!

Uh oh!

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

DanielleMaywood commented Aug 18, 2025 •

edited

Loading