This pull-request addresses a few design things within the `<AgentRow
/>` element. This is a follow-on from the previous work done with
implementing tabs.
- Workspace border can no longer be red, will always be orange (this was
done in a previous PR but not stated).
- Warnings have been moved to inside the Agent Logs collapsible.
- Warning badge has been added to the Agent Logs collapsible trigger.
- Collapsible is now open by default when there is an error inside of
the agent.
- Agent disconnected is no longer prominent by default.
Add workspace secrets as a field in the agent manifest protobuf schema.
This allows the control plane to pass user secrets to agents for runtime
injection into workspace sessions.
Message fields:
- env_name: environment variable name (empty for file-only secrets)
- file_path: file path (empty for env-only secrets)
- value: the decrypted secret value as bytes
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore <dependency name> major version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's major version (unless you unignore this specific
dependency's major version or upgrade to it yourself)
- `@dependabot ignore <dependency name> minor version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's minor version (unless you unignore this specific
dependency's minor version or upgrade to it yourself)
- `@dependabot ignore <dependency name>` will close this group update PR
and stop Dependabot creating any more for the specific dependency
(unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore <dependency name>` will remove all of the ignore
conditions of the specified dependency
- `@dependabot unignore <dependency name> <ignore condition>` will
remove the ignore condition of the specified dependency and ignore
conditions
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps rust from `a08d20a` to `cf09adf`.
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Fixes https://github.com/coder/internal/issues/1461
Two synchronization issues caused
`TestPortableDesktop_IdleTimeout_StopsRecordings` (and the
`MultipleRecordings` variant) to flake on macOS CI:
1. **`clk.Advance(idleTimeout)` was not awaited.** In
`MultipleRecordings`, both idle timers fire simultaneously but their
`fire()` goroutines race to remove themselves from the mock clock's
event list. Without `MustWait`, the second timer may still be in `m.all`
when the next `Advance` is called, causing `"cannot advance ... beyond
next timer/ticker event in 0s"`.
2. **The test depended on SIGINT being handled promptly.** After the
`stop_timeout` timer was released, the test relied entirely on the shell
process handling SIGINT (via `rec.done`). On macOS, `/bin/sh` may not
interrupt `wait` reliably, leaving `lockedStopRecordingProcess` blocked
in its `select` while holding `p.mu` — deadlocking the
`require.Eventually` callback.
### Fix
Wait for each `Advance` to complete and advance past the 15s stop
timeout so the process is forcibly killed via the timer path,
independent of signal handling.
Verified with 1000 iterations (500 per test) with zero failures.
> Generated with [Coder Agents](https://coder.com/agents)
The GetChats SQL query ordered by (updated_at, id) DESC with no
pin_order awareness. A pinned chat with an old updated_at could
land on page 2+ and be invisible in the sidebar's Pinned section.
Add a 4-column ORDER BY: pinned-first flag DESC, negated pin_order
DESC, updated_at DESC, id DESC. The negation trick keeps all sort
columns DESC so the cursor tuple < comparison still works. Update
the after_id cursor clause to match the expanded sort key.
Fix the false handler comment claiming PinChatByID bumps updated_at.
Add dbcrypt support for user secret values. When database encryption is
enabled, secret values are transparently encrypted on write and
decrypted on read through the existing dbcrypt store wrapper.
- Wrap `CreateUserSecret`, `GetUserSecretByUserIDAndName`,
`ListUserSecretsWithValues`, and `UpdateUserSecretByUserIDAndName` in
enterprise/dbcrypt/dbcrypt.go.
- Add rotate and decrypt support for user secrets in
enterprise/dbcrypt/cliutil.go (`server dbcrypt rotate` and `server
dbcrypt decrypt`).
- Add internal tests covering encrypt-on-create, decrypt-on-read,
re-encrypt-on-update, and plaintext passthrough when no cipher is
configured.
After the cherry-pick workflow creates a backport PR, it now comments on
the original PR to notify the author with a link to the new PR.
If the cherry-pick had conflicts, the comment includes a warning.
## Changes
- Capture the URL output of `gh pr create` into `NEW_PR_URL`
- Add `gh pr comment` on the original PR with the link
- Append a conflict warning to the comment when applicable
> Generated by Coder Agents
Closes#16332
Previously `coder provisioner jobs list` showed no indication of what a workspace
build job was doing (i.e., start, stop, or delete). This adds
`workspace_build_transition` to the provisioner job metadata, exposed in
both the REST API and CLI. Template and workspace name columns were also
added, both available via `-c`.
```
$ coder provisioner jobs list -c id,type,status,"workspace build transition"
ID TYPE STATUS WORKSPACE BUILD TRANSITION
95f35545-a59f-4900-813d-80b8c8fd7a33 template_version_import succeeded
0a903bbe-cef5-4e72-9e62-f7e7b4dfbb7a workspace_build succeeded start
```
The "By model" and "Pull requests" tables on the PR Insights page
(`/agents/settings/insights`) were side-by-side at `lg` breakpoints, and
the Pull requests table was hard-capped at 20 rows by the backend.
- Replaced `lg:grid-cols-2` with a single-column stacked layout so both
tables span the full content width.
- Removed the `LIMIT 20` from the `GetPRInsightsRecentPRs` SQL query so
all PRs in the selected time range are returned.
- Can add this back if we need it. If we do, we should add a little
subheader above this table to indicate that we're not showing all PRs
within the selected timeframe.
- Added client-side pagination to the Pull requests table using
`PaginationWidgetBase` (page size 10), matching the existing pattern in
`ChatCostSummaryView`.
- Renamed the section heading from "Recent" to "Pull requests" since it
now shows the full set for the time range.
<img width="1481" height="1817" alt="image"
src="https://github.com/user-attachments/assets/0066c42f-4d7b-4cee-b64b-6680848edc68"
/>
> 🤖 PR generated with Coder Agents
When a devcontainer subagent is terraform-managed, the provisioner sets
its directory to the host-side `workspace_folder` path at build time. At
runtime, the agent injection code determines the correct
container-internal
path from `devcontainer read-configuration` and sends it via
`CreateSubAgent`.
However, the `CreateSubAgent` handler only updated `display_apps` for
pre-existing agents, ignoring the `Directory` field. This caused
SSH/terminal
sessions to land in `~` instead of the workspace folder (e.g.
`/workspaces/foo`).
Add `UpdateWorkspaceAgentDirectoryByID` query and call it in the
terraform-managed subagent update path to also persist the directory.
Fixes PLAT-118
<details><summary>Root cause analysis</summary>
Two code paths set the subagent `Directory` field:
1. **Provisioner (build time):** `insertDevcontainerSubagent` in
`provisionerdserver.go`
stores `dc.GetWorkspaceFolder()` — the **host-side** path from the
`coder_devcontainer` Terraform resource (e.g. `/home/coder/project`).
2. **Agent injection (runtime):**
`maybeInjectSubAgentIntoContainerLocked` in
`api.go` reads the devcontainer config and gets the correct
**container-internal**
path (e.g. `/workspaces/project`), then calls `client.Create(ctx,
subAgentConfig)`.
For terraform-managed subagents (those with `req.Id != nil`),
`CreateSubAgent`
in `coderd/agentapi/subagent.go` recognized the pre-existing agent and
entered
the update path — but only called `UpdateWorkspaceAgentDisplayAppsByID`,
discarding the `Directory` field from the request. The agent kept the
stale
host-side path, which doesn't exist inside the container, causing
`expandPathToAbs` to fall back to `~`.
</details>
> [!NOTE]
> Generated by Coder Agents
Previously, editing a past user message in Agents chat waited for the
PATCH round-trip and cache reconciliation before the conversation
visibly settled. The edited bubble and truncated tail could briefly fall
back to older fetched state, and a failed edit did not restore the full
local editing context cleanly.
Keep history editing optimistic end-to-end: update the edited user
bubble and truncate the tail immediately, preserve that visible
conversation until the authoritative replacement message and cache catch
up, and restore the draft/editor/attachment state on failure. The route
already scopes each `agentId` to a keyed `AgentChatPage` instance with
its own store/cache-writing closures, so navigating between chats does
not need an extra post-await active-chat guard to keep one chat's edit
response out of another chat.
Fixes https://github.com/coder/internal/issues/1440
- Convert `OrganizationAutocomplete` to a purely presentational, fully
controlled component
- Accept `value`, `onChange`, `options` from parent; remove internal
state, data fetching, and permission filtering
- Update `CreateTemplateForm` and `CreateUserForm` to own org fetching,
permission checks, auto-select, and invalid-value clearing inline
- Memoize `orgOptions` in callers for stable `useEffect` deps
- Rewrite Storybook stories for the new controlled API
> 🤖 Written by a Coder Agent. Reviewed by a human.
Closes CODAGT-123
Assistant messages containing only source parts (no markdown or
reasoning)
were missing the bottom spacer that normally fills the gap left by the
hidden
action bar, causing them to sit flush against the next user bubble.
The existing fallback spacer guarded on `Boolean(parsed.reasoning)`, so
it
only fired for thinking-only replies. Replace that guard with the
broader
`hasRenderableContent` flag (which covers blocks, tools, and sources)
and
extract a named `needsAssistantBottomSpacer` boolean so future content
types
inherit consistent spacing without re-reading compound conditions.
Adds a `SourcesOnlyAssistantSpacing` Storybook story mirroring the
existing
`ThinkingOnlyAssistantSpacing` pattern for regression coverage.
Closes CODAGT-124
When a streaming assistant response finishes and moves from the live
stream
tail into the conversation timeline, the message jumps 4px upward. This
happens because the outer layout wrapper and live-stream section both
used
`gap-3` (12px), while the committed-message list used `gap-2` (8px).
Unify all three containers to `gap-2` so the gap between messages stays
at 8px regardless of whether they're streaming or committed, eliminating
the layout shift.
A Storybook story with play-function assertions locks the invariant: it
renders both committed messages and an active stream, then verifies both
the outer and inner containers report `rowGap === "8px"`.
Documents the private/reserved IP range restrictions added to AI Gateway
Proxy:
- **Restricting proxy access**: Updated to reflect that private/reserved
IP ranges are now blocked by default, with atomic IP validation to
prevent DNS rebinding. Documents the Coder access URL exemption and the
`CODER_AIBRIDGE_PROXY_ALLOWED_PRIVATE_CIDRS` option.
- **Upstream proxy**: Added a note on the DNS rebinding limitation when
an upstream proxy is configured, and that upstream proxies should
enforce their own restrictions.
> [!NOTE]
> Initially generated by Coder Agents, modified and reviewed by
@ssncferreira
Follow-up: #23109
Fixes a regression introduced in #24060 that could crash the frontend.
`thunk` is created by `useEffectEvent()`, and React 19.2 enforces that
effect-event functions are not invoked during render. The previous code
called `thunk()` inside a `setState` updater function, and React
executes updater
functions during render, so this became an illegal render-phase call.
The fix computes `next` in the interval callback (`const next =
thunk()`) and then stores it via `setComputedValue(() => next)`. This
keeps the `useEffectEvent` call outside render and also preserves
correct behavior when `func` returns a function value, because React
stores `next` instead of treating it as a functional updater.
Go's html/template has a built-in security filter (urlFilter) that only
allows http, https, and mailto URL schemes. Any other scheme gets
replaced with #ZgotmplZ.
The OAuth2 app's callback URL uses custom URI scheme which the filter
considers unsafe. For example the Coder JetBrains plugin exposes a
callback URI with the scheme jetbrains:// - which was effectively
changed by the template engine into #ZgotmplZ. Of course this is not an
actual callback. When users clicked the cancel button nothing happened.
The fix was simple - we now wrap the apps registered callback URI into
htmltemplate.URL. Usually this needs some validation otherwise the
linter will complain about it. The callback URI used by the Cancel logic
is actually validated by our backend when the client app
programmatically registered via the dynamic OAuth2 registration
endpoints, so we refactored the validation around that code and re-used
some of it in the Cancel handling to make sure we don't allow URIs like
`javascript` and `data`, even though in theory these URIs were already
validated.
In addition, while testing this PR with
https://github.com/coder/coder-jetbrains-toolbox/pull/209 I discovered
that we are also not compliant with
https://www.rfc-editor.org/rfc/rfc6749#section-4.1.2.1 which requires
the server to attach the local state if it was provided by the client in
the original request. Also it is optional but generally a good practice
to include `error_description` in the error responses. In fact we follow
this pattern for the other types of error responses. So this is not a
one off.
- resolves#20323
<img width="1485" height="771" alt="Cancel_page_with_invalid_uri"
src="https://github.com/user-attachments/assets/5539d234-9ce3-4dda-b421-d023fc9aa99e"
/>
<img width="486" height="746" alt="Coder Toolbox handling the Cancel
button"
src="https://github.com/user-attachments/assets/acab71a6-d29c-4fa9-80ba-3c0095bbdc8f"
/>
<!--
If you have used AI to produce some or all of this PR, please ensure you
have read our [AI Contribution
guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING)
before submitting.
-->
\`InlineMarkdown\` and \`MemoizedInlineMarkdown\` lived in
\`Markdown.tsx\`
alongside a static \`import { Prism as SyntaxHighlighter } from
"react-syntax-highlighter"\` — the full PrismJS build with ~300 language
grammars. Because \`DashboardLayout\` eagerly imports
\`AnnouncementBannerView → InlineMarkdown\`, every authenticated page
loaded and evaluated the entire Prism/refractor bundle on startup even
though syntax highlighting is only used in secondary views.
This PR moves \`InlineMarkdown\` and \`MemoizedInlineMarkdown\` into
their
own \`InlineMarkdown.tsx\` file that depends only on \`react-markdown\`
and
updates all six consumers to import from the new module.
\`Markdown.tsx\`
keeps the PrismJS import for the full \`Markdown\` component, which is
only reached through lazy-loaded routes.
> 🤖 Generated by Coder Agents
Reorder error checks in isRetryableError so IsConnectionError is evaluated before context.DeadlineExceeded. Dial timeouts (*net.OpError wrapping DeadlineExceeded) were incorrectly treated as non-retryable, causing Coder Connect to fail immediately on broken tunnels with valid DNS despite existing retry logic.
Fixes#24201
Add the five REST endpoints for managing user secrets, SDK client
methods, and handler tests.
Endpoints:
- `POST /api/v2/users/{user}/secrets`
- `GET /api/v2/users/{user}/secrets`
- `GET /api/v2/users/{user}/secrets/{name}`
- `PATCH /api/v2/users/{user}/secrets/{name}`
- `DELETE /api/v2/users/{user}/secrets/{name}`
Routes are registered under the existing `/{user}` group with
`ExtractUserParam`. The delete query was changed from `:exec` to
`:execrows` so the handler can distinguish "not found" from success
(DELETE with `:exec` silently returns nil for zero affected rows).
## What
Bumps `coder/tailscale` to
[`e956a95`](e956a95074)
([PR #113](https://github.com/coder/tailscale/pull/113)) to pick up the
`RTM_MISS` fix for the Darwin network monitor.
Already released on `release/2.31` as v2.31.8. (#24185) to unblock a
customer. This PR is to update `main`.
## Why
On Darwin, `RTM_MISS` route-socket messages (fired on every failed route
lookup) were not filtered by `netmon`, causing each one to be treated as
a `LinkChange`. When netcheck sends STUN probes to an IPv6 address with
no route, this creates a self-sustaining feedback loop: `RTM_MISS` →
`LinkChange` → `ReSTUN` → netcheck → v6 STUN probe → `RTM_MISS` → …
The loop drives DERP home-region flapping at ~70× baseline, which at
fleet scale saturates PostgreSQL's `NOTIFY` lock and causes coordinator
health-check timeouts.
The upstream fix adds a single `if msg.Type == unix.RTM_MISS { return
true }` check to `skipRouteMessage`. This is safe because `RTM_MISS` is
a lookup-path signal, not a table-mutation signal — route withdrawals
always emit `RTM_DELETE` before any subsequent lookup can miss.
Of note is that this issue has only been reported recently, since users
updated to macOS 26.4.
Relates to ENG-2394
## Summary
Exposes `credential_kind` and `credential_hint` on AI Bridge session
threads, making credential metadata visible in the session detail API.
Each thread in the `/api/v2/aibridge/sessions/{session_id}` response now
includes:
- `credential_kind`: `centralized` or `byok`
- `credential_hint`: masked credential (e.g. `sk-a...pgAA`)
Values are taken from the thread's root interception.
## Changes
- `codersdk/aibridge.go`: Added `CredentialKind` and `CredentialHint`
fields to `AIBridgeThread`
- `coderd/database/db2sdk/db2sdk.go`: Populated from root interception
in `buildAIBridgeThread`
- `SessionTimeline.stories.tsx`: Added fields to mock thread data
The startup-timeout integration tests in `chatloop` used a 5ms real-time
budget and relied on wall-clock scheduling to fire the startup guard
timer before the first stream part arrived. On loaded CI runners the
timer sometimes lost the race, producing `attempts == 2` instead of
`attempts == 1` and flaking `TestRun_FirstPartDisarmsStartupTimeout`.
Replace the real `time.Timer` in `startupGuard` with a `quartz.Timer` so
tests can control time deterministically. Production behavior is
unchanged: `RunOptions.Clock` defaults to `quartz.NewReal()` when nil,
and the startup timeout still covers both opening the provider stream
and waiting for the first stream part.
- Add `RunOptions.Clock quartz.Clock` with nil-safe default.
- Tag the startup guard timer as `"startupGuard"` for quartz trap
targeting.
- Rewrite the four startup-timeout integration tests to use
`quartz.NewMock(t)` with trap/advance/release sequences instead of
wall-clock sleeps.
- Add `awaitRunResult` helper so tests fail with a clear message instead
of hanging when `Run` does not complete.
Closes https://github.com/coder/internal/issues/1460
Adds a GitHub Actions workflow that runs on PRs targeting `release/*`
branches to flag non-bug-fix cherry-picks.
## What it does
- Triggers on `pull_request_target` (opened, reopened, edited) for
`release/*` branches
- Checks if the PR title starts with `fix:` or `fix(scope):`
(conventional commit format)
- If not a bug fix, comments on the PR informing the author and emits a
warning (via `core.warning`), but does **not** fail the check
- Deduplicates comments on title edits by updating an existing comment
(identified by a hidden HTML marker) instead of creating a new one
> [!NOTE]
> Generated by Coder Agents
Adds `coder exp chat context add` and `coder exp chat context clear`
commands that run inside a workspace to manage chat context files via
the agent token.
`add` reads instruction and skill files from a directory (defaulting to
cwd) and inserts them as context-file messages into an active chat.
Multiple calls are additive — `instructionFromContextFiles` already
accumulates all context-file parts across messages.
`clear` soft-deletes all context-file messages, causing
`contextFileAgentID()` to return `!found` on the next turn, which
triggers `needsInstructionPersist=true` and re-fetches defaults from the
agent.
Both commands auto-detect the target chat via `CODER_CHAT_ID` (already
set by `agentproc` on chat-spawned processes), or fall back to
single-active-chat resolution for the agent. The `--chat` flag overrides
both.
Also adds sub-agent context inheritance: `createChildSubagentChat` now
copies parent context-file messages to child chats at spawn time, so
delegated sub-agents share the same instruction context without
independently re-fetching from the workspace agent.
<details><summary>Implementation details</summary>
**New files:**
- `cli/exp_chat.go` — CLI command tree under `coder exp chat context`
**Modified files:**
- `agent/agentcontextconfig/api.go` — `ConfigFromDir()` reads context
from an arbitrary directory without env vars
- `codersdk/agentsdk/agentsdk.go` — `AddChatContext`/`ClearChatContext`
SDK methods
- `coderd/workspaceagents.go` — POST/DELETE handlers on
`/workspaceagents/me/chat-context`
- `coderd/coderd.go` — Route registration
- `coderd/database/queries/chats.sql` — `GetActiveChatsByAgentID`,
`SoftDeleteContextFileMessages`
- `coderd/database/dbauthz/dbauthz.go` — RBAC implementations for new
queries
- `coderd/x/chatd/subagent.go` — `copyParentContextFiles` for sub-agent
inheritance
- `cli/root.go` — Register `chatCommand()` in `AGPLExperimental()`
**Auth pattern:** Uses `AgentAuth` (same as `coder external-auth`) —
agent token via `CODER_AGENT_TOKEN` + `CODER_AGENT_URL` env vars.
</details>
> 🤖 Generated by Coder Agents
---------
Co-authored-by: Michael Suchacz <203725896+ibetitsmike@users.noreply.github.com>
## Problem
`go run` caches the final linked executable in `~/.cache/go-build`.
Every
helper invocation via `go run ./scripts/<tool>` stores a copy, and
because
the cache key includes build metadata, the same tool accumulates
multiple
cached executables over time. With 12+ helper binaries invoked during
`make gen` and `make pre-commit`, this is a meaningful contributor to
GOCACHE growth.
## Fix
Replace `go run` with `go build -o _gen/bin/<tool>` for 12 repo-local
helper packages (16 Makefile callsites). Each helper is an explicit Make
file target with `$(wildcard *.go)` prerequisites, so `make -j`
serializes
builds correctly instead of racing on shared output paths.
Helpers converted: `apitypings`, `auditdocgen`, `check-scopes`,
`clidocgen`, `dbdump`, `examplegen`, `gensite`, `apikeyscopesgen`,
`metricsdocgen`, `metricsdocgen-scanner`, `modeloptionsgen`, `typegen`.
Left on `go run` (intentionally): `migrate-ci` and `migrate-test`
(CI/test-only, not on common developer paths).
`_gen/` is already in `.gitignore`. The `clean` target removes
`_gen/bin`.
## GOCACHE growth (isolated cache, single `make gen`)
| | Old (`go run`) | New (`go build -o`) |
|--|----------------|---------------------|
| Total cache size | 2.9 GB | 2.6 GB |
| Cached executables | 11 | 4 |
| Executable bytes | 401 MB | 25 MB |
The 4 remaining executables come from tools outside this change
(`dbgen` and `goimports` from `generate.sh`, plus two `main` binaries
from deferred helpers). Helper binaries now live in `_gen/bin/`
(581 MB, gitignored, cleaned by `make clean`).
## Build time benchmarks
**Source changed** (content hash invalidated, forces recompile):
| Helper | `go run` | `go build -o` + run | Overhead |
|--------|---------|---------------------|----------|
| typegen | 1.50s | 2.03s | +0.52s |
| examplegen | 1.37s | 1.67s | +0.30s |
| apikeyscopesgen | 1.21s | 1.71s | +0.50s |
| modeloptionsgen | 1.23s | 1.64s | +0.41s |
**Repeat invocation** (no source change, the common `make gen` / `make
pre-commit` path):
| Helper | `go run` (cache lookup) | Cached binary | Speedup |
|--------|------------------------|---------------|---------|
| typegen | 0.346s | 0.037s | 9.4x |
| examplegen | 0.368s | 0.037s | 9.9x |
| modeloptionsgen | 0.342s | 0.021s | 16.3x |
| apikeyscopesgen | 0.298s | 0.030s | 9.9x |
When source changes, `go build -o` is 0.3-0.5s slower per helper (it
writes a local binary instead of caching in GOCACHE). On repeat runs
(the common path), the pre-built binary is 10-16x faster because
`go run` still does a staleness check while the binary just executes.
> This PR was authored by Mux on behalf of Mike.
the current page has an "Agentic loop completed" block that doesn't
really contain any valuable info that isn't available elsewhere. replace
this with a status indicator
<img width="507" height="300" alt="Screenshot 2026-04-08 at 2 47 40 PM"
src="https://github.com/user-attachments/assets/09cf3772-a52d-485d-a15e-b2257b2d9003"
/>
couple of little design tweaks to make the UI of the Request Logs page
and Sessions pages more consistent:
- decrease size of Request Logs page chevron
- copy Request Logs page chevron animation for Sessions expandable
sections
- use TokenBadges component in RequestLogsRow
- wrap tool call counts in badges
<img width="1393" height="210" alt="Screenshot 2026-04-08 at 1 56 10 PM"
src="https://github.com/user-attachments/assets/97e7acb6-71c7-48d6-b0df-a102c7602cc0"
/>
Frontend for https://github.com/coder/coder/pull/24022.
From that PR's description:
> The agents chat interface displays thumbnails for videos recorded by
the computer use agent. Currently, to display a thumbnail, the frontend
downloads the entire video and shows the first frame.
#24022 adds a thumbnail file id to `wait_agent` tool results, and this
PR displays it instead of fetching the entire video.
The agents chat interface displays thumbnails for videos recorded by the
computer use agent. Currently, to display a thumbnail, the frontend
downloads the entire video and shows the first frame. This PR starts
storing a new thumbnail file in the database for every recorded video,
and exposes the file id in the `wait_agent` tool result alongside the
recording file id, so the frontend can fetch just the thumbnail.
The cherry-pick and backport workflows create PRs under
`github-actions[bot]`. Since GitHub doesn't support creating PRs on
behalf of another user, this adds attribution to the user who added the
label (`github.event.sender.login`):
- **Assignee**: the labeler is assigned to the backport PR
- **Reviewer**: the labeler is added as a reviewer
- **PR body**: includes "Requested by: @user"
Applied to both `cherry-pick.yaml` and `backport.yaml`.
---
> Generated by Coder Agents
## Motivation
During the April 2 dogfood incident, a pod OOM-kill triggered a
reconnection storm: hundreds of chat-stream and agent-RPC websockets all
attempted to reconnect at the same deterministic backoff intervals (1 s,
2 s, 4 s, …). Because every browser tab computed the same delay, the
surviving replicas received a synchronized wall of new connections at
each retry tick, amplifying the overload that caused the first OOM in
the first place.
The root cause of the memory blowup (chatd serialization cost) is a
separate issue. This change addresses the secondary blast-radius
problem: when N clients reconnect in lockstep, the retry storm itself
becomes a capacity threat.
## Change
The shared `createReconnectingWebSocket` utility now applies symmetric
jitter (default ±30%) to the capped exponential-backoff delay before
scheduling the reconnect timer. With 100 clients and a 1 s base delay,
reconnects spread over the 700 ms–1300 ms window instead of all landing
at exactly 1000 ms, and once retries hit `maxMs` the scheduler still
preserves downward spread instead of collapsing back to a single tick.
Two new options are accepted by callers:
- **`jitter`** (0–1 fraction, default `0.3`) — controls the jitter
window. Values are clamped to `[0, 1]`; `0` preserves exact legacy
timing.
- **`random`** (`() => number`, default `Math.random`) — injectable RNG,
primarily a deterministic test seam. Non-finite output falls back to the
midpoint (`0.5`).
The `retryingAt` timestamp surfaced to `ChatStatusCallout` is computed
from the jittered delay, so the countdown shown to users reflects the
actual retry time. The scheduler also keeps `maxMs` as a hard ceiling on
the final delay and saturates exponential overflow at that cap instead
of dropping to `0ms` retries.
No production callers need changes — the default jitter activates
automatically for all four call sites (`AgentsPage` chat-list watcher,
`AgentChatPage` workspace watcher, `useChatStore` per-chat stream,
`useGitWatcher`). The two downstream tests that asserted exact reconnect
timing now pin `Math.random()` to `0.5` so those expectations stay
deterministic.
This pull-request makes a few changes to our `<Badge />` component to
bring it inline with Figma.
* Added all variants to the stories of Figma (they can vary per
badge-type, so its better we track everything).
* Removed the `border` variant of the component, border variants should
be on all `sm` and `md`.
* Added a hover effect to the `default` variant (per-design).
* Resolved issue with sizings of `xs` and `sm` plus resolved
iconography.
* Resolved issue with icons not showing at all on `xs` variants.
closes CODAGT-122
Add a spacer div that renders only when an assistant message lacks the
action bar, matching the height the action bar would provide.
> 🤖 Generated by Coder Agents
## Problem
Workspaces showed as "Healthy" immediately after creation while the
agent was still downloading, starting, or connecting. If the agent never
connected, the workspace stayed "Healthy" for the entire connection
timeout (~120s), then abruptly flipped to "Unhealthy".
## Root cause
In `db2sdk.WorkspaceAgent`, the health switch had no case for
`WorkspaceAgentConnecting`. Agents in `connecting` status with a
non-`off` lifecycle (e.g. `created` after a fresh build) fell through to
the `default` case and were marked `Healthy = true`.
## Fix
Add an explicit case for `WorkspaceAgentConnecting` that sets `Healthy =
false` with reason `"agent has not yet connected"`. The case is placed
after the existing `!connected + off` case (which correctly catches
stopped agents as "not running") and before the `timeout`/`disconnected`
cases.
```
Status + Lifecycle → Health reason
──────────────────────────────────────────────────────
any !connected + off → "agent is not running"
connecting + created/starting → "agent has not yet connected" ← NEW
timeout + any → "agent is taking too long to connect"
disconnected + any → "agent has lost connection"
connected + start_error → "agent startup script exited with an error"
connected + shutting_down → "agent is shutting down"
connected + ready/starting → healthy
```
The frontend already handles this case — `getAgentHealthIssue()` returns
"Workspace agent is still connecting" with `severity: "info"` for
unhealthy workspaces with connecting agents.
## Test changes
- **Healthy test**: now actually connects the agent via `agenttest.New`
before asserting health (previously passed due to the bug).
- **New Connecting test**: verifies a never-connected agent is correctly
marked unhealthy.
- **Mixed health test**: connects a1 and waits for the mixed state
(`a1.Healthy && !workspace.Healthy`) to avoid a race where both agents
are initially connecting.
- **Sub-agent excluded test**: connects the parent agent and waits for
it to be healthy before creating the sub-agent.
- **TestWorkspaceAgent/Connect**: flipped assertion to `Health.Healthy
== false` for a `dbfake` agent that never connects.
<details>
<summary>Review notes</summary>
### Known follow-up
The `healthy:false` workspace search filter maps to `[disconnected,
timeout]` and does not include `connecting`. This is a pre-existing gap
that is now more consequential — a workspace unhealthy solely due to a
connecting agent won't appear in `healthy:false` results. Worth a
follow-up issue.
### Deep review findings addressed
| Finding | Severity | Status |
|---------|----------|--------|
| Mixed health test race (all 3 reviewers) | P2 | Fixed — tightened
`Eventually` condition |
| `TestWorkspaceAgent/Connect` assertion break | P1 | Fixed — flipped
assertion |
| CLI renders red for connecting agents | Obs | Acknowledged — design
trade-off, accurate but visually strong for transient state |
| Switch case ordering overlap | Obs | Documented with inline comment |
</details>
> 🤖 This PR was created with the help of Coder Agents, and needs a human
review. 🧑💻
Replace Tooltip with `HelpPopover` in the "New workspace" page header.
`HelpPopover` supports interactive content like links and provides
better layout control, making it a better fit for this use case.
Renames the "Security implications" section to "Security posture" and
reframes the intro paragraph. "Implications" reads as a caveat or
warning; the section actually describes built-in structural guarantees
of the control plane architecture.
> PR generated with Coder Agents
Workspace agent logs could still fail after the earlier invalid UTF-8
fix because NUL bytes are valid Go/protobuf strings but are rejected by
Postgres text columns. The legacy HTTP log upload path also bypassed the
old sanitization entirely, and both server insert paths computed
logs_length from the unsanitized input.
Add a shared log-output sanitizer in agentsdk, use it in the protobuf
conversion path and both server-side insert paths, and compute
OutputLength from the sanitized string so overflow accounting matches
what is actually stored. This keeps the old invalid UTF-8 behavior while
also handling embedded NUL bytes consistently across DRPC and HTTP log
ingestion.
Refs [#23292 ](https://github.com/coder/coder/issues/23292)
Refs [#13433 ](https://github.com/coder/coder/issues/13433)
Adds backend validation for user secret environment variable names and file paths.
Env name validation enforces POSIX naming rules and blocks a deliberately aggressive denylist of reserved names and prefixes. The denylist errs on the side of blocking too much since it's easier to remove entries later than to add them after users have created conflicting secrets.
File path validation requires paths to start with ~/ or /.
Fixes several documentation gaps and inaccuracies in the Coder Agents
docs identified during a deep review against the current product state.
## BYOK (User API Keys)
`models.md` stated *"Developers cannot add their own providers, models,
or API keys"* — this has been incorrect since the provider key policy
system shipped (Apr 2, #23751/#23781).
- Added **Key policy** section documenting the three admin toggles
(`central_api_key_enabled`, `allow_user_api_key`,
`allow_central_api_key_fallback`) with a truth table showing all
resolution outcomes
- Added **User API keys (BYOK)** section covering the developer-facing
key management page, status indicators, selection priority, and key
removal
- Updated `platform-controls/index.md` to reference BYOK instead of
claiming keys are admin-only
## Reasoning effort enum fixes
- **OpenAI**: removed `none` — code accepts `minimal, low, medium, high,
xhigh`
- **OpenRouter**: narrowed to `low, medium, high` per
`ReasoningEffortFromChat` in `chatprovider.go`
## Tool table completeness
- Added `spawn_computer_use_agent`, `read_skill`, `read_skill_file` to
`index.md` tool table
- Added "Workspace extension tools" section to `architecture.md` for
`read_skill`/`read_skill_file`
- Fixed orchestration restriction note to list all 5 gated tools instead
of just `spawn_agent`
- Added conditional availability notes for desktop and skills tools
## Platform controls
Three admin-only settings existed in the Behavior tab with no
documentation:
- **Virtual desktop** — admin toggle, Anthropic + portabledesktop
requirements
- **Workspace autostop fallback** — default TTL for agent workspaces
without template-defined autostop
- **Data retention** — moved `chat-retention.md` into
`platform-controls/` since it's admin-only, fixed nav path
---
> PR generated with Coder Agents
Dev and RC builds now show diagonal warning stripes in the navbar plus a
centered version badge, making it impossible to miss which build you're
running.
**Devel build:** amber "warning" from theme
**RC build:** sky "pending" from theme
> 🤖 Written by a Coder Agent. Will be reviewed by a human.
## Summary
Adds `credential_kind` and `credential_hint` columns to
`aibridge_interceptions` to record how each LLM request was
authenticated and provide a masked credential identifier for audit
purposes.
This enables admins to distinguish between centralized API keys,
personal API keys, and subscription-based credentials in the
interceptions audit log.
## Changes
- New migration adding `credential_kind`and `credential_hint` to
`aibridge_interceptions`
- Updated `InsertAIBridgeInterception` query and proto definition to
carry the new fields
- Wired proto fields through `translator.go` and `aibridgedserver.go` to
the database
Depends on https://github.com/coder/aibridge/pull/239
Update the release calendar table now that v2.31.7 has been promoted to
stable (`latest` on GitHub Releases).
## Changes
| Release | Old Status | New Status | Latest Patch |
|---------|-----------|------------|-------------|
| 2.31 | Mainline | Stable | v2.31.7 |
| 2.30 | Stable | Security Support | v2.30.6 |
| 2.29 | Security Support + ESR | Extended Support Release | v2.29.9 |
---
> **Note:** The auto-generation script
(`scripts/update-release-calendar.sh`) determines status positionally
from the latest non-RC tag, so it will always mark the latest minor
version as "Mainline". This manual update is needed to reflect the
promotion of 2.31 to stable.
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Adds an optional `CreatedAt` timestamp to `tool-call` and `tool-result`
`ChatMessagePart` variants so the frontend can compute tool execution
duration (`result.created_at - call.created_at`).
Timestamps are recorded at the correct moments in the chatloop:
- **Tool-call**: when the model stream emits the tool call
- **Tool-result**: when tool execution completes (or is interrupted)
These are passed through `PersistedStep.PartCreatedAt` so the
persistence layer can apply accurate timestamps to stored parts.
SSE-published parts also carry `CreatedAt` for real-time display.
Old persisted messages without `created_at` deserialize to `nil` — fully
backward compatible.
<details><summary>Implementation notes (Coder Agents
generated)</summary>
### Why not stamp in `PartFromContent`?
`PartFromContent` is called both for SSE publishing (correct timing) and
during persistence (wrong timing — both tool-call and tool-result would
get the same "persistence time" timestamp, yielding ~0 duration).
Instead, timestamps are captured in the chatloop at the right moments
and carried through `PersistedStep.PartCreatedAt` as a
`map[string]time.Time` keyed by `"call:<id>"` / `"result:<id>"`.
### Interrupted tool calls
`persistInterruptedStep` also stamps `CreatedAt` on synthetic error
results for cancelled/interrupted tool calls, so partial duration is
available.
### Files changed
| File | Change |
|------|--------|
| `codersdk/chats.go` | Add `CreatedAt *time.Time` field |
| `codersdk/chats_test.go` | JSON round-trip test |
| `coderd/database/dbtime/dbtime.go` | Add `TimePtr` helper |
| `coderd/x/chatd/chatloop/chatloop.go` | Track timestamps, pass through
`PersistedStep` |
| `coderd/x/chatd/chatd.go` | Apply timestamps during persistence |
| `coderd/x/chatd/chatprompt/chatprompt_test.go` | Verify
`PartFromContent` does NOT stamp |
| `site/src/api/typesGenerated.ts` | Auto-generated |
</details>
---------
Co-authored-by: Ethan <39577870+ethanndickson@users.noreply.github.com>
## What
Two small docs improvements for AI Bridge:
1. **`setup.md` – Structured Logging section**: Added a `record_type`
table documenting the six event types emitted by AI Bridge structured
logs (`interception_start`, `interception_end`, `token_usage`,
`prompt_usage`, `tool_usage`, `model_thought`) along with their key
fields. Previously only the `"interception log"` message prefix was
mentioned.
2. **`monitoring.md`**: Added a "Structured Logging" section that
cross-links to `setup.md#structured-logging`, so users landing on the
monitoring page can discover the feature without navigating to the setup
guide first.
<details><summary>Source reference</summary>
Record types and fields were extracted from
`enterprise/aibridgedserver/aibridgedserver.go` where they are emitted
as `slog.F("record_type", "...")` string literals under the
`InterceptionLogMarker` (`"interception log"`) message.
</details>
Adds client-executed dynamic tools to the chat API. Dynamic tools are
declared by the client at chat creation time, presented to the LLM
alongside built-in tools, but executed by the client rather than chatd.
This enables external systems (Slack bots, IDE extensions, Discord bots,
CI/CD integrations) to plug custom tools into the LLM chat loop without
modifying chatd's built-in tool set.
Modeled after OpenAI's Assistants API: the chat pauses with
`requires_action` status when the LLM calls a dynamic tool, the client
POSTs results back via `POST /chats/{id}/tool-results`, and the chat
resumes.
See [this example](https://github.com/coder/coder-slackbot-poc) as a
reference for how this is used. It's highly-configurable, which would
enable creating chats from webhooks, periodically polling, or running as
a Slackbot.
<details>
<summary>Design context</summary>
### Architecture
The chatloop **exits** when it encounters dynamic tools and
**re-enters** when results arrive. No blocking channels, no pubsub for
tool results, no in-memory registry. The DB is the only coordination
mechanism.
```
Phase 1 (chatloop):
LLM response → execute built-in tools only →
Persist(assistant + built-in results) →
status = requires_action → chatloop exits
Phase 2 (POST /tool-results):
Persist(dynamic tool results) →
status = pending → wakeCh → chatloop re-enters
```
### Validation (POST /tool-results)
1. Chat status must be `requires_action` (409 if not)
2. Read chat's `dynamic_tools` → set of dynamic tool names
3. Read last assistant message → extract tool-call parts matching
dynamic tool names
4. Submitted tool_call_ids must match exactly (400 for missing/extra)
5. Persist tool-result message parts, set status to `pending`, signal
wake
### Idempotency
Tool call IDs scoped per LLM step. State machine (`requires_action` →
`pending`) is the guard. First POST wins, subsequent get 409.
### Mixed tool calls
When the LLM calls both built-in and dynamic tools in one step, built-in
tools execute immediately. Their results are persisted in phase 1.
Dynamic tool results arrive via POST in phase 2. The LLM sees all
results when the chatloop resumes.
</details>
> 🤖 Generated by Coder Agents
## Problem
In linked worktrees, Git hooks inherit multiple repo-local environment
variables: `GIT_DIR`, `GIT_COMMON_DIR`, `GIT_INDEX_FILE`, and others.
The
pre-commit and pre-push hooks only unset `GIT_DIR`, leaving the rest in
place.
When `make pre-commit` runs `go build`, Go tries to stamp VCS info by
shelling
out to `git`. With the leftover partial Git environment, `git` exits 128
and
the build fails:
```
error obtaining VCS status: exit status 128
Use -buildvcs=false to disable VCS stamping.
```
This only happens inside hooks in a linked worktree — running `make
pre-commit`
directly from the terminal works fine because the repo-local vars are
not set.
## Fix
Replace the bare `unset GIT_DIR` in both hooks with a loop that clears
every
variable reported by `git rev-parse --local-env-vars`:
```sh
while IFS= read -r var; do
unset "$var"
done < <(git rev-parse --local-env-vars)
```
This covers all 15 repo-local variables Git may inject (`GIT_DIR`,
`GIT_COMMON_DIR`, `GIT_INDEX_FILE`, `GIT_OBJECT_DIRECTORY`, etc.) and is
forward-compatible — if Git adds new local vars in the future, the loop
picks
them up automatically.
The agent SSH server unconditionally allows all four SSH forwarding
paths (TCP local, TCP reverse, Unix local, Unix reverse). This is a
sandbox escape vector when workspaces are used for AI agent containment
— a reverse tunnel lets anything inside the workspace reach the user's
local machine, bypassing network isolation.
This adds two new agent CLI flags / environment variables:
- `--block-reverse-port-forwarding` /
`CODER_AGENT_BLOCK_REVERSE_PORT_FORWARDING` — blocks both TCP (`ssh -R`)
and Unix socket reverse forwarding
- `--block-local-port-forwarding` /
`CODER_AGENT_BLOCK_LOCAL_PORT_FORWARDING` — blocks both TCP (`ssh -L`)
and Unix socket local forwarding
Template admins can set these via the `env` block on the container/VM
resource that runs the agent (e.g. `docker_container`,
`kubernetes_pod`), or via `coder_env` resources tied to the agent.
Fixes https://github.com/coder/coder/issues/22275
<details>
<summary>Implementation notes</summary>
Follows the existing `BlockFileTransfer` pattern:
1. `agent/agentssh/agentssh.go` — New `BlockReversePortForwarding` and
`BlockLocalPortForwarding` fields on `Config`. TCP callbacks check these
before allowing forwarding. The `direct-streamlocal@openssh.com` channel
handler is wrapped to reject Unix local forwards.
2. `agent/agentssh/forward.go` — `forwardedUnixHandler` gains a
`blockReversePortForwarding` field to reject
`streamlocal-forward@openssh.com` requests.
3. `agent/agent.go` — New fields on `Options` and `agent` struct,
plumbed to SSH config.
4. `cli/agent.go` — New serpent flags with env vars.
5. Tests cover all four blocked paths: TCP local, TCP reverse, Unix
local, Unix reverse.
</details>
> 🤖 Generated by Coder Agents
Adds a GitHub Actions workflow that automatically cherry-picks merged
PRs to the last 3 release branches when the `backport` label is applied.
## How it works
1. Add the `backport` label to any PR targeting `main` (before or after
merge).
2. On merge (or on label if already merged), the workflow discovers the
latest 3 `release/*` branches by semver.
3. For each branch, it cherry-picks the merge commit (`-x -m1`) and
opens a PR.
Created backport PRs follow existing repo conventions:
- **Branch:** `backport/<pr>-to-<version>`
- **Title:** `<original PR title> (#<pr>)` — e.g. `fix(site): correct
button alignment (#12345)`
- **Body:** links back to the original PR and merge commit
If cherry-pick has conflicts, the PR is still opened with instructions
for manual resolution — no conflict markers are committed.
Also:
- Removes `scripts/backport-pr.sh` (replaced by this workflow)
- Removes `.github/cherry-pick-bot.yml` (old bot config)
- Adds a section to the contributing docs explaining how to use the
backport label
> [!NOTE]
> Generated with [Coder Agents](https://coder.com/agents)
Adds a GitHub Actions workflow that cherry-picks merged PRs to the
latest release branch when the `cherry-pick` label is applied.
## How it works
1. Add the `cherry-pick` label to any PR targeting `main` (before or
after merge).
2. On merge (or on label if already merged), the workflow detects the
latest `release/*` branch.
3. It cherry-picks the merge commit (`-x -m1`) and opens a PR.
This complements the `backport` label (see #24025) which targets the
latest **3** release branches. `cherry-pick` targets only the **latest**
one — useful for getting fixes into the current release.
Created PRs follow existing repo conventions:
- **Branch:** `backport/<pr>-to-<version>`
- **Title:** `<original PR title> (#<pr>)` — e.g. `fix(site): correct
button alignment (#12345)`
- **Body:** links back to the original PR and merge commit
If the cherry-pick encounters conflicts, the workflow aborts the
cherry-pick, creates an empty commit with resolution instructions, and
opens the PR with a `[CONFLICT]` prefix so the author can resolve
manually.
Also:
- Removes `scripts/backport-pr.sh` (replaced by this workflow)
- Removes `.github/cherry-pick-bot.yml` (old bot config)
- Adds a section to the contributing docs explaining the `cherry-pick`
label
> [!NOTE]
> Generated with [Coder Agents](https://coder.com/agents)
Adds telemetry collection for the agents chat system (`/agents`) to the
existing telemetry snapshot pipeline.
Three new snapshot fields:
- **`Chats`** — per-chat metadata (id, owner, status, mode,
workspace_id, root_chat_id, has_parent, archived, model config)
collected time-windowed via `createdAfter`
- **`ChatMessageSummaries`** — per-chat aggregated message metrics
(counts by role, token sums by type, cost, runtime, model count,
compression count) collected time-windowed
- **`ChatModelConfigs`** — model configuration metadata (provider,
model, context limit, enabled, default) collected as full dump
No PII is included — titles, message content, and URLs are excluded at
the SQL level. Only structural metadata flows through telemetry.
<details><summary>Implementation plan</summary>
### SQL Queries (`coderd/database/queries/chats.sql`)
- `GetChatsCreatedAfter` — time-windowed chat metadata
- `GetChatMessageSummariesPerChat` — per-chat message aggregates via
`GROUP BY`
- `GetChatModelConfigsForTelemetry` — full dump of model configs
### Telemetry (`coderd/telemetry/telemetry.go`)
- `Chat`, `ChatMessageSummary`, `ChatModelConfig` structs
- `ConvertChat`, `ConvertChatMessageSummary`, `ConvertChatModelConfig`
conversion functions
- Three `eg.Go()` blocks in `createSnapshot()` following the existing
collection pattern
### Authorization (`coderd/database/dbauthz/dbauthz.go`)
- System-only access for all three queries via `rbac.ResourceSystem`
### Tests
- `TestChatsTelemetry` in `coderd/telemetry/telemetry_test.go` — creates
chats (root + child), messages with token/cost data, model configs;
verifies all snapshot fields
- dbauthz test entries for all three queries in
`coderd/database/dbauthz/dbauthz_test.go`
</details>
> 🤖 Generated by Coder Agents
## Problem
The Sysbox docker-in-workspaces docs examples use `sudo dockerd &` in
`startup_script` to start Docker. This causes workspaces to report as
unhealthy because `dockerd` keeps references to stdout/stderr after the
script exits.
## Fix
Replace `sudo dockerd &` with `sudo service docker start`, which
properly daemonizes Docker through the service manager and returns
cleanly. This matches the pattern used in our [dogfood
template](https://github.com/coder/coder/blob/main/dogfood/coder/main.tf#L614).
## Validation
Created a test template and workspace on dogfood — agent reported `✔
healthy` and `docker info` confirmed the daemon running inside the
workspace.
Fixes#21166
> 🤖 This PR was created with the help of Coder Agents, and has been
reviewed by my human. 🧑💻
- Remove Kyleosophy alternative completion chimes (keeps original chime
intact)
- Extract 5 sub-components from the 717-line god component:
- `PersonalInstructionsSettings` — user prompt textarea form
- `SystemInstructionsSettings` — admin system prompt + TextPreviewDialog
- `VirtualDesktopSettings` — admin desktop toggle
- `WorkspaceAutostopSettings` — admin autostop toggle + duration form
- `RetentionPeriodSettings` — admin retention toggle + number input
- Parent is now a ~160-line layout shell
- `isAnyPromptSaving` coupling preserved via prop
- Add `docs/plans/` to `.gitignore`
> 🤖 Written by a Coder Agent. Reviewed by a human.
Fixes https://github.com/coder/coder/issues/23910
Adds periodic cleanup of chats and chat files to the dbpurge background
goroutine, with a configurable retention period exposed in the Agent
settings UI.
> 🤖 Written by a Coder Agent. Reviewed by a human.
Bumps
[github.com/aws/aws-sdk-go-v2/service/s3](https://github.com/aws/aws-sdk-go-v2)
from 1.96.0 to 1.97.3.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="90650dd227"><code>90650dd</code></a>
Release 2026-03-26</li>
<li><a
href="dd88818bee"><code>dd88818</code></a>
Regenerated Clients</li>
<li><a
href="b662c50138"><code>b662c50</code></a>
Update endpoints model</li>
<li><a
href="500a9cb352"><code>500a9cb</code></a>
Update API model</li>
<li><a
href="6221102f76"><code>6221102</code></a>
fix stale skew and delayed skew healing (<a
href="https://redirect.github.com/aws/aws-sdk-go-v2/issues/3359">#3359</a>)</li>
<li><a
href="0a39373433"><code>0a39373</code></a>
fix order of generated event header handlers (<a
href="https://redirect.github.com/aws/aws-sdk-go-v2/issues/3361">#3361</a>)</li>
<li><a
href="098f389827"><code>098f389</code></a>
Only generate resolveAccountID when it's required (<a
href="https://redirect.github.com/aws/aws-sdk-go-v2/issues/3360">#3360</a>)</li>
<li><a
href="6ebab66428"><code>6ebab66</code></a>
Release 2026-03-25</li>
<li><a
href="b2ec3beebb"><code>b2ec3be</code></a>
Regenerated Clients</li>
<li><a
href="abc126f6b3"><code>abc126f</code></a>
Update API model</li>
<li>Additional commits viewable in <a
href="https://github.com/aws/aws-sdk-go-v2/compare/service/s3/v1.96.0...service/s3/v1.97.3">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts page](https://github.com/coder/coder/network/alerts).
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps
[github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream](https://github.com/aws/aws-sdk-go-v2)
from 1.7.6 to 1.7.8.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="e3b97d2a02"><code>e3b97d2</code></a>
Release 2023-10-12</li>
<li><a
href="863010ddb2"><code>863010d</code></a>
Regenerated Clients</li>
<li><a
href="6946ef8b91"><code>6946ef8</code></a>
Update endpoints model</li>
<li><a
href="6d93ded453"><code>6d93ded</code></a>
Update API model</li>
<li><a
href="bebc232e7f"><code>bebc232</code></a>
fix: fail to load config if configured profile doesn't exist (<a
href="https://redirect.github.com/aws/aws-sdk-go-v2/issues/2309">#2309</a>)</li>
<li><a
href="5de46742b7"><code>5de4674</code></a>
fix DNS timeout error not retried (<a
href="https://redirect.github.com/aws/aws-sdk-go-v2/issues/2300">#2300</a>)</li>
<li><a
href="e155bb72a2"><code>e155bb7</code></a>
Release 2023-10-06</li>
<li><a
href="9d342ba339"><code>9d342ba</code></a>
Regenerated Clients</li>
<li><a
href="1df99141a1"><code>1df9914</code></a>
Update SDK's smithy-go dependency to v1.15.0</li>
<li><a
href="32ada3a191"><code>32ada3a</code></a>
Update API model</li>
<li>See full diff in <a
href="https://github.com/aws/aws-sdk-go-v2/compare/service/m2/v1.7.6...service/m2/v1.7.8">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts page](https://github.com/coder/coder/network/alerts).
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
I said I wouldn't but the illustrious @jakehwll added a ResizeObserver
recently so imma do that too.
This makes `<ExpandableText>` determine if it should be expandable or
not on resize
The default `net.Dialer` in the Coder Connect path had no timeout,
falling back to the OS TCP timeout when the tunnel was broken but DNS
still resolved. Add a 5s dial timeout and 30s TCP keepalive.
Fixes#24006
When `coder ssh` connects to a workspace after laptop wake, DNS or the
control plane may be briefly unavailable. Previously this caused an
immediate failure, which VS Code Remote SSH classified as permanent
("Reload Window").
Wrap each network step (workspace resolution, template version fetch,
agent connection info, Coder Connect dial, tailnet dial) with
`retryWithInterval` so transient errors (DNS, connection refused, 5xx)
are retried individually. Non-retryable errors (auth, 404) and context
cancellation stop immediately. Data transfer is never retried.
RC tags are now created directly on `main`. The `release/X.Y` branch is
only cut when the actual release is ready. This eliminates the need to
cherry-pick hundreds of commits from main onto the release branch
between the first RC and the release.
## Workflow
```
main: ──●──●──●──●──●──●──●──●──●──
↑ ↑ ↑
rc.0 rc.1 cut release/2.34, tag v2.34.0
\
release/2.34: ──●── v2.34.1 (patch)
```
1. **RC:** On `main`, run `./scripts/release.sh`. The tool detects main
(or a detached HEAD reachable from main), prompts for the commit SHA to
tag, suggests the next RC version, and tags it.
2. **Release:** When the RC is blessed, create `release/X.Y` from `main`
(or the specific RC commit). Switch to that branch and run
`./scripts/release.sh`, which suggests `vX.Y.0`.
3. **Patch:** Cherry-pick fixes onto `release/X.Y` and run
`./scripts/release.sh` from that branch.
## Changes
### `scripts/releaser/release.go`
- Two modes based on branch:
- **`main` (or detached HEAD from main)** — RC tagging. Prompts for the
commit SHA to tag (defaults to HEAD). Always checks out the target
commit so the flow operates in detached HEAD. Suggests the next RC based
on existing RC tags.
- **`release/X.Y`** — Release/patch mode. Suggests `vX.Y.0` if the
latest tag is an RC, or the next patch otherwise.
- Detached HEAD support: if `git branch --show-current` is empty, checks
whether HEAD is an ancestor of `origin/main` and enters RC mode
automatically.
- Commit selection prompt in RC mode: shows current commit, lets the
user confirm or provide a different SHA.
- Warns if you try to tag a non-RC on main, or an RC on a release
branch.
- Skips open-PR check and branch sync check in RC mode (not useful on
main).
### `scripts/releaser/main.go`
- Updated help text.
### `.github/workflows/release.yaml`
- RC tags (`*-rc.*`): skip the release-branch validation (they live on
main).
- Non-RC tags: still require the corresponding `release/X.Y` branch.
### `docs/about/contributing/CONTRIBUTING.md`
- Rewrote the Releases section with the new workflow, release types
table, and ASCII diagram.
- Replaced the old "Creating a release" / "Creating a release (via
workflow dispatch)" subsections.
<details><summary>Decision log</summary>
### Why this approach?
Previously, cutting a release branch early for an RC meant
cherry-picking all of main's progress onto that branch before the actual
release — often hundreds of commits. This approach avoids that entirely:
RCs are just tagged snapshots of main, and the release branch only
exists once you need it for stabilization and backports.
### Files NOT changed
- **`scripts/release/publish.sh`** — `--rc` flag controls GitHub
prerelease marking (tag-level, not branch-level). `target_commitish`
already defaults to `main` when the tag isn't on a release branch.
- **`scripts/release/tag_version.sh`** — No RC-specific branch logic.
- **`scripts/releaser/version.go`** — Version parsing/comparison
unchanged.
- **`docs/install/releases/index.md`** — Public-facing docs describe RC
as a release channel with no branch-level detail.
</details>
> Generated by Coder Agents
This pull-request resolves an regression where the spread was overriding
the required styles from the `react-window` virtualised rows. This was
causing the scroll to act a little crazy.
Fixescoder/internal#1455
Three changes to eliminate the timing-sensitive flake in
`TestSubscribeRelayEstablishedMidStream`:
1. **Reduce `PendingChatAcquireInterval` from `time.Hour` to
`time.Second`.**
The primary trigger is still `signalWake()` from `SendMessage`, but a
short fallback poll ensures the worker picks up the pending chat
even under heavy CI goroutine scheduling contention.
2. **Increase context timeout from `WaitLong` (25s) to `WaitSuperLong`
(60s).**
The worker pipeline (model resolution, message loading, LLM call)
involves multiple DB round-trips that can be slow when PostgreSQL
is shared with many parallel test packages.
3. **Add a status-polling loop while waiting for the streaming
request.**
If the worker errors out during chat processing, the test now
fails immediately with the error status and message instead of
silently timing out.
> Generated by Coder Agents
Two fixes for the release script:
**1. Branch regex cleanup** — Simplified to only match `release/X.Y`.
Removed
support for `release/X.Y.Z` and `release/X.Y-rc.N` branch formats. RCs
are
now tagged from main (not from release branches), and the three-segment
`release/X.Y.Z` format will not be used going forward.
**2. Changelog range for first release on a new minor** — When no tags
match
the branch's major.minor, the commit range fell back to `HEAD` (entire
git
history, ~13k lines of changelog). Now computes `git merge-base` with
the
previous minor's release branch (e.g. `origin/release/2.32`) as the
changelog
starting point. This works even when that branch has no tags pushed yet.
Falls
back to the latest reachable tag from a previous minor if the branch
doesn't
exist.
Migrated LogLine and Logs components from Emotion CSS-in-JS to Tailwind
CSS classes.
- Replaced Emotion `css` prop and theme-based styling with Tailwind
utility classes in `LogLine` and `LogLinePrefix` components
- Converted CSS-in-JS styles object to conditional Tailwind classes
using the `cn` utility function
- Updated log level styling (error, debug, warn) to use Tailwind classes
with design token references
- Migrated the Logs container component styling from Emotion to Tailwind
classes
- Removed Emotion imports and theme dependencies
Refactored the tab overflow hook by renaming `useTabOverflowKebabMenu`
to `useKebabMenu` and removing the configurable `alwaysVisibleTabsCount`
parameter.
- Renamed `useTabOverflowKebabMenu` to `useKebabMenu` and moved it to a
new file
- Removed the `alwaysVisibleTabsCount` parameter and hardcoded it to 1
tab as `ALWAYS_VISIBLE_TABS_COUNT`
- Removed the `utils/index.ts` export file for the Tabs component
- Updated the import in `AgentRow.tsx` to use the new hook name and
removed the `alwaysVisibleTabsCount` prop
- Refactored the internal logic to use a more functional approach with
`reduce` instead of imperative loops
- Added better performance optimizations to prevent unnecessary
re-renders
This PR improves the agent log download functionality by replacing the
single download button with a comprehensive dropdown menu system.
- Replaced single download button with a dropdown menu offering multiple
download options
- Added ability to download all logs or individual log sources
separately
- Updated download button to show chevron icon indicating dropdown
functionality
- Enhanced download options with appropriate icons for each log source
<img width="370" height="305" alt="image"
src="https://github.com/user-attachments/assets/ddf025f5-f936-499a-9165-6e81b62d6860"
/>
Moves the `charm.land/fantasy` replace directive from
`github.com/kylecarbs/fantasy` to `github.com/coder/fantasy`, pointing
at the same `cj/go1.25` branch and commit (`112927d9b6d8`).
> Generated by Coder Agents
Update queries as prep work for user secrets API development:
- Switch all lookups and mutations from ID-based to user_id + name
- Split list query into metadata-only (for API responses) and
with-values (for provisioner/agent)
- Add partial update support using CASE WHEN pattern for write-only
value fields
- Include value_key_id in create for dbcrypt encryption support
- Update dbauthz wrappers and remove stale methods from dbmetrics
Previously, after creating a provider config in the agents provider
editor, the Save changes button stayed enabled for the lifetime of the
mounted form. The form kept the pre-create local baseline, so the
freshly-saved values still looked dirty.
Key `ProviderForm` by provider config identity so React remounts the
form when a config is created and re-establishes the pristine state from
the saved provider values.
## Summary
Replaces N per-chat heartbeat goroutines with a single centralized
heartbeat loop that issues one `UPDATE` per 30s interval for all running
chats on a worker.
## Problem
Each running chat spawned a dedicated goroutine that issued an
individual `UPDATE chats SET heartbeat_at = NOW() WHERE id = $1 AND
worker_id = $2 AND status = 'running'` query every 30 seconds. At 10,000
concurrent chats this produces **~333 DB queries/second** just for
heartbeats, plus ~333 `ActivityBumpWorkspace` CTE queries/second from
`trackWorkspaceUsage`.
## Solution
New `UpdateChatHeartbeats` (plural) SQL query replaces the old singular
`UpdateChatHeartbeat`:
```sql
UPDATE chats
SET heartbeat_at = @now::timestamptz
WHERE worker_id = @worker_id::uuid
AND status = 'running'::chat_status
RETURNING id;
```
A single `heartbeatLoop` goroutine on the `Server`:
1. Ticks every `chatHeartbeatInterval` (30s)
2. Issues one batch UPDATE for all registered chats
3. Detects stolen/completed chats via set-difference (equivalent of old
`rows == 0`)
4. Calls `trackWorkspaceUsage` for surviving chats
`processChat` registers an entry in the heartbeat registry instead of
spawning a goroutine.
## Impact
| Metric | Before (10K chats) | After (10K chats) |
|---|---|---|
| Heartbeat queries/sec | ~333 | ~0.03 (1 per 30s per replica) |
| Heartbeat goroutines | 10,000 | 1 |
| Self-interrupt detection | Per-chat `rows==0` | Batch set-difference |
---
> 🤖 Generated by Coder Agents
<details><summary>Implementation notes</summary>
- Uses `@now` parameter instead of `NOW()` so tests with `quartz.Mock`
can control timestamps.
- `heartbeatEntry` stores `context.CancelCauseFunc` + workspace state
for the centralized loop.
- `recoverStaleChats` is unaffected — it reads `heartbeat_at` which is
still updated.
- The old singular `UpdateChatHeartbeat` is removed entirely.
- `dbauthz` wrapper uses system-level `rbac.ResourceChat` authorization
(same pattern as `AcquireChats`).
</details>
Audit and connection log pages were timing out due to expensive COUNT(*)
queries over large tables. This commit adds opt-in count capping: requests can
return a `count_cap` field signaling that the count was truncated at a threshold,
avoiding full table scans that caused page timeouts.
Text-cast UUID comparisons in regosql-generated authorization queries
also contributed to the slowdown by preventing index usage for connection
and audit log queries. These now emit native UUID operators.
Frontend changes handle the capped state in usePaginatedQuery and
PaginationWidget, optionally displaying a capped count in the pagination
UI (e.g. "Showing 2,076 to 2,100 of 2,000+ logs")
Related to:
https://linear.app/codercom/issue/PLAT-31/connectionaudit-log-performance-issue
This pull-request ensures we have a stable test where the content
doesn't change every time we have a new storybook artifact by setting it
to a consistent date.
Closes https://github.com/coder/internal/issues/1454
> This PR was authored by Mux on behalf of Mike.
External MCP tools returned by `ConnectAll` were ordered by goroutine
completion, making the tool list nondeterministic across chat turns.
This broke prompt-cache stability since tools are serialized in order.
Sort tools by their model-visible name after all connections complete,
matching the existing pattern in workspace MCP tools
(`agent/x/agentmcp/manager.go`). Also guards against a nil-client panic
in cleanup when a connected server contributes zero tools after
filtering.
- Use async `findByLabelText` instead of sync `getByLabelText` in
`ProviderAccordionCards` story
- Same bug fixed in #23999 for three other stories but missed for this
one
> 🤖 Written by a Coder Agent. Will be reviewed by a human.
Bumps [github.com/gohugoio/hugo](https://github.com/gohugoio/hugo) from
0.159.2 to 0.160.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/gohugoio/hugo/releases">github.com/gohugoio/hugo's
releases</a>.</em></p>
<blockquote>
<h2>v0.160.0</h2>
<p>Now you can inject <a
href="https://gohugo.io/functions/css/build/#vars">CSS vars</a>, e.g.
from the configuration, into your stylesheets when building with <a
href="https://gohugo.io/functions/css/build/">css.Build</a>. Also, now
all the render hooks has a <a
href="https://gohugo.io/render-hooks/links/#position">.Position</a>
method, now also more accurate and effective.</p>
<h2>Bug fixes</h2>
<ul>
<li>Fix some recently introduced Position issues 4e91e14c <a
href="https://github.com/bep"><code>@bep</code></a> <a
href="https://redirect.github.com/gohugoio/hugo/issues/14710">#14710</a></li>
<li>markup/goldmark: Fix double-escaping of ampersands in link URLs
dc9b51d2 <a href="https://github.com/bep"><code>@bep</code></a> <a
href="https://redirect.github.com/gohugoio/hugo/issues/14715">#14715</a></li>
<li>tpl: Fix stray quotes from partial decorator in script context
43aad711 <a href="https://github.com/bep"><code>@bep</code></a> <a
href="https://redirect.github.com/gohugoio/hugo/issues/14711">#14711</a></li>
</ul>
<h2>Improvements</h2>
<ul>
<li>all: Replace NewIntegrationTestBuilder with Test/TestE/TestRunning
481baa08 <a href="https://github.com/bep"><code>@bep</code></a></li>
<li>tpl/css: Support <a
href="https://github.com/import"><code>@import</code></a>
"hugo:vars" for CSS custom properties in css.Build 5d09b5e3 <a
href="https://github.com/bep"><code>@bep</code></a> <a
href="https://redirect.github.com/gohugoio/hugo/issues/14699">#14699</a></li>
<li>Improve and extend .Position handling in Goldmark render hooks
303e443e <a href="https://github.com/bep"><code>@bep</code></a> <a
href="https://redirect.github.com/gohugoio/hugo/issues/14663">#14663</a></li>
<li>markup/goldmark: Clean up test 638262ce <a
href="https://github.com/bep"><code>@bep</code></a></li>
</ul>
<h2>Dependency Updates</h2>
<ul>
<li>build(deps): bump github.com/magefile/mage from 1.16.1 to 1.17.1
bf6e35a7 <a
href="https://github.com/dependabot"><code>@dependabot</code></a>[bot]</li>
<li>build(deps): bump github.com/go-jose/go-jose/v4 from 4.1.3 to 4.1.4
0eda24e6 <a
href="https://github.com/dependabot"><code>@dependabot</code></a>[bot]</li>
<li>build(deps): bump golang.org/x/image from 0.37.0 to 0.38.0 beb57a68
<a
href="https://github.com/dependabot"><code>@dependabot</code></a>[bot]</li>
</ul>
<h2>Documentation</h2>
<ul>
<li>readme: Revise edition descriptions and installation instructions
9f1f1be0 <a
href="https://github.com/jmooring"><code>@jmooring</code></a></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="652fc5acdd"><code>652fc5a</code></a>
releaser: Bump versions for release of 0.160.0</li>
<li><a
href="bf6e35a755"><code>bf6e35a</code></a>
build(deps): bump github.com/magefile/mage from 1.16.1 to 1.17.1</li>
<li><a
href="4e91e14cb0"><code>4e91e14</code></a>
Fix some recently introduced Position issues</li>
<li><a
href="dc9b51d2e2"><code>dc9b51d</code></a>
markup/goldmark: Fix double-escaping of ampersands in link URLs</li>
<li><a
href="481baa0896"><code>481baa0</code></a>
all: Replace NewIntegrationTestBuilder with Test/TestE/TestRunning</li>
<li><a
href="43aad7118d"><code>43aad71</code></a>
tpl: Fix stray quotes from partial decorator in script context</li>
<li><a
href="9f1f1be0be"><code>9f1f1be</code></a>
readme: Revise edition descriptions and installation instructions</li>
<li><a
href="0eda24e65f"><code>0eda24e</code></a>
build(deps): bump github.com/go-jose/go-jose/v4 from 4.1.3 to 4.1.4</li>
<li><a
href="5d09b5e32a"><code>5d09b5e</code></a>
tpl/css: Support <a
href="https://github.com/import"><code>@import</code></a>
"hugo:vars" for CSS custom properties in css.Build</li>
<li><a
href="303e443ea7"><code>303e443</code></a>
Improve and extend .Position handling in Goldmark render hooks</li>
<li>Additional commits viewable in <a
href="https://github.com/gohugoio/hugo/compare/v0.159.2...v0.160.0">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps rust from `1d0000a` to `a08d20a`.
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps ubuntu from `5e5b128` to `eb29ed2`.
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Needed by #23833
Adds a `chat_file_links` association table to track which files are
associated with each chat.
- `AppendChatFileIDs` query links a file to a chat with deduplication
- `GetChatFileMetadataByIDs` query returns lightweight file metadata by
IDs
- Tool-created files (e.g. `propose_plan`) are linked to the chat after
insert
- User-uploaded files are linked to the chat when the referencing
message is sent
- Single-chat GET endpoint hydrates `files: ChatFileMetadata[]` on the
response
> 🤖 Created by Coder Agents and massaged into shape by a human.
Fixes https://github.com/coder/internal/issues/1418
The `TestRun_ActiveToolsPrepareBehavior` test asserts
`persistedStep.Runtime > 0`, but on Windows the timer resolution (~15ms)
means the in-memory mock model can complete within the same clock tick,
producing a measured duration of `0s`.
Change the assertion from `require.Greater` to `require.GreaterOrEqual`
so that a legitimately measured zero duration on low-resolution clocks
does not cause a flake.
> Generated by Coder Agents
## Fix flaky TestAwaitSubagentCompletion/CompletesViaPubsub
Fixescoder/internal#1435
### Root Cause
During `createParentChildChats`, the processor publishes notifications
on `ChatStreamNotifyChannel(child.ID)` via PostgreSQL `LISTEN/NOTIFY`.
After `drainInflight()` returns, these stale notifications can still be
buffered in the pgListener's `NotifyChan()`. When
`awaitSubagentCompletion` subscribes and a stale notification is
dispatched between `setChatStatus(Waiting)` and
`insertAssistantMessage`, `checkSubagentCompletion` sees `done=true`
(status is `Waiting`) but returns an empty report because the message
hasn't been committed yet.
### Fix
Swap the order: insert the assistant message **before** transitioning
the status to `Waiting`. This guarantees the report is committed before
the status makes the chat appear complete to `checkSubagentCompletion`.
### Verification
- 50 consecutive runs of the specific test: all pass
- 10 runs of the full `TestAwaitSubagentCompletion` suite: all pass
- 20 runs with `-race`: all pass
> Generated by Coder Agents
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore <dependency name> major version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's major version (unless you unignore this specific
dependency's major version or upgrade to it yourself)
- `@dependabot ignore <dependency name> minor version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's minor version (unless you unignore this specific
dependency's minor version or upgrade to it yourself)
- `@dependabot ignore <dependency name>` will close this group update PR
and stop Dependabot creating any more for the specific dependency
(unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore <dependency name>` will remove all of the ignore
conditions of the specified dependency
- `@dependabot unignore <dependency name> <ignore condition>` will
remove the ignore condition of the specified dependency and ignore
conditions
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Add optional demographic and newsletter preference fields to the first
user setup page, and redesign the setup form using non-MUI components
## New fields
**Newsletter preferences** (opt-in checkboxes):
- **Marketing updates** — product announcements, tips, best practices
- **Release & security updates** — new releases, patches, security
advisories
## Frontend redesign
Migrated the setup page from MUI to the shadcn/ui design system used
across the rest of the app:
- Replaced MUI `TextField`, `MenuItem`, `Checkbox`, `Autocomplete` with
`Input`, `Label`, `Select`, and `Checkbox` from `#/components`
- Switched from Emotion `css` props to Tailwind utility classes
- Left-aligned header, widened form container to 500px
- Updated copy: "30-day trial", "Learn more", "Help us make Coder
better"
- Side-by-side layouts for first/last name, phone/country
- Moved privacy policy text to always-visible onboarding section
- Removed "Number of developers" field from trial section
### Implementation notes
- The `onboarding_info` payload is fire-and-forget via
`Telemetry.Report()` — not stored in the database
- Country picker switched from MUI Autocomplete to Radix Select for
design consistency
- GitHub OAuth button preserved — conditionally rendered when
`authMethods.github.enabled`
- NewPasswordField is meant to be a drop in replacement for the MUI
PasswordField
### References
- #23989
- #24021
- #24014
- #24018
---------
Co-authored-by: Tracy Johnson <tracy@coder.com>
Co-authored-by: Jeremy Ruppel <jeremy.ruppel@gmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Jeremy Ruppel <jeremyruppel@users.noreply.github.com>
Co-authored-by: Kayla はな <mckayla@hey.com>
The backend (`chatd.go`) already fully implements both `"queue"` and
`"interrupt"` busy behaviors for `SendMessage`, and the `message_agent`
subagent tool already leverages both internally. However the HTTP API
hardcoded `"queue"` and the SDK had no way for callers to request
interrupt-on-send.
This adds a `ChatBusyBehavior` enum type to the SDK and an optional
`busy_behavior` field on `CreateChatMessageRequest`. The HTTP handler
validates the field and passes it through to `chatd.SendMessage`.
Default remains `"queue"` for full backward compatibility.
<details><summary>Implementation notes</summary>
- `codersdk/chats.go`: New `ChatBusyBehavior` type with
`ChatBusyBehaviorQueue` and `ChatBusyBehaviorInterrupt` constants. Added
`BusyBehavior` field to `CreateChatMessageRequest` with `enums` tag for
codegen.
- `coderd/exp_chats.go`: `postChatMessages` now reads
`req.BusyBehavior`, maps SDK constants to
`chatd.SendMessageBusyBehavior*`, returns 400 on invalid values.
- `site/src/api/typesGenerated.ts`: Auto-generated via `make gen`.
- No frontend behavior changes — the field is available but unused by
the UI.
</details>
> [!NOTE]
> Generated by Coder Agents
## What
Documents that Coder license keys are validated locally using
cryptographic signatures and do not require an outbound connection to
Coder's servers. This is a common question from customers evaluating
Coder for air-gapped environments.
## Changes
- **`docs/admin/licensing/index.md`**: Added an "Offline license
validation" section explaining that license keys are signed JWTs
validated locally with no phone-home requirement.
- **`docs/install/airgap.md`**: Added a "License validation" row to the
air-gapped comparison table, confirming no changes are needed for
offline license validation and linking to the licensing docs.
## Why
While the air-gapped docs state that "all Coder features are supported"
offline, there was no explicit mention that the license itself doesn't
require connectivity. This is a frequent question from
security-conscious and air-gapped customers.
---------
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Co-authored-by: Matyas Danter <mdanter@gmail.com>
This change replaces the line number display in agent logs with formatted timestamps. The `AgentLogLine` component now shows timestamps in `HH:mm:ss.SSS` format using dayjs instead of sequential line numbers. The component no longer requires `number` and `maxLineNumber` props, and the associated styling for line number formatting has been removed.
This is a global change.. but I don't think its one that will do much damage.
Added `data-slot` attributes to all Tabs components for better CSS
targeting and component identification. Replaced generic button
selectors with data-slot attribute selectors in tab styling variants.
Implemented `useTabOverflowKebabMenu` hook to handle tab overflow
scenarios by measuring tab widths and determining which tabs should be
hidden in a dropdown menu when container space is limited.
Enhanced the AgentRow logs section with:
- Tab overflow handling using a kebab menu (three dots) for tabs that
don't fit
- Copy logs button with visual feedback using CheckIcon animation
- Download logs functionality for selected tab content with proper
filename generation
- Improved layout with flex containers and proper spacing
Few props and components updates
* Added `overflowKebabMenu` prop to TabsList component to enable
`flex-nowrap` behavior when overflow handling is active.
* Created `<DownloadSelectedAgentLogsButton />` component to replace the
previous download functionality, now working with filtered log content
based on selected tab.
https://github.com/user-attachments/assets/af48ca39-c906-4a11-a891-0d4399eee827
Adds a `system_prompt` field to `CreateChatRequest` that allows API
consumers to provide custom instructions when creating a chat. The
per-chat prompt is stored as a separate system message (`role=system`,
`visibility=model`) in the `chat_messages` table, inserted between the
deployment system prompt and the workspace awareness message.
Also moves deployment system prompt resolution from the HTTP handler
(`resolvedChatSystemPrompt`) into `chatd.CreateChat` where it belongs.
The handler no longer assembles system prompts —
`CreateOptions.SystemPrompt` is now purely the per-chat user prompt, and
the deployment prompt is resolved internally by chatd.
No database schema changes required.
**Message insertion order:**
1. Deployment system prompt (resolved by chatd, existing)
2. Per-chat user system prompt (new, from `CreateOptions.SystemPrompt`)
3. Workspace awareness (existing)
4. Initial user message (existing)
🤖 Generated with [Coder Agents](https://coder.com/agents)
Surface the aggregated `runtime_ms` from `chat_messages` through all
four cost analytics queries (summary, per-model, per-chat, per-user).
This is the key billing metric for agent compute time.
The per-chat breakdown already groups by `root_chat_id`, so subagent
runtime is automatically rolled up under the parent chat — no additional
query changes needed.
<details>
<summary>Implementation details</summary>
**SQL** (`coderd/database/queries/chats.sql`): Added
`COALESCE(SUM(cm.runtime_ms), 0)::bigint AS total_runtime_ms` to
`GetChatCostSummary`, `GetChatCostPerModel`, `GetChatCostPerChat`, and
`GetChatCostPerUser`.
**Go SDK** (`codersdk/chats.go`): Added `TotalRuntimeMs int64` to
`ChatCostSummary`, `ChatCostModelBreakdown`, `ChatCostChatBreakdown`,
and `ChatCostUserRollup`.
**Handler** (`coderd/exp_chats.go`): Wired the new field through all
converter functions and the response assembly.
**Tests** (`coderd/exp_chats_test.go`): Updated fixture to seed non-zero
`runtime_ms` values and added assertions for the new field at summary,
per-model, and per-chat levels.
</details>
> 🤖 Generated by Coder Agents
This pull-request removes all instances of `<IconButton />` being
imported from `@mui/material/IconButton`. This means that we've removed
one whole dependency from MUI and replaced all instances with the local
variant.
Sets `AWS_SDK_UA_APP_ID` in the Terraform provisioner environment so
that all AWS API calls made during workspace builds include Coder's AWS
Partner Revenue Measurement (PRM) attribution in the user-agent header.
This enables AWS to attribute resource usage driven by Coder back to us
as an AWS partner across all deployments.
## How it works
- `provisionEnv()` now unconditionally sets
`AWS_SDK_UA_APP_ID=APN_1.1/pc_cdfmjwn8i6u8l9fwz8h82e4w3$` in the
environment passed to `terraform plan` and `terraform apply`
- The Terraform AWS provider picks this up and appends it to the
user-agent header on every AWS API call
- If a customer has already set `AWS_SDK_UA_APP_ID` in their environment
(e.g. via `coder.env`), we don't override it
- Templates that don't use the AWS provider are unaffected — the env var
is simply ignored
## Notes
- The product code is hardcoded in the source. It may be worth
obfuscating this value (e.g. via `-ldflags -X` at build time) to keep it
out of the public repo, though it is technically a public identifier.
- This covers user-agent attribution only. Resource-level `aws-apn-id`
tags for cost allocation are a separate effort that requires template
changes.
## References
- [AWS SDK Application ID
docs](https://docs.aws.amazon.com/sdkref/latest/guide/feature-appid.html)
- [AWS PRM Automated User
Agent](https://prm.partner.aws.dev/automated-user-agent.html) (partner
login required)
---------
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Co-authored-by: DevCats <christofer@coder.com>
Polishes the AI model configuration form (add/edit model) with tighter
layout and better input affordances.
**Frontend changes:**
- Replace "Unset" with "Default" in select dropdowns to communicate
system fallback
- Show pricing fields inline instead of behind a collapsible toggle
- Use flat section dividers (`border-t`) instead of bordered fieldsets
- Move field descriptions into info-icon tooltips to fix input
misalignment
- Add InputGroup adornments: `$` prefix + `/1M` suffix on pricing,
`tokens` suffix on token fields, `%` suffix on compression threshold,
range placeholders on temperature/penalty fields
- Shorter pricing labels (Input, Output, Cache Read, Cache Write)
- Compact JSON textareas (1-row height, resizable)
- Smart grid layouts by field type (3-col provider, 4-col pricing, 3-col
advanced)
- Boolean fields render as a segmented control (Default · On · Off)
instead of a dropdown
**Backend changes:**
- Add `enum` tags to OpenAI `service_tier`
(`auto,default,flex,scale,priority`) and `reasoning_summary`
(`auto,concise,detailed`) so they render as select dropdowns instead of
free-text inputs
> 🤖 Generated by Coder Agents
Auditors are still able to access and read this page but they won't be
aren't able to update any of the content, we should show that to them.
Should also be noted that this page isn't shown to the user in the
sidebar when they are an Auditor.
This PR adds log source tabs to the workspace agent logs panel so users
can quickly focus on specific log streams instead of scanning one
combined feed. It also updates the shared tabs trigger behavior to
explicitly use `type="button"` when rendered as a native button,
preventing unintended form submission behavior.
- Adds per-source tabs in `AgentRow` with an **All Logs** default view.
- Shows only log sources that currently have output, with source icons
and sorted labels.
- Filters rendered log lines based on the active tab while preserving
existing log streaming/scroll behavior.
- Refines the logs container layout/styling for the new tabbed UI.
- Updates `TabsTrigger` to safely default button type when not using
`asChild`.
https://github.com/user-attachments/assets/9b3e7a9d-72e3-4c12-aba2-2b70cdbc04c1
The CopyButton tooltip on `/agents` defaulted to top (Radix default),
while the Edit button already used `side="bottom"`. This adds an
optional `tooltipSide` prop to `CopyButton` and passes `"bottom"` in the
agents `ConversationTimeline` so both tooltips appear below the buttons
consistently.
## Changes
- `CopyButton`: added optional `tooltipSide` prop, forwarded to
`<TooltipContent side={tooltipSide}>`
- `ConversationTimeline`: passed `tooltipSide="bottom"` to the
copy-message `CopyButton`
> Generated by Coder Agents
The `useEffect` that syncs `chatRecord.status` from React Query
unconditionally overwrites the store's `chatStatus`. The `chat(chatId)`
query has no `staleTime` (defaults to 0), so it refetches on window
focus, remount, etc. If the REST response catches a transient
`"pending"` status (e.g. between multi-step tool-call cycles), it
regresses `chatStatus` from `"running"` to `"pending"`.
Since `shouldApplyMessagePart()` drops ALL parts when status is
`"pending"` or `"waiting"`, every incoming `message_part` event is
silently discarded — not even buffered. Parts are visible on the
WebSocket but nothing renders, and the UI shows "Response is taking
longer than expected". A page reload fixes it because a fresh REST fetch
returns the current status.
**Fix:** Add `wsStatusReceivedRef` — once the WebSocket delivers a
status event, it becomes the authoritative source and REST refetches can
no longer overwrite it. This mirrors the existing
`wsQueueUpdateReceivedRef` pattern already used for queued messages. The
ref resets on chat change.
> Generated with [Coder Agents](https://coder.com/agents)
Aligns the copy/edit action bar so both user and assistant messages use
the same hover-to-reveal pattern.
## Changes
- Replace bifurcated copy UX (inline `afterResponseSlot` for assistant,
floating toolbar for user) with a single unified action bar using
`CopyButton` + optional edit `Button`
- Remove `BlockList` `afterResponseSlot` prop and related machinery
- Remove per-message `copyHovered`/`useClipboard` state and left-border
highlight effect
- Remove `lastAssistantPerTurnIds`/`isTurnActive` computation — all
messages with content get actions on hover
- Hide actions on mid-chain assistant messages (only last in consecutive
chain shows buttons)
- Reduce inter-message gap from `gap-3` to `gap-2`
- Shrink action buttons to `size-6` for tighter vertical spacing
- Add 8px sticky top offset for user messages
> 🤖 Generated by Coder Agents
## Problem
MCP servers configured in `.mcp.json` with stdio transport are
discovered successfully (tools appear) but die immediately after
connection, making all tool calls fail.
## Root Cause
In `connectServer`, the subprocess is spawned with `connectCtx` — a
30-second timeout context whose `cancel()` is deferred:
```go
connectCtx, cancel := context.WithTimeout(ctx, connectTimeout)
defer cancel()
if err := c.Start(connectCtx); err != nil { ... }
```
The mcp-go stdio transport calls `exec.CommandContext(connectCtx, ...)`.
When `connectServer` returns, `cancel()` fires, and
`exec.CommandContext` sends SIGKILL to the subprocess. The process
immediately becomes a zombie.
Confirmed by checking `/proc/<pid>/status` after context cancellation:
```
State: Z (zombie)
```
## Fix
Pass the parent `ctx` (which is `a.gracefulCtx` — the agent's long-lived
context) to `c.Start()`. `connectCtx` continues to bound only the
`Initialize()` handshake. The subprocess is cleaned up when the Manager
is closed or the parent context is canceled.
## Regression Test
Added `TestConnectServer_StdioProcessSurvivesConnect` which:
- Spawns a real subprocess (re-execs the test binary as a fake MCP
server)
- Calls `connectServer` and lets it return (internal `connectCtx` gets
canceled)
- Verifies the subprocess is still alive by calling `ListTools`
The test **fails** on the old code with `transport error: context
deadline exceeded` and **passes** with the fix.
> Generated with [Coder Agents](https://coder.com/agents)
## Summary
Move `ConvertMessagesWithFiles` into the `g2` errgroup so prompt
conversion runs concurrently with instruction persistence, user prompt
resolution, MCP server connections, and workspace MCP tool discovery.
## Problem
In `runChat`, the setup before the first LLM `Stream()` call is
sequential across two errgroups:
```
g.Wait() // model + messages + MCP configs
ConvertMessagesWithFiles() // sequential — blocked on g2 starting
g2.Wait() // instructions + user prompt + MCP connect + workspace MCP
```
`ConvertMessagesWithFiles` can take non-trivial time on conversations
with file attachments (batch DB resolution), and it was blocking g2 from
starting.
## Fix
`ConvertMessagesWithFiles` only reads the `messages` slice (available
after `g.Wait()`) and resolves file references via the database. No g2
task reads or writes the `prompt` variable. This makes it safe to
overlap with g2:
```
g.Wait()
g2.Wait() // now includes ConvertMessagesWithFiles in parallel
```
The `InsertSystem` call for parent chats and the `promptErr` check are
deferred to after `g2.Wait()`, preserving correctness.
<details><summary>Decision log</summary>
- `ConvertMessagesWithFiles` is read-only on `messages` — no mutation,
safe for concurrent access
- `prompt` and `promptErr` are written only by the conversion goroutine,
read only after `g2.Wait()` — no data race
- Error from prompt conversion is checked immediately after `g2.Wait()`,
before any code that uses `prompt`
- `chatloop.Run` now uses `:=` instead of `=` since the prior `err`
declaration from `prompt, err :=` was removed
</details>
> Generated by Coder Agents
Piggybacks on #23878. Moves instruction file reading and skill discovery
from `chatd` (server-side, via multiple `LS`/`ReadFile` round-trips
through the agent connection) to the agent itself (local filesystem
access).
This intentionally drops backward compatibility with older agents that
don't support the context-config endpoint. Agents and server are
deployed together; there is no rolling-update contract to maintain here.
## What changed
The agent's `GET /api/v0/context-config` response now returns
`[]ChatMessagePart` directly — the same types chatd persists. This
eliminates intermediate type conversions and makes the protocol
extensible.
| Field | Type | Description |
|---|---|---|
| `parts` | `[]ChatMessagePart` | Context-file and skill parts, ready to
persist |
| `working_dir` | `string` | Agent's resolved working directory |
Removed from the response: `instructions_dirs`, `instructions_file`,
`skills_dirs`, `skill_meta_file`, `mcp_config_files` — the agent reads
files locally and returns their content as parts.
Removed from chatd: all legacy `LS`/`ReadFile` fallback code
(`readHomeInstructionFile`, `readInstructionDirFile`, `DiscoverSkills`
via LS, etc).
## Why
The previous architecture had the agent resolve paths, serve them over
HTTP, then `chatd` make N+1 round-trips back through the agent
connection to read files. The agent has direct filesystem access and
should just read the files.
## Key design decisions
- **Agent returns `ChatMessagePart` directly** — same types chatd
persists. No intermediate `InstructionFileEntry`/`SkillEntry` types
needed.
- **`SkillMeta.MetaFile`** — persisted via `ContextFileSkillMetaFile` on
the skill part, so custom meta file names
(`CODER_AGENT_EXP_SKILL_META_FILE`) survive across chat turns.
- **No pre-read body** — `read_skill` always dials the workspace to
fetch the skill body on demand. Simpler than caching the body in the
response.
- **MCP config paths kept agent-internal** — `MCPConfigFiles()` getter,
not sent over the wire.
- **No backward compat fallback** — old agents that don't support
context-config get no instruction files. This is acceptable since agent
and server deploy together.
Following on from #23989#24018
- We also no longer want to collect `IsBusiness` demographic data
- Newsletter fields no longer allow `nil` as a value, instead default to
false
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
New `IndustryType` and `OrgSize` enums were added in #23989, but they
are no longer desired in the onboarding/marketing telemetry data. This
removes them.
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Two new columns added to aibridge_token_usages:
- cache_read_input_tokens (BIGINT, default 0)
- cache_write_input_tokens (BIGINT, default 0)
Migration backfills existing rows by extracting values from the metadata
JSONB column (cache_read_input, input_cached, prompt_cached for reads
(max value selected since only 1 should be set), cache_creation_input
for writes).
All references to data from metadata were updated to reference new
columns. No other changes then changing where data is extracted from.
Requires aibridge library version bump to include:
https://github.com/coder/aibridge/pull/229
Fixes: https://github.com/coder/aibridge/issues/150
Add optional demographic and newsletter preference fields to the setup
page: business use (yes/no), industry type, organization size, and two
newsletter toggles (marketing, release/security updates).
The new data flows through telemetry via a FirstUserOnboarding struct in
the snapshot payload, sent once when the first user is created. The
telemetry-server and BigQuery schema changes are required separately to
persist this data.
---------
Co-authored-by: default <davidiii@fraley.us>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Improves the copy button on the last assistant message in agent chat
conversations.
**Changes:**
- Copy icon aligned to the left edge of the content area
- No margin-top gap between message content and the copy button
- Hovering the copy button reveals a 2px vertical indicator line
spanning the copyable content
- Tooltip repositioned to bottom-left with compact sizing
- Extra padding below the copy button to visually separate from the next
user prompt
- Hover state triggers only on the button itself, not the entire row
> This PR was authored by Mux on behalf of @ibetitsmike.
Chromatic tests for `ChatModelAdminPanel` were failing because
synchronous
`getByRole`/`getByLabelText` calls were used after `userEvent.click()`
actions
that trigger navigation to a conditionally-rendered detail view. The
switches
and inputs in `ProviderForm` aren't in the DOM immediately after click,
so the
synchronous queries fail with "Unable to find an accessible element."
Changed all post-click `getByRole`/`getByLabelText` calls to their async
`findByRole`/`findByLabelText` equivalents across three stories:
`ProviderWithUserKeysEnabled`, `EnvPresetProviders`, and
`CreateAndUpdateProvider`.
## Problem
When a prebuilt workspace is claimed, the agent reinitializes via a
single fire-and-forget pubsub event over SSE. If the agent's SSE
connection is interrupted at claim time, the event is permanently lost —
the workspace is stuck with no self-healing path.
Additionally, regular (non-prebuild) workspaces had no way to opt out of
the `/reinit` polling loop — agents would reconnect indefinitely to an
endpoint that would never send them anything useful.
## Root Cause
`workspaceAgentReinit` fetches the workspace (with its current
`owner_id`) via `GetWorkspaceByAgentID`, but never checked whether a
claim already happened. It only subscribed to pubsub for future events.
The database already has durable claim state (`owner_id` changes from
`PrebuildsSystemUserID` to the real user), but no layer ever consulted
it on reconnection.
## Solution
### Server-side durable check with first-build-initiator gating
**TOCTOU-safe ordering**: Subscribe to pubsub claim events *before* any
durable checks, so a claim that fires during the check is buffered in
the channel rather than lost.
**First-build-initiator gating**: When `!workspace.IsPrebuild()` (owner
is no longer the system user), look up the first build's `InitiatorID`.
The prebuild reconciler always uses `PrebuildsSystemUserID` as the
initiator. This distinguishes claimed prebuilds from regular workspaces
without any SQL schema changes.
- **Regular workspace** (first build initiator ≠ system user) → **409
Conflict**, agent stops reconnecting
- **Claimed prebuild, build completed** → pre-seed channel with reinit
event and close it, transmitter delivers one-shot then exits
- **Claimed prebuild, build in-progress** → fall through to pubsub
subscription, agent waits for completion event
- **Unclaimed prebuild** → pubsub subscription (existing happy path)
### Declarative reinit events (defense-in-depth)
- Added `UserID` field to `ReinitializationEvent` with JSON tags
- Switched pubsub serialization from raw string to JSON (with
backward-compat fallback for rolling upgrades)
- Populated `UserID` at both the publish site and the durable check
### Agent SDK: 409 handling
`WaitForReinitLoop` detects 409 Conflict from the server and closes the
`reinitEvents` channel, cleanly exiting the retry goroutine.
### Agent CLI: fixed two bugs + added reinitCtx
- **Closed channel (`!ok`)**: now blocks on `<-ctx.Done()` instead of
`continue`, keeping the current agent running. Previously this would
leak agents by skipping `agnt.Close()` and re-entering the loop.
- **Duplicate owner reinit**: cancels `reinitCtx` (stops the reinit
goroutine), then blocks on `<-ctx.Done()`. Previously `continue` would
skip cleanup and create a new agent on the next loop iteration.
- **`reinitCtx`**: a cancellable child of `ctx` passed to
`WaitForReinitLoop`, allowing the agent to stop the reinit HTTP polling
after reinit completes.
### Agent-side idempotency
Tracks `lastOwnerID` in the agent reinit loop — duplicate events for the
same owner are skipped.
## Testing
- **"unclaimed prebuild receives reinit via pubsub"**: prebuild owned by
system user, pubsub event triggers reinit
- **"claimed prebuild receives one-shot reinit on reconnect"**: first
build by system user, owner changed, build completed → immediate reinit
(no pubsub needed)
- **"claimed prebuild waits during in-progress claim build"**: claimed
but build still running → no reinit until build completes
- **"regular workspace gets 409"**: first build by real user → 409
Conflict, agent stops polling
- Updated claim publisher/listener tests: verify `UserID` survives JSON
round-trip + backward compat with raw string payloads
- Updated SSE round-trip test: verify `UserID` survives transmit →
receive cycle
Fixes#22359
## Rolling upgrade note
During a rolling deploy where old coderd instances coexist with new
ones, the pubsub `ReinitializationEvent` has a new `workspace_id` field
(JSON key `workspace_id`). Old publishers send a raw reason string
instead of JSON; the new listener gracefully falls back by treating the
entire payload as the reason and filling in `WorkspaceID` from context.
The only visible effect during the upgrade window is that `WorkspaceID`
may be the zero UUID in agent-side logs — this is cosmetic and resolves
once all instances are updated.
Add a nullable `value_key_id` column to the `user_secrets` table with a
foreign key to `dbcrypt_keys`. This is the column dbcrypt uses to track
which encryption key encrypted a given secret's value. This is required
for encryption of user secret values.
The column was missing from the original migration (000357).
Frontend for provider key policies (backend in #23751).
## Changes
**Admin provider form**: Three policy toggles (central API key, user API
keys, central fallback) with cross-field validation and conditional
visibility. Form resets properly after save.
**User settings page**: New `/settings/providers` route for personal API
key management. Conditional sidebar item (visible only when providers
allow user keys). Status badges, masked key input, save/remove actions
with confirmation. Read-only model list per provider. Gated behind
`agents` experiment flag.
**Model selector**: Distinguishes user-fixable (`user_api_key_required`)
from admin-fixable (`missing_api_key`) empty states. Links to
`/settings/providers` when user action is needed. Applied to both chat
detail and agent create flows.
**API client**: Query/mutation hooks for user provider configs. Cache
invalidation across provider configs and model catalog.
When a user tries to archive-and-delete a chat from /agents but the
workspace is already gone, the UI showed a "Failed to look up workspace
for deletion" toast and blocked the archive. This change detects the
workspace-gone response and archives the chat without attempting
deletion.
## Changes
The backend returns 410 Gone for soft-deleted workspaces and 404 for
workspaces that do not exist or the user cannot access.
`isWorkspaceNotFound()` detects both status codes.
`resolveArchiveAndDeleteAction()` now returns `"archive"` when the
workspace preflight fetch gets a 404 or 410, and the page branches on
that action to call the existing archive mutation directly. The
`archiveAndDeleteMutation` also tolerates these status codes from
`deleteWorkspace()` to handle the race where the workspace disappears
between the preflight lookup and the actual delete call.
The mutation body was extracted into a testable
`archiveAndDeleteWorkspace()` utility so the tolerance logic has direct
test coverage. A `navigateAfterArchive()` helper consolidates the
post-archive redirect logic that was previously duplicated across the
proceed, confirm, and archive paths.
## Pre-existing patterns preserved
- The `"proceed"` and `"confirm"` archive-and-delete paths use
`onSettled` for navigation, matching the existing behavior before this
change. Only the new `"archive"` path uses `onSuccess` since it has no
workspace deletion step that should still navigate on partial failure.
- `isWorkspaceNotFound()` uses the same `isAxiosError(error) &&
error.response?.status` pattern already used in several places in
`site/src/api/api.ts`. The backend 404 ambiguity (deleted vs
unauthorized) is documented in the JSDoc.
- The pre-existing double-submit race during the async preflight window
is unchanged.
This PR modifies the `wait_agent` tool call card to display screen
recordings of computer use subagents. The backend logic was added in
https://github.com/coder/coder/pull/23894.
There's one big inefficiency in the current implementation: to display
video thumbnails, the frontend downloads the entire video files from the
backend. Our backend does not support HTTP range requests to only fetch
the first frame. I'll be fixing that in a later PR.
https://github.com/user-attachments/assets/684cea8b-66a9-45f8-96b2-57433da41c1c
This PR introduces screen recording of the computer use agent using the
virtual desktop.
- Screen recording is triggered by a `wait_agent` tool call. Recording
is stopped by a successful `wait_agent` tool call or when there hasn't
been any desktop activity for 10 minutes.
- Recordings are handled by the `portabledesktop` cli via the `record`
command. The videos are sped up in periods of inactivity.
- Recordings are saved to the database to the `chat_files` table.
There's a hard limit of 100MB per recording. Larger recordings are
dropped.
- A successful `wait_agent` on a computer use subagent tool call returns
a `recording_file_id`, later allowing the frontend to display the
corresponding video.
`isContextLimitKey` had a fallback heuristic that matched any key starting with `"max"` containing `"context"`, causing false positives on keys like `"max_context_version"`. A provider returning such metadata would have the value parsed as a context limit.
Replace substring matching on the separator-stripped key with word-level matching. A new `metadataKeyWords` function tokenizes keys by splitting on separators and camelCase boundaries, then the fallback requires
`"context"` paired with a limit-related word (`"limit"`, `"window"` + qualifier, `"length"` + qualifier, or `"tokens"` + qualifier). Known exact forms like `"context_window"` remain in the fast-path switch.
Closes https://github.com/coder/coder/issues/23332
Add `-x` to backport script `git cherry-pick` command to include a
commit message reference to the original commit. This makes it easier to
trace where a cherry picked commit actually came from.
- Extend `TestChatTemplateAllowlistEnforcement` to also exercise
`read_template` and `create_workspace` through the allowlist
- Mock LLM now chains 4 tool calls: list_templates, read_template
(blocked), read_template (allowed), create_workspace (blocked)
- Wire dummy `CreateWorkspace` config into test server so the tool
reaches the allowlist check
- Generalize tool result collection to support multiple calls per tool
name
> 🤖 Created by Coder Agents and reviewed by Kyle the human.
When the `agents` experiment is enabled, new users are automatically
granted the `agents-access` role at creation time so they can use Coder
Agents without manual admin intervention.
- Auto-assigns in `CreateUser()` — covers admin API, OAuth, and OIDC
creation paths
- Skips auto-assign for OIDC users when enterprise site role sync is
enabled (sync overwrites roles on every login; those admins should use
`--oidc-user-role-default` instead)
- CLI `create-admin-user` bypasses `CreateUser()` but creates `owner`
users who already have all permissions
> 🤖 Written by a Coder Agent. Will be reviewed by a human.
_Disclaimer: created using Claude Opus 4.6._
```
# Examples:
# ./scripts/backport-pr.sh 2.30 23969
# ./scripts/backport-pr.sh --dry-run 2.30 23969
```
Here's one I created: https://github.com/coder/coder/pull/23972
Signed-off-by: Danny Kopping <danny@coder.com>
- Change `errChatHasNoWorkspaceAgent` message from cryptic `"chat has no
workspace agent"` to actionable `"workspace has no running agent: the
workspace may be stopped. Use the start_workspace tool to start it, or
create_workspace to create a new one"`
- Update test assertions to match the new message substring
> 🤖 Written by a Coder Agent. Reviewed by a human.
Add language reference docs that the Modernization Reviewer reads
before reviewing TS/React code, matching the existing Go reference
(.claude/docs/GO.md).
- references/typescript.md: Modern TypeScript 5.0-6.0 RC patterns,
replacements, and new capabilities
- references/react.md: Modern React 18-19.2 + Compiler 1.0 patterns,
replacements, and new capabilities
SKILL.md updated to reference these docs in the Tier 2 file filters
and spawn prompt instructions.
Refs #23500
*Disclaimer: implemented by a Coder Agent using Claude Opus 4.6*
## Summary
Two changes on the AI Bridge sessions page
(`/aibridge/sessions/<session>`):
1. **Updated header subtitle and link** — replaced the generic
"Centralized auditing for LLM usage across your organization. More about
AI Governance" with auditing-specific copy and a link to the [AI Bridge
audit docs](https://coder.com/docs/ai-coder/ai-bridge/audit).
2. **Added prompt attribution tooltip** — each user prompt now shows an
info icon with a tooltip explaining that prompt origin cannot be
reliably determined (human vs. agent), linking to the [attribution
docs](https://coder.com/docs/ai-coder/ai-bridge/audit#human-vs-agent-attribution).
## Changes
| File | What changed |
|------|-------------|
| `AIBridgeSessionsLayout.tsx` | Updated subtitle text and link target |
| `SessionTimeline.tsx` | Added `InfoIcon` + `Tooltip` next to the
"Prompt" label in `ThreadItem` |
<img width="954" height="318" alt="image"
src="https://github.com/user-attachments/assets/db3ca443-cb0f-426a-8457-4625c82fd6ba"
/>
---------
Signed-off-by: Danny Kopping <danny@coder.com>
Reuse the CopyButton component to let users copy plan content from
the propose_plan tool output. Follows the same pattern used by the
assistant message copy button.
- stabilize `TestAwaitSubagentCompletion/CompletesViaPubsub` by waiting
for durable completion state before sending the synthetic pubsub wake
- add coverage for successful subagent completion with an empty report
> 🤖 Written by a Coder Agent. Reviewed by a human.
Some clients (e.g. Claude) send a HEAD request without credentials as a
connectivity check before making actual API calls. This was logging at
`Warn` level, creating noise. Downgrade to Info for unauthenticated HEAD
requests and add the HTTP method to the logger for better observability.
Related to internal slack thread:
https://codercom.slack.com/archives/C0AEHQGLW22/p1775045200997309
## Description
Adds `provider_name` to aibridge interceptions to store the provider
instance name alongside the provider type. This allows distinguishing
between multiple instances of the same provider type (e.g. `copilot` vs
`copilot-business`).
## Changes
* Add `provider_name` column to `aibridge_interceptions` table with
backfill from `provider`.
* Add `provider_name` field to the proto `RecordInterceptionRequest`
message.
* Add `ProviderName` to the `codersdk.AIBridgeInterception` API
response.
_Disclaimer: initially produced by Claude Opus 4.6, modified and
reviewed by @ssncferreira ._
Closes https://github.com/coder/internal/issues/1432
Closes https://github.com/coder/internal/issues/1399
The test setup in `createWorkspaceWithApps` opens a short-lived RPC
connection to fetch the agent manifest before starting the real agent.
This connection used `ConnectRPC()` which sends no `role` parameter, so
the server treated it as a real agent connection and enabled connection
monitoring. When the helper closed, its monitor asynchronously wrote
`disconnectedAt` to the DB — racing with the real agent's monitor and
transiently marking the agent as disconnected.
The fix uses `ConnectRPCWithRole(ctx, "apptest-manifest")` so the helper
doesn't trigger connection monitoring. The server already has this
role-based distinction for non-agent clients like
`coder-logstream-kube`; the test helper just wasn't using it.
Both issues share this codepath: `setupProxyTest` →
`createWorkspaceWithApps` → the `ConnectRPC` call at `setup.go:518`.
Both test configurations have a non-empty `PrimaryAppHost`, so both
enter the affected block.
This is not masking a product issue — the "disconnected" state was
caused by two competing monitors writing to the same agent DB row, a
scenario that only exists in this test setup. No assertions were
weakened; the proxy still checks real agent connectivity on every
request.
The copy button on the last assistant message was showing even while the
turn was still in progress (agent streaming or running tool calls). The
content is not final at that point, so the button should be suppressed
until the turn completes.
The `lastAssistantPerTurnIds` computation unconditionally included the
trailing assistant message. Now it checks a new `isTurnActive` prop
derived from `isActiveChatStatus(chatStatus) || hasStreamState` and
skips the trailing ID when the turn is active. Completed turns (those
followed by a user message) are unaffected.
Bumps [github.com/gohugoio/hugo](https://github.com/gohugoio/hugo) from
0.158.0 to 0.159.2.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/gohugoio/hugo/releases">github.com/gohugoio/hugo's
releases</a>.</em></p>
<blockquote>
<h2>v0.159.2</h2>
<p>Note that the security fix below is not a potential threat if you
either:</p>
<ul>
<li>Trust your Markdown content files.</li>
<li>Have custom <a href="https://gohugo.io/render-hooks/">render hook
template</a> for links and images.</li>
</ul>
<p>EDIT IN: This release also adds release archives for
non-extended-withdeploy builds.</p>
<h2>What's Changed</h2>
<ul>
<li>Fix potential content XSS by escaping dangerous URLs in Markdown
links and images 479fe6c6 <a
href="https://github.com/bep"><code>@bep</code></a></li>
<li>resources/page: Fix shared reader in
Source.ValueAsOpenReadSeekCloser df520e31 <a
href="https://github.com/jmooring"><code>@jmooring</code></a> <a
href="https://redirect.github.com/gohugoio/hugo/issues/14684">#14684</a></li>
</ul>
<h2>v0.159.1</h2>
<p>The regression fixed in this release isn't new, but it's so subtle
that we thought we'd release this sooner rather than later. For some
time now, the minifier we use have stripped namespaced attributes in
SVGs, which broke dynamic constructs using e.g. <a
href="https://alpinejs.dev/directives/bind">AlpineJS' x-bind:</a>
namespace (library used by Hugo's <a
href="https://gohugo.io/">documentation site</a>).</p>
<p>To fix this, the upstream library has hadded a
<code>keepNamespaces</code> slice option. It was not possible to find a
default that would make all happy, so we opted for an option that at
least would make AlpineJS sites work out of the box:</p>
<pre lang="toml"><code> [minify.tdewolff.svg]
keepNamespaces = ['', 'x-bind']
</code></pre>
<h2>What's Changed</h2>
<ul>
<li>minifiers: Keep x-bind and blank namespace in SVG minification
42289d76 <a href="https://github.com/bep"><code>@bep</code></a> <a
href="https://redirect.github.com/gohugoio/hugo/issues/14669">#14669</a></li>
</ul>
<h2>v0.159.0</h2>
<p>This release greatly improves and simplifies management of
Node.js/npm dependencies in a multi-module setup. See <a
href="https://gohugo.io/hugo-modules/nodejs-dependencies/">this page</a>
for more information.</p>
<h2>Note</h2>
<ul>
<li>Replace deprecated site.Data with hugo.Data in tests a8fca598 <a
href="https://github.com/bep"><code>@bep</code></a></li>
<li>Replace deprecated excludeFiles and includeFiles with files in tests
182b1045 <a href="https://github.com/bep"><code>@bep</code></a></li>
<li>Replace deprecated :filename with :contentbasename in the permalinks
test eb11c3d0 <a
href="https://github.com/bep"><code>@bep</code></a></li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li>tpl/tplimpl: Fix Vimeo shortcode test eaf4c751 <a
href="https://github.com/jmooring"><code>@jmooring</code></a> <a
href="https://redirect.github.com/gohugoio/hugo/issues/14649">#14649</a></li>
</ul>
<h2>Improvements</h2>
<ul>
<li>create: Return error instead of panic when page not found 807cae1d
<a href="https://github.com/mango766"><code>@mango766</code></a> <a
href="https://redirect.github.com/gohugoio/hugo/issues/14112">#14112</a></li>
<li>commands: Preserve non-content files in convert output c4fb61d9 <a
href="https://github.com/xndvaz"><code>@xndvaz</code></a> <a
href="https://redirect.github.com/gohugoio/hugo/issues/4621">#4621</a></li>
<li>npm: Use workspaces to simplify <code>hugo mod npm pack</code>
d88a29e0 <a href="https://github.com/bep"><code>@bep</code></a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="5f4646acaa"><code>5f4646a</code></a>
releaser: Bump versions for release of 0.159.2</li>
<li><a
href="479fe6c654"><code>479fe6c</code></a>
Fix potential content XSS by escaping dangerous URLs in links and
images</li>
<li><a
href="81a5cdca07"><code>81a5cdc</code></a>
releaser: Add standard withdeploy release assets</li>
<li><a
href="df520e3150"><code>df520e3</code></a>
resources/page: Fix shared reader in
Source.ValueAsOpenReadSeekCloser</li>
<li><a
href="b55d452e46"><code>b55d452</code></a>
testing: Simplify line ending handling in tests</li>
<li><a
href="ea7eac6558"><code>ea7eac6</code></a>
readme: Update Go version to 1.25.0</li>
<li><a
href="458ebdd448"><code>458ebdd</code></a>
releaser: Prepare repository for 0.160.0-DEV</li>
<li><a
href="86c7d3afac"><code>86c7d3a</code></a>
releaser: Bump versions for release of 0.159.1</li>
<li><a
href="42289d76f9"><code>42289d7</code></a>
minifiers: Keep x-bind and blank namespace in SVG minification</li>
<li><a
href="0c013c2326"><code>0c013c2</code></a>
Adjust depreceated syntax in tests</li>
<li>Additional commits viewable in <a
href="https://github.com/gohugoio/hugo/compare/v0.158.0...v0.159.2">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Fixes: coder/internal#1441
- Move `contextConfigAPI` init from `handleManifest` to `init()`,
matching all other API fields
- Change `agentcontextconfig.NewAPI` to accept `func() string` closure
(lazy directory evaluation)
- `Config()` and HTTP handler now compute on demand via
`a.manifest.Load().Directory`
- Widen `TestAgent_Reconnect` to loop 5 reconnections with a non-empty
manifest directory
- Add `TestContextConfigAPI_InitOnce` internal test verifying lazy eval
across manifest changes
- Add `TestNewAPI_LazyDirectory` unit test for the lazy contract
> 🤖 Written by a Coder Agent. Reviewed by a human.
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore <dependency name> major version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's major version (unless you unignore this specific
dependency's major version or upgrade to it yourself)
- `@dependabot ignore <dependency name> minor version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's minor version (unless you unignore this specific
dependency's minor version or upgrade to it yourself)
- `@dependabot ignore <dependency name>` will close this group update PR
and stop Dependabot creating any more for the specific dependency
(unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore <dependency name>` will remove all of the ignore
conditions of the specified dependency
- `@dependabot unignore <dependency name> <ignore condition>` will
remove the ignore condition of the specified dependency and ignore
conditions
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps ubuntu from `ce4a593` to `5e5b128`.
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
This pull-request moves using to using the plain `radix-ui` package over
`@radix-ui/react-*` packages. Put simply, now we're not going to run
into issues with inconsistent radix dependencies. This will have no
effect to how the code is built, but will give us a single place to
import from.
Previously, `CreateChat` inserted the `chats` row with the DB default
status (`waiting`), then updated it to `pending` in the same transaction
via `setChatPendingWithStore`. This wasted two extra queries per chat
creation (`GetChatByID` + `UpdateChatStatus`) and rewrote the same row
immediately after inserting it.
Now `CreateChat` passes the status directly to `InsertChat`, so the row
is written once in its final create-time state. The
`setChatPendingWithStore` helper is removed entirely. `InsertChat` now
requires an explicit `status` parameter at all callsites instead of
relying on a DB column default.
## Motivation
On an experimental branch we're trialing firing all chatd notifications
from plpgsql triggers. The old two-step insert made that awkward: in an
`AFTER INSERT` trigger, `NEW` only contained the insert-time row
(`waiting`), not the final committed state (`pending`). To emit the
correct event payload the trigger had to be deferred and re-read the row
from `chats` at commit time.
With this change, `NEW` already contains the correct row to publish — no
deferred trigger, no extra `SELECT`, simpler and cheaper trigger logic.
That said, this seems like a worthwhile change regardless of the trigger
experiment: writing the final row state once removes unnecessary DB work
on every chat creation and makes the create path easier to reason about.
Chat title generation used free-form text completion, which let models
respond conversationally instead of producing a title. Review chats
started with GitHub URLs were especially affected — models would say "I
don't have the ability to browse external links" and that string became
the persisted title.
Replace the raw-text `generateShortText` path with structured output via
`object.Generate[generatedTitle]`. Both auto-title and manual retitle
now go through the same typed contract: the model must return a JSON
object with a `title` field, validated and normalized before
persistence. Invalid outputs (empty, too long) are rejected and retried
through the existing candidate-model fallback loop.
<!--
If you have used AI to produce some or all of this PR, please ensure you have read our [AI Contribution guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING) before submitting.
-->
relates to GRU-18
Adds support for tailnet updates to Tunneler FSM.
This adds full RC release support to the release scripts and GitHub
Actions workflow. Previously, the tooling only supported stable and
mainline releases with strict vMAJOR.MINOR.PATCH semver tags.
Changes:
- scripts/releaser/version.go: Add Pre field to version struct for
prerelease suffixes (e.g. "rc.0"), update regex, parsing, String(),
comparison methods, and add IsRC()/rcNumber() helpers.
- scripts/releaser/release.go: Detect RC branches (release/X.Y-rc.N),
suggest RC version numbers, auto-set "rc" channel (skipping
stable/mainline prompt), add RC advisory to release notes, skip docs
update for RC releases.
- .github/workflows/release.yaml: Add "rc" channel option, fix branch
derivation for RC tags (v2.32.0-rc.0 -> release/2.32-rc.0 instead of
broken release/2.32.0-rc), skip homebrew/winget/package publishing for
RC releases.
- scripts/release/publish.sh: Add --rc flag, pass --prerelease to gh
release create for RC releases.
- scripts/releaser/version_test.go: Add comprehensive unit tests for
version parsing, string formatting, IsRC, rcNumber, GreaterThan, and
Equal with RC versions.
<!--
If you have used AI to produce some or all of this PR, please ensure you
have read our [AI Contribution
guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING)
before submitting.
-->
<!--
If you have used AI to produce some or all of this PR, please ensure you have read our [AI Contribution guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING) before submitting.
-->
relates to GRU-18
Adds support for network application (e.g. SSH) updates to Tunneler.
The `echo_latest_mainline_version()` function fetches all GitHub
releases and sorts by version number to find the latest mainline
release. It did not filter out pre-release tags (e.g. `v2.32.0-rc.0`),
so publishing an RC release caused `coder.com/install.sh` to resolve the
RC as the latest mainline version instead of the actual mainline
release.
Adds a `grep` filter for strict semver (`MAJOR.MINOR.PATCH`) before
sorting, so tags with pre-release suffixes like `-rc.0` are excluded
from version resolution.
This is required to prevent the agent from becoming unhealthy.
Since we are stopping the workspace now, also add a confirmation dialog.
Also add stories to test the new behavior and make a tweak to the
permissions query in support of that.
We recently upgraded storybook and vite in #23485 which bumped our
`storybook` version from 10.2.10 to 10.3.3. In 10.2.16,
storybookjs/storybook#34045 was merged that changes the list of default
allowed hosts to an empty array. This means if you have custom DNS set
up (like through the Coder desktop app) your `.coder` domain will no
longer be able to reach storybook and you'll get an `Invalid host`
response. This is a breaking change, but storybook didn't treat it as
such.
This PR adds the `core.allowedHosts` config to our storybook dev server.
I'm not sure this has the same effect for build so I left the other
`viteFinal` `server.allowedHosts` config, but it may be defunct
The "Thinking..." indicator flickered or failed to appear when the user
sent a message.
## Problem
The server sends `status:pending` before `status:running` when
processing a new message. `selectIsAwaitingFirstStreamChunk` only
accepted `"running"`, so during the pending window the indicator was
hidden. When the optimistic `setChatStatus("running")` from `handleSend`
was overridden by the WS `status:pending` event, the indicator would
flash and disappear.
Secondarily, `StreamingOutput` hid the indicator as soon as
`streamState` became non-null, even when no text/reasoning blocks
existed yet (e.g. only tool-call parts or whitespace-only deltas had
arrived).
## Fix
1. **`chatStore.ts`** — `selectIsAwaitingFirstStreamChunk` now also
accepts `chatStatus === "pending"` when the latest durable message is a
user message (fresh send). Tool-call cycles (where latest =
assistant/tool) remain unaffected.
2. **`StreamingOutput.tsx`** — During streaming, the component keeps
showing "Thinking..." until a text or reasoning block appears, bridging
the visual gap between the startup placeholder and the first visible
content.
3. **`streamState.ts`** — Changed the early-return guard for
text/reasoning parts from `!part.text` to `!part.text?.trim()` so
whitespace-only deltas don't create a non-null `StreamState` with empty
blocks.
<details><summary>Decision log</summary>
- Including `"pending"` in `isAwaitingFirstStreamChunk` was previously
rejected because it caused the 15-second "startup taking longer" warning
during tool-call cycles. The `latestMessage?.role === "user"` guard now
prevents that — during tool cycles the latest durable message is
assistant/tool, not user.
- The `StreamingOutput` streaming-thinking check uses a synthetic
`"starting"` status for `ChatStatusCallout` rather than adding a new
phase to `LiveStatusModel`, keeping the status model clean.
- The whitespace trim fix in `streamState.ts` is defense-in-depth — the
`StreamingOutput` fix handles the rendering gap, but preventing
empty-block `StreamState` creation is the correct behavior at the
source.
</details>
This refactors `<Tabs />` into two clearer patterns: link tabs for route
navigation and Radix tabs for stateful tab panels. That gives us proper
accessibility semantics where we need them without overloading simple
navigation tabs.
As part of that split, this updates several consumers, adds coverage for
both variants, and cleans up some nearby styling.
- introduce Radix-backed tabs primitives for tabbed content
- move router-based tabs to `LinkTabs`
- update notifications, IdP sync, and workspace build pages to use
semantic tabs
- preserve route navigation tabs for groups and templates
- add stories/tests for both tab implementations
- simplify related layout and styling in touched components
Closes#22244
This pull-request makes our `<Alert />`'s more inline with the Figma
style-system, we're looking to ensure that these are vertically rendered
now and not horizontal WCAG nightmares.
---------
Co-authored-by: Danielle Maywood <danielle@themaywoods.com>
The sticky user message in the chat timeline had two visual issues:
1. **Dead space during scroll** — the clipping calculation subtracted
48px prematurely (`fullHeight - scrolledPast - 48`), causing the message
to shrink before its content had actually left the viewport. Removed the
offset so clipping begins exactly when content scrolls out of view.
2. **Blur/gradient popping in abruptly** — the `--fade-opacity` variable
was a binary 0/1 toggle. Now it ramps 0→1 over the last 40px before
`MIN_HEIGHT`, so the blur and bottom gradient only appear when the
message is fully compressed.
Also added a longer (~25 line) user message to the `WithMessageHistory`
story to make the sticky behavior easier to test visually.
Replace hardcoded paths for instruction files, skills, and MCP config
with
values read from `CODER_AGENT_EXP_*` environment variables. Template
authors
configure paths via the existing `coder_agent` `env` block. The agent
resolves `~`, relative, and absolute paths locally, then serves the
resolved config over `GET /api/v0/context-config`. `chatd` fetches this
once per workspace attach and falls back to today's defaults for older
agents.
All path env vars are comma-separated, allowing multiple directories:
| Env Var | Default | Controls |
|---|---|---|
| `CODER_AGENT_EXP_INSTRUCTIONS_DIRS` | `~/.coder` | Dirs containing the
instruction file |
| `CODER_AGENT_EXP_INSTRUCTIONS_FILE` | `AGENTS.md` | Instruction file
name |
| `CODER_AGENT_EXP_SKILLS_DIRS` | `.agents/skills` | Skills directories
|
| `CODER_AGENT_EXP_SKILL_META_FILE` | `SKILL.md` | Skill metadata file
name |
| `CODER_AGENT_EXP_MCP_CONFIG_FILES` | `.mcp.json` | MCP config files |
### Example
```hcl
resource "coder_agent" "main" {
os = "linux"
arch = "amd64"
env = {
CODER_AGENT_EXP_INSTRUCTIONS_DIRS = "/opt/company/agent-config,~/.coder"
CODER_AGENT_EXP_INSTRUCTIONS_FILE = "CLAUDE.md"
CODER_AGENT_EXP_SKILLS_DIRS = "/opt/company/ai-skills,.agents/skills"
CODER_AGENT_EXP_MCP_CONFIG_FILES = "/opt/company/mcp.json,.mcp.json"
}
}
```
<details>
<summary>Implementation Details</summary>
### Architecture
Follows the same pattern as MCP tool discovery:
agent resolves locally → exposes via HTTP → chatd consumes.
**Agent-side** (`agent/agentcontextconfig/`):
- `ResolvePath` / `ResolvePaths` handle `~`, relative, and absolute path
forms; returns `""` for relative paths when baseDir is empty
- `Config` reads env vars, falls back to defaults, resolves all paths
- `GET /api/v0/context-config` serves the resolved config as JSON
**chatd-side** (`coderd/x/chatd/`):
- Calls `conn.ContextConfig()` once on first workspace attach
- Falls back to hardcoded defaults on 404 (older agents)
- Iterates instruction dirs, skills dirs using resolved absolute paths
- `LSRelativityRoot` everywhere — no more home/root juggling
### Key design decisions
- **`EXP_` prefix**: env vars use `CODER_AGENT_EXP_*` to indicate
experimental status
- **Plural names**: comma-separated vars use plural names (`DIRS`,
`FILES`); single-value vars use singular (`FILE`)
- **Defaults in `workspacesdk`**: default constants live in
`codersdk/workspacesdk/` so both agent and server reference them without
cross-layer imports
- **`skillMetaFile` persistence**: stored on context-file parts via
`ContextFileSkillMetaFile` and restored on subsequent chat turns so
custom values survive across turns
- **Working dir dedup**: `slices.Contains` guard prevents reading the
same instruction file from both `InstructionsDirs` and the working
directory
- **MCP server dedup**: first-occurrence-wins dedup prevents leaking
duplicate connections from overlapping config files
- **ResolvePath safety**: returns `""` for relative paths when `baseDir`
is empty, so `ResolvePaths` filters them out
### Files changed
| File | Change |
|---|---|
| `agent/agentcontextconfig/` | New package — path resolution + HTTP
endpoint |
| `codersdk/workspacesdk/agentconn.go` | `ContextConfigResponse` type,
default constants, client method |
| `agent/agent.go` + `agent/api.go` | Wire up endpoint, pass config to
MCP |
| `agent/x/agentmcp/manager.go` | Accept `[]string` MCP config paths,
dedup by name |
| `coderd/x/chatd/chatd.go` | Fetch config, thread through, named
returns |
| `coderd/x/chatd/instruction.go` | Accept configurable dir + file name,
`skillMetaFileFromParts` |
| `coderd/x/chatd/chattool/skill.go` | Accept configurable dirs + meta
file |
| `codersdk/chats.go` | `ContextFileSkillMetaFile` field for persistence
|
### Test coverage
- `TestConfig` (4 cases): defaults, custom env vars, whitespace
trimming, comma-separated dirs
- `TestResolvePath` / `TestResolvePaths`: including empty baseDir edge
case
- `TestPersistInstructionFilesFallbackOnOlderAgent`: backward-compat
path when `ContextConfig` returns 404
- `TestChatMessagePartVariantTags`: updated exclusion list for new
internal field
### Backward compatibility
Older agents return 404 for the new endpoint. `chatd` catches this and
falls back to today's defaults via `readHomeInstructionFile` (using
`LSRelativityHome`). Existing workspaces work with no changes.
</details>
The terminal panel in the agents sidebar generated a fresh
`reconnectionToken` via `crypto.randomUUID()` on every mount. Navigating
between chats or reloading the page orphaned the PTY session.
- Use the chat ID (`agentId`) as the reconnection token for
`TerminalPanel`
- Add optional `chatId` prop to `TerminalPanel`, falling back to a
random UUID when not provided
- Thread `agentId` from `AgentChatPageView` to `TerminalPanel`
This mirrors how the dedicated Terminal page persists sessions via a
URL-stored token.
> 🤖 Written by a Coder Agent. Reviewed by a human.
Previously the command required exactly two arguments, forcing users to
run it multiple times to declare multiple dependencies for a single
unit.
This accepts variadic depends-on arguments so all dependencies can be
declared in one call:
```
coder exp sync want my-unit dep-1 dep-2 dep-3
```
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Marcin Tojek <mtojek@users.noreply.github.com>
Move !isSavingMessage to the outer toolbar guard so the gradient
container does not mount empty during save. Remove the now-redundant
inner guard.
Add flex to the assistant copy button wrapper div. The plain block
wrapper with an inline-flex button created a line box whose height
depended on the inherited non-integer line-height (14px * 1.625 =
22.75px strut). Sub-pixel rounding during hover repaints caused a
1px jitter. Making it a flex container eliminates the strut.
Add behavioral assertions to UserMessageCopyButton: click edit and
assert onEditUserMessage fires, click copy and assert writeText is
called with the raw markdown.
Add MultiAssistantTurnCopyButton regression story for the
isLastAssistantMessage fix.
Refs #23850
Fixes issues found in post-merge review of #23891 and #23892.
- **P2:** Export `_resetForTesting()` from `chime.ts` to break
cross-test cache dependency; call in `beforeEach`
- **P2:** Add `KylesophyToggle` and `TogglesKyleosophy` Storybook
stories
- **P3:** Fix JSDoc on `maybePlayChime` — terminal states are
`waiting|pending`, not `waiting|error`
- **P3:** Rename `setKylesophyLocal` back to `setLocalKyleosophy` to
match `setLocal*` convention
- Preserve original `location` property descriptor in
`isKylesophyForced` tests to avoid leaking mutated descriptors across
test suites (#23892 review)
> 🤖 Written by a Coder Agent. Reviewed by a human.
## Add terminal panel to chat sidebar
Extract the reusable terminal runtime from `TerminalPage` into
`modules/terminal/` and wire it into the agents chat right sidebar as a
new **Terminal** tab.
### Changes
- **`modules/terminal/WorkspaceTerminal.tsx`** — Shared xterm +
websocket terminal component (container-sized, no route dependency)
- **`modules/terminal/WorkspaceTerminalAlerts.tsx`** — Moved from
`TerminalPage/` to shared module
- **`pages/AgentsPage/TerminalPanel.tsx`** — Sidebar wrapper around
`WorkspaceTerminal`
- **`pages/AgentsPage/AgentDetailView.tsx`** — Terminal tab added (gated
on `hasWorkspace`)
- **`pages/TerminalPage/TerminalPage.tsx`** — Slimmed to page-shell
using shared component
### Demo
[dogfood-terminal-demo.webm](https://github.com/user-attachments/assets/359200dc-f8e4-4a9a-b00b-923f142dc228)
### Behavior
- Terminal tab appears only when the chat has a workspace with a
connected agent
- Connects via the existing workspace agent PTY websocket
- Resizes correctly on panel width changes, expand/collapse, and
viewport resize
- `fitAddon.fit()` guarded against pre-renderer crashes (fixes proxy
access)
- Tab switching unmounts/remounts cleanly (reconnects via session token)
- No changes to Git or Desktop panel behavior
---
_Generated with [`mux`](https://github.com/coder/mux) • Model:
`anthropic:claude-opus-4-6` • Thinking: `xhigh`_
Refs #23897
- Rename user-facing "chats" to "Coder Agents" (feature name) or
"conversations" (individual instances)
- Covers UI strings, docs prose, Storybook stories, and aria labels
- API paths, internal code identifiers, and the "Chats API" docs page
name are intentionally left unchanged
- TaskPage / AI Tasks are out of scope
> 🤖 Written by a Coder Agent. Will be reviewed by a human.
Sometimes clicking **Edit** on a chat message does not populate the
composer with the message text, and the edit flow had a few timing bugs
around Lexical hydration. The composer was relying on
`key={initialValue}` on `LexicalComposer`, so re-editing the same text
could produce no state change, no remount, and an empty editor.
This PR keeps the editor mounted and switches edit flows to an
imperative `setValue()` API on `ChatMessageInputRef`. It also hardens
that API so draft reads and writes stay correct across initial
hydration: canceling edit no longer refocuses on mobile, pre-edit draft
snapshots preserve persisted drafts, early `setValue()` calls buffer
until the editor is ready, and `getValue()` falls back before readiness
but reads live editor state after attach.
Add a hover-reveal copy button to both user and assistant messages
in the agents chat. Copies raw markdown to the clipboard, preserving
formatting for pasting into markdown-aware editors.
The button uses the existing useClipboard hook and matches the
visual pattern established by the edit button on user messages
(opacity-0 with group-hover reveal and focus-visible support).
For assistant messages, the button sits below the response content.
For user messages, it sits inline alongside the edit button.
Messages with no copyable text content (e.g. tool-only messages)
do not show the button.
Fixes#23897 (docs link only — naming rename is in #23905)
- Fix version stripping logic in both Go (`codersdk/deployment.go`) and
TypeScript (`site/src/utils/docs.ts`) to preserve `-rc.X` suffixes
instead of amputating them along with `-devel`
- Add `v0.0.0` fallback in the TS frontend to match Go backend behavior
for dev builds
- Add tests covering RC, devel, and plain release version strings
> 🤖 Written by a Coder Agent. Will be reviewed by a human.
## Problem
Subagent chats were receiving git context (branch, remote origin, PR
status) from their parent or sibling chats' git operations. When a git
operation triggers external auth, the workspace agent sends `chat_id`
identifying which chat initiated it — but this was broken at two levels:
1. **Agent side:** `CODER_CHAT_ID` was never injected into process
environments. `chatd` sets `Coder-Chat-Id` HTTP headers and the
agent extracts them for process isolation, but never propagated
`CODER_CHAT_ID` to `cmd.Env`. So `gitaskpass` always sent an empty
`chat_id`.
2. **Server side:** `workspaceAgentsExternalAuth` ignored the `chat_id`
query param. `MarkStale` broadcast git context to **all** chats on
the workspace via `filterChatsByWorkspaceID`.
## Fix
- Inject `CODER_CHAT_ID` into `cmd.Env` in `agentproc` when the chat
ID is known, so `gitaskpass` can read and forward it.
- Read `chat_id` from query params in `workspaceAgentsExternalAuth`
and thread it through `chatGitRef`.
- Refactor `MarkStale` to accept a `MarkStaleParams` struct. When
`ChatID` is provided, target only that specific chat. When empty
(legacy agents, non-chat git operations), fall back to the existing
workspace-wide broadcast.
- Extract `markStaleSingle` helper to deduplicate the upsert+publish
logic.
<details><summary>Investigation notes</summary>
### Data flow before fix
```
chatd → sets Coder-Chat-Id header on agent conn
agent → extracts chatID, stores on process struct
agent → does NOT set CODER_CHAT_ID in cmd.Env ← gap 1
gitaskpass → reads CODER_CHAT_ID (always empty), sends chat_id=""
server handler → ignores chat_id query param ← gap 2
MarkStale → broadcasts to ALL workspace chats
```
### Data flow after fix
```
chatd → sets Coder-Chat-Id header on agent conn
agent → extracts chatID, stores on process struct
agent → sets CODER_CHAT_ID in cmd.Env
gitaskpass → reads CODER_CHAT_ID, sends chat_id=<uuid>
server handler → reads chat_id, passes to MarkStale
MarkStale → targets only that specific chat
```
</details>
After sending a message, `handleSend` clears stream state and inserts
the user message but did not set `chatStatus` to `"running"`. Combined
with #23805 narrowing `selectIsAwaitingFirstStreamChunk` to only
match `chatStatus === "running"` (instead of `isActiveChatStatus` which
included `"pending"`), the "Thinking..." indicator could not appear
until
the WebSocket delivered `status:running` — a 50–500ms+ gap.
Optimistically set `chatStatus` to `"running"` in both the send and edit
paths after the POST returns (non-queued). The WebSocket
`status:running`
event no-ops via the `setChatStatus` guard; error/pending events
override
the optimistic value.
<details><summary>Investigation & decision log</summary>
### Root cause chain
1. **PR #23805** (`953c3bdc0`) changed
`selectIsAwaitingFirstStreamChunk`
from `isActiveChatStatus(state.chatStatus)` → `state.chatStatus ===
"running"`.
Valid fix: during `"pending"`, `shouldApplyMessagePart()` drops stream
parts,
so `streamState` stays null and the 15s "startup taking too long"
warning
fired spuriously during multi-turn tool-call cycles.
2. **PR #23884** (`4b5265695`) fixed event ordering within a WebSocket
batch
so both `[message_part, status:running]` and `[status:running,
message_part]`
orderings show "Thinking...". Correct fix, but only operates **after**
`chatStatus` reaches `"running"`.
3. `handleSend` never set `chatStatus` optimistically — it relied
entirely on
the WebSocket `status:running` event. After #23805 narrowed the
selector,
the gap between POST completion and WebSocket event became visible.
### Why this fix is safe
- Non-queued POST = server accepted the message → `"running"` is the
correct
next state.
- `setChatStatus("running")` guard: `if (state.chatStatus === status)
return`
makes the subsequent WebSocket confirmation a no-op.
- If the server transitions to error/pending instead, the WebSocket
event
overrides the optimistic value.
- `shouldApplyMessagePart()` returns `true` for `"running"`, so early
stream
parts arriving before the WebSocket `status:running` will not be
silently
dropped.
### What was NOT regressed by PR #23884
PR #23884's `setTimeout(0)` deferred flush is correct. Both event
orderings
now produce a render cycle where `chatStatus === "running"` and
`streamState === null`, allowing "Thinking..." to appear. The
`setTimeout(0)`
fires in a separate macrotask, giving the browser a paint opportunity.
</details>
## Problem
Every `GET /api/experimental/chats/{chatID}` call was blocking for
200-800ms because the `getChat` handler called `resolveChatDiffStatus`,
which unconditionally hit the git provider API (e.g. GitHub's `GET
/repos/{owner}/{repo}/pulls?head=...`) via `ResolveBranchPullRequest` —
even when the cached diff status was fresh.
This made every chat page load at `/agents/{id}` noticeably slow.
## Root cause
The call chain was:
1. `getChat` → `resolveChatDiffStatus`
2. `resolveChatDiffStatus` → `resolveChatDiffReference` →
`gp.ResolveBranchPullRequest(...)` **(external HTTP call)**
3. Only **after** the external call: `chatDiffStatusIsStale(status,
now)` check
The staleness check happened after the expensive work, so every request
paid the cost regardless of cache freshness.
## Fix
`getChat` now returns the cached `chat_diff_statuses` row directly from
the database. The background `gitsync` worker already keeps these rows
fresh (every `DiffStatusTTL = 120s`), so inline resolution was
redundant.
The `resolveChatDiffContents` endpoint (which fetches actual diff
content) still uses the full resolution path since it needs to make
provider API calls by design.
## Changes
- `getChat` reads cached diff status from DB instead of calling
`resolveChatDiffStatus`
- Remove `resolveChatDiffStatus` (dead code — no production callers)
- Remove `chatDiffStatusIsStale` and `chatDiffStatusTTL` (dead code)
- Remove `RefreshesStaleStatusWithExternalAuth` test (tested the removed
inline refresh path)
<details><summary>Decision log</summary>
- **Why not just add a staleness gate?** The background worker already
handles refreshes on the same schedule. Adding an early-return-if-fresh
would work but leaves dead code for the stale path that's never
exercised in production (the worker gets there first). Removing the
inline path entirely is simpler and eliminates the external API
dependency from the read path.
- **Why keep `resolveChatDiffContents` unchanged?** That endpoint's job
is to fetch the actual diff content from the provider, so external API
calls are inherent to its purpose.
</details>
The "Thinking..." indicator intermittently failed to render after
submitting a message. The behavior depended on the order of events
within a single WebSocket frame.
## Root Cause
`flushMessageParts()` was called before **all** non-`message_part`
events in the batch loop. When the server sent `[message_part,
status:"running"]` in the same SSE chunk:
1. `message_part` → pushed to `partsBuf`
2. `status:"running"` → `flushMessageParts()` applied parts **first** →
`streamState` became non-null → then `chatStatus` set to `"running"`
3. Subscriber saw `streamState != null && chatStatus == "running"` →
`selectIsAwaitingFirstStreamChunk` returned `false` → no "Thinking..."
When events arrived in the reverse order (`[status:"running",
message_part]`), the indicator worked because the status was set before
parts were applied.
## Fix
- Move `flushMessageParts()` to only fire before `message` and `error`
events (which need prior parts visible)
- Add `discardBufferedParts()` for events that clear stream state
(`status:"pending"/"waiting"`, `retry`) so the deferred `setTimeout(0)`
flush doesn't re-populate cleared state
- Status changes are now always applied before parts within a batch, and
the deferred flush gives React one render cycle to show "Thinking..."
| Event | Flush? | Rationale |
|---|---|---|
| `message` | YES | Durable commit must include all stream parts |
| `error` | YES | Partial output should be visible alongside error |
| `status` | NO | Status must be set before parts so "starting" phase
renders |
| `retry` | DISCARD | Retry clears stream state; flushing would
re-populate it |
| `queue_update` | NO | Doesn't interact with stream state |
## Tests (written first, failing before fix)
1. **"shows starting phase when message_part arrives before
status:running in same batch"** — the exact bug scenario
2. **"shows starting phase when status:running arrives before
message_part in same batch"** — verifies the "good" order still works
3. **"discards buffered parts when status transitions to pending"** —
verifies parts don't leak through pending transitions
All tests are deterministic (fake timers, no race conditions).
<details><summary>Implementation plan & decision log</summary>
### Why not reorder events within the batch?
Reordering would change the semantic ordering of events from the server,
which could have subtle side effects. The simpler approach is to be
selective about when parts are flushed.
### Why discard (not flush) before pending/waiting/retry?
These events clear `streamState`. If parts were flushed before the
clear, they'd be visible for one frame then disappear. If the deferred
flush ran after the clear, it would re-populate the state. Discarding is
the only correct behavior.
### Why keep flush before error?
Errors should surface partial output so the user can see what the agent
was doing when it failed.
</details>
- Force-enable Kyleosophy on `dev.coder.com` via hostname check
- Toggle shows as checked + disabled with "Kyleosophy is mandatory on
`dev.coder.com`"
- `isKylesophyForced()` exported for UI and testability
- Tests for forced/non-forced hostname behavior
> 🤖 Written by a Coder Agent. Reviewed by a human.
- Add "Enable Kyleosophy" toggle to Settings > Behavior
- When enabled, replaces standard completion chime with random Kyle
sound clips
- Ships 8 alternative `.mp3` files as static assets (~82KB total)
- localStorage preference (`agents.kyleosophy`), defaults to off
- Pauses orphaned Audio elements on sound URL change to prevent overlap
<details><summary>Review findings addressed</summary>
- **P2** Stale JSDoc on `playChimeAudio` — updated to reflect
parameterized behavior
- **P3** Overlapping audio on rapid completions — added
`chimeAudio?.pause()` before replacement
- **P3** Test ordering dependency — pinned `Math.random` for
determinism, documented cache behavior
- **Nit** Setter naming — `setLocalKyleosophy` → `setKylesophyLocal`
- Toggle moved to bottom of Behavior page per product request
- Description changed to "IYKYK" per product request
</details>
> 🤖 Written by a Coder Agent. Reviewed by a human.
Removes 6 fragile `require.Equal(t, codersdk.ChatStatusPending,
chat.Status)` assertions from chat relay and creation tests.
**Root cause**: In HA tests with two replicas sharing the same DB, the
worker can acquire a just-created chat (flipping `pending → running` via
`AcquireChats`) before the HTTP response reaches the test. All affected
tests already synchronize via `require.Eventually` waiting for `running`
status, making the initial assertion both redundant and racy.
- Remove 5 assertions in `enterprise/coderd/exp_chats_test.go` (all
`TestChatStreamRelay` subtests)
- Remove 1 assertion in `coderd/exp_chats_test.go` (`TestPostChats`)
- An existing comment in `TestPostChats/Success` already documents this
exact race
Fixes flake:
https://github.com/coder/coder/actions/runs/23807597632/job/69385425724
> 🤖 Written by a Coder Agent. Will be reviewed by a human.
Closes#22189
GFM alerts (e.g., `> [!IMPORTANT]`) in Markdown content failed to render
when the alert body contained inline formatting like `**bold**`,
`*italic*`, or `` `code` ``. The alert marker and subsequent text were
merged into a single string node by the parser, causing the type
detection to fail and fall back to a plain blockquote.
Additionally, multi-line alert content (`> line one\n> line two`) lost
its line breaks — all lines collapsed into one.
- Split the alert marker from trailing content in shared string nodes so
type detection works with inline formatting
- Preserve embedded newlines as `<br/>` elements to match GitHub's GFM
alert rendering
- Wrap plain-text children instead of splitting on `\n` to avoid
stripping newline information early
<img width="447" height="187" alt="image"
src="https://github.com/user-attachments/assets/d2fa3495-0b31-483c-97d8-12fed6819e24"
/>
Unarchiving a root chat now restores descendant chats in the database
and emits lifecycle events for every affected chat so passive sessions
converge without a full refetch.
This keeps archive and unarchive symmetric at both the data and
watch-stream layers by returning the affected chat family from the
database, using those post-update rows for chatd pubsub fanout, and
covering descendant lifecycle delivery with a watch-level regression
test.
Closes#23666
- Prompt table was collapsing and sizing improperly, fixed
- Make pretty much everything `text-sm` and `font-normal`
- Add model filter
- Back button on session threads page now navigates back instead of
going straight to `/aibridge/sessions`
---------
Co-authored-by: Jake Howell <jake@hwll.me>
`TestServer_X11_EvictionLRU` hangs forever when the developer's login
shell is `fish`. This is the only test in the repo that breaks on fish,
and it meant I couldn't run `make test` or similar without it blocking
indefinitely.
The test uses `sess.Shell()` to start interactive shell sessions, which
causes the SSH server to run the user's login shell directly (`fish
-l`). Fish buffers all piped stdin to EOF before executing any of it, so
the test's `echo ready-0\n` write never gets processed — fish sits
waiting for the pipe to close, and the test sits waiting for the echo
response.
The fix is a one-line change: `sess.Shell()` → `sess.Start("sh")`. The
test is exercising X11 LRU eviction, not shell behavior, so using `sh`
explicitly is both correct and shell-agnostic. The DISPLAY environment
variable is set identically either way since the x11-req handler runs
before `sessionStart`.
Fixes flaky `TestOpenAIReasoningWithWebSearchRoundTripStoreFalse` and
`TestOpenAIReasoningWithWebSearchRoundTrip`.
## Changes
- Gate the `processChat` control subscriber's cancel callback behind a
`chan struct{}` that is closed after publishing `"running"` status
- Add `TestGatedControlCancel` with 4 subtests exercising the gate logic
<details>
<summary>Root cause analysis</summary>
`SendMessage` publishes a `"pending"` notification on
`chat:stream:<chatID>` via PostgreSQL `NOTIFY`. `processChat` subscribes
to the same channel for control signals. Due to async NOTIFY delivery,
the `"pending"` notification can arrive at the control subscriber
**after** it registers its queue — even though it was published
**before**. `shouldCancelChatFromControlNotification("pending")` returns
`true`, immediately self-interrupting the processor before it does any
work.
The fix gates the cancel callback behind a closed channel. The channel
is closed after `processChat` publishes `"running"` status, so stale
notifications from before initialization are harmlessly ignored.
`close()` provides a happens-before guarantee in the Go memory model.
</details>
> 🤖 Written by a Coder Agent. Reviewed by a human.
Introduces a new `getAIBridgePermissions` method that all AI Bridge
pages can use to restrict access/paywall. Also adds the paywall and
alert to the session threads page bc I totes forgot.
---------
Co-authored-by: Jake Howell <jacob@coder.com>
Replaces the tooltip-inside-tooltip approach for the context usage
indicator with a hover-based Popover. Skill descriptions now appear as
nested tooltips to the right, matching the ModelSelector pattern.
**Before**: Tooltip with inline skill descriptions (truncated, janky
nested tooltips)
**After**: Popover opens on hover, skill names listed cleanly,
descriptions appear to the right on hover
- Popover opens on `mouseEnter`, closes after 150ms delay on
`mouseLeave`
- `onOpenAutoFocus` prevented to avoid stealing chat input focus
- Mobile keeps tap-to-toggle Popover behavior
- Skill rows get subtle `hover:bg-surface-tertiary` highlight
- `TooltipProvider` with `delayDuration={300}` wraps skill items (same
as ModelSelector)
Replaces the generic red `ErrorAlert` ("Forbidden.") with a proactive
permission check and friendly info alert when a user lacks the
`agents-access` role.
- Add `createChat` permission check to `permissions.json` using
`owner_id: "me"`
- Handle `"me"` owner substitution in `renderPermissions` (SSR path)
- Pass `canCreateChat` from `useAuthenticated().permissions` into
`AgentCreateForm`
- Show `ChatAccessDeniedAlert` and disable input immediately (no need to
trigger a 403 first)
- Also catch 403 errors as a fallback in case permissions aren't yet
loaded
- Add `ForbiddenNoAgentsRole` Storybook story with `play` assertions
- Add `TestRenderPermissionsResolvesMe` Go test to pin the `"me"`
sentinel substitution
<details><summary>Implementation plan & decision log</summary>
- Uses the existing `permissions.json` + `checkAuthorization` system
rather than a separate API call
- `owner_id: "me"` is resolved to the actor's ID by both the auth-check
API endpoint and the SSR `renderPermissions` function
- Go test uses a real `rbac.StrictCachingAuthorizer` (not a mock) so it
verifies both the sentinel substitution and the RBAC role evaluation
end-to-end
- Alert follows the exact same `Alert` pattern as the 409 usage-limit
block
- Uses `severity="info"` and links to the getting-started docs Step 3
- Textarea is disabled proactively so the user never sees the scary
generic error
</details>
> 🤖 Created by a Coder Agent and will be reviewed by a human.
Registers a new aibridge provider for ChatGPT by reusing the existing
OpenAI provider with a different `Name` and `BaseURL`
(https://chatgpt.com/backend-api/codex). The ChatGPT backend API is
OpenAI-compatible, so no new provider type is needed.
ChatGPT authenticates exclusively via per-user OAuth JWTs (BYOK mode) —
no centralized API key is configured. The OpenAI provider already
handles this: when no key is set, it falls through to the bearer token
from the request's Authorization header.
Depends on #23811
- Remove `border-surface-secondary` from all line art and use the
default border color
- Use `text-sm` for all timeline elements and tables *(except the
`Thinking...` mono font, that's `text-xs`)
---------
Co-authored-by: Jake Howell <jacob@coder.com>
## Description
Adds support for multiple Copilot provider instances to route requests to different Copilot upstreams (individual, business, enterprise). Each instance has its own name and base URL, enabling per-upstream metrics, logs, circuit breakers, API dump, and routing.
## Changes
* Add Copilot business and enterprise provider names and host constants
* Register three Copilot provider instances in aibridged (default, business, enterprise)
* Update `defaultAIBridgeProvider` in `aibridgeproxy` to route new Copilot hosts to their corresponding providers
## Related
* Depends on: https://github.com/coder/aibridge/pull/240
* Closes: https://github.com/coder/aibridge/issues/152
Note: documentation changes will be added in a follow-up PR.
_Disclaimer: initially produced by Claude Opus 4.6, heavily modified and reviewed by @ssncferreira ._
Renders the `last_injected_context` data (AGENTS.md files and skills)
from the Chat API in the `ContextUsageIndicator` hover tooltip. On
hover, users now see:
- **Context files**: basename with full path on title hover, truncation
indicator
- **Skills**: name and optional description
Separated from the existing token usage info by a border divider when
both sections are present. Added `max-w-72` to prevent the tooltip from
getting too wide.
<img width="970" height="598" alt="image"
src="https://github.com/user-attachments/assets/5bc25cb2-1d92-41d2-ab1a-63e5e49f667a"
/>
<details>
<summary>Data flow</summary>
```
chatQuery.data.last_injected_context
→ AgentChatPage (AgentChatPageView prop)
→ AgentChatPageView (ChatPageInput prop)
→ ChatPageInput (spread into latestContextUsage)
→ AgentChatInput (contextUsage prop)
→ ContextUsageIndicator (usage.lastInjectedContext)
```
</details>
<details>
<summary>Files changed</summary>
| File | Change |
|---|---|
| `ContextUsageIndicator.tsx` | Add `lastInjectedContext` to interface,
render context files and skills sections in tooltip |
| `ChatPageContent.tsx` | Thread `lastInjectedContext` prop, spread into
context usage object |
| `AgentChatPageView.tsx` | Thread `lastInjectedContext` prop to
`ChatPageInput` |
| `AgentChatPage.tsx` | Pass `chatQuery.data?.last_injected_context`
down |
</details>
Archiving a chat now transitions pending or running chats to waiting
before setting the archived flag. This publishes a status notification
on `ChatStreamNotifyChannel` so `subscribeChatControl` cancels the
active `processChat` context via `ErrInterrupted` — the same codepath
used by the stop button.
The `processChat` cleanup also skips queued-message auto-promotion when
the chat is archived, so archiving behaves like a hard stop rather than
interrupt-and-continue.
Relates to https://github.com/coder/coder/issues/23666
_Disclaimer: produced using Claude Opus 4.6, reviewed by me, and
validated against Dogfood dataset._
The `ListAIBridgeSessions` query materialized and aggregated all
matching interceptions before paginating, then ran expensive
token/prompt lookups across the full dataset. For a page of 25 sessions
against ~200k interceptions (our dogfood dataset), this meant:
- Three CTEs scanning all rows (filtered_interceptions, session_tokens,
session_root)
- ARRAY_AGG(fi.id) collecting every interception ID per session
- Lateral prompt lookup via ANY(array_of_all_ids) running for every
session, not just the page
- ~90MB of disk sorts and JIT compilation kicking in
The improvement is to restructure to paginate first and enrich after: a
single CTE groups interceptions into sessions with only cheap aggregates
(MIN, MAX, COUNT), applies cursor pagination and LIMIT, then lateral
joins fetch metadata, tokens, and prompts for just the ~25-row page.
Measured against 220k interceptions / 160k sessions:
| Metric | Before | After |
|--------------------|--------|-------|
| Execution time | 1800ms | 185ms |
| Shared buffer hits | 737k | 2.6k |
| Disk sort spill | 86MB | 16MB |
| Lateral loops | 160k | 25 |
https://grafana.dev.coder.com/goto/fbODPGtvR?orgId=1 the results are
identical, just _much_ faster.
---
Also includes some additional tests which I added prior to refactoring
the query to ensure no regressions on edge-cases.
---------
Signed-off-by: Danny Kopping <danny@coder.com>
Bump the repository Go toolchain from 1.25.7 to 1.25.8.
Updates `go.mod`, the shared `setup-go` action default, and the dogfood
image checksum so local, CI, and dogfood builds stay aligned.
The agent row tooltip showed "Error starting the agent" / "Something
went wrong during the agent startup" with a red border when a startup
script fails. This is misleading — the agent is started and functional,
only the startup script exited with a non-zero code.
Extracts shared message constants (`agentLifecycleMessages`,
`agentStatusMessages`) from `health.ts` so both the workspace-level
health classification and the per-agent-row tooltips reference the same
single source of truth. No more duplicated wording that can drift.
Changes:
- **`health.ts`**: Exports `agentLifecycleMessages` and
`agentStatusMessages` maps; `getAgentHealthIssue` now references them
instead of inline strings.
- **`AgentStatus.tsx`**: All lifecycle/status tooltip components
(`StartErrorLifecycle`, `StartTimeoutLifecycle`,
`ShutdownTimeoutLifecycle`, `ShutdownErrorLifecycle`, `TimeoutStatus`)
now import and render from the shared message constants.
`StartErrorLifecycle` icon changed from red (`errorWarning`) to orange
(`timeoutWarning`).
- **`AgentRow.tsx`**: `start_error` border changed from
`border-border-destructive` (red) to `border-border-warning` (orange).
Closes#23652
Refs #21389
> 🤖 This PR was created with the help of Coder Agents, and has been
reviewed by my human. 🧑💻
The flaky test assumed the second streamed OpenAI request had already
been captured when the chat status event arrived. In practice, the
capture server can record that second request slightly later, which
intermittently left `streamRequestCount` at `1`.
This change waits for the second captured request before asserting on
the follow-up payload and relaxes the count check to a sanity check. The
test still verifies the `store=false` round-trip behavior without
depending on that timing race.
Fixescoder/internal#1433
- Fix workspaces list invalidation after kebab-menu delete and add
Storybook coverage for the immediate `Deleting` state.
> 🤖 This PR was made by Coder Agents and read by me.
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
- Add `chat-access` built-in role granting chat CRUD at User scope
- Exclude `ResourceChat` from member, org member, and org service
account `allPermsExcept` calls
- Allow system, owner, and user-admin to assign the new role
- Migration auto-assigns role to users who have ever created a chat
- Update RBAC test matrix: `memberMe` denied, `chatAccessUser` allowed
**Breaking change**: Members without `chat-access` lose chat creation
ability. Migration covers existing chat creators. Members who have never
created a chat do not get this role automatically applied.
> 🤖 This PR was created by a Coder Agent and reviewed by me.
While running scaletests I noticed the archived filter button on the
agents sidebar would disappear when the current filter had zero results.
This made it impossible to switch between active and archived views once
one side was empty.
The filter dropdown was only rendered inside the Pinned or first
time-group section header. When `visibleRootIDs` was empty, neither
header existed, so the filter had nowhere to attach.
This keeps the original dropdown placement on section headers when chats
exist. When the list is empty, the empty-state box itself now provides a
"View archived →" or "← Back to active" link so users can always switch
filters without needing the dropdown.
<img width="322" height="191" alt="image"
src="https://github.com/user-attachments/assets/7fd9ca09-5f72-4796-a925-7fab570fdff5"
/>
<img width="320" height="184" alt="image"
src="https://github.com/user-attachments/assets/2f856088-c2dc-4e34-9ece-84144a1adf79"
/>
Both archived & unarchived look the same when there's at least one
agent:
<img width="322" height="194" alt="image"
src="https://github.com/user-attachments/assets/42c4d54b-e500-45b1-b045-c126144c35bd"
/>
- Use Tooltip instead of Popover for AI Gov tooltip
- Fix Agentic Loop tool call summing
- Collapse all expandable sections by default
- Add solid background to "Show More" button
- Remove "Sort by" dropdown for v1
Fixes factual errors found during a review of all pages under
`/docs/ai-coder/agents/`.
## Tool tables (`index.md`, `architecture.md`)
Both pages had incomplete tool tables. Added:
- `process_output`, `process_list`, `process_signal` — core workspace
tools always registered alongside `execute`, missing from both pages
- `propose_plan` — platform tool (root chats only), missing from both
pages
- `spawn_computer_use_agent` — orchestration tool (conditional), missing
from architecture.md
Also fixed the architecture.md claim that the agent is "restricted to
the tool set defined in this section" — it now mentions skills and MCP
tools with links to the relevant pages.
## Model options (`models.md`)
- **OpenAI / OpenRouter Reasoning Effort**: docs listed `low`, `medium`,
`high` — code has `none`, `minimal`, `low`, `medium`, `high`, `xhigh`.
Fixed both.
- **Removed hidden fields** that never appear in the admin UI:
- Google: Safety Settings (`hidden:"true"`)
- OpenRouter: Provider Order, Allow Fallbacks (parent struct
`hidden:"true"`)
- Vercel: Provider Options (`hidden:"true"`)
---
*PR generated with Coder Agents*
WebKit internally adjusts scrollTop during layout when content
above the viewport changes height, even with overflow-anchor:none.
These phantom adjustments fire scroll events where isNearBottom
returns false. The scroll handler was setting autoScrollRef =
nearBottom on every such event, permanently killing follow mode.
The scroll handler now only enables follow mode, never disables
it. When follow mode is active, the user is not wheel/touch
scrolling, and isNearBottom is false, this indicates a
browser-initiated scroll adjustment. Re-pin immediately and
set the restore guard so the pin's own scroll event is suppressed.
Disabling follow mode is exclusive to user-interaction handlers
(wheel, touch, scrollbar pointerdown) via handleUserInterrupt.
Guard-clear callbacks also check isNearBottom before dropping
the restoration flag, re-pinning if content grew between the
pin and the clear.
## Summary
Fixes three flaky chatd tests that intermittently fail due to timing
races with the background run loop.
Closescoder/internal#1428
## Root Cause
`CreateChat` and `PromoteQueued` call `signalWake()` which writes to
`wakeCh`, triggering `processOnce` immediately. Even though
`newTestServer` sets `PendingChatAcquireInterval: testutil.WaitLong` to
prevent ticker-based polling, the wake channel bypasses this. This
causes `processOnce` to acquire and process the chat concurrently with
the test's manual DB updates and assertions.
### Failing tests
| Test | Failure | Cause |
|------|---------|-------|
| `TestPromoteQueuedAllowsAlreadyQueuedMessageWhenUsageLimitReached` |
`expected: "pending", actual: "running"` | Wake from `CreateChat` races
with manual `UpdateChatStatus`; wake from `PromoteQueued` acquires the
chat before the status assertion |
| `TestSendMessageInterruptBehaviorQueuesAndInterruptsWhenBusy` |
`should have 1 item(s), but has 2` | Wake from `CreateChat` triggers
`processChat` which auto-promotes a queued message, adding an extra row
to `chat_messages` |
| `TestSubscribeNoPubsubNoDuplicateMessageParts` | `Condition satisfied`
(duplicate events) | Pre-existing `WaitGroup.Add/Wait` race in the
`Eventually` + `WaitUntilIdleForTest` pattern |
## Fix
Introduces a `waitForChatProcessed` helper that:
1. Polls until the chat reaches a **terminal state** (not pending AND
not running)
2. Then calls `WaitUntilIdleForTest` to wait for the inflight
`WaitGroup`
Waiting for a terminal state (not just "not pending") avoids a
`sync.WaitGroup` `Add/Wait` race: `AcquireChats` updates the DB status
to `running` **before** `processOnce` calls `inflight.Add(1)`. Checking
only `status != pending` could return while `Add(1)` hasn't happened
yet, causing `Wait()` to return prematurely.
### Per-test changes
- **`TestSendMessageInterruptBehaviorQueuesAndInterruptsWhenBusy`**:
Call `waitForChatProcessed` after `CreateChat` before manually setting
running status
-
**`TestPromoteQueuedAllowsAlreadyQueuedMessageWhenUsageLimitReached`**:
Call `waitForChatProcessed` after `CreateChat`; remove the inherently
racy `status == pending` assertion after `PromoteQueued` (the wake
immediately acquires the chat). Key assertions on promoted message,
queue state, and message count remain.
- **`TestSubscribeNoPubsubNoDuplicateMessageParts`**: Replace inline
`Eventually` with the safer `waitForChatProcessed` helper
## Verification
All three tests pass 150 consecutive executions with `-race -count=10`
across 15 runs (0 failures).
Updates the descriptive text in the proxy selection dropdown menu to be
clearer and more concise.
**Before:**
> Workspace proxies improve terminal and web app connections to
workspaces. This does not apply to CLI connections. A region must be
manually selected, otherwise the default primary region will be used.
**After:**
> Workspace proxies improve terminal and web app connections. CLI
connections are unaffected. If no region is selected, the primary region
will be used.
---------
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Adds a nullable JSONB column `last_injected_context` to the `chats`
table that stores the most recently persisted injected context parts
(AGENTS.md context-file and skill message parts). The column is updated
only when `persistInstructionFiles()` runs — on first workspace attach
or when the agent changes — so there are no redundant writes on
subsequent turns.
Internal fields (`ContextFileContent`, `ContextFileOS`,
`ContextFileDirectory`, `SkillDir`) are stripped at write time so the
column only holds small metadata. No stripping needed on the read path.
<details>
<summary>Implementation notes</summary>
- New migration `000456` adds nullable `last_injected_context JSONB`
column.
- New SQL query `UpdateChatLastInjectedContext` writes the column
without touching `updated_at`.
- `persistInstructionFiles()` strips internal fields from parts via
`StripInternal()` before persisting.
- Sentinel path (no AGENTS.md) persists skill-only parts when skills
exist.
- `codersdk.Chat` exposes `LastInjectedContext []ChatMessagePart`
(omitempty).
- `db2sdk.Chat()` passes through the already-clean data.
</details>
<!--
If you have used AI to produce some or all of this PR, please ensure you have read our [AI Contribution guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING) before submitting.
-->
relates to GRU-18
Adds support for agent updates to the Tunneler
Two ResizeObserver effects (content and container) each had their own
local restoreGuardRafId but both wrote to the shared
isRestoringScrollRef. Either observer's guard-clear RAF could fire
while the other's pin chain was in-flight, leaving isRestoringScrollRef
prematurely false.
scrollTranscriptToBottom also set isRestoringScrollRef without
cancelling any pending guard-clear, so a stale clear could drop
the flag mid-smooth-scroll animation.
Promote restoreGuardRafId to a single shared ref so all write
paths coordinate through one cancellation point.
## Problem
The `/agents` page frequently shows "Response startup is taking longer
than expected" even while the agent is actively working and messages are
appearing in the transcript.
## Root Cause
There's an inconsistency between `isActiveChatStatus` and
`shouldApplyMessagePart` during `"pending"` status (the state between
agent tool-call turns):
| Component | Treats `"pending"` as... |
|---|---|
| `isActiveChatStatus` | **active** — includes both `"running"` and
`"pending"` |
| `shouldApplyMessagePart` | **inactive** — drops all `message_part`
events during `"pending"` |
| Status handler | clears `streamState` to `null` on `"pending"` |
This creates a dead state during multi-turn tool-call cycles:
1. Agent finishes a turn → status = `"pending"` → `streamState` cleared
to `null`
2. `selectIsAwaitingFirstStreamChunk` returns `true` (status is
"active", stream is null, latest message isn't assistant)
3. Phase = `"starting"` → 15s timer starts
4. Stream parts from the server are **silently dropped**
(`shouldApplyMessagePart()` returns `false` for `"pending"`)
5. `streamState` stays `null` — phase is stuck at `"starting"`
6. Meanwhile, durable messages (tool calls, tool results) appear
normally in the transcript
7. After 15s → "Response startup is taking longer than expected" fires
## Fix
Narrow `selectIsAwaitingFirstStreamChunk` to only check `chatStatus ===
"running"` instead of `isActiveChatStatus(chatStatus)`. `"running"` is
the only status where the transport actually accepts stream parts, so
it's the only status where we should be showing the "starting"
indicator.
`isActiveChatStatus` is left unchanged since its other caller
(`shouldSurfaceReconnectState`) correctly needs to include `"pending"`.
Adds three new documentation pages for major shipped features that had
no docs, and updates the platform controls index to reflect current
state.
## New pages
### Extending Agents (`extending-agents.md`)
Covers two workspace-level extension mechanisms:
- **Skills** — `.agents/skills/<name>/SKILL.md` directory structure,
frontmatter format, auto-discovery, `read_skill`/`read_skill_file`
tools, size limits, lazy loading
- **Workspace MCP tools** — `.mcp.json` format, stdio and HTTP
transports, tool name prefixing, discovery lifecycle and caching
### MCP Servers (`platform-controls/mcp-servers.md`)
Admin MCP server configuration:
- CRUD via **Agents** > **Settings** > **MCP Servers**
- Four auth modes: none, OAuth2 (with auto-discovery), API key, custom
headers
- Availability policies: `force_on`, `default_on`, `default_off`
- Tool governance via allow/deny lists
- Permission model and secret redaction
### Usage & Insights (`platform-controls/usage-insights.md`)
Three admin dashboards:
- **Usage limits** — spend caps with per-user and per-group overrides,
priority hierarchy, enforcement behavior
- **Cost tracking** — per-user rollup with token breakdowns, date
filtering, per-model and per-chat drill-down
## Updated files
- **`platform-controls/index.md`** — Moved MCP servers, usage limits,
and analytics from "Where we are headed" into "What platform teams
control today" with links to the new pages. Removed the tool
customization roadmap section (now covered by MCP servers page).
- **`manifest.json`** — Added nav entries for all three new pages.
## Resulting nav hierarchy
```
Coder Agents
├── Getting Started
├── Early Access
├── Architecture
├── Models
├── Platform Controls
│ ├── Template Optimization
│ ├── MCP Servers ← NEW
│ └── Usage & Insights ← NEW
├── Extending Agents ← NEW
└── Chats API
```
---
*PR generated with Coder Agents*
- Enable corepack before the linkspector step so `pnpm` shim is in PATH
- `action-linkspector@v1.4.1` internally calls `actions/setup-node@v5`,
which now defaults `package-manager-cache: true` — it detects
`pnpm-lock.yaml` and tries to resolve the `pnpm` binary, but it's not
installed on the runner
- Add TODO to remove the workaround when upstream is fixed
Upstream: https://github.com/UmbrellaDocs/action-linkspector/issues/54
> 🤖 Cian asked a Coder Agent to make this PR and then reviewed the
change.
## Problem
`aibridgeproxyd` sends `X-AI-Bridge-Request-Id` on every MITM request to
`aibridged` for cross-service log correlation, but aibridged never reads
it. The header is silently forwarded to upstream LLM providers.
## Changes
* Renamed the header to `X-Coder-AI-Governance-Request-Id` to match the
existing `X-Coder-AI-Governance-*` convention.
* `aibridged` now extracts the header, logs it and strips it before
forwarding upstream.
* Added `TestServeHTTP_StripInternalHeaders` to verify no `X-Coder-*`
headers leak to upstream
The terraform testdata fixtures silently drift when the coder provider
releases a new version. The .terraform.lock.hcl files are gitignored,
.tf files use loose constraints (>= 2.0.0), and generate.sh always
runs terraform init -upgrade. The Makefile only re-runs generate.sh
when the terraform CLI version changes, not the provider version.
Track a canonical lockfile and provider-version.txt in git. Change
generate.sh to respect the lockfile by default (terraform init without
-upgrade). Add --upgrade flag for intentional provider bumps, --check
for cheap staleness detection in the Makefile, and a new
update-terraform-testdata make target.
Adds the experimental `docker-chat-sandbox` example template under
`examples/templates/x/`. It provisions a regular dev agent plus a
chat-designated agent that runs inside bubblewrap with a read-only root,
writable `/home/coder`, and outbound TCP restricted to the Coder
control-plane endpoint via `iptables`.
The chat agent still appears in dashboard and API responses, but the
template reserves it for chatd-managed sessions rather than normal user
interaction. `lint/examples` now walks nested template directories, so
experimental templates can live under `examples/templates/x/` without
treating `x/` itself as a template.
scheduleBottomPin() cancelled any in-flight pin and restarted the
double-RAF chain on every ResizeObserver notification. When content
height changes on consecutive frames (e.g. during streaming where
SmoothText reveals characters each frame and markdown re-rendering
occasionally changes block height), the inner RAF that actually sets
scrollTop is perpetually cancelled before it fires. The scroll falls
behind the growing content.
Two fixes:
1. Make scheduleBottomPin() idempotent: if a pin is already in-flight,
skip. The inner RAF reads scrollHeight at execution time so it
always targets the latest bottom. User-interrupt paths (wheel,
touch) still cancel via cancelPendingPins().
2. Add overscroll-behavior:contain to the scroll container. Prevents
elastic overscroll from generating extra scroll events that could
flip autoScrollRef to false.
Replace the raw JSON dump for process_signal with the standard
ToolCollapsible + ToolIcon + ToolLabel pipeline, matching process_list
and other generic tools. A thin ProcessSignalRenderer promotes soft
failures (success=false, isError=false) so the generic renderer shows
the error indicator. ToolLabel distinguishes running, success, and
failure states. TerminalIcon used for consistency with other process
tools.
When a process is killed via process_signal, the execute and
process_output blocks show a red OctagonX icon with signal details
on hover. The killedBySignal field is set on MergedTool during the
existing cross-message parsing pass, no new abstractions.
Stories for process_signal (10) and killed indicators (8). Unit
tests for the cross-tool annotation logic (3). Humanized labels
and TerminalIcon for process_list.
Bumps the github-actions group with 4 updates:
[actions/cache](https://github.com/actions/cache),
[fluxcd/flux2](https://github.com/fluxcd/flux2),
[Mattraks/delete-workflow-runs](https://github.com/mattraks/delete-workflow-runs)
and
[umbrelladocs/action-linkspector](https://github.com/umbrelladocs/action-linkspector).
Updates `actions/cache` from 5.0.3 to 5.0.4
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/cache/releases">actions/cache's
releases</a>.</em></p>
<blockquote>
<h2>v5.0.4</h2>
<h2>What's Changed</h2>
<ul>
<li>Add release instructions and update maintainer docs by <a
href="https://github.com/Link"><code>@Link</code></a>- in <a
href="https://redirect.github.com/actions/cache/pull/1696">actions/cache#1696</a></li>
<li>Potential fix for code scanning alert no. 52: Workflow does not
contain permissions by <a
href="https://github.com/Link"><code>@Link</code></a>- in <a
href="https://redirect.github.com/actions/cache/pull/1697">actions/cache#1697</a></li>
<li>Fix workflow permissions and cleanup workflow names / formatting by
<a href="https://github.com/Link"><code>@Link</code></a>- in <a
href="https://redirect.github.com/actions/cache/pull/1699">actions/cache#1699</a></li>
<li>docs: Update examples to use the latest version by <a
href="https://github.com/XZTDean"><code>@XZTDean</code></a> in <a
href="https://redirect.github.com/actions/cache/pull/1690">actions/cache#1690</a></li>
<li>Fix proxy integration tests by <a
href="https://github.com/Link"><code>@Link</code></a>- in <a
href="https://redirect.github.com/actions/cache/pull/1701">actions/cache#1701</a></li>
<li>Fix cache key in examples.md for bun.lock by <a
href="https://github.com/RyPeck"><code>@RyPeck</code></a> in <a
href="https://redirect.github.com/actions/cache/pull/1722">actions/cache#1722</a></li>
<li>Update dependencies & patch security vulnerabilities by <a
href="https://github.com/Link"><code>@Link</code></a>- in <a
href="https://redirect.github.com/actions/cache/pull/1738">actions/cache#1738</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/XZTDean"><code>@XZTDean</code></a> made
their first contribution in <a
href="https://redirect.github.com/actions/cache/pull/1690">actions/cache#1690</a></li>
<li><a href="https://github.com/RyPeck"><code>@RyPeck</code></a> made
their first contribution in <a
href="https://redirect.github.com/actions/cache/pull/1722">actions/cache#1722</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/cache/compare/v5...v5.0.4">https://github.com/actions/cache/compare/v5...v5.0.4</a></p>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/actions/cache/blob/main/RELEASES.md">actions/cache's
changelog</a>.</em></p>
<blockquote>
<h1>Releases</h1>
<h2>How to prepare a release</h2>
<blockquote>
<p>[!NOTE]<br />
Relevant for maintainers with write access only.</p>
</blockquote>
<ol>
<li>Switch to a new branch from <code>main</code>.</li>
<li>Run <code>npm test</code> to ensure all tests are passing.</li>
<li>Update the version in <a
href="https://github.com/actions/cache/blob/main/package.json"><code>https://github.com/actions/cache/blob/main/package.json</code></a>.</li>
<li>Run <code>npm run build</code> to update the compiled files.</li>
<li>Update this <a
href="https://github.com/actions/cache/blob/main/RELEASES.md"><code>https://github.com/actions/cache/blob/main/RELEASES.md</code></a>
with the new version and changes in the <code>## Changelog</code>
section.</li>
<li>Run <code>licensed cache</code> to update the license report.</li>
<li>Run <code>licensed status</code> and resolve any warnings by
updating the <a
href="https://github.com/actions/cache/blob/main/.licensed.yml"><code>https://github.com/actions/cache/blob/main/.licensed.yml</code></a>
file with the exceptions.</li>
<li>Commit your changes and push your branch upstream.</li>
<li>Open a pull request against <code>main</code> and get it reviewed
and merged.</li>
<li>Draft a new release <a
href="https://github.com/actions/cache/releases">https://github.com/actions/cache/releases</a>
use the same version number used in <code>package.json</code>
<ol>
<li>Create a new tag with the version number.</li>
<li>Auto generate release notes and update them to match the changes you
made in <code>RELEASES.md</code>.</li>
<li>Toggle the set as the latest release option.</li>
<li>Publish the release.</li>
</ol>
</li>
<li>Navigate to <a
href="https://github.com/actions/cache/actions/workflows/release-new-action-version.yml">https://github.com/actions/cache/actions/workflows/release-new-action-version.yml</a>
<ol>
<li>There should be a workflow run queued with the same version
number.</li>
<li>Approve the run to publish the new version and update the major tags
for this action.</li>
</ol>
</li>
</ol>
<h2>Changelog</h2>
<h3>5.0.4</h3>
<ul>
<li>Bump <code>minimatch</code> to v3.1.5 (fixes ReDoS via globstar
patterns)</li>
<li>Bump <code>undici</code> to v6.24.1 (WebSocket decompression bomb
protection, header validation fixes)</li>
<li>Bump <code>fast-xml-parser</code> to v5.5.6</li>
</ul>
<h3>5.0.3</h3>
<ul>
<li>Bump <code>@actions/cache</code> to v5.0.5 (Resolves: <a
href="https://github.com/actions/cache/security/dependabot/33">https://github.com/actions/cache/security/dependabot/33</a>)</li>
<li>Bump <code>@actions/core</code> to v2.0.3</li>
</ul>
<h3>5.0.2</h3>
<ul>
<li>Bump <code>@actions/cache</code> to v5.0.3 <a
href="https://redirect.github.com/actions/cache/pull/1692">#1692</a></li>
</ul>
<h3>5.0.1</h3>
<ul>
<li>Update <code>@azure/storage-blob</code> to <code>^12.29.1</code> via
<code>@actions/cache@5.0.1</code> <a
href="https://redirect.github.com/actions/cache/pull/1685">#1685</a></li>
</ul>
<h3>5.0.0</h3>
<blockquote>
<p>[!IMPORTANT]
<code>actions/cache@v5</code> runs on the Node.js 24 runtime and
requires a minimum Actions Runner version of <code>2.327.1</code>.</p>
</blockquote>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="668228422a"><code>6682284</code></a>
Merge pull request <a
href="https://redirect.github.com/actions/cache/issues/1738">#1738</a>
from actions/prepare-v5.0.4</li>
<li><a
href="e34039626f"><code>e340396</code></a>
Update RELEASES</li>
<li><a
href="8a67110529"><code>8a67110</code></a>
Add licenses</li>
<li><a
href="1865903e1b"><code>1865903</code></a>
Update dependencies & patch security vulnerabilities</li>
<li><a
href="5656298164"><code>5656298</code></a>
Merge pull request <a
href="https://redirect.github.com/actions/cache/issues/1722">#1722</a>
from RyPeck/patch-1</li>
<li><a
href="4e380d19e1"><code>4e380d1</code></a>
Fix cache key in examples.md for bun.lock</li>
<li><a
href="b7e8d49f17"><code>b7e8d49</code></a>
Merge pull request <a
href="https://redirect.github.com/actions/cache/issues/1701">#1701</a>
from actions/Link-/fix-proxy-integration-tests</li>
<li><a
href="984a21b1cb"><code>984a21b</code></a>
Add traffic sanity check step</li>
<li><a
href="acf2f1f76a"><code>acf2f1f</code></a>
Fix resolution</li>
<li><a
href="95a07c5132"><code>95a07c5</code></a>
Add wait for proxy</li>
<li>Additional commits viewable in <a
href="cdf6c1fa76...668228422a">compare
view</a></li>
</ul>
</details>
<br />
Updates `fluxcd/flux2` from 2.7.5 to 2.8.3
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/fluxcd/flux2/releases">fluxcd/flux2's
releases</a>.</em></p>
<blockquote>
<h2>v2.8.3</h2>
<h2>Highlights</h2>
<p>Flux v2.8.3 is a patch release that fixes a regression in
helm-controller. Users are encouraged to upgrade for the best
experience.</p>
<p>ℹ️ Please follow the <a
href="https://github.com/fluxcd/flux2/discussions/5572">Upgrade
Procedure for Flux v2.7+</a> for a smooth upgrade from Flux v2.6 to the
latest version.</p>
<p>Fixes:</p>
<ul>
<li>Fix templating errors for charts that include <code>---</code> in
the content, e.g. YAML separators, embedded scripts, CAs inside
ConfigMaps (helm-controller)</li>
</ul>
<h2>Components changelog</h2>
<ul>
<li>helm-controller <a
href="https://github.com/fluxcd/helm-controller/blob/v1.5.3/CHANGELOG.md">v1.5.3</a></li>
</ul>
<h2>CLI changelog</h2>
<ul>
<li>[release/v2.8.x] Add target branch name to update branch by <a
href="https://github.com/fluxcdbot"><code>@fluxcdbot</code></a> in <a
href="https://redirect.github.com/fluxcd/flux2/pull/5774">fluxcd/flux2#5774</a></li>
<li>Update toolkit components by <a
href="https://github.com/fluxcdbot"><code>@fluxcdbot</code></a> in <a
href="https://redirect.github.com/fluxcd/flux2/pull/5779">fluxcd/flux2#5779</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/fluxcd/flux2/compare/v2.8.2...v2.8.3">https://github.com/fluxcd/flux2/compare/v2.8.2...v2.8.3</a></p>
<h2>v2.8.2</h2>
<h2>Highlights</h2>
<p>Flux v2.8.2 is a patch release that comes with various fixes. Users
are encouraged to upgrade for the best experience.</p>
<p>ℹ️ Please follow the <a
href="https://github.com/fluxcd/flux2/discussions/5572">Upgrade
Procedure for Flux v2.7+</a> for a smooth upgrade from Flux v2.6 to the
latest version.</p>
<p>Fixes:</p>
<ul>
<li>Fix enqueuing new reconciliation requests for events on source Flux
objects when they are already reconciling the revision present in the
watch event (kustomize-controller, helm-controller)</li>
<li>Fix the Go templates bug of YAML separator <code>---</code> getting
concatenated to <code>apiVersion:</code> by updating to Helm 4.1.3
(helm-controller)</li>
<li>Fix canceled HelmReleases getting stuck when they don't have a retry
strategy configured by introducing a new feature gate
<code>DefaultToRetryOnFailure</code> that improves the experience when
the <code>CancelHealthCheckOnNewRevision</code> is enabled
(helm-controller)</li>
<li>Fix the auth scope for Azure Container Registry to use the
ACR-specific scope (source-controller, image-reflector-controller)</li>
<li>Fix potential Denial of Service (DoS) during TLS handshakes
(CVE-2026-27138) by building all controllers with Go 1.26.1</li>
</ul>
<h2>Components changelog</h2>
<ul>
<li>source-controller <a
href="https://github.com/fluxcd/source-controller/blob/v1.8.1/CHANGELOG.md">v1.8.1</a></li>
<li>kustomize-controller <a
href="https://github.com/fluxcd/kustomize-controller/blob/v1.8.2/CHANGELOG.md">v1.8.2</a></li>
<li>notification-controller <a
href="https://github.com/fluxcd/notification-controller/blob/v1.8.2/CHANGELOG.md">v1.8.2</a></li>
<li>helm-controller <a
href="https://github.com/fluxcd/helm-controller/blob/v1.5.2/CHANGELOG.md">v1.5.2</a></li>
<li>image-reflector-controller <a
href="https://github.com/fluxcd/image-reflector-controller/blob/v1.1.1/CHANGELOG.md">v1.1.1</a></li>
<li>image-automation-controller <a
href="https://github.com/fluxcd/image-automation-controller/blob/v1.1.1/CHANGELOG.md">v1.1.1</a></li>
<li>source-watcher <a
href="https://github.com/fluxcd/source-watcher/blob/v2.1.1/CHANGELOG.md">v2.1.1</a></li>
</ul>
<h2>CLI changelog</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="871be9b40d"><code>871be9b</code></a>
Merge pull request <a
href="https://redirect.github.com/fluxcd/flux2/issues/5779">#5779</a>
from fluxcd/update-components-release/v2.8.x</li>
<li><a
href="f7a168935d"><code>f7a1689</code></a>
Update toolkit components</li>
<li><a
href="bf67d7799d"><code>bf67d77</code></a>
Merge pull request <a
href="https://redirect.github.com/fluxcd/flux2/issues/5774">#5774</a>
from fluxcd/backport-5773-to-release/v2.8.x</li>
<li><a
href="5cb2208cb7"><code>5cb2208</code></a>
Add target branch name to update branch</li>
<li><a
href="bfa461ed21"><code>bfa461e</code></a>
Merge pull request <a
href="https://redirect.github.com/fluxcd/flux2/issues/5771">#5771</a>
from fluxcd/update-pkg-deps/release/v2.8.x</li>
<li><a
href="f11a921e0c"><code>f11a921</code></a>
Update fluxcd/pkg dependencies</li>
<li><a
href="b248efab1d"><code>b248efa</code></a>
Merge pull request <a
href="https://redirect.github.com/fluxcd/flux2/issues/5770">#5770</a>
from fluxcd/backport-5769-to-release/v2.8.x</li>
<li><a
href="4d5e044eb9"><code>4d5e044</code></a>
Update toolkit components</li>
<li><a
href="3c8917ca28"><code>3c8917c</code></a>
Merge pull request <a
href="https://redirect.github.com/fluxcd/flux2/issues/5767">#5767</a>
from fluxcd/update-pkg-deps/release/v2.8.x</li>
<li><a
href="c1f11bcf3d"><code>c1f11bc</code></a>
Update fluxcd/pkg dependencies</li>
<li>Additional commits viewable in <a
href="8454b02a32...871be9b40d">compare
view</a></li>
</ul>
</details>
<br />
Updates `Mattraks/delete-workflow-runs` from
5bf9a1dac5c4d041c029f0a8370ddf0c5cb5aeb7 to
b3018382ca039b53d238908238bd35d1fb14f8ee
<details>
<summary>Commits</summary>
<ul>
<li>See full diff in <a
href="5bf9a1dac5...5bf9a1dac5">compare
view</a></li>
</ul>
</details>
<br />
Updates `umbrelladocs/action-linkspector` from 1.4.0 to 1.4.1
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/umbrelladocs/action-linkspector/releases">umbrelladocs/action-linkspector's
releases</a>.</em></p>
<blockquote>
<h2>Release v1.4.1</h2>
<p>v1.4.1: PR <a
href="https://redirect.github.com/umbrelladocs/action-linkspector/issues/52">#52</a>
- chore: update actions/checkout to v5 across all workflows</p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="37c85bcde5"><code>37c85bc</code></a>
Merge pull request <a
href="https://redirect.github.com/umbrelladocs/action-linkspector/issues/52">#52</a>
from UmbrellaDocs/action-v5</li>
<li><a
href="badbe56d6b"><code>badbe56</code></a>
chore: update actions/checkout to v5 across all workflows</li>
<li><a
href="e0578c9289"><code>e0578c9</code></a>
Merge pull request <a
href="https://redirect.github.com/umbrelladocs/action-linkspector/issues/51">#51</a>
from UmbrellaDocs/caching-fix-50</li>
<li><a
href="5ede5ac56a"><code>5ede5ac</code></a>
feat: enhance reviewdog setup with caching and version management</li>
<li><a
href="a73cfa2d0f"><code>a73cfa2</code></a>
Merge pull request <a
href="https://redirect.github.com/umbrelladocs/action-linkspector/issues/49">#49</a>
from Goooler/node24</li>
<li><a
href="aee511ae2b"><code>aee511a</code></a>
Update action runtime to node 24</li>
<li>See full diff in <a
href="652f85bc57...37c85bcde5">compare
view</a></li>
</ul>
</details>
<br />
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore <dependency name> major version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's major version (unless you unignore this specific
dependency's major version or upgrade to it yourself)
- `@dependabot ignore <dependency name> minor version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's minor version (unless you unignore this specific
dependency's minor version or upgrade to it yourself)
- `@dependabot ignore <dependency name>` will close this group update PR
and stop Dependabot creating any more for the specific dependency
(unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore <dependency name>` will remove all of the ignore
conditions of the specified dependency
- `@dependabot unignore <dependency name> <ignore condition>` will
remove the ignore condition of the specified dependency and ignore
conditions
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps rust from `f7bf1c2` to `1d0000a`.
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Adds suffix-based agent selection for chatd. Template authors can direct
chat traffic to a specific root workspace agent by naming it with the
`-coderd-chat` suffix (for example, `coder_agent "dev-coderd-chat"`).
When no suffix match exists, chatd falls back to the first root agent by
`DisplayOrder`, then `Name`. Multiple suffix matches return an error.
The selection logic lives in `coderd/x/chatd/internal/agentselect` and
is shared by chatd core plus the workspace chat tools so all chat entry
points pick the same agent deterministically.
No database migrations, API contract changes, or provider changes. The
experimental sandbox template was split out to #23777.
Previously, when a CONNECT tunnel was blocked because the destination
resolved to
a private/reserved IP range, the proxy returned 502 Bad Gateway —
implying an
upstream failure rather than a deliberate policy block.
Introduce `blockedIPError` as a sentinel type returned by both
`checkBlockedIP`
and `checkBlockedIPAndDial`. `ConnectionErrHandler` now inspects the
error with
`errors.As` and returns 403 Forbidden for policy blocks, keeping 502 for
genuine
dial failures.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Closes#15129
Adds a generated **Beta features** table on the feature stages doc,
using the same mainline/stable sparse-checkout approach as the
early-access experiments list.
- Walk `docs/manifest.json` for routes with `state: ["beta"]` and render
a table (title, description, mainline vs stable).
- Inject output between `<!-- BEGIN: available-beta-features -->` /
`END` in `docs/install/releases/feature-stages.md`.
- Rename `scripts/release/docs_update_experiments.sh` →
`docs_update_feature_stages.sh`, refresh the script header, and use
`build/docs/feature-stages` for clone output.
<img width="1624" height="1061" alt="image"
src="https://github.com/user-attachments/assets/5fa811dd-9b80-446b-ae65-ec6e6cfedd6a"
/>
Follow-up to #23763.
The custom predicate uses the **SLSA v0.2 schema** (`invocation`,
`configSource`, `metadata`) but declares `predicate-type` as v1.
GitHub's attestation API rejects the mismatch:
```
Error: Failed to persist attestation: Invalid Argument -
predicate is not of type slsa1.ProvenancePredicate
```
This was masked before #23763 because the steps failed earlier on
missing `subject-digest`. Now that digests are provided, this is the
next error.
## Fix
Remove the custom `predicate-type` and `predicate` inputs. Without them,
`actions/attest@v4` auto-generates a correct SLSA v1 predicate from the
GitHub Actions OIDC token — which is what `gh attestation verify`
expects.
- `ci.yaml`: 3 attestation steps (main, latest, version-specific)
- `release.yaml`: 3 attestation steps (base, main, latest)
<details>
<summary>Verification (source code trace of actions/attest@v4)</summary>
1. **`detect.ts`**: No `predicate-type`/`predicate` → returns
`'provenance'` (not `'custom'`)
2. **`main.ts`**: `getPredicateForType('provenance')` →
`generateProvenancePredicate()`
3. **`@actions/toolkit/.../provenance.ts`**:
`buildSLSAProvenancePredicate()` fetches OIDC claims, builds correct v1
predicate with `buildDefinition`/`runDetails`
</details>
> 🤖 This PR was created with the help of Coder Agents, and needs a human
review. 🧑💻
The Linear Release workflow had `cancel-in-progress: true`
unconditionally, so a new push to `main` would cancel an already-running
sync. This meant successive PR merges would show you a bunch of red Xs
on CI, even though nothing was wrong.
<img width="958" height="305" alt="image"
src="https://github.com/user-attachments/assets/1bd06948-ef2d-469f-9d48-a82277a6110c"
/>
Other workflows like CI guard against this with `cancel-in-progress: ${{
github.ref != 'refs/heads/main' }}`.
This PR does the same thing to the linear release workflow. The job will
be queued instead.
<img width="678" height="105" alt="image"
src="https://github.com/user-attachments/assets/931e38c8-3de4-40d6-b156-d5de5726d094"
/>
Letting the job finish is not particularly wasteful or anything since
the sync takes 30~ seconds in CI time.
## Problem
GitHub SLSA provenance attestations have been silently failing on
**every release** since they were introduced. Confirmed across all 10+
release runs checked (v2.29.2 through v2.31.6).
The `actions/attest` action requires `subject-digest` (a `sha256:...`
hash) to identify the artifact being attested, but the workflow only
provided `subject-name` (the image tag like
`ghcr.io/coder/coder:v2.31.6`). This caused every attestation step to
error with:
```
Error: One of subject-path, subject-digest, or subject-checksums must be provided
```
The failures were masked by `continue-on-error: true` and only surfaced
as `##[warning]` annotations that nobody noticed. Enterprise customers
doing `gh attestation verify` would find no provenance records for any
of our Docker images.
> [!NOTE]
> The cosign SBOM attestation (separate step) has been working correctly
the entire time — it uses a different mechanism (`cosign attest --type
spdxjson`) that does not require the same inputs. This fix is
specifically for the GitHub-native SLSA provenance attestations.
## Fix
**Add `subject-digest` to all `actions/attest` steps** (release.yaml +
ci.yaml):
- Base image: capture digest from `depot/build-push-action` output
- Main image: resolve digest via `docker buildx imagetools inspect
--raw` after push
- Latest image: same approach
- Use `subject-name` without tag per the [actions/attest
docs](https://github.com/actions/attest#container-image)
**Update `anchore/sbom-action`** from v0.18.0 to v0.24.0 (node24
support, ahead of the [June 2
deadline](https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/)).
All changes remain non-blocking for the release process
(`continue-on-error: true` preserved).
> 🤖 This PR was created with the help of Coder Agents, and is reviewed
by a human.
These chatd relay tests were seeding chats through
`subscriber.CreateChat(...)`, which wakes the subscriber and can race
local acquisition against the intended remote-worker setup.
Seed waiting and remote-running chats directly in the database instead,
and point the default OpenAI provider at a local safety-net server so
accidental processing fails locally instead of reaching the live API.
Closes https://github.com/coder/internal/issues/1430
This fixes a flaky `TestConfigCache_UserPrompt_ExpiredEntryRefetches` by
making the seeded user prompt entry unambiguously expired before the
cache lookup runs.
The test previously inserted a `tlru` entry with a zero TTL, which
depends on `Set` and `Get` landing in different clock ticks. Switching
that seed entry to a negative TTL keeps the bounded `tlru` cache
behavior while removing the same-tick race.
Close https://github.com/coder/internal/issues/1432
## Problem
When the workspaces table transitions between states (loading →
populated, or populated → empty search results), the table column
headers visibly jump. This happens because the Actions column's width is
content-driven: when workspace rows are present, the action buttons give
it intrinsic width, shrinking the Name/Template/Status columns. When the
body is empty or loading, the Actions column collapses to zero, and the
other columns expand to fill the space.
## Solution
Hide the header content during loading and empty states using
`visibility: hidden` (Tailwind's `invisible` class), which preserves the
row's layout height but hides the text. This prevents the visual jump
since headers aren't visible during the states where column widths
differ.
- **Loading**: first column shows a skeleton bar matching the body
skeleton aesthetic; other columns are invisible
- **Empty search results**: all header content is invisible
- **Populated**: headers display normally
---------
Co-authored-by: Jaayden Halko <jaayden@coder.com>
## Summary
Skills are now discovered once on the first turn (or when the workspace
agent changes) and persisted as `skill` message parts alongside
`context-file` parts. On subsequent turns, the skill index is
reconstructed from persisted parts instead of re-dialing the workspace
agent.
This makes skills consistent with the AGENTS.md pattern and is
groundwork for a future `/context` endpoint that surfaces loaded
workspace context to the frontend.
## Changes
- Add `skill` `ChatMessagePartType` with `SkillName` and
`SkillDescription` fields
- Extend `persistInstructionFiles` to also discover and persist skills
as parts
- Add `skillsFromParts()` to reconstruct skill index from persisted
parts on subsequent turns
- Update `runChat()` to use `skillsFromParts` instead of re-dialing
workspace for skills
- Frontend: handle new `skill` part type (skip rendering, hide
metadata-only messages)
## Before / After
| | AGENTS.md | Skills |
|---|---|---|
| **Before** | Persist as `context-file` parts, reconstruct from parts |
In-memory `skillsCache` only, re-dial workspace on cache miss |
| **After** | Persist as `context-file` parts, reconstruct from parts |
Persist as `skill` parts, reconstruct from parts |
The in-memory `skillsCache` remains for `read_skill`/`read_skill_file`
tool calls that need full skill bodies on demand.
<details><summary>Design context</summary>
This is the first step toward a unified workspace context
representation. Currently:
- Context files are persisted as message parts (works)
- Skills were only in-memory (inconsistent)
- Workspace MCP servers are cached in-memory (future work)
Persisting skills as parts means a future `/context` endpoint can query
both context files and skills from the same message parts in the DB,
without depending on ephemeral server-side caches.
</details>
The llmmock Anthropic stream wrote each chunk as `data:` only, so
Anthropic clients never saw the named SSE events they dispatch on and
Claude responses arrived empty even though the stream completed with
HTTP 200.
Update `sendAnthropicStream()` to emit `event: <type>` and `data:
<json>` for each Anthropic chunk while leaving the OpenAI-style streams
unchanged.
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore <dependency name> major version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's major version (unless you unignore this specific
dependency's major version or upgrade to it yourself)
- `@dependabot ignore <dependency name> minor version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's minor version (unless you unignore this specific
dependency's minor version or upgrade to it yourself)
- `@dependabot ignore <dependency name>` will close this group update PR
and stop Dependabot creating any more for the specific dependency
(unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore <dependency name>` will remove all of the ignore
conditions of the specified dependency
- `@dependabot unignore <dependency name> <ignore condition>` will
remove the ignore condition of the specified dependency and ignore
conditions
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
## Problem
Commit 386b449 (PR #23745) changed the `OneWayWebSocketEventSender`
event channel from unbuffered to buffered(64) to reduce chat streaming
latency. This introduced a nondeterministic race in `sendEvent`:
```go
sendEvent := func(event codersdk.ServerSentEvent) error {
select {
case eventC <- event: // buffered channel — almost always ready
case <-ctx.Done(): // also ready after cancellation
}
return nil
}
```
After context cancellation, Go's `select` randomly picks between two
ready cases, so `send()` sometimes returns `nil` instead of `ctx.Err()`.
With the old unbuffered channel the send case was rarely ready (no
reader), masking the bug.
## Fix
Add a priority `select` that checks `ctx.Done()` before attempting the
channel send:
```go
select {
case <-ctx.Done():
return ctx.Err()
default:
}
select {
case eventC <- event:
case <-ctx.Done():
return ctx.Err()
}
```
This is the standard Go pattern for prioritizing one channel over
another. When the context is already cancelled, the first select returns
immediately. The second select still handles the case where cancellation
happens concurrently with the send.
## Verification
- Ran the flaky test 20× in a loop (`-count=20`): all passed
- Ran the full `TestOneWayWebSocketEventSender` suite 5× (`-count=5`):
all passed
- Ran the complete `coderd/httpapi` test package: all passed
Fixescoder/internal#1429
Previously, generating a new agent title used a page-global pending
state, so one in-flight regeneration disabled the action for every chat
in the Agents UI.
This change tracks regenerations by chat ID, updates the Agents page
contracts to use `regeneratingTitleChatIds`, and adds sidebar story
coverage that proves only the active chat is disabled.
Previously, when a user sent a message, there was a 0–1000ms (avg
~500ms) polling delay before processing began.
`SendMessage`/`CreateChat`/`EditMessage` set `status='pending'` in the
DB and returned, but nothing woke the processing loop — it was a blind
1-second ticker.
## Changes
**Event-driven acquisition (main change):** Adds a `wakeCh` channel to
the chatd `Server`. `CreateChat`, `SendMessage`, `EditMessage`, and
`PromoteQueued` call `signalWake()` after committing their transactions,
which wakes the run loop to call `processOnce` immediately. The 1-second
ticker remains as a fallback safety net for edge cases (stale recovery,
missed signals).
**Buffer WebSocket write channel:** Changes the
`OneWayWebSocketEventSender` event channel from unbuffered to buffered
(64), decoupling the event producer from WebSocket write speed. The
existing 10s write timeout guards against stuck connections.
<details><summary>Implementation plan & analysis</summary>
The full latency analysis identified these sources of delay in the
streaming pipeline:
1. **Chat acquisition polling** — 0–1000ms (avg 500ms) dead time per
message. Fixed by wake channel.
2. **Unbuffered WebSocket write channel** — each token blocked on the
previous WS write completing. Fixed by buffering.
3. **PersistStep DB transaction per step** — `FOR UPDATE` lock + batch
insert. Not addressed in this PR (medium risk, would overlap DB write
with next provider TTFB).
4. **Multi-hop channel pipeline** — 4 channel hops per token. Not
addressed (medium complexity).
</details>
<details><summary>Test stabilization notes</summary>
`signalWake()` causes the chatd daemon to process chats immediately
after creation/send/edit, which exposed timing assumptions in several
tests that expected chats to remain in `pending` status long enough to
assert on. These tests were updated with `require.Eventually` +
`WaitUntilIdleForTest` patterns to wait for processing to settle before
asserting.
The race detector (`test-go-race-pg`) shows failures in
`TestCreateWorkspaceTool_EndToEnd` and `TestAwaitSubagentCompletion` —
these appear to be pre-existing races in the end-to-end chat flow that
are now exercised more aggressively because processing starts
immediately instead of after a 1s delay. Main branch CI (race detector)
passes without these changes.
</details>
## Problem
The context-usage indicator ring in the agents chat uses a Radix UI
`Tooltip`, which only opens on hover. On mobile/touch devices there is
no hover event, so tapping the indicator does nothing.
## Fix
On mobile viewports (`< 640 px`, matching the existing
`isMobileViewport()` helper), render a `Popover` instead of a `Tooltip`
so that tapping the ring toggles the context-usage info. Desktop
behavior (hover tooltip) is unchanged.
- Extract the trigger button and content into shared variables to avoid
duplication
- Conditionally render `Popover` (mobile) or `Tooltip` (desktop) based
on viewport width
- Both `Popover` and `PopoverContent` were already imported in the file
Short prompts were producing title-generation meta responses such as "I
am a title generator" and prompt-echo titles. This rewrites the
automatic and manual title prompts to be shorter, less self-referential,
and more focused on returning only the title text.
The change also removes the broader post-generation guard layer, updates
manual regeneration to send real conversation text instead of a meta
instruction, and keeps regression coverage focused on the slimmer prompt
contract.
This patch shows the loading spinner on the AI Bridge Sessions table
during pagination. These API calls can take a noticeable amount of time
on dogfood, and without this it shows stale data while the fetch is
fetching.
Claude with the assist 🤖
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Kayla はな <mckayla@hey.com>
Closes#22136
This pull-request implements a `<ClientFilter />` to our `Request Logs`
page for AI Bridge. This will allow the user to select a client which
they wish to filter against. Technically the backend is able to actually
filter against multiple clients at once however the frontend doesn't
currently have a nice way of supporting this (future improvement).
<img width="1447" height="831" alt="image"
src="https://github.com/user-attachments/assets/0be234e2-25f2-4a89-b971-d74817395da1"
/>
---------
Co-authored-by: Jeremy Ruppel <jeremy.ruppel@gmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Split out from #23000.
## Problem
`openAppInNewWindow()` opens workspace apps in a slim popup without
`noopener`, leaving `window.opener` intact. Since workspace apps can
proxy arbitrary user-hosted content under the dashboard origin, this
exposes the Coder dashboard to tabnabbing and same-origin DOM access.
We cannot simply pass `"noopener"` to `window.open()` because the WHATWG
spec mandates that `window.open()` returns `null` when `"noopener"` is
present — indistinguishable from a blocked popup — which would break the
popup-blocked error toast.
## Solution
Use a two-step approach in `openAppInNewWindow()`:
1. Open `about:blank` first (without `noopener`) so we can detect popup
blockers via the `null` return
2. If the popup succeeded, sever the opener reference with `popup.opener
= null`
3. Navigate the popup to the target URL via `popup.location.href`
This preserves popup-block detection while eliminating the security
exposure.
## Tests
A vitest case in `AppLink.test.tsx` asserts that `open_in="slim-window"`
anchors do not carry `target` or `rel` attributes (since slim-window
opening is handled programmatically via `onClick` / `window.open()`, not
anchor attributes).
---------
Co-authored-by: Kayla はな <kayla@tree.camp>
Adds skill discovery and tools to chatd so the agent can discover and
load `.agents/skills/` from workspaces, following the same pattern as
AGENTS.md instruction loading and MCP tool discovery.
## What changed
### `chattool/skill.go` — discovery, loading, and tools
- **DiscoverSkills** — walks `.agents/skills/` via `conn.LS()` +
`conn.ReadFile()`, parses SKILL.md frontmatter (name + description),
validates kebab-case names match directory names, silently skips
broken/missing entries.
- **FormatSkillIndex** — renders a compact `<available-skills>` XML
block for system prompt injection (~60 tokens for 3 skills). Progressive
disclosure: only names + descriptions in context, full body loaded on
demand.
- **LoadSkillBody** / **LoadSkillFile** — on-demand loading with path
traversal protection and size caps (64KB for SKILL.md, 512KB for
supporting files).
- **read_skill** / **read_skill_file** tools — `fantasy.AgentTool`
implementations following the same pattern as ReadFile and
WorkspaceMCPTool. Receive pre-discovered `[]SkillMeta` via closure to
avoid re-scanning on every call.
### `chatd.go` — integration into runChat
- Skills discovered in the `g2` errgroup parallel with instructions and
MCP tools.
- `skillsCache` (sync.Map) per chat+agent, same invalidation pattern as
MCP tools cache.
- Skill index injected via `InsertSystem` after workspace instructions.
- Re-injected in `ReloadMessages` callback so it survives compaction.
- `read_skill` + `read_skill_file` tools registered when skills are
present (for both root and subagent chats).
- Cache cleaned up in `cleanupStreamIfIdle` alongside MCP tools cache.
## Format compatibility
Uses the same `.agents/skills/<name>/SKILL.md` format as
[coder/mux](https://github.com/coder/mux) and
[openai/codex](https://github.com/openai/codex).
Removes the `font-semibold` class applied to unread chat titles in the
`/agents` sidebar. The unread indicator dot already provides sufficient
visual distinction for unread chats.
## Summary
Adds read/unread tracking for chats so users can see which agent
conversations have new assistant messages they haven't viewed.
## Backend Changes
- Adds `last_read_message_id` column to the `chats` table (migration
000439).
- Computes `has_unread` as a virtual column in `GetChatsByOwnerID` using
an `EXISTS` subquery checking for assistant messages beyond the read
cursor.
- Exposes `has_unread` on the `codersdk.Chat` struct and auto-generated
TypeScript types.
- Updates `last_read_message_id` on stream connect/disconnect in
`streamChat`, avoiding per-message API calls during active streaming.
- Uses `context.WithoutCancel` for the deferred disconnect write so the
DB update succeeds even after the client disconnects.
## Frontend Changes
- Bold title (`font-semibold`) for unread chats in the sidebar.
- Small blue dot indicator next to the relative timestamp.
- Suppresses unread indicator for the currently active chat via
`isActive` from NavLink.
## Design Decisions
- Only `assistant` messages count as unread — the user's own messages
don't trigger the indicator.
- No foreign key on `last_read_message_id` since messages can be deleted
(via rollback/truncation) and the column is just a high-water mark.
- Zero API calls during streaming: exactly 2 DB writes per stream
session (connect + disconnect).
- Unread state refreshes on chat list load and window focus. The
`watchChats` WebSocket optimistically marks non-active chats as unread
on `status_change` events, but does not carry a server-computed
`has_unread` field. Navigating to a chat optimistically clears its
unread indicator in the cache.
- Trim leading/trailing whitespace from metadata `value` and `error`
fields before storage
- Trimming happens before length validation so whitespace-padded values
are handled correctly
- Add `TrimWhitespace` test covering spaces, tabs, newlines, and
preserved inner whitespace
- No backfill needed (unlogged table, stores only latest value)
> 🤖 Created by a Coder Agent, reviewed by me.
Replace overly-broad `AsSystemRestricted` with purpose-built actors:
- **OAuth2 provider paths** → `AsSystemOAuth2` (13 call sites across
`tokens.go`, `registration.go`, `apikey.go`)
- **Provisioner daemon health read** → `AsSystemReadProvisionerDaemons`
(1 site in `healthcheck/provisioner.go`)
- **Provisionerd file cache paths** → `AsProvisionerd` (2 sites in
`provisionerdserver.go`, matching existing usage nearby)
<details>
<summary>Implementation notes</summary>
Each replacement actor is a strict subset of `AsSystemRestricted`. Every
DB method
at each call site is already covered by the narrower actor's
permissions:
- `subjectSystemOAuth2`: OAuth2App/Secret/CodeToken (all), ApiKey (Read,
Delete), User (Read), Organization (Read)
- `subjectSystemReadProvisionerDaemons`: ProvisionerDaemon (Read)
- `subjectProvisionerd`: File (Create, Read) plus provisionerd-scoped
resources
No new permissions added. `nolint:gocritic` comments updated to reflect
the new actors.
</details>
> 🤖 Created by a Coder Agent, reviewed by me.
`onChatReady` now waits for the store to have all fetched messages (not
just query success), so the DOM has content when the parent frame acts
on the signal. Removed the `scroll-to-bottom` round-trip from
`onChatReady`, the embed page no longer needs the parent to echo a
scroll command back.
Also moved `autoScrollRef` check before the `widthChanged` guard in
`ScrollAnchoredContainer`'s ResizeObserver. The width guard is only
relevant for scroll compensation (user scrolled up); it should never
prevent pinning to bottom when auto-scroll is active. Also tightened the
store sync guard to `storeMessageCount < fetchedMessageCount`.
Addresses chat page rendering performance. Profiling with React Profiler
showed `AgentChat` actual render times of 20–31ms (exceeding the
16ms/60fps
budget), with `StickyUserMessage` as the #1 component bottleneck at
35.7%
of self time.
## Changes
**Hoist `createComponents` to module scope** (`response.tsx`):
Previously every `<Response>` instance called `createComponents()` per
render, creating a fresh components map that forced Streamdown to
discard
its cached render tree. Now both light/dark variants are precomputed
once
at module scope.
**Wrap `StickyUserMessage` in `memo()`** (`ConversationTimeline.tsx`):
Profile-confirmed #1 bottleneck. Each instance carries
IntersectionObserver
+ ResizeObserver + scroll handlers; skipping re-render avoids all that
setup.
**Wrap `ConversationTimeline` in `memo()`**
(`ConversationTimeline.tsx`):
Prevents cascade re-renders from the parent when props haven't changed.
**Remove duplicate `buildSubagentTitles`** (`ConversationTimeline.tsx` →
`AgentDetailContent.tsx`): Was computed in both `AgentDetailTimeline`
and
`ConversationTimeline`. Now computed once and passed as a prop.
<details>
<summary>Profiling data & analysis</summary>
### Profiler Metrics
| Metric | Value |
|--------|-------|
| INP (Interaction to Next Paint) | 82ms |
| Processing duration (event handlers) | 52ms |
| AgentChat actual render | 20–31ms (budget: 16ms) |
| AgentChat base render (no memo) | ~100ms |
### Top Bottleneck Components (self-time %)
| Component | Self Time | % |
|-----------|-----------|---|
| StickyUserMessage | 11.0ms | 35.7% |
| ForwardRef (radix-ui) | 7.4ms | 24.0% |
| Presence (radix-ui) | 2.0ms | 6.5% |
| AgentChatInput | 1.4ms | 4.5% |
### Decision log
- Chose module-scope precomputation over `useMemo` for
`createComponents`
because there are only two possible theme variants and they're static.
- Did not add virtualization — sticky user messages + scroll anchoring
make it complex. The memoization fixes should be measured first.
- Did not wrap `BlockList` in `memo()` — the React Compiler (enabled for
`pages/AgentsPage/`) already auto-memoizes JSX elements inside it.
- Phase 2 (verify React Compiler effectiveness on
`parseMessagesWithMergedTools`)
and Phase 3 (radix-ui Tooltip lazy-mounting) deferred to follow-up PRs.
</details>
Add a per-MCP-server `model_intent` toggle that wraps tool schemas with
a
`model_intent` field, requiring the LLM to provide a human-readable
description of each tool call's purpose. The intent string is shown as a
status label in the UI instead of opaque tool names, and is
transparently
stripped before the call reaches the remote MCP server.
Built-in tools have rich specialized renderers (terminal blocks, file
diffs,
etc.) and don't need this. MCP tools hit `GenericToolRenderer` which
only
shows raw tool names and JSON — that's where model_intent adds value.
The model learns what to provide via the JSON Schema `description` on
the
`model_intent` property itself — no system prompt changes needed.
<details>
<summary>Implementation details</summary>
### Architecture
Inspired by the `withModelIntent()` pattern from `coder/blink`, adapted
for
Go + React. The wrapping is entirely in the `mcpclient` layer — tool
implementations never see `model_intent`.
**Schema wrapping** (`mcpToolWrapper.Info()`): When enabled, wraps the
original tool parameters under a `properties` key and adds a
`model_intent`
string field with a rich description that teaches the model inline.
**Input unwrapping** (`mcpToolWrapper.Run()`): Strips `model_intent` and
unwraps `properties` before forwarding to the remote MCP server. Handles
three input shapes models may produce:
1. `{ model_intent, properties: {...} }` — correct format
2. `{ model_intent, key: val, ... }` — flat, no wrapper
3. Malformed — falls through gracefully
**Frontend extraction**: `streamState.ts` extracts `model_intent` from
incrementally parsed streaming JSON. `messageParsing.ts` extracts it
from
persisted tool call args.
**UI rendering**: `GenericToolRenderer` shows the capitalized intent
string
as the primary label when available, falling back to the raw tool name.
### Changes
- Database: `model_intent` boolean column on `mcp_server_configs`
- SDK: `ModelIntent` field on config/create/update types
- API: pass-through in create/update handlers + converter
- mcpclient: schema wrapping in `Info()`, input unwrapping in `Run()`
- Frontend: extraction from streaming + persisted args
- UI: intent label in `GenericToolRenderer`, toggle in admin panel
- Tests: 6 new tests (schema wrapping, unwrapping, passthrough,
fallback)
### Decision log
- **Option lives on MCPServerConfig, not model config**: Built-in tools
already have rich renderers; only MCP tools benefit from model_intent.
- **No system prompt changes**: The JSON Schema `description` on the
`model_intent` property teaches the model inline.
- **Pointer bool on update request**: Follows existing pattern (`*bool`)
so PATCH requests don't reset the value when omitted.
</details>
Fixes expired MCP OAuth2 tokens causing 401 errors and stale
`auth_connected` status in the UI.
When users authenticate MCP servers (e.g. GitHub) via OAuth2, the access
token and refresh token are stored in the database. However, when the
access token expired, nothing refreshed it anywhere:
- **chatd**: sent the expired token as-is, getting a 401 and skipping
the MCP server
- **list/get endpoints**: reported `auth_connected: true` just because a
token record existed, regardless of expiry
## Changes
### Shared utility: `mcpclient.RefreshOAuth2Token`
Pure function that uses `golang.org/x/oauth2` `TokenSource` to check if
a token is expired (or within 10s of expiry) and refresh it. No DB
dependency — callers handle persistence.
### chatd (`coderd/x/chatd/chatd.go`)
Before calling `mcpclient.ConnectAll`, refreshes expired tokens.
Persists new credentials to the database. Falls back to the old token if
refresh fails.
### List/get MCP server endpoints (`coderd/mcp.go`)
Both `listMCPServerConfigs` and `getMCPServerConfig` now attempt refresh
when checking `auth_connected`. If the token is expired:
- **Has refresh token**: attempt refresh, persist result, report
`auth_connected` based on success
- **No refresh token**: report `auth_connected: false` if expired
This means the UI accurately reflects whether the user's token is
actually usable, rather than just whether a record exists.
<details>
<summary>Design notes</summary>
- `RefreshOAuth2Token` lives in `mcpclient` to avoid circular imports
(`coderd` → `chatd` → `mcpclient` is fine; `chatd` → `coderd` would be
circular).
- DB persistence is handled by each caller with their own authz context
(`AsSystemRestricted` in both cases).
- The `buildAuthHeaders` warning in mcpclient about expired tokens is
kept as defense-in-depth logging.
</details>
<!--
If you have used AI to produce some or all of this PR, please ensure you
have read our [AI Contribution
guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING)
before submitting.
-->
Adds the Tunneler state machine and logic for handling build updates.
This is a partial implementation and tests. Further PRs will fill out
the other event types.
Relates to GRU-18
## Summary
Fixes two issues with pinned chat drag-to-reorder in the agents sidebar:
1. **Missing optimistic reorder**: After dragging a pinned chat to a new
position, the sidebar flashed back to the old order while waiting for
the server response, then snapped to the new order. Now the react-query
cache is updated optimistically in `onMutate` so the reorder is visually
instant.
2. **Post-drag navigation on upward drag**: Dragging a pinned chat
**upward** caused unintended navigation to that chat on drop. The
browser synthesizes a click from the final `pointerup`, and the previous
suppression listener was attached too low in the DOM tree to reliably
intercept it after items reorder.
## Changes
### Optimistic cache reorder (`site/src/api/queries/chats.ts`)
`reorderPinnedChat.onMutate` now extracts all pinned chats from the
cache, reorders them to match the new position, and writes the updated
`pin_order` values back via `updateInfiniteChatsCache`. The server
response still reconciles on `onSettled`.
### Document-level click suppression (`AgentsSidebar.tsx`)
- Moved click suppression from the pinned container div to
`document.addEventListener('click', handler, true)`. This fires before
any element-level handlers regardless of DOM reordering during drag.
Scoped via `pinnedContainerRef.contains(target)` so it only affects
pinned rows.
- Replaced `recentDragRef` (boolean cleared by `setTimeout(100ms)`) with
`lastDragEndedAtRef` (`performance.now()` timestamp checked against a
300ms window). This eliminates the race where the timeout fires before
the synthetic click arrives.
---
PR generated with Coder Agents
Add a keyboard shortcut (ArrowUp on empty input) to start editing the
most recent user message, mirroring Mux's behavior. The shortcut reuses
the existing history-edit flow triggered by the pencil button.
Extract a shared `getEditableUserMessagePayload` helper so the pencil
button and the new shortcut both derive the edit payload identically.
Derive the last editable user message during render in
`AgentDetailInput` from the existing store selectors, keeping the
implementation Effect-free and React Compiler friendly.
## Problem
Chats with a persisted `agent_id` binding hang indefinitely when the
workspace is stopped. The stale agent row still exists in the DB, so
`ensureWorkspaceAgent` succeeds, but the dial blocks forever in
`AwaitReachable`. The MCP discovery goroutine used an unbounded context,
so `g2.Wait()` never returned and the LLM never started.
## Fix
Three targeted changes restore the pre-binding behavior where stopped
workspaces degrade gracefully instead of blocking:
1. **`dialWithLazyValidation`**: "no agents in latest build" is now a
terminal fast-fail — the hanging dial is canceled and
`errChatHasNoWorkspaceAgent` returned immediately, instead of falling
through to `waitForOriginalDial`.
2. **Pre-LLM workspace setup**: MCP discovery and instruction
persistence gate on `workspaceAgentIDForConn` before attempting any
dial. MCP discovery is bounded by a 5s timeout and checks the in-memory
tool cache first (using the cheap cached agent from
`ensureWorkspaceAgent`), so the common subsequent-turn path has zero DB
queries.
3. **`persistInstructionFiles`**: tracks whether the workspace
connection succeeded and skips sentinel persistence on failure, so the
next turn retries if the workspace is restarted.
## Scenarios
**Running workspace, subsequent turn (hot path):** MCP cache hit via
in-memory cached agent. Zero DB queries, zero dials. Unchanged from
#23274.
**Stopped workspace, persisted binding (the bug):** MCP cache hit (stale
descriptors, fine — they fail at invocation). Pre-LLM setup completes
instantly. Tool invocation enters `dialWithLazyValidation`, dial fails
or hangs, validation discovers no agents, returns
`errChatHasNoWorkspaceAgent`. Model sees the error and can call
`start_workspace`.
**New chat, running workspace:** `ensureWorkspaceAgent` resolves via
latest-build, persists binding. MCP discovery dials and caches tools.
**New chat, stopped workspace:** `ensureWorkspaceAgent` finds no agents,
returns `errChatHasNoWorkspaceAgent`. Pre-LLM setup skips. LLM starts
with built-in tools only.
**Rebuilt workspace (agent switched):** MCP cache hit with stale agent
(harmless for one turn). Tool invocation dials stale agent, fails fast,
`dialWithLazyValidation` switches to new agent, persists updated
binding.
**Workspace restarted after stop:** No sentinel was persisted during the
stopped turn, so instruction persistence retries. Agent binding switches
to the new agent via `workspaceAgentIDForConn`.
**Transient DB error during validation:** Not
`errChatHasNoWorkspaceAgent`, so `dialWithLazyValidation` falls through
to `waitForOriginalDial` (cannot prove stale). No false positive.
**Tool invocation on stopped workspace:** `getWorkspaceConn` calls
`ensureWorkspaceAgent` (returns stale row), then
`dialWithLazyValidation` validation discovers no agents, returns
`errChatHasNoWorkspaceAgent`, cached state cleared, error returned to
model.
## Problem
The agent chat delayed-startup banner ("Response startup is taking
longer than expected") could appear even though the model was already
streaming.
The root cause is in `Subscribe()`: `message_part` events were delivered
via the fast local in-process stream, while `status` events were
delivered via PostgreSQL pubsub. Both feed into the same `select`
statement, and Go's `select` picks whichever channel is ready first —
there is no ordering guarantee between channels. So a `message_part`
could outrun the `status=running` that logically precedes it.
The frontend saw content arrive while it still thought the chat was
pending, triggering the banner.
## Fix
Also forward `status` events from the local channel, alongside
`message_part`.
Both event types already travel through the same FIFO subscriber
channel: `publishStatus()` is called before the first `message_part`, so
channel ordering guarantees the frontend sees `status=running` before
any content.
Pubsub still delivers a duplicate `status` event later; the frontend
deduplicates it (`setChatStatus` is idempotent — it early-returns when
the status hasn't changed).
## Summary
Add site-wide banners for AI Governance seat usage thresholds:
1. **90% capacity warning (admin-only):** When actual AI Governance
seats are ≥90% and <100% of the license limit, admins see:
> "You have used 90% of your AI governance add-on seats."
2. **Over-limit banner (admin-only):** When actual seats exceed the
license limit, admins see a prominent warning:
> "Your organization is using {actual} / {limit} AI Governance user
seats ({X}% over the limit). Contact sales@coder.com"
- Uses floor whole percentage (Go int division / `Math.floor`)
- Includes a clickable `mailto:sales@coder.com` link
## Summary
Adds a "Generate new title" action that lets users manually regenerate a
chat's title using richer conversation context than the automatic
first-message title path.
## Changes
### Backend
- **New endpoint:** `POST
/api/experimental/chats/{chatID}/title/regenerate` returns the updated
Chat with a regenerated title
- **Manual title algorithm:** Extracts useful user/assistant text turns
→ selects first user turn + last 3 turns → builds context with gap
markers → renders prompt with anti-recency guidance → calls lightweight
model → normalizes output
- **Helpers:** `extractManualTitleTurns`,
`selectManualTitleTurnIndexes`, `buildManualTitleContext`,
`renderManualTitlePrompt`, `generateManualTitle` — all private, with the
public `Server.RegenerateChatTitle` method
- **SDK:** `ExperimentalClient.RegenerateChatTitle(ctx, chatID) (Chat,
error)`
- Persists title via existing `UpdateChatByID` and broadcasts
`ChatEventKindTitleChange`
### Frontend
- API client method + React Query mutation with cache invalidation
- "Generate new title" menu item (with wand icon) in both TopBar and
Sidebar dropdown menus
- Loading/disabled state while regeneration is in-flight
- Error toast on failure
- Stories updated for both menus
### Tests
- `quickgen_test.go`: Table-driven tests for all 4 helper functions
(turn extraction, index selection, context building, prompt rendering)
- `exp_chats_test.go`: Handler tests (ChatNotFound,
NotFoundForDifferentUser, NoDaemon)
## Design notes
- The existing auto-title path (`maybeGenerateChatTitle`, `titleInput`)
is completely unchanged
- Manual regeneration uses richer context (first user turn + last 3
turns + gap markers) vs the auto path's single first message
- Endpoint is experimental and marked with `@x-apidocgen {"skip": true}`
Converts the Agents page transcript from reverse-scroll
(\`flex-col-reverse\`) to normal top-to-bottom document flow with
explicit auto-scroll pinning.
The previous \`flex-col-reverse\` approach (from #23451) required
browser-specific workarounds for Chrome vs Firefox \`scrollTop\` sign
conventions, and compensation logic that could fight user scroll intent
during streaming. This replaces it with standard scroll math
(\`scrollHeight - scrollTop - clientHeight\`) while keeping the proven
\`ScrollAnchoredContainer\` observer machinery.
Changes:
- \`AgentDetailView.tsx\`: layout flip to \`flex-col\`, standard bottom
detection, explicit prepend restoration via \`pendingPrependRef\`
snapshot, sentinel moved inside content wrapper, initial mount bottom
pin, \`scrollToBottomRef\` imperative API for send/edit, user-interrupt
guard with \`isNearBottom\` check
- \`AgentDetailView.stories.tsx\`: all scroll stories updated for
normal-flow semantics, new \`ScrollAnchorPreservedOnOlderHistoryLoad\`
story with deterministic \`IntersectionObserver\` mock
- \`AgentsSkeletons.tsx\` + \`AgentDetailLoadingView\`: removed
\`flex-col-reverse\` so loading state matches the real transcript layout
- \`AgentDetail.tsx\`: replaced \`scrollTop = 0\` on send/edit with
imperative \`scrollToBottomRef\` call
Supersedes #23576 (which was reverted in #23638). Same behavioral goals
but keeps scroll logic local to \`ScrollAnchoredContainer\` rather than
extracting a separate hook.
Adds preview cards for the `spawn_computer_use_agent` tool. Spawning
that agent now renders a rich saying "Spawning computer use sub-agent".
The "waiting for" tool now displays an inline desktop preview which can
be clicked to reveal the desktop in the sidebar.
https://github.com/user-attachments/assets/e486ca0e-a569-4142-bb12-db3b707967b8
https://github.com/user-attachments/assets/bd5d12a1-61b3-4b7d-83b6-317bdfb60b3c
## Summary
Adds pinned chats to the agents page sidebar with server-side
persistence and drag-to-reorder. Users can pin/unpin chats via the
context menu, and pinned chats appear in a dedicated "Pinned" section
above the time-grouped list.
## Database
Migration `000453_chat_pin_order`: adds `pin_order integer DEFAULT 0 NOT
NULL` column on `chats` (0 = unpinned, 1+ = pinned in display order).
Three SQL queries handle pin operations server-side using CTEs with
`ROW_NUMBER()`:
- `PinChatByID`: normalizes existing orders and appends to end
- `UnpinChatByID`: sets target to 0 and compacts remaining pins
- `UpdateChatPinOrder`: shifts neighbors, clamps to `[1, pinned_count]`
All queries exclude archived chats. `ArchiveChatByID` clears `pin_order`
on archive. The handler rejects pinning archived chats with 400.
## Backend
Pin/unpin/reorder go through the existing `PATCH
/api/experimental/chats/{chat}` via the `pin_order` field on
`UpdateChatRequest`. The handler routes based on current pin state:
`pin_order == 0` unpins, `> 0` on an already-pinned chat reorders, `> 0`
on an unpinned chat appends to end.
## Frontend
- `pinChat` / `unpinChat` / `reorderPinnedChat` optimistic mutations
using shared `isChatListQuery` predicate
- Sidebar renders Pinned section above time groups, excludes pinned
chats from time groups
- Pin/Unpin context menu items (hidden for child/delegated chats)
- `@dnd-kit/core` + `@dnd-kit/sortable` for drag-to-reorder with
`MouseSensor`, `TouchSensor`, and `KeyboardSensor`
- Local pin-order override prevents flash on drop; click blocker
prevents NavLink navigation after drag
---
*PR generated with Coder Agents*
The hook lived at src/hooks/, outside the React Compiler scope.
It returned a new object literal every render, causing three
handler guards and two downstream JSX guards in AgentChatInput
(233 cache slots) to always miss.
Move the hook to src/pages/AgentsPage/hooks/ where the compiler
processes it. The compiler auto-memoizes the return object, so
manual useMemo is unnecessary.
Also replace ctorRef.current render-time access with a useState
lazy initializer. The ref access caused a CompileError that
would have prevented compilation. Browser API availability is
constant, so useState captures it once.
Coder's chat (chatd) can now discover and use MCP servers configured in
a workspace's `.mcp.json` file. This brings project-specific tooling
(GitHub, databases, docs servers, etc.) into the chat without any manual
configuration.
## How it works
The workspace agent reads `.mcp.json` from the workspace directory (same
format Claude Code uses), connects to the declared MCP servers —
spawning child processes for stdio servers and connecting over the
network for HTTP/SSE — and caches their tool lists. Two new agent HTTP
endpoints expose this:
- `GET /api/v0/mcp/tools` returns the cached tool list (supports
`?refresh=true`)
- `POST /api/v0/mcp/call-tool` proxies calls to the correct server
On each chat turn, chatd calls `ListMCPTools` through the existing
`AgentConn` tailnet connection, wraps each tool as a
`fantasy.AgentTool`, and adds them to the LLM's tool set alongside
built-in and admin-configured MCP tools. Tool names are prefixed with
the server name (`github__create_issue`) to avoid collisions.
Failed server connections are logged and skipped — they never block the
agent or break the chat. Child stdio processes are terminated on agent
shutdown.
The React Compiler failed to memoize the messages derivation
chain because a useDashboard() hook call sat between the
messages computation and its consumer (getLatestContextUsage).
An IIFE around the context usage logic also fragmented the
dependency chain.
Replacing the IIFE with a ternary and reordering the non-hook
computation before the hook call lets the compiler group
messages + getLatestContextUsage into a single cache guard
keyed on messagesByID and orderedMessageIDs.
Two test fixtures (devcontainer-resources, multiple-agents-multiple-envs)
were generated before terraform-provider-coder v2.15.0 added the
merge_strategy attribute to coder_env. Running generate.sh with the
current provider adds merge_strategy: "replace" (the default) to all
coder_env resources, causing unstable diffs on every regeneration.
The React Compiler guarded buildStreamTools on the whole
streamState ref, which changes on every text chunk. Refactoring
the function to accept toolCalls and toolResults directly lets
the compiler guard on those sub-fields, which are stable during
text-only streaming.
Before: $[0] !== streamState (misses every text chunk)
After: $[0] !== toolCalls || $[1] !== toolResults (passes when
only blocks change)
Verified: 181 functions compile, 0 diagnostics. Reference
stability tests confirm toolCalls/toolResults retain identity
across text-part updates and change when tool data updates.
*Problem:* `publishChatPubsubEvent` was constructing a partial
`codersdk.Chat` that omitted `LastModelConfigID` and other fields. Go's
zero-value UUID caused the sidebar to show "Default model" for chats
received via SSE.
*Solution:*
- Extracted `convertChat`/`convertChats` from `exp_chats.go` into
`db2sdk.Chat`/`db2sdk.Chats`, alongside existing `ChatMessage`,
`ChatQueuedMessage`, and `ChatDiffStatus` converters.
`publishChatPubsubEvent` now calls `db2sdk.Chat(chat, nil)` instead of
maintaining its own copy of the conversion logic
- Added backend integration test
`TestWatchChats/CreatedEventIncludesAllChatFields`
- Added frontend regression tests for nil-UUID and valid model config ID
cases
> 🤖 Created by Coder Agents, reviewed by this human.
During streaming, StreamingOutput's compiler cache guard misses
every chunk because streamState.blocks and streamTools are new
references. This causes renderBlockList to recreate all child JSX
elements, and React calls every child function even for blocks
that finished streaming.
Wrapping SmoothedResponse and ReasoningDisclosure in React.memo
lets React skip the function call entirely when props are stable.
For N completed response blocks and M completed thinking blocks,
this reduces per-chunk function calls from N+M+1 to 1. The
compiler still compiles both inner functions cleanly (6 and 12
cache slots respectively, zero diagnostics).
If the desktop viewer component was hidden, for example after collapsing
the sidebar, the next time it was shown the viewer would be blank. This
PR fixes that.
Adds an `enabled` toggle to the chat model admin create/edit form so
admins
can disable a model without soft-deleting it. Disabled models stay
visible
in admin settings but stop appearing in user-facing model selectors.
The backend already supported this (`chat_model_configs.enabled` column,
filtered queries, and SDK fields). This change wires it into the admin
UI
and adds coverage on both sides.
**Backend:** three new subtests in `coderd/exp_chats_test.go` verifying
the visibility contract (admin sees disabled models, non-admin doesn't,
update-to-disabled preserves the record).
**Frontend:** `enabled` field added to form logic and seeded from the
existing model (defaults to `true` for new models). A Switch+Tooltip
control renders in the form header, matching the MCP Server panel
pattern.
Two interaction stories cover the create-disabled and toggle-existing
flows.
Add theme synchronization, navigation blocking, scroll-to-bottom
handling, and chat-ready signaling to the agent embed page. The parent
frame can now set light/dark theme via postMessage or query param, and
ThemeProvider skips its own class manipulation when the embed marker is
present. Navigation attempts that leave the embed route are intercepted
and forwarded to the parent frame. The scroll container ref is lifted
to the layout so the parent can request scroll-to-bottom.
## Problem
When the user sends a message while the agent is actively streaming a
response, `handleSend` called `store.clearStreamState()`
**unconditionally before** the POST request. If the server queues the
message (`response.queued = true` because the agent is busy), the
in-progress stream output is immediately wiped from the UI. The full
text only reappears once the agent finishes and the durable message
arrives via WebSocket — causing a visible cutoff mid-stream.
## Fix
Move `clearStreamState()` from before the POST to **after** the
response, gated behind `!response.queued`:
- **Queued sends** (`response.queued === true`): `clearStreamState()` is
never called. The stream continues uninterrupted. The WebSocket `status`
handler already clears stream state when the chat transitions to
`"pending"` / `"waiting"` after the queued message is dequeued.
- **Non-queued sends** (`response.queued === false`):
`clearStreamState()` + `upsertDurableMessage()` fire immediately after
the POST, same net behavior as before.
- **Edit and promote paths**: Unchanged — those are intentional
interruptions where eager clearing is correct.
### Additional behavior changes (both improvements)
1. **Failed sends no longer wipe stream state.** Previously
`clearStreamState()` ran before the `try` block, so a network error
still wiped the agent's in-progress output. Now the `catch` re-throws
before reaching `clearStreamState()`, preserving the stream on failure.
2. **`clearStreamState()` fires for all non-queued responses**, not just
those with a `message` body. The original guard was `!response.queued &&
response.message`; now `clearStreamState()` is under `!response.queued`
while `upsertDurableMessage` retains the `response.message` check. The
server always sets `message` for non-queued responses, so this is a
no-op in practice but is semantically correct.
## Testing
**AgentDetail.stories.tsx**: New `StreamingSurvivesQueuedSend` story
exercises the full flow — mocks `createChatMessage` to return `{ queued:
true }`, delivers streaming text via WebSocket, sends a message through
the UI, and asserts the streaming text remains visible.
Array.from(graphemeSegmenter.segment(text)) materializes the
entire text into an array before iterating, even though the loop
breaks early at the visible prefix length. During streaming at
60fps, this makes each frame O(full text) instead of O(prefix).
Benchmark on 5000-char text with 200-char prefix: 22.6x faster
(1.44ms to 0.06ms per call, saving 8.3% of the frame budget).
The fallback codepoint path had the same issue with Array.from.
_Generated by mux but reviewed by a human_
Several stories computed dates relative to `dayjs()` / `new Date()` at
render time, causing snapshot text to shift daily. I ran into this on my
PRs.
This adds an optional `now` prop to `DateRangePicker`,
`TemplateInsightsControls`, and `CreateTokenForm` so stories can inject
a deterministic clock without global mocking. License stories replace
the misleadingly-named `FIXED_NOW = dayjs().startOf("day")` with
absolute timestamps. All fixed timestamps use noon UTC to avoid timezone
boundary issues.
Affected stories:
- `AgentSettingsPageView`: Usage Date Filter, Usage Date Filter Refetch
Overlay
- `LicenseCard`: Expired/future AI Governance variants, Not Yet Valid
- `LicensesSettingsPage`: Shows Addon Ui For Future License Before Nbf
- `TemplateInsightsControls`: Day
- `CreateTokenPage`: Default
> **PR Stack**
> 1. **#23351** ← `#23282` *(you are here)*
> 2. #23282 ← `#23275`
> 3. #23275 ← `#23349`
> 4. #23349 ← `main`
---
## Summary
`chatretry.Retry()` used pure exponential backoff (1 s, 2 s, 4 s, …) and
never consulted provider `Retry-After` headers. Fantasy's
`ProviderError` carries `ResponseHeaders` including `Retry-After`, but
`chaterror.Classify()` only parsed error text and silently dropped the
structured transport metadata.
This makes `Retry-After` a first-class signal in the classification →
retry pipeline.
<img width="853" height="346" alt="image"
src="https://github.com/user-attachments/assets/65f012b6-8173-43d2-957e-ab9faddea525"
/>
## Changes
### `coderd/chatd/chaterror/classify.go`
- Added `RetryAfter time.Duration` field to `ClassifiedError` — a
normalized minimum retry delay derived from provider response metadata.
- `Classify()` now calls `extractProviderErrorDetails()` before falling
back to text heuristics. Structured `ProviderError.StatusCode` takes
priority over regex extraction.
- `normalizeClassification()` preserves and clamps `RetryAfter`.
### `coderd/chatd/chaterror/provider_error.go` (new)
Provider-specific extraction, isolated from the text-based
classification logic:
- `extractProviderErrorDetails()` unwraps `*fantasy.ProviderError` from
the error chain via `errors.As`.
- `retryAfterFromHeaders()` parses headers in priority order:
1. `retry-after-ms` (OpenAI-specific, millisecond precision)
2. `retry-after` (standard HTTP — integer seconds or HTTP-date)
- Case-insensitive header key lookup.
### `coderd/chatd/chatretry/chatretry.go`
- `effectiveDelay(attempt, classified)` computes `max(Delay(attempt),
classified.RetryAfter)` — the provider hint acts as a floor without
weakening the local exponential backoff.
- `Retry()` now uses `effectiveDelay` and passes the effective delay to
both `onRetry(...)` and the sleep timer, so downstream payloads, logs,
and the frontend countdown stay aligned automatically.
### Tests
- `classify_test.go`: Structured provider status + `Retry-After`
extraction, `retry-after-ms` priority, HTTP-date parsing, invalid header
fallback, `WithProvider` preservation.
- `chatretry_test.go`: Retry-after-as-floor semantics — longer hint
wins, shorter hint keeps base delay.
## Design notes
- **No SDK/API/frontend changes needed.** `codersdk.ChatStreamRetry`
already carries `DelayMs` and `RetryingAt`, and the frontend already
consumes them. The fix is purely in the server-side delay computation.
- **Existing retryability rules unchanged.** This fixes *when* we sleep,
not *whether* an error is retryable.
- **Provider hint is a floor:** `max(baseDelay, RetryAfter)` ensures we
never retry earlier than the provider asks, and never weaken our own
backoff curve.
## Changes
- **Commit 1**: Remove 17 unnecessary `//nolint` directives:
- `//nolint:varnamelen` — linter not active
- `//nolint:unused` on exported `SlimUnsupported`
- `//nolint:govet` in `coderd/httpmw/csrf` — no longer fires
- `//nolint:revive` on functions refactored since the nolint was added
- `//nolint:paralleltest` citing Go 1.22 loop variable capture
(obsolete)
- Bare `//nolint` narrowed to specific `//nolint:gocritic` with
justification
- **Commit 2**: Fix root causes behind 5 dangerous nolint suppressions:
- Add `MinVersion: tls.VersionTLS12` to TLS client config (removes
`gosec` G402)
- Delete trivial unexported wrappers `apiKey()`/`normalizeProvider()` in
chatprovider (removes `revive` confusing-naming)
- Add doc comments to `StartWithAssert` and `Router` (removes `revive`
exported)
- Rename unused parameters to `_` in integration test helpers
> 🤖 This PR was created using Coder Agents and reviewed by me.
- Suppress informational `log.Printf` messages from the metrics scanner
when stdout is not a TTY (i.e. piped via `atomic_write` in `make gen` or
CI)
- Genuine warnings (`warnf`) still print unconditionally so real
problems remain visible
- `log.Fatalf` for fatal errors is unchanged
> 🤖 Created by Coder Agents and reviewed by a human
Admins can now control whether the built-in Coder Agents default system
prompt is prepended to their custom instructions, rather than having the
custom prompt silently replace the default.
**Changes:**
- New `include_default_system_prompt` boolean toggle (defaults to `true`
for existing deployments) stored as a site config key — no migration
needed.
- GET `/api/experimental/chats/config/system-prompt` returns the toggle
state, the custom prompt, and a preview of the built-in default.
- PUT persists both the toggle and custom prompt atomically in a single
transaction.
- `resolvedChatSystemPrompt()` composes `[default?, custom?]` joined by
`\n\n`, falling back to the built-in default on DB errors.
- Settings UI adds a Switch toggle with conditional helper text and a
"Preview" button that shows the built-in default prompt via the existing
`TextPreviewDialog`.
- Comprehensive test coverage: 15 subtests covering toggle behavior,
prompt composition matrix, auth boundaries, and integration with chat
creation.
- Adds `GET /api/experimental/chats/by-workspace` endpoint that returns
workspace_id → latest chat_id mapping
- Modifies FE to fetch this alongside the workspace list, gated on
`agents` experiment and render an "Agent" badge similar to the existing
"Task" badge in `WorkspacesTable`
- Badge links to the "latest chat" linked to the given workspace.
Notes:
- Intentionally uses `fetchWithPostFilter` for RBAC to decouple from
workspaces API — will migrate to `workspaces_expanded` view later.
- If users have multiple chats linked to the same workspace, the badge
will link to the most recently updated one.
> 🤖 This PR was created with the help of Coder Agents, and has been
reviewed by my human. 🧑💻
## Summary
Adds an entitlement-gated **AI add-on** column to both the **Users**
table and the **Organization Members** table. When
`ai_governance_user_limit` is entitled, each row shows whether the user
is consuming an AI seat.
## Background
The AI governance add-on tracks which users are consuming AI seats.
Admins need visibility into per-user seat consumption directly from the
user management tables. This change surfaces that information through
both the site-wide Users table and the per-organization Members table,
gated behind the `ai_governance_user_limit` entitlement so the column
only appears when the feature is licensed.
## Implementation
### Backend
- **New SQL query** `GetUserAISeatStates`
(`coderd/database/queries/aiseatstate.sql`) — returns user IDs consuming
an AI seat, derived from:
- Users with entries in `aibridge_interceptions` (AI Bridge usage)
- Users who own workspaces with `has_ai_task = true` builds (AI Tasks
usage)
- **SDK types** — added `has_ai_seat: boolean` to `codersdk.User` and
`codersdk.OrganizationMemberWithUserData`
- **Handler wiring** — both the Users list endpoint (`coderd/users.go`)
and all Members endpoints (`coderd/members.go`) query AI seat state per
page of user IDs and populate the response field
- **dbauthz** — per-user `ActionRead` checks on `ResourceUserObject`
### Frontend
- **Shared `AISeatCell` component**
(`site/src/modules/users/AISeatCell.tsx`) — green `CircleCheck` for
consuming, gray `X` for non-consuming
- **`TableColumnHelpTooltip`** — extended with `ai_addon` variant with
tooltip: *"Users with access to AI features like AI Bridge, Boundary, or
Tasks who are actively consuming a seat."*
- **Column visibility** gated behind
`useFeatureVisibility().ai_governance_user_limit`
## Validation
- Backend: dbauthz full method suite (`TestMethodTestSuite`) passes
including new `GetUserAISeatStates` test
- Backend: `TestGetUsers`, `TestUsersFilter`, CLI golden file tests pass
- Frontend: 7/7 tests pass across `UsersPage.test.tsx` and
`OrganizationMembersPage.test.tsx` (column visibility gating both
directions)
- `go build ./coderd/...` compiles clean
- `pnpm --dir site run lint:types` passes
- `make gen` clean
## Risks
- **Pagination performance**: The AI seat query is scoped to the current
page's user IDs (not a full table scan), keeping it efficient for
paginated views.
- **Semantic scope**: The workspace-side AI seat derivation uses "any
build with `has_ai_task = true`" rather than "latest build only". If the
product intent is latest-build-only, this can be tightened in a
follow-up.
---
_Generated with `mux` • Model: `anthropic:claude-opus-4-6` • Thinking:
`xhigh` • Cost: `$27.25`_
<!-- mux-attribution: model=anthropic:claude-opus-4-6 thinking=xhigh
costs=27.25 -->
*Disclaimer: implemented by a Coder Agent using Claude Opus 4.6,
reviewed by me.*
Replace the transitional soft warning message:
> AI Bridge is now Generally Available in v2.30. In a future Coder
version, your deployment will require the AI Governance Add-On to
continue using this feature. Please reach out to your account team or
sales@coder.com to learn more.
with the definitive requirement message:
> The AI Governance Add-On is required to use AI Bridge. Please reach
out to your account team or sales@coder.com to learn more.
Updated in:
- `enterprise/coderd/license/license.go`
- `enterprise/coderd/license/license_test.go` (2 occurrences)
## Summary
Adds a process-wide cache for three hot database queries in `chatd` that
were hitting Postgres on **every chat turn** despite returning
rarely-changing configuration data:
| Query | Before (50k turns) | After | Reduction |
|---|---|---|---|
| `GetEnabledChatProviders` | ~98.6k calls | ~500-1000 | ~99% |
| `GetChatModelConfigByID` | ~49.2k calls | ~500-1000 | ~98% |
| `GetUserChatCustomPrompt` | ~46.7k calls | ~1000-2000 | ~97% |
These were identified via `coder exp scaletest chat` (5000 concurrent
chats × 10 turns) as the dominant source of Postgres load during chat
processing.
## Design
Follows the established **webpush subscription cache pattern**
(`coderd/webpush/webpush.go`):
- `sync.RWMutex` + `tailscale.com/util/singleflight` (generic) +
generation-based stale prevention + TTL
- 10s TTL for provider/model config, 5s TTL for user prompts
- Negative caching for `sql.ErrNoRows` on user prompts (the common case
— most users don't set custom prompts)
- Deep-clones `ChatModelConfig.Options` (`json.RawMessage` = `[]byte`)
on both store and read paths
### Invalidation
Single pubsub channel (`chat:config_change`) with kind discriminator for
cross-replica cache invalidation. Seven publish points in
`coderd/chats.go` cover all admin mutation endpoints
(create/update/delete for providers and model configs, put for user
prompts).
_This PR was generated with mux and was reviewed by a human_
*Disclaimer: implemented by a Coder Agent using Claude Opus 4.6*
Adds an info banner on the `/aibridge/request-logs` page encouraging
users to visit `/aibridge/sessions` for an improved audit experience.
This allows us to validate whether customers still find the raw request
logs view useful before removing it in a future release.
Fixes#23563
When a provider request fails and retries, the "Retrying request" banner
lingered in the UI after the retry succeeded. This happened because
`retryState` was only cleared on explicit `status` events (`running`,
`pending`, `waiting`), not when the stream resumed with `message_part`
or `message` events. Since the backend does not publish a
dedicated"retry resolved" event, the banner stayed visible for the
entire duration of the successful response.
Add `store.clearRetryState()` calls to the `message_part`, `message`,
and `status` event handlers so the banner disappears as soon as content
flows again.
Closes https://github.com/coder/coder/issues/23624
Follow-up to #23282. The retry and terminal error callouts had a few UX
oddities:
- Auto-retrying states reused backend error text that said "Please try
again" even while the UI was already retrying on behalf of the user.
- Terminal error states also said "Please try again" with no action the
user could take.
- `startup_timeout` had no specific title or retry copy — it fell
through to the generic "Retrying request" heading.
- The kind pill showed raw enum values like `startup_timeout` and
`rate_limit`.
- Terminal error metadata showed a "Retryable" / "Not retryable" label
that does not help users.
- A separate "Provider anthropic" metadata row duplicated information
already present in the message body.
- The `usage-limit` error kind used a hyphen while every backend kind
uses underscores.
Changes:
**Backend (`chaterror/message.go`)**
- Split message generation into `terminalMessage()` and
`retryMessage()`, replacing the old `userFacingMessage()`.
- Terminal messages include HTTP status codes and actionable guidance
(e.g. "Check the API key, permissions, and billing settings.").
- Retry messages are clean factual statements without status codes or
remediation, suitable for the retry countdown UI (e.g. "Anthropic is
temporarily overloaded.").
- Removed "Please try again" / "Please try again later" from all paths.
- `StreamRetryPayload` calls `retryMessage()` instead of forwarding
`classified.Message`.
**Frontend**
- Removed the parallel frontend message-generation system:
`getRetryMessage()`, `getProviderDisplayName()`,
`getRetryProviderSubject()`, and the `PROVIDER_DISPLAY_NAMES` map are
all deleted from `chatStatusHelpers.ts`.
- `liveStatusModel.ts` passes `retryState.error` through directly — the
backend owns the copy.
- Added specific title and retry copy for `startup_timeout`, and
extended the title mapping to cover `auth` and `config`.
- Kind pills now show humanized labels ("Startup timeout", "Rate limit",
etc.) instead of raw enum strings.
- Removed the redundant "Provider anthropic" metadata row.
- Removed the terminal "Retryable" / "Not retryable" badge.
- Normalized `"usage-limit"` → `"usage_limit"` and added it to
`ChatProviderFailureKind` so all error kinds follow the same underscore
convention and live in one enum.
Refs #23282.
## Problem
The dogfood startup script uses `gh auth status` to decide whether to
re-authenticate the GitHub CLI. That command exits non-zero when **any**
stored credential is invalid—even if Coder external auth already injects
a working `GITHUB_TOKEN` into the environment and `gh` commands work
fine.
On workspaces with a persistent home volume, `~/.config/gh/hosts.yml`
retains OAuth tokens written by previous `gh auth login --with-token`
calls. These tokens are issued by Coder's external auth integration and
can be rotated or revoked between workspace starts, but the copy in
`hosts.yml` persists on the volume. When the stored token goes stale,
`gh auth status` reports two accounts:
```
✓ Logged in to github.com account user (GITHUB_TOKEN) ← works fine
✗ Failed to log in to github.com account user (hosts.yml) ← stale token
```
It exits 1 because of the stale entry, even though `gh` API calls
succeed via `GITHUB_TOKEN`. This makes the auth state **indeterminate**
from `gh auth status` alone—you can't tell whether `gh` actually works
or not.
When the script enters the login branch:
1. `gh auth login --with-token` **refuses** to accept piped input when
`GITHUB_TOKEN` is already set in the environment, and exits 1.
2. `set -e` kills the script before it reaches `sudo service docker
start`.
The result: Docker never starts, devcontainer health checks fail, and
the workspace reports a startup error—all because of a stale GitHub CLI
credential that has no bearing on workspace functionality.
## Fix
- Switch the auth guard from `gh auth status` to `gh api user --jq
.login`, which tests whether GitHub API access actually works regardless
of which credential provides it.
- Wrap the fallback `gh auth login` so a failure logs the indeterminate
state but does not abort the script.
## Summary
This change removes the steady-state "resolve the latest workspace
agent" query from chat execution.
Instead of asking the database for the latest build's agent on every
turn, a chat now persists the workspace/build/agent binding it actually
uses and reuses that binding across subsequent turns. The common path
becomes "load the bound agent by ID and dial it", with fallback paths to
repair the binding when it is missing, stale, or intentionally changed.
## What changes
- add `workspace_id`, `build_id`, and `agent_id` binding fields to
`chats`
- expose those fields through the chat API / SDK so the execution
context is explicit
- load the persisted binding first in chatd, instead of always resolving
the latest build's agent
- persist a refreshed binding when chatd has to re-resolve the workspace
agent
- keep child / subagent chats on the same bound workspace context by
inheriting the parent binding
- leave `build_id` / `agent_id` unset for flows like `create_workspace`,
then bind them lazily on the next agent-backed turn
## Runtime behavior
The binding is treated as an optimistic cache of the agent a chat should
use:
- if the bound agent still exists and dials successfully, we use it
without a latest-build lookup
- if the bound agent is missing or no longer reachable, chatd
re-resolves against the latest build and persists the new binding
- if a workspace mutation changes the chat's target workspace, the
binding is updated as part of that mutation
To avoid reintroducing a hot-path query, dialing uses lazy validation:
- start dialing the cached agent immediately
- only validate against the latest build if the dial is still pending
after a short delay
- if validation finds a different agent, cancel the stale dial, switch
to the current agent, and persist the repaired binding
## Result
The hot path stops issuing
`GetWorkspaceAgentsInLatestBuildByWorkspaceID` for every user message,
which is the source of the DB pressure this PR is addressing. At the
same time, chats still converge to the correct workspace agent when the
binding becomes stale due to rebuilds or explicit workspace changes.
Previously, long bash commands in the execute tool were truncated with
an ellipsis and could not be viewed in full. The only way to see the
full command was to copy it via the copy button.
Adds overflow detection and an inline expand/collapse chevron next to
the copy button. Clicking the command text or the chevron toggles
between truncated and wrapped views. Short commands that fit on one line
are visually unchanged.
https://github.com/user-attachments/assets/88ec6cd4-5212-4608-9a90-9ce217d5dce7
EDIT: couldn't be bothered re-recording the video but the chevron is
hidden until hovered now, like the copy button.
_PR generated by Mux but reviewed by a human_
## Problem
The e2e test `template update with new name redirects on successful
submit` is flaky.
After saving template settings, the app navigates to
`/templates/<name>`, which immediately redirects to
`/templates/<name>/docs` via the router's index route (`<Navigate
to="docs" replace />`). The assertion used `expect.poll()` with
`toHavePathNameEndingWith(`/${name}`)`, which matches only the
**transient intermediate URL** — it only exists while `TemplateLayout`'s
async data fetch is pending. Once the fetch resolves and the `<Outlet
/>` renders, the index route fires the `/docs` redirect and the URL no
longer matches.
## Why it's flaky (not deterministic)
The flakiness depends on whether the template query cache is warm:
- **Cache miss → PASSES**: The mutation's `onSuccess` handler
invalidates the query cache. If `TemplateLayout` needs to re-fetch, it
shows a `<Loader />`, which delays rendering the `<Outlet />` that
contains the `<Navigate to="docs">`. This gives `expect.poll()` time to
see the transient `/new-name` URL → **pass**.
- **Cache hit → FAILS**: If the template data is still in the query
client, `TemplateLayout` renders immediately and the `<Navigate
to="docs" replace />` fires nearly instantly. By the time the first poll
runs, the URL is already `/new-name/docs` → **fail**.
## Fix
Assert the **final stable URL** (`/${name}/docs`) instead of the
transient one.
This is safe because `expect.poll()` is retry-based: it keeps sampling
until a match is found (or timeout). Seeing the transient `/new-name`
URL just causes harmless retries — once the redirect completes and the
URL settles on `/new-name/docs`, the poll matches and the test passes.
| Poll | URL | Ends with `/new-name/docs`? | Action |
|---|---|---|---|
| 1st | `/templates/new-name` | No | Retry |
| 2nd | `/templates/new-name` | No | Retry |
| 3rd | `/templates/new-name/docs` | Yes | **Pass** ✅ |
Closes https://github.com/coder/internal/issues/1403
Problem: previously, the deployment-wide chat template allowlist was never actually wired in from `chatd.go`
- Extracts `parseChatTemplateAllowlist` into shared `coderd/util/xjson.ParseUUIDList`
- Adds `Server.chatTemplateAllowlist()` method that reads the allowlist from DB
- Passes `AllowedTemplateIDs` callback to `ListTemplates`, `ReadTemplate`, and `CreateWorkspace` tool constructors
> 🤖 Created by Coder Agents and reviewed by a human.
The `/agents` model picker collapsed distinct configured model variants
into fewer entries because options were built from the deduplicated
catalog (`ChatModelsResponse`). Two configs with the same provider/model
but different display names or settings appeared as a single option.
Switch option building from `getModelOptionsFromCatalog()` to a new
`getModelOptionsFromConfigs()` that emits one `ModelSelectorOption` per
enabled `ChatModelConfig` row. The option ID is the config UUID
directly, eliminating the catalog-ID ↔ config-ID mapping layer
(`buildModelConfigIDByModelID`, `buildModelIDByConfigID`).
Provider availability is still gated by the catalog response, and status
messaging ("no models configured" vs "models unavailable") is unchanged.
The sidebar now resolves model labels by config ID first, and the
/agents Storybook fixtures were updated so the stories seed matching
config IDs and model-config query data after the picker contract change.
The `/agents` transcript used `flex-col-reverse` for bottom-anchored
chat layout, where `scrollTop = 0` means bottom and the sign of
`scrollTop` when scrolled up varies by browser engine. A
`ResizeObserver` detected content height changes and applied manual
`compensateScroll(delta)` to preserve position, which fought manual
upward scrolling during streaming — repeatedly adjusting the user's
scroll position when they were trying to read earlier content.
This replaces that model with normal DOM order (`flex-col`, standard
`overflow-y: auto`) and a dedicated `useAgentTranscriptAutoScroll` hook
that only auto-scrolls when follow-mode is enabled. When the user
scrolls up, follow-mode disables and incoming content does not move the
viewport.
Changes:
- **New**: `useAgentTranscriptAutoScroll.ts` — local hook with
follow-mode state, RAF-throttled button visibility, dual
`ResizeObserver` (content + container), and `jumpToBottom()`
- **Modified**: `AgentDetailView.tsx` — removed
`ScrollAnchoredContainer` (~350 lines of reverse-layout compensation),
replaced with normal-order container wired to the new hook, added
pagination prepend scroll restoration
- **Modified**: `AgentDetailView.stories.tsx` — updated scroll stories
for normal-order bottom-distance assertions
## Problem
MCP OAuth2 auto-discovery stripped the path component from the MCP
server URL
before looking up Protected Resource Metadata. Per RFC 9728 §3.1, the
well-known
URL should be path-aware:
```
{origin}/.well-known/oauth-protected-resource{path}
```
For `https://api.githubcopilot.com/mcp/`, the correct metadata URL is
`https://api.githubcopilot.com/.well-known/oauth-protected-resource/mcp/`,
not
`https://api.githubcopilot.com/.well-known/oauth-protected-resource`
(which
returns 404).
The same issue applied to RFC 8414 Authorization Server Metadata for
issuers
with path components (e.g. `https://github.com/login/oauth` →
`/.well-known/oauth-authorization-server/login/oauth`).
## Fix
Replace the `mcp-go` `OAuthHandler`-based discovery with a
self-contained
implementation that correctly follows path-aware well-known URI
construction for
both RFC 9728 and RFC 8414, falling back to root-level URLs when the
path-aware
form returns an error. Also implements RFC 7591 registration directly,
removing
the `mcp-go/client/transport` dependency from the discovery path.
Note: this fix resolves the discovery half of the problem for servers
like
GitHub Copilot. Full OAuth2 support for GitHub's MCP server also
requires
dynamic client registration (RFC 7591), which GitHub's authorization
server
does not currently support — that will be addressed separately.
## Summary
Tool call failures in `/agents` previously displayed alarming red
styling (red icons, red text, red alert icons) that made it look like
the user did something wrong. This PR replaces the scary error
presentation with a calm, unified style and adds a dedicated timeout
display for subagent tools.
## Changes
### Unified failure style (all tools)
- Replace red `CircleAlertIcon` + `text-content-destructive` with a
muted `TriangleAlertIcon` in `text-content-secondary` across **all 11
tool renderers**.
- Remove red icon/label recoloring on error from `ToolIcon` and all
specialized tool components.
- Error details remain accessible via tooltip on hover.
### Subagent timeout display
- `ClockIcon` with "Timed out waiting for [Title]" instead of a generic
error display.
- `CircleXIcon` for non-timeout subagent errors with proper error verbs
("Failed to spawn", "Failed waiting for", etc.) instead of the
misleading running verb ("Waiting for").
- Timeout detection from result string/error field containing "timed
out".
### Title resolution for historical messages
- `ConversationTimeline` now computes `subagentTitles` via
`useMemo(buildSubagentTitles(...))` and passes it to historical
`ChatMessageItem` rendering, so `wait_agent` can resolve the actual
agent title from a prior `spawn_agent` result even outside streaming
mode.
### Stories
8 new stories: `GenericToolFailed`, `GenericToolFailedNoResult`,
`SubagentWaitTimedOut`, `SubagentWaitTimedOutWithTitle`,
`SubagentWaitTimedOutTitleFromMap`, `SubagentSpawnError`,
`SubagentWaitError`, `MCPToolFailedUnifiedStyle`.
## Files changed (15)
- `tool/Tool.tsx` — GenericToolRenderer + SubagentRenderer
- `tool/SubagentTool.tsx` — timeout/error verbs, icon changes
- `tool/ToolIcon.tsx` — remove destructive recoloring
- `tool/*.tsx` (10 specialized tools) — unified warning icon
- `ConversationTimeline.tsx` — pass subagentTitles to historical
rendering
- `tool.stories.tsx` — 8 new stories, updated existing assertions
- Move `agent/agentdesktop/` to `agent/x/agentdesktop/` to signal
experimental/unstable status
- Update import paths in `agent/agent.go` and `api_test.go`
> 🤖 This mechanical refactor was performed by an agent. I made sure it
didn't change anything it wasn't supposed to.
- Clarify the system prompt to prefer tools before asking the user for
clarification.
- Limit clarification to cases where ambiguity or user preferences
materially affect the outcome.
- Remove the contradictory instruction to always start by asking
clarifying questions.
> 🤖 This PR has been reviewed by the author.
### Changes
**coder/coder:**
- `coderd/aibridge/aibridge.go` — Added `HeaderCoderBYOKToken` constant,
`IsBYOK()` helper, and updated `ExtractAuthToken` to check the BYOK
header first.
- `enterprise/aibridged/http.go` — BYOK-aware header stripping: in BYOK
mode only the BYOK header is stripped (user's LLM credentials
preserved); in centralized mode all auth headers are stripped.
<hr/>
**NOTE**: `X-Coder-Token` was removed! As of now `ExtractAuthToken`
retrieves token either from `X-Coder-AI-Governance-BYOK-Token` or from
`Authorization`/`X-Api-Key`.
---------
Co-authored-by: Susana Ferreira <susana@coder.com>
Co-authored-by: Danny Kopping <danny@coder.com>
## Summary
Adds a general-purpose `map[string]string` label system to chats, stored
as jsonb with a GIN index for efficient containment queries.
This is a standalone foundational feature that will be used by the
upcoming Automations feature for session identity (matching webhook
events to existing chats), replacing the need for bespoke session-key
tables.
## Changes
### Database
- **Migration 000451**: Adds `labels jsonb NOT NULL DEFAULT '{}'` column
to `chats` table with a GIN index (`idx_chats_labels`)
- **`InsertChat`**: Accepts labels on creation via `COALESCE(@labels,
'{}')`
- **`UpdateChatByID`**: Supports partial update —
`COALESCE(sqlc.narg('labels'), labels)` preserves existing labels when
NULL is passed
- **`GetChats`**: New `has_labels` filter using PostgreSQL `@>`
containment operator
- **`GetAuthorizedChats`**: Synced with generated `GetChats` (new column
scan + query param)
### API
- **Create chat** (`POST /chats`): Accepts optional `labels` field,
validated before creation
- **Update chat** (`PATCH /chats/{chat}`): Supports `labels` field for
atomic label replacement
- **List chats** (`GET /chats`): Supports `?label=key:value` query
parameters (multiple are AND-ed)
### SDK
- `Chat`, `CreateChatRequest`, `UpdateChatRequest`, `ListChatsOptions`
all gain `Labels` fields
- `UpdateChatRequest.Labels` is a pointer (`*map[string]string`) so
`nil` means "don't change" vs empty map means "clear all"
### Validation (`coderd/httpapi/labels.go`)
- Max 50 labels per chat
- Key: 1–64 chars, must match `[a-zA-Z0-9][a-zA-Z0-9._/-]*` (supports
namespaced keys like `github.repo`, `automation/pr-number`)
- Value: 1–256 chars
- 13 test cases covering all edge cases
### Chat runtime
- `chatd.CreateOptions` gains `Labels` field, threaded through to
`InsertChat`
- Existing `UpdateChatByID` callers (e.g., quickgen title updates) are
unaffected — NULL labels preserve existing values via COALESCE
We had a bug where computer use base64-encoded screenshots would not be
interpreted as screenshots anymore once saved to the db, loaded back
into memory, and sent to Anthropic. Instead, they would be interpreted
as regular text. Once a computer use agent made enough screenshots and
stopped, and you tried sending it another message, you'd get an out of
context error:
<img width="808" height="367" alt="Screenshot 2026-03-23 at 12 02 54"
src="https://github.com/user-attachments/assets/f0bf6be2-4863-47ca-a7a9-9e6d9dfceeed"
/>
This PR fixes that.
## Summary
Introduces a new `context-file` ChatMessagePart type for persisting
workspace instruction files (AGENTS.md) as durable, frontend-visible
message parts. This is the foundation for showing loaded context files
in the chat input's context indicator tooltip.
### Problem
Previously, instruction files were resolved transiently on every turn
via `resolveInstructions()` → `InsertSystem()` and injected into the
in-memory prompt without persistence. The frontend had no knowledge that
instruction files were loaded into context, and there was no way to
surface this information to users.
### Solution
Instruction files are now read **once** when a workspace is first
attached to a chat (matching how [openai/codex handles
it](https://developers.openai.com/codex/guides/agents-md)) and persisted
as `user`-role, `both`-visibility message parts with a new
`context-file` type. This ensures:
- **Durability**: survives page refresh (data is in the DB, returned by
`getChatMessages`)
- **Cache-friendly**: `user`-role avoids the system-message hoisting
that providers do, keeping the instruction content in a stable position
for prompt caching
- **Frontend-visible**: the frontend receives paths and truncation
status for future context indicator rendering
- **Extensible**: the same pattern works for Skills (future)
### Key changes
| Layer | Change |
|---|---|
| **SDK** (`codersdk/chats.go`) | Add `ChatMessagePartTypeContextFile`
with `context_file_path`, `context_file_content` (internal, stripped
from API), `context_file_truncated` fields |
| **Prompt expansion** (`chatprompt`) | Expand `context-file` parts to
`<workspace-context>` text blocks in `partsToMessageParts()` |
| **Chat engine** (`chatd.go`) | Add `persistInstructionFiles()`, called
on first turn with a workspace. Remove per-turn `resolveInstructions()`
+ `InsertSystem()` from `processChat()` and `ReloadMessages` |
| **Frontend** | Ignore `context-file` parts in `messageParsing.ts` and
`streamState.ts` (no rendering yet — follow-up will add tooltip display)
|
### How it works
1. On each turn, `processChat` checks if any loaded message contains
`context-file` parts
2. If not (first turn with a workspace), reads AGENTS.md files via the
workspace agent connection and persists them
3. For this first turn, also injects the instruction text into the
prompt (since messages were loaded before persistence)
4. On all subsequent turns, `ConvertMessagesWithFiles()` encounters the
persisted `context-file` parts and expands them into text automatically
— no extra resolution needed
The description read "Control access of templates for users and groups
to templates" with "templates" appearing twice and garbled grammar.
Simplified to "Control user and group access to templates."
---------
Co-authored-by: Jake Howell <jacob@coder.com>
HealthLayout sets refetchInterval: 30_000 on its health query.
In storybook tests, the seeded cache data prevents the initial
fetch, but interval polling still fires after 30s, hitting the
Vite proxy with no backend. This caused test-storybook to hang
indefinitely in environments without a running coderd.
Set refetchInterval: false in the storybook QueryClient defaults
alongside the existing staleTime: Infinity and retry: false.
The `ComputerUseProviderTool` function needed a little bit of an
adjustment because I changed `NewComputerUseTool`'s signature in
upstream fantasy a little bit.
- Stores a deployment-wide agents template allowlist in `site_configs`
(`agents_template_allowlist`)
- Adds `GET/PUT /api/experimental/chats/config/template-allowlist`
endpoints
- Filters `list_templates`, `read_template`, and `create_workspace` chat
tools by allowlist, if defined (empty=all allowed)
- Add "Templates" admin settings tab in Agents UI ([what it looks
like](https://624de63c6aacee003aa84340-sitjilsyrr.chromatic.com/?path=/story/pages-agentspage-agentsettingspageview--template-allowlist))
> 🤖 This PR was created with the help of Coder Agents, and has been
reviewed by my human. 🧑💻
> **PR Stack**
>
> 1. #23351 ← `#23282`
> 2. **#23282** ← `#23275` *(you are here)*
> 3. #23275 ← `#23349`
> 4. #23349 ← `main`
---
## Summary
Replaces raw error strings and infinite "Thinking..." spinners in the
agents chat UI with a structured live-status model that drives startup,
retry, and failure UI from one source of truth.
This branch also folds in the frontend follow-up fixes that fell out of
that refactor: malformed `retrying_at` timestamps no longer render
`Retrying in NaNs`, stale persisted generic errors no longer outlive a
recovered chat status, and partial streamed output stays visible when a
response fails after blocks have already rendered.
Consumes the structured error metadata added in #23275.
Retry-After header handling remains in #23351.
<img width="853" height="493" alt="image"
src="https://github.com/user-attachments/assets/5a4a1690-5e22-4ece-965c-a000fd669244"
/>
<img width="812" height="517" alt="image"
src="https://github.com/user-attachments/assets/e78d28ce-1566-48ca-a991-62c6e1838079"
/>
<img width="847" height="523" alt="image"
src="https://github.com/user-attachments/assets/e5fd7b60-4a3c-4573-ba4c-4e5f6dbfbdc3"
/>
## Problem
The previous AgentDetail chat UI derived startup, retry, and failure
behavior from several loosely connected bits of state spread across
`ChatContext`, `AgentDetailContent`, `ConversationTimeline`, and ad hoc
props. That made the UI inconsistent: some failures were just raw
strings, retry state could only partially describe what was happening,
startup could sit on an infinite spinner, and rendering decisions
depended on local booleans instead of one authoritative model.
Those splits also made edge cases brittle. Invalid retry timestamps
could produce broken countdown text, persisted generic errors could
linger after recovery, and streamed partial output could disappear if
the turn later failed.
## Fix
Introduce a structured live-status pipeline for AgentDetail.
`ChatContext` now normalizes stream errors and retry metadata into
richer state, `liveStatusModel` centralizes precedence and phase
derivation, and `ChatStatusCallout` renders startup, retry, and terminal
failure states with shared copy, provider attribution, status links,
attempt metadata, and guarded countdown handling.
`AgentDetailContent` and `ConversationTimeline` now consume that single
model instead of juggling separate error and stream booleans, while
usage-limit messaging stays on its explicit path. The result is a
timeline that shows consistent state transitions, preserves accumulated
assistant output across failures, suppresses stale generic errors once
live state recovers, and has focused model, store, and story coverage
around those behaviors.
## Problem
Text attachments (`InlineTextAttachmentButton`) and image thumbnails
(`ImageThumbnail`) rendered at different heights when displayed side by
side in user messages. Text cards had no explicit height
(content-driven), while images used `h-16` (64px).
## Changes
**`ConversationTimeline.tsx`**
- Added `h-16` to `InlineTextAttachmentButton` to match `ImageThumbnail`
- Added `isPlaceholder` prop: when the content hasn't been fetched yet
(file_id path), renders "Pasted text" in sans-serif `text-sm` with
`items-center` alignment instead of monospace `text-xs`
- Once real content loads, it still renders in `font-mono text-xs` with
`formatTextAttachmentPreview()`
**`ConversationTimeline.stories.tsx`**
- Added `UserMessageWithMixedAttachments` story showing a text
attachment and image side by side as a visual regression guard
When a chat is created, createChat.onSuccess invalidates the sidebar
list query, triggering a background refetch. The refetch can hit the
server before async title generation finishes, returning the fallback
(truncated) title. If the title_change WebSocket event arrives and
writes the generated title into the cache, the in-flight refetch
response then overwrites it with the stale fallback title.
Cancel any in-flight sidebar-list and per-chat refetches before every
WebSocket-driven cache write. This mirrors the existing pattern in
archiveChat/unarchiveChat, which cancel queries before optimistic
updates for the same reason.
## Problem
When chatd pushes a branch and then creates a PR (e.g. `git push`
followed by `gh pr create`), the gitsync background worker often picks
up the stale `chat_diff_statuses` row between the two operations. At
that point no PR exists yet, so the worker skips the row. However, the
acquisition SQL locks the row for **5 minutes** (crash-recovery
interval), creating a dead zone where the PR diff is invisible in the UI
until the user manually navigates to the chat.
### Root cause
1. `git push` triggers `GIT_ASKPASS` → coderd external-auth handler →
`MarkStale()` sets `stale_at = now - 1s`
2. Background worker acquires the row within ~10s, atomically bumps
`stale_at = NOW() + 5 min` (crash-recovery lock)
3. Worker calls `ResolveBranchPullRequest` → no PR exists yet → returns
`nil` → worker skips with `continue`
4. `gh pr create` completes moments later, but uses its own auth (not
`GIT_ASKPASS`), so no second `MarkStale` fires
5. Row is locked for 5 minutes before the worker can retry
Loading the chat works immediately because `GET /chats/{chat}` calls
`resolveChatDiffStatus` synchronously, which discovers the PR inline.
## Fix
When `ResolveBranchPullRequest` returns nil (no PR yet) **and** the row
was recently marked stale (within 2 minutes), apply a short 15-second
backoff via `BackoffChatDiffStatus` instead of letting the 5-minute
acquisition lock stand. Outside the retry window, the worker skips the
row as before — no indefinite fast-polling for branches that never
receive a PR.
To make the "recently marked stale" check work, `updated_at` is no
longer overwritten by the acquisition and backoff SQL queries. This
preserves it as a reliable "last externally changed" timestamp (set by
`MarkStale` or a successful refresh).
### Behavior summary
| Scenario | `updated_at` age | Backoff | Effective retry |
|---|---|---|---|
| Fresh push, no PR yet | < 2 min | 15s (`NoPRBackoff`) | ~15s |
| Old row, no PR | ≥ 2 min | None (skip) | ~5 min (acquisition lock) |
| Error (any age) | Any | 120s (`DiffStatusTTL`) | ~120s |
| Success (any age) | Any | 120s (`DiffStatusTTL`) | ~120s |
## Changes
- **`coderd/database/queries/chats.sql`** — Remove `updated_at = NOW()`
from `AcquireStaleChatDiffStatuses` and `BackoffChatDiffStatus`
- **`coderd/database/queries.sql.go`** — Regenerated
- **`coderd/x/gitsync/worker.go`** — Add `NoPRBackoff` (15s) and
`NoPRRetryWindow` (2 min) constants; apply short backoff only within the
retry window
- **`coderd/x/gitsync/worker_test.go`** — Add
`TestWorker_NoPR_RecentMarkStale_BacksOffShort` and
`TestWorker_NoPR_OldRow_Skips`
- Add `SanitizePromptText` stripping ~24 invisible Unicode codepoints
and collapsing excessive newlines
- Apply at write and read paths for defense-in-depth
- Frontend: warn in both prompt textareas when invisible characters
detected
- Explicit codepoint list (not blanket `unicode.Cf`) to avoid breaking
flag emoji
- 34 Go tests + idempotency meta-test, 11 TS unit tests, 4 Storybook
stories
> This PR was created with the help of Coder Agents, and was reviewed by my human.
## Changes
Replaces the "Loading model catalog..." / "Loading models..." text flash
on `/agents` with a clean skeleton loading state, and removes the
admin-nag status messages entirely.
### Removed
- `getModelCatalogStatusMessage()` function and
`modelCatalogStatusMessage` prop chain — "Loading model catalog..." /
"No chat models are configured. Ask an admin to configure one." text
below the input
- `inputStatusText` prop chain — "No models configured. Ask an admin." /
"Models are configured but unavailable. Ask an admin." inline text
- `modelCatalogError` prop from `AgentCreateForm`
### Changed
- `AgentChatInput`: when `isModelCatalogLoading` is true, renders a
`Skeleton` in place of the `ModelSelector`
- `getModelSelectorPlaceholder()`: "No Models Configured" / "No Models
Available" (title case)
### Added
- `LoadingModelCatalog` story — skeleton where model selector sits
- `NoModelsConfigured` story — selector shows "No Models Configured"
Net -69 lines.
- Add `X-Coder-Owner-Id`, `X-Coder-Chat-Id`, `X-Coder-Subchat-Id`,
`X-Coder-Workspace-Id` headers to all outgoing LLM API requests from
chatd
- Extend `ModelFromConfig` with `extraHeaders` param, forwarded via
Fantasy `WithHeaders` on all 8 providers
- Add `CoderHeaders(database.Chat)` helper to build the header map from
chat state
- Update all 4 `ModelFromConfig` call sites (resolveChatModel,
computer-use override, title gen, push summary)
- Thread `database.Chat` into `generatePushSummary` (was `chatTitle
string`)
- Tests: `TestCoderHeaders` (4 subtests),
`TestModelFromConfig_ExtraHeaders` (OpenAI + Anthropic),
`TestModelFromConfig_NilExtraHeaders`
- Refactor existing `TestModelFromConfig_UserAgent` to use channel-based
signaling
> 🤖 This PR was generated by Coder Agents and self-reviewed by a human.
create_workspace could create a replacement workspace after a single 5s
agent dial failed, even when the existing workspace agent had recently
checked in. That made temporary reachability blips look like dead
workspaces and let chatd replace a running workspace too aggressively.
Use the workspace agent's DB-backed status with the deployment's
AgentInactiveDisconnectTimeout before allowing replacement. Recently
connected and still-connecting agents now reuse the existing workspace,
while disconnected or timed-out agents still allow a new workspace. This
also threads the inactivity timeout through chatd and adds focused
coverage for the reuse and replacement branches.
## Problem
The sticky user message visual state (`--clip-h`, fade gradient, push-up
positioning) is driven by an `update()` function that only ran on
`scroll` events. The chat scroll container uses `flex-col-reverse`,
where `scrollTop = 0` means "at bottom." When streaming content grows
the transcript while the user is auto-scrolled to the bottom,
`scrollTop` stays at `0` — no `scroll` event fires — so `update()` never
runs and the sticky messages become visually stale until the user
manually scrolls.
## Fix
Add a `ResizeObserver` on the scroller's content wrapper inside the
existing `useLayoutEffect` that sets up the scroll/resize listeners.
When the content wrapper resizes (streaming growth), it fires the
observer which calls `update()` through the same RAF-throttle pattern
used by the scroll handler.
Single observer per sticky message instance. Zero cost when nothing is
resizing. Cleanup handled in the same effect teardown.
Wraps the `GenericToolRenderer` (used for MCP and unrecognized tools) in
`ToolCollapsible` so the result content is hidden behind a
click-to-expand chevron, matching the pattern used by `read_file`,
`write_file`, and other built-in tool renderers.
### Changes
- Move `ToolIcon` + `ToolLabel` into the `ToolCollapsible` `header` prop
- Compute `hasContent` from `writeFileDiff` / `fileContent` /
`resultOutput` — when there's no content, the header renders as a plain
div with no chevron
- Remove `ml-6` from `ScrollArea` classNames (the `ToolCollapsible`
button handles its own layout)
- `defaultExpanded` is `false` by default in `ToolCollapsible`, so
results start collapsed
### Before
MCP tool results were always fully visible inline.
### After
MCP tool results are collapsed by default with a chevron toggle,
consistent with `read_file`, `edit_files`, `list_templates`, etc.
## Problem
When an MCP tool returns an `EmbeddedResource` content item (e.g. GitHub
MCP server returning file contents via `get_file_contents`), the
`convertCallResult` function falls through to the `default` case,
producing:
```
[unsupported content type: mcp.EmbeddedResource]
```
This loses the actual resource content and shows an unhelpful message in
the chat UI.
## Root Cause
The type switch in `convertCallResult` handles `TextContent`,
`ImageContent`, and `AudioContent`, but not the other two `mcp.Content`
implementations from `mcp-go`:
- `mcp.EmbeddedResource` — wraps a `ResourceContents` (either
`TextResourceContents` or `BlobResourceContents`)
- `mcp.ResourceLink` — contains a URI, name, and description
## Fix
Add two new cases to the type switch:
1. **`mcp.EmbeddedResource`**: nested type switch on `.Resource`:
- `TextResourceContents` → append `.Text` to `textParts`
- `BlobResourceContents` → base64-decode `.Blob` as binary (type
`"image"` or `"media"` based on MIME)
- Unknown → fallback `[unsupported embedded resource type: ...]`
2. **`mcp.ResourceLink`**: render as `[resource: Name (URI)]` text
## Testing
Added 3 new test cases (all passing, full suite 23/23 PASS):
- `TestConnectAll_EmbeddedResourceText` — text resource extraction
- `TestConnectAll_EmbeddedResourceBlob` — binary blob decoding
- `TestConnectAll_ResourceLink` — resource link rendering
## Summary
Previously the user's MCP server toggles were ephemeral — every page
reload or navigation to a new chat reset them to the admin-configured
defaults (`force_on` + `default_on`). This was frustrating for users who
routinely disabled a default-on server or enabled a default-off one.
This PR persists the MCP server picker selection to `localStorage` under
the key `agents.selected-mcp-server-ids`.
## Changes
### `MCPServerPicker.tsx`
- **`mcpSelectionStorageKey`** — exported constant for the localStorage
key.
- **`getSavedMCPSelection(servers)`** — reads from localStorage, filters
out stale/disabled IDs, always includes `force_on` servers.
- **`saveMCPSelection(ids)`** — writes the current selection to
localStorage.
### `AgentCreateForm.tsx`
- Initialises `userMCPServerIds` from `getSavedMCPSelection` instead of
`null`.
- Calls `saveMCPSelection` on every toggle.
### `AgentDetail.tsx`
- Adds localStorage as a fallback tier in `effectiveMCPServerIds`: user
override → chat record → **saved selection** → defaults.
- Calls `saveMCPSelection` on every toggle.
### `MCPServerPicker.test.ts` (new)
- 13 unit tests covering save, restore, stale-ID filtering, force_on
merging, invalid JSON handling, and disabled server filtering.
## Fallback priority
| Priority | Source | When |
|----------|--------|------|
| 1 | In-memory state | User toggled during this session |
| 2 | Chat record | Existing conversation with `mcp_server_ids` |
| 3 | localStorage | User has a saved selection from a prior session |
| 4 | Server defaults | `force_on` + `default_on` servers |
Child chats created via `spawn_agent` and `spawn_computer_use_agent`
were not inheriting the parent's `MCPServerIDs`, meaning subagents lost
access to the parent's MCP server tools.
## Changes
- Pass `parent.MCPServerIDs` in the `CreateOptions` for both
`createChildSubagentChat()` and the `spawn_computer_use_agent` tool
handler in `coderd/x/chatd/subagent.go`.
## Tests
Added 3 tests in `subagent_internal_test.go`:
- `TestCreateChildSubagentChat_InheritsMCPServerIDs` — verifies child
chat gets parent's MCP server IDs (multiple servers)
- `TestSpawnComputerUseAgent_InheritsMCPServerIDs` — verifies computer
use subagent gets parent's MCP server IDs via the tool
- `TestCreateChildSubagentChat_NoMCPServersStaysEmpty` — verifies no
regression when parent has no MCP servers
*Disclaimer: implemented by a Coder Agent and reviewed by me.*
Renames the env vars used by chatd integration tests from the canonical
`SOMEPROVIDER_API_KEY` (e.g. `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`) to
`SOMEPROVIDER_TEST_API_KEY` (e.g. `ANTHROPIC_TEST_API_KEY`,
`OPENAI_TEST_API_KEY`) so that test-specific keys don't collide with
production/canonical provider credentials.
Relates to https://github.com/coder/internal/issues/1425
See also:
https://codercom.slack.com/archives/C0AGTPWLA3U/p1774433646799499
## Summary
Adds `xhigh` to the OpenAI reasoning effort normalizer so GPT-5.4 class
models can use `reasoning_effort: xhigh` without it being silently
dropped.
## Problem
The SDK schema (`codersdk/chats.go`) already advertises `xhigh` as a
valid `reasoning_effort` value, but the runtime normalizer in
`chatprovider.go` only accepts `minimal|low|medium|high` for the OpenAI
provider. When a user sets `xhigh`, `ReasoningEffortFromChat()` returns
`nil` and the value never reaches the OpenAI API.
## Changes
- **Fantasy dependency**: Updated `kylecarbs/fantasy` (cj/go1.25) which
now includes the `ReasoningEffortXHigh` constant
([kylecarbs/fantasy#9](https://github.com/kylecarbs/fantasy/pull/9)).
- **`chatprovider.go`**: Adds `fantasyopenai.ReasoningEffortXHigh` to
the OpenAI case in `ReasoningEffortFromChat()`.
- **`chatprovider_test.go`**: Adds `OpenAIXHighEffort` test case.
## Upstream
-
[charmbracelet/fantasy#186](https://github.com/charmbracelet/fantasy/pull/186)
Consolidates coderdtest invocations in 7 tests to reduce 23 instances to 7 across:
- `TestGetUser` (3 → 1) — read-only user lookups
- `TestUserTerminalFont` (3 → 1) — each creates own user via
CreateAnotherUser
- `TestUserTaskNotificationAlertDismissed` (3 → 1) — each creates own
user
- `TestUserLogin` (3 → 1) — each creates/deletes own user
- `TestExpMcpConfigureClaudeCode` (5 → 1) — writes to isolated temp dirs
- `TestOAuth2RegistrationTokenSecurity` (3 → 1) — independent
registrations
- `TestOAuth2SpecificErrorScenarios` (3 → 1) — independent error
scenarios
> 🤖 This PR was created with the help of Coder Agents, and has been
reviewed by my human. 🧑💻
## Description
When duplicating a template that has prebuilds configured, a warning
alert is now shown above the create template form. The warning displays
the total number of prebuilds that will be automatically created.

### Changes
**Single file modified:**
`site/src/pages/CreateTemplatePage/DuplicateTemplateView.tsx`
- Fetches presets for the template's active version using the existing
`templateVersionPresets` React Query helper
- Computes total prebuild count by summing `DesiredPrebuildInstances`
across all presets
- Renders a warning `<Alert>` above the form when prebuilds are
configured
### Design decisions
| Decision | Rationale |
|---|---|
| Warning in `DuplicateTemplateView`, not `CreateTemplateForm` | Only
the duplicate flow needs this. Keeps data fetching local. No new props.
|
| Feature-flag gated (`workspace_prebuilds`) | Matches existing pattern
in `TemplateLayout.tsx`. |
| Non-blocking query | Presets fetch failure shouldn't prevent
duplication. Warning is informational. |
| Count with pluralization | Users know exactly how many prebuilds will
spin up. |
<img width="1136" height="375" alt="image"
src="https://github.com/user-attachments/assets/1ca42608-a204-48f5-b27d-6d476ab32fa7"
/>
Closes#18987
GPG emits an "untrusted key" warning when signing with a key that hasn't
been assigned a trust level, which can cause verification steps to fail
or produce noisy output.
Example:
```sh
gpg: Signature made Tue Mar 24 20:56:59 2026 UTC
gpg: using RSA key 21C96B1CB950718874F64DBD6A5A671B5E40A3B9
gpg: Good signature from "Coder Release Signing Key <security@coder.com>" [unknown]
gpg: WARNING: This key is not certified with a trusted signature!
gpg: There is no indication that the signature belongs to the owner.
Primary key fingerprint: 21C9 6B1C B950 7188 74F6 4DBD 6A5A 671B 5E40 A3B9
```
After importing the release key, derive its fingerprint from the keyring
and mark it as ultimately trusted via `--import-ownertrust`.
The fingerprint is extracted dynamically rather than hard-coded, so this
works for any key supplied via `CODER_GPG_RELEASE_KEY_BASE64`.
## Problem
Template administrators cannot delete templates that have running
prebuilds.
The `deleteTemplate` handler fetches all non-deleted workspaces and
blocks
deletion if any exist, making no distinction between human-owned
workspaces
and prebuild workspaces (owned by the system `PrebuildsSystemUserID`).
This forces admins into a manual multi-step workflow: set
`desired_instances`
to 0 on every preset, wait for the reconciler to drain prebuilds, then
retry
deletion. Prebuilds are an internal system concern that admins should
not need
to manage manually.
## Fix
Replace the blanket `len(workspaces) > 0` guard in `deleteTemplate` with
a
loop that only blocks deletion when a non-prebuild (human-owned)
workspace
exists. Prebuild workspaces — owned by `database.PrebuildsSystemUserID`
— are
now ignored during the check.
Once the template is soft-deleted (`deleted=true`), the existing
prebuilds
reconciler detects `isActive()=false` and cleans up remaining prebuilds
asynchronously. No changes to the reconciler are needed.
The error message and HTTP status for human workspaces remain unchanged.
## Testing
Added two new subtests to `TestDeleteTemplate`:
- **`OnlyPrebuilds`**: deletion succeeds when only prebuild workspaces
exist.
- **`PrebuildsAndHumanWorkspaces`**: deletion is blocked when both
prebuild
and human workspaces exist.
Existing reconciler test ("soft-deleted templates MAY have prebuilds")
already
covers post-deletion prebuild cleanup.
> **PR Stack**
> 1. #23351 ← `#23282`
> 2. #23282 ← `#23275`
> 3. **#23275** ← `#23349` *(you are here)*
> 4. #23349 ← `main`
---
## Summary
Extracts a structured error classification subsystem for agent chat
(`chatd`) so that retry and error payloads carry machine-readable
metadata — error kind, provider name, HTTP status code, and retryability
— instead of raw error strings.
This is the **backend half** of the error-handling work. The frontend
counterpart is in #23282.
## Changes
### New package: `coderd/chatd/chaterror/`
Canonical error classification — extracts error kind, provider, status
code, and user-facing message from raw provider errors. One source of
truth that drives both retry policy and stream payloads.
- **`kind.go`**: Error kind enum (`rate_limit`, `timeout`, `auth`,
`config`, `overloaded`, `unknown`).
- **`signals.go`**: Signal extraction — parses provider name, HTTP
status code, and retryability from error strings and wrapped types.
- **`classify.go`**: Classification logic — maps extracted signals to an
error kind.
- **`message.go`**: User-facing message templates keyed by kind +
signals.
- **`payload.go`**: Projectors that build `ChatStreamError` and
`ChatStreamRetry` payloads from a classified error.
### Modified
- **`codersdk/chats.go`**: Added `Kind`, `Provider`, `Retryable`,
`StatusCode` fields to `ChatStreamError` and `ChatStreamRetry`.
- **`coderd/chatd/chatretry/`**: Thinned to retry-policy only;
classification logic moved to `chaterror`.
- **`coderd/chatd/chatloop/`**: Added per-attempt first-chunk timeout
(60 s) via `guardedStream` wrapper — produces retryable
`startup_timeout` errors instead of hanging forever.
- **`coderd/chatd/chatd.go`**: Publishes normalized retry/error payloads
via `chaterror` projectors.
<!--
If you have used AI to produce some or all of this PR, please ensure you
have read our [AI Contribution
guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING)
before submitting.
-->
Adds the AI Bridge sessions list page.
## Summary
The `build` CI job on `main` is failing with:
```
ERROR: Computed invalid windows version format: 2.32.0-rc.0.1
```
This started when the `v2.32.0-rc.0` tag was created, making `git
describe` produce versions like `2.32.0-rc.0-devel+4f571f8ff`.
## Root cause
`scripts/build_go.sh` converts the version to a Windows-compatible
`X.Y.Z.{0,1}` format by stripping pre-release segments. It uses
`${var%-*}` (shortest suffix match), which only removes the last
`-segment`. For RC versions this leaves `-rc.0` intact:
```
2.32.0-rc.0-devel → strip %-* → 2.32.0-rc.0 → + .1 → 2.32.0-rc.0.1 ✗
```
## Fix
Switch to `${var%%-*}` (longest suffix match) so all pre-release
segments are stripped from the first hyphen onward:
```
2.32.0-rc.0-devel → strip %%-* → 2.32.0 → + .1 → 2.32.0.1 ✓
```
Verified all version patterns produce valid output:
| Input | Output |
|---|---|
| `2.32.0` | `2.32.0.0` |
| `2.32.0-devel` | `2.32.0.1` |
| `2.32.0-rc.0-devel` | `2.32.0.1` |
| `2.32.0-rc.0` | `2.32.0.1` |
Fixes
https://github.com/coder/coder/actions/runs/23511163474/job/68434008241
Nine subtests covering the poll loop, pubsub notification path,
timeout, context cancellation, descendant auth check, and both
error-status branches in handleSubagentDone.
Wire p.clock through awaitSubagentCompletion's timer and ticker
so future tests can use quartz mock clock. Tests use channel-based
coordination and context.WithTimeout instead of time.Sleep.
Coverage: awaitSubagentCompletion 0%->70.3%, handleSubagentDone
0%->100%, checkSubagentCompletion 0%->77.8%,
latestSubagentAssistantMessage 0%->78.9%.
Super unclear why CI hates me and only [fails
lint](https://github.com/coder/coder/actions/runs/23504799702/job/68409588632?pr=23385)
for me (I feel personally attacked), but `dpdm` detected a circular
dependency between the WorkspaceSettingPage and its Sidebar in my other
branch. They both wanted the same context/hook combo, so easy solve to
move the hook/context into a third module to resolve the circular dep.
## Summary
Large pasted text that the UI collapses into an attachment chip was
completely invisible to the LLM. Providers only accept specific MIME
types (images, PDFs) in file content blocks — a `text/plain` `FilePart`
is silently dropped, so the model received nothing for pasted content.
## Fix
Detect paste-originated text files by their
`pasted-text-{timestamp}.txt` filename pattern and convert them to
`fantasy.TextPart` with a bounded 128 KiB inline body and truncation
notice. Binary uploads and real uploaded text files keep their existing
`FilePart` semantics.
The detection uses the existing frontend naming convention
(`pasted-text-YYYY-MM-DD-HH-MM-SS.txt`) combined with a text-like MIME
check for defense-in-depth. A TODO marks this for migration to explicit
origin metadata.
<details>
<summary>Review notes: intentionally skipped findings</summary>
A 10-reviewer deep review was run on this change. The following findings
were raised and intentionally dropped after cross-check. Documenting
them here so future reviewers do not re-flag the same concerns:
**"Unresolved file IDs cause silent data loss" (Edge Case Analyst P1)**
— When a file ID is not in the resolver map, `name` stays empty and
paste detection fails. This is pre-existing behavior for ALL file types
(not introduced by this change). The resolver calls `GetChatFilesByIDs`
which returns whatever rows exist; missing IDs simply fall through to an
empty `FilePart`. The Contract Auditor independently traced this path
and confirmed the fallback is safe. If the file was deleted between
message construction and conversion, the model already saw nothing
before this patch — this change does not make it worse.
**"String builder pre-allocation overhead" (Performance Analyst P1)** —
Misidentified scope. `formatSyntheticPasteText` is only called when
`isSyntheticPaste` returns true (actual synthetic pastes), not for every
file part. The `Grow()` call is correct and efficient.
**"Constant naming violates Uber style" (Style Reviewer P1)** —
Over-severity. `syntheticPasteInlineBudget` is standard Go camelCase for
unexported constants, consistent with the Uber guide and surrounding
code.
**"`IsSyntheticPasteForTest` naming is misleading" (Style Reviewer P2)**
— This is the standard Go `export_test.go` pattern. The `ForTest` suffix
is conventional.
</details>
Observations bypassed the severity test entirely. A reviewer filing
a convention violation as Obs meant it skipped both the upgrade
check and the unnecessary-novelty gate. The combination let issues
pass through as dropped observations when they warranted P3+.
Two changes:
- Severity test now applies to findings AND observations.
- Unnecessary novelty check now covers reviewer-flagged Obs.
When developers switch branches, the database may have migrations
from the other branch that don't exist in the current binary.
This causes coder server to fail at startup, leaving developers
stuck.
The develop script now detects this before starting the server:
1. Connects to postgres (starts temp embedded instance for
built-in postgres, or uses CODER_PG_CONNECTION_URL).
2. Compares DB version against the source's latest migration.
3. If DB is ahead, searches git history for the missing down
SQL files and applies them in a transaction.
4. If git recovery fails (ambiguous versions across branches,
missing files), falls back to resetting the public schema.
Also adds --reset-db and --skip-db-recovery flags.
When edit_files receives multiple files, each file was processed
independently: read, compute edits, write. If file B failed, file A
was already written to disk. The caller got an error but had no way
to know which files were modified.
Split editFile into prepareFileEdit (read + compute, no side
effects) and a write phase. The handler runs all preparations
first and writes only if every file's edits succeed.
A write-phase failure (e.g. disk full) can still leave earlier
files committed. True cross-file atomicity would require
filesystem transactions. The prepare phase catches the common
failure modes: bad paths, search misses, permission errors.
## Summary
Fixes several bugs in the markdown URL transform that replaces
`localhost` URLs with workspace port-forward URLs in the AI agent chat.
## Bugs Fixed
### 1. URLs without an explicit port produce `NaN` in the subdomain
When an LLM outputs a URL like `http://localhost/path` (no port),
`parsed.port` is the empty string `""`. `parseInt("", 10)` returns
`NaN`, producing a broken URL like:
```
http://NaN--agent--workspace--user.proxy.example.com/path
```
Now defaults to port 80 for HTTP and 443 for HTTPS via the new
`resolveLocalhostPort()` helper.
### 2. Protocol always hardcoded to `"http"`
The `urlTransform` in `AgentDetail.tsx` always passed `"http"` as the
protocol argument, silently discarding the original URL's scheme. This
meant `https://localhost:8443/...` would not get the `s` suffix in the
subdomain. Now extracts the protocol from the parsed URL, matching the
existing behavior of `openMaybePortForwardedURL`.
### 3. `urlTransform` not memoized
The closure was re-created on every render. Wrapped in `useCallback`
with the four primitive dependencies (`proxyHost`, `agentName`,
`wsName`, `wsOwner`).
### 4. Duplicated `localHosts` definition
The localhost detection set was defined separately in both
`AgentDetail.tsx` and `portForward.ts`. Consolidated into a single
shared export from `portForward.ts`.
## Changes
- **`site/src/utils/portForward.ts`**: Export shared `localHosts` set
and new `resolveLocalhostPort()` helper. Update
`openMaybePortForwardedURL` to use both.
- **`site/src/pages/AgentsPage/AgentDetail.tsx`**: Import shared
`localHosts` and `resolveLocalhostPort`. Fix protocol extraction.
Memoize `urlTransform`.
- **`site/src/utils/portForward.jest.ts`**: Add tests for
`resolveLocalhostPort` and `localHosts`. Renamed from `.test.ts` to
`.jest.ts` to match project convention.
Wraps the chat timeline in React's <Profiler> to emit
performance.measure() entries and throttled console.warn for
slow renders. Inert in standard builds, only produces output
with a profiling build.
Refs #23354
The React Compiler (babel-plugin-react-compiler@1.0.0) handles
memoization automatically for all components in the AgentsPage
compiled path. Three memo() wrappers were redundant:
- ChatMessageItem in ConversationTimeline.tsx
- LazyFileDiff in DiffViewer.tsx
- ChatTreeNode in AgentsSidebar.tsx
Also migrate three Context.Provider usages to the React 19
shorthand (<Context value={...}>) and simplify the EmbedContext
export to use the context directly instead of re-exporting
.Provider as an alias.
Linear's MCP server (`mcp.linear.app`) returns `token_type="bearer"`
(lowercase) in its OAuth2 token response but rejects requests that use
the lowercase form in the `Authorization` header. RFC 6750 says the
scheme is case-insensitive, but Linear enforces capital-B `Bearer`.
Confirmed by running the actual Linear MCP OAuth flow end-to-end:
- `Authorization: Bearer <token>` → **42 tools, works**
- `Authorization: bearer <token>` → **401 invalid_token**
This is a one-line fix: normalize any case variant of `bearer` to
`Bearer` before building the `Authorization` header, matching the
behavior of the mcp-go library's own OAuth handler.
Depot runners are running out of disk space and blocking builds.
Temporarily switch the build and release jobs from depot runners to
GitHub-hosted runners:
- `ci.yaml` build job: `depot-ubuntu-22.04-8` → `ubuntu-latest`
- `release.yaml` check-perms + release jobs: `depot-ubuntu-22.04-8` →
`ubuntu-latest`
**This is intended to be reverted once depot resolves their disk space
issues.**
> 🤖 This PR was created with the help of Coder Agents, and will be
reviewed by my human. 🧑💻
<img width="684" height="540" alt="image"
src="https://github.com/user-attachments/assets/ccd09873-4640-4a54-b3ca-f740dd50b38d"
/>
## Changes
- Move filter dropdown from top nav bar to inline with the first time
group header (e.g. "Today")
- Remove analytics icon from desktop sidebar nav bar
- Change "View details" to "View usage" in the usage indicator dropdown
- Fix green progress bar visibility in dark mode (`bg-surface-green` →
`bg-content-success`)
- Fix missing space before date in "Resets" text
---
PR generated with Coder Agents
- update some config settings to support "absolute"-style imports by
using a `#/` prefix
- migrate some of the imports in the `WorkspacesPage` to use the new
import style as a proof of concept
because of the change in import sorting behavior this results in, this
diff is already kind of hard to look at–even just from a small migration
for a single page. I think breaking this up into bite size pieces isn't
gonna be worth the work, and leaves more time for merge conflicts to
accrue, more times people would likely have to resolve them.
so I think as far as process for this, I'd like to...
- merge this PR as is, where the config changes are relatively easy to
spot in the haystack, with just enough imports updated to prove that the
config changes are correct
- merge another mega PR after this one which just bites the bullet and
migrates everything else in one fell swoop. it'll probably result in a
ton of merge conflicts for open PRs, but at least it'll only do so once
and then it can be over with.
The @ imports at the bottom of this file are auto-loaded by Claude Code
but silently ignored by other agent runtimes (Coder Agents, Zed, etc.).
Add an explicit fallback so those agents know what to read and when.
After a long requestAnimationFrame pause (e.g. backgrounded tab), the
time delta can be very large, causing the character budget to spike and
bypass smooth rendering entirely. Clamp to 100ms.
Extracted from #23236.
## Problem
MCP servers like Linear (`mcp.linear.app`) require PKCE (RFC 7636) for
their OAuth2 flow. Without it, the token exchange may succeed but the
resulting access token is immediately rejected with a 401
`invalid_token` error when the chat daemon tries to connect to the MCP
server.
This means users can authenticate successfully in the UI (the OAuth
popup completes, `auth_connected` shows `true`), but the model never
receives the MCP tools — they silently fail to load.
### Root cause
The `mcpServerOAuth2Connect` handler was calling
`oauth2Config.AuthCodeURL(state)` without any PKCE parameters
(`code_challenge`, `code_challenge_method`). The callback was calling
`oauth2Config.Exchange(ctx, code)` without a `code_verifier`. Linear's
MCP OAuth endpoint decoded state confirms it expected PKCE with
`codeChallengeMethod: "plain"`.
### Investigation
- The chat (`c2c04fc5-5622-4b71-a5a9-80508e86f78e`) had the Linear MCP
server ID in `mcp_server_ids`
- `auth_connected: true` (token row exists in DB)
- No "expired" or "empty token" warnings in logs
- Server log showed: `skipping MCP server due to connection failure ...
error="initialize: transport error: request failed with status 401:
{"error":"invalid_token","error_description":"Missing or invalid access
token"}"`
- Decoding Linear's OAuth state revealed PKCE was expected
## Changes
- Generate a PKCE `code_verifier` during the OAuth2 connect step using
`oauth2.GenerateVerifier()` and store it in a cookie scoped to the
callback path
- Include `code_challenge` (S256) in the authorization redirect URL via
`oauth2.S256ChallengeOption()`
- Pass the `code_verifier` during the token exchange in the callback via
`oauth2.VerifierOption()`
- Fix a nil-pointer guard on `api.HTTPClient` in the callback
- Add tests verifying PKCE parameters are sent correctly and backwards
compatibility when no verifier cookie is present
feat: add deep-review skill for multi-reviewer code review
Add a skill to .agents/skills/deep-review/ that orchestrates parallel
code reviews from domain-specific reviewers (test auditor, security
reviewer, concurrency reviewer, etc.), cross-checks their findings for
contradictions and convergence, then posts a single structured GitHub
review with inline comments.
Each reviewer reads only its own methodology file (roles/{name}.md) to
preserve independent perspectives. The orchestrator cross-checks across
all findings before posting, tracing combined consequences and
calibrating severity in both directions.
Key capabilities: re-review gate for tracking prior findings across
rounds, consequence-based severity (P0-P4), quoting discipline
separating reviewer evidence from orchestrator judgment, and author
independence (same rigor regardless of who wrote the PR).
- Detect `-testify.m` sub-test filtering in `SetupSuite` and skip the `Accounting` check.
> 🤖 This PR was created with the help of Coder Agents, and was reviewed by my human. 🧑💻
## Summary
Stabilizes the /agents chat viewport so users can read older messages
without being yanked to the bottom when new content arrives.
## Architecture
Replaces the implicit scroll-follow behavior with
**ResizeObserver-driven
scroll anchoring**:
- **`autoScrollRef`** is the single source of truth. User scrolling away
from bottom turns it off; scrolling back near bottom or clicking the
button turns it back on.
- A **content ResizeObserver** on an inner wrapper detects transcript
growth. When auto-scroll is on, it re-pins to bottom via double-RAF
(waiting for React commit + layout to settle). When off, it
compensates `scrollTop` by the height delta to preserve the reading
position. Sign-aware for both Chrome-style negative and Firefox-style
positive `flex-col-reverse` scrollTop conventions.
- Compensation is **skipped during pagination** (older messages prepend
into the overflow direction; the browser preserves scrollTop) and
**during reflow** from width changes.
- A **container ResizeObserver** re-pins to bottom after viewport
resizes
(composer growth, panel changes) when auto-scroll is on.
- **`isRestoringScrollRef`** guards against feedback loops from
programmatic scroll writes. The smooth-scroll guard stays active
until the scroll handler detects arrival at bottom.
## Files changed
- **AgentDetailView.tsx**: Rewrote `ScrollAnchoredContainer` with the
new approach.
- **AgentDetailView.stories.tsx**: Refactored `ScrollToBottomButton`
story
scroll helpers into shared utilities.
## Behavior
- **At bottom + new content**: stays pinned, button hidden.
- **Scrolled up + new content**: reading position preserved, no jump.
- **Viewport resize while pinned**: re-pins to bottom.
- Scroll-to-bottom button and smooth scroll still work.
Adds a `propose_plan` tool that presents a workspace markdown file as a
dedicated plan card in the agent UI.
The workflow is: the agent uses `write_file`/`edit_files` to build a
plan file (e.g. `/home/coder/PLAN.md`), then calls `propose_plan(path)`
to present it. The backend reads the file via `ReadFile` and the
frontend renders it as an expanded markdown preview card.
**Backend** (`coderd/x/chatd/chattool/proposeplan.go`): new tool
registered as root-chat-only. Validates `.md` suffix, requires an
absolute path, reads raw file content from the workspace agent. Includes
1 MiB size cap.
**Frontend** (`site/src/components/ai-elements/tool/`): dedicated
`ProposePlanTool` component with `ToolCollapsible` + `ScrollArea` +
`Response` markdown renderer, expanded by default. Custom icon
(`ClipboardListIcon`) and filename-based label.
**System prompt** (`coderd/x/chatd/prompt.go`): added `<planning>`
section guiding the agent to research → write plan file → iterate → call
`propose_plan`.
OpenAI Responses follow-up turns were replaying full assistant/tool
history even when `store=true`, which breaks after reasoning +
provider-executed `web_search` output.
This change persists the OpenAI response ID on assistant messages, then
in `coderd/x/chatd` switches `store=true` follow-ups to
`previous_response_id` chaining with a system + new-user-only prompt.
`store=false` and missing-ID cases still fall back to manual replay.
It also updates the fake OpenAI server and integration coverage for the
chaining contract, and carries the rebased path move to `coderd/x/chatd`
plus the migration renumber needed after rebasing onto `main`.
## Summary
- use React Query's `dataUpdatedAt` as the remote diff cache
invalidation token instead of a component-local counter
- keep the `@pierre/diffs` cache key stable across remounts without a
custom hashing implementation
- preserve targeted coverage for the cache-key helper used by the
/agents remote diff viewer
## Testing
- `cd site && pnpm exec biome check
src/pages/AgentsPage/components/DiffViewer/RemoteDiffPanel.tsx
src/pages/AgentsPage/components/DiffViewer/diffCacheKey.ts
src/pages/AgentsPage/components/DiffViewer/diffCacheKey.test.ts`
- `cd site && pnpm exec vitest run
src/pages/AgentsPage/components/DiffViewer/diffCacheKey.test.ts
--project=unit`
- `cd site && pnpm exec tsc -p .`
https://github.com/user-attachments/assets/a482ef45-402a-4d86-af59-b1526b2ce3e2
## Summary
Redesigns the **Default Autostop** section on the `/agents` settings
page to clarify that it is a fallback for chat-linked workspaces whose
templates do not define their own autostop policy. Template-level
settings always take priority — this is a backstop, not an override.
## Changes
### UX
- Renamed to **Workspace Autostop Fallback** with clearer description
- Replaced always-visible duration field (confusing `0` in an hours box)
with a **toggle-to-enable** pattern matching the Virtual Desktop section
- Toggle ON auto-saves with a 1-hour default; toggle OFF auto-saves with
0
- Save button is always visible when the toggle is on but disabled until
the user changes the duration value
- Per-section disabled flags — toggling autostop no longer freezes the
Virtual Desktop switch or prompt textareas during the save round-trip
### Reliability
- `onError` rollback on toggle auto-saves so the UI snaps back to server
truth on failure
- Stateful mocks in Storybook stories to prevent race conditions from
instant mock resolution
### Accessibility
- Added `aria-label="Autostop duration"` to the DurationField input
- Updated `DurationField` component to merge external `inputProps` with
internal ones (preserves `step: 1`)
### Stories
- Updated all existing autostop stories for the new toggle-based flow
- Added `DefaultAutostopToggleOff` — tests disabling from an enabled
state
- Added `DefaultAutostopSaveDisabled` — verifies Save button is visible
but disabled when no duration change
---
PR generated with Coder Agents
## Problem
`TestConnectAll_MultipleServers` flakes with:
```
net/http: HTTP/1.x transport connection broken: http: CloseIdleConnections called
```
Each MCP client connection implicitly uses `http.DefaultTransport`. When
`httptest.Server.Close()` runs during parallel test cleanup, it calls
`CloseIdleConnections` on `http.DefaultTransport`, breaking in-flight
connections from other goroutines or parallel tests sharing that
transport.
## Fix
Clone the default transport for each MCP connection via
`http.DefaultTransport.(*http.Transport).Clone()`, passed through
`WithHTTPBasicClient` (StreamableHTTP) and `WithHTTPClient` (SSE). This
scopes idle connection cleanup to a single MCP server so it cannot
disrupt unrelated connections.
Fixescoder/internal#1420
## Problem
During OAuth2 auto-discovery for MCP servers, the callback URL
registered with the remote authorization server via Dynamic Client
Registration (RFC 7591) contained the literal string `{id}` instead of
the actual config UUID:
```
https://coder.example.com/api/experimental/mcp/servers/{id}/oauth2/callback
```
This happened because the discovery and registration occurred **before**
the database insert that generates the ID. When the user later initiated
the OAuth2 connect flow, the redirect URL used the real UUID, causing
the authorization server to reject it with:
> The provided redirect URIs are not approved for use by this
authorization server
## Fix
Restructure the auto-discovery flow in `createMCPServerConfig` to:
1. **Insert** the MCP server config first (with empty OAuth2 fields) to
get the database-generated UUID
2. **Build** the callback URL with the actual UUID
3. **Perform** OAuth2 discovery and dynamic client registration with the
correct URL
4. **Update** the record with the discovered OAuth2 credentials
5. **Clean up** the record if discovery fails
## Testing
Added regression test
`TestMCPServerConfigsOAuth2AutoDiscovery/RedirectURIContainsRealConfigID`
that:
- Stands up mock auth + MCP servers
- Captures the `redirect_uris` sent during dynamic client registration
- Asserts the URI contains the real config UUID, not `{id}`
- Verifies the full callback path structure
All existing MCP server config tests continue to pass.
Fallback to the configured model name in PR Insights when a model config
has a blank display name.
This updates both the by-model breakdown and recent PR rows, and adds a
regression test for blank display names.
replace_all in fuzzy mode (passes 2 and 3 of fuzzyReplace) only
replaced the first match. seekLines returned the first match,
spliceLines replaced one range, and there was no loop.
Extract fuzzy pass logic into fuzzyReplaceLines which:
- Returns a 3-tuple (result, matched, error) for clean caller flow
- When replaceAll is true, collects all non-overlapping matches
then applies replacements from last to first to preserve indices
- When replaceAll is false with multiple matches, returns an error
Add test cases for replace_all with fuzzy trailing whitespace and
fuzzy indent matching.
Both write_file and edit_files use atomic writes (write to temp
file, then rename). Since rename operates on directory entries, it
replaces symlinks with regular files instead of writing through
the link to the target.
Add resolveSymlink() that uses afero.Lstater/LinkReader to resolve
symlink chains (up to 10 levels) before the atomic write. Both
writeFile and editFile resolve the path before any filesystem
operations, matching the behavior of 'echo content > symlink'.
Gracefully no-ops on filesystems that don't support symlinks (e.g.
MemMapFs used in existing tests).
## Summary
Adds a user-facing MCP server configuration panel to the chat input
toolbar. Users can toggle which MCP servers provide tools for their chat
sessions, and authenticate with OAuth2 servers via popup windows.
## Changes
### New Components
- **`MCPServerPicker`** (`MCPServerPicker.tsx`): Popover-based picker
that appears in the chat input toolbar next to the model selector. Shows
all enabled MCP servers with toggles.
- **`MCPServerPicker.stories.tsx`**: 13 Storybook stories covering all
states.
### Availability Policies
Respects the admin-configured availability for each server:
- **`force_on`**: Always active, toggle disabled, lock icon shown. User
cannot disable.
- **`default_on`**: Pre-selected by default, user can opt out via
toggle.
- **`default_off`**: Not selected by default, user must opt in via
toggle.
### OAuth2 Authentication
For servers with `auth_type: "oauth2"`:
- Shows auth status (connected/not connected)
- "Connect to authenticate" link opens a popup window to
`/api/experimental/mcp/servers/{id}/oauth2/connect`
- Listens for `postMessage` with `{type: "mcp-oauth2-complete"}` from
the callback page
- Same UX pattern as external auth on the Create Workspace screen
### Integration Points
- `AgentChatInput`: MCP picker appears in the toolbar after the model
selector
- `AgentDetail`: Manages MCP selection state, initializes from
`chat.mcp_server_ids` or defaults
- `AgentDetailView` / `AgentDetailContent`: Props plumbed through to
input
- `AgentCreatePage` / `AgentCreateForm`: MCP selection for new chats
- `mcp_server_ids` now sent with `CreateChatMessageRequest` and
`CreateChatRequest`
### Helper
- `getDefaultMCPSelection()`: Computes default selection from
availability policies (`force_on` + `default_on`)
## Storybook Stories
| Story | Description |
|-------|-------------|
| NoServers | No servers - picker hidden |
| AllDisabled | All disabled servers - picker hidden |
| SingleForceOn | Force-on server with locked toggle |
| SingleDefaultOnNoAuth | Default-on with no auth required |
| SingleDefaultOff | Optional server not selected |
| OAuthNeedsAuth | OAuth2 server needing authentication |
| OAuthConnected | OAuth2 server already connected |
| MixedServers | Multiple servers with mixed availability/auth |
| AllConnected | All OAuth2 servers authenticated |
| Disabled | Picker in disabled state |
| WithDisabledServer | Disabled servers filtered out |
| AllOptedOut | All toggled off except force_on |
| OptionalOAuthNeedsAuth | Optional OAuth2 needing auth |
All map iterations in ConvertState now use sorted helpers instead of
ranging over Go maps directly. Previously only coder_env and
coder_script were sorted (via sortedResourcesByType). This extends
the pattern to coder_agent, coder_devcontainer, coder_agent_instance,
coder_app, coder_metadata, coder_external_auth, and the main
resource output list.
Also fixes generate.sh writing version.txt to the wrong directory
(resources/ instead of testdata/), which caused the Makefile version
check to silently desync and trigger unnecessary regeneration.
Adds TestConvertStateDeterministic that calls ConvertState 10 times
per fixture and asserts byte-identical JSON output without any
post-hoc sorting.
## Context
`./scripts/develop.sh` was failing to build in my dogfood workspace
with:
```
src/testHelpers/handlers.ts(346,35): error TS2345: Argument of type 'NonSharedBuffer'
is not assignable to parameter of type 'ArrayBuffer'.
Type 'Buffer<ArrayBuffer>' is missing the following properties from type 'ArrayBuffer':
maxByteLength, resizable, resize, detached, and 2 more.
```
## Alternatives considered
**`fileBuffer.buffer`** — `.buffer` gives you the underlying
`ArrayBuffer`, but Node pools small buffers into a shared 8 KB slab. A
`Buffer.from("hello")` has `byteOffset: 1472` and `.buffer.byteLength:
8192` — passing `.buffer` to a `Response` sends all 8,192 bytes instead
of 5. It happens to work for `readFileSync` (dedicated allocation,
offset 0), but breaks silently if someone refactors how the buffer is
constructed.
**`fileBuffer.buffer.slice(byteOffset, byteOffset + byteLength)`** — the
safe version of the above. Always correct, but unnecessarily complex.
**`new HttpResponse(fileBuffer)`** (chosen) — `HttpResponse` extends
`Response`, whose constructor accepts `BodyInit` which includes
`Uint8Array`. When you pass a typed array view, `Response` reads only
the bytes within that view (respecting `byteOffset`/`byteLength`), so
it's safe regardless of pooling. `Buffer` is a `Uint8Array` subclass, so
this just works:
```
pooled = Buffer.from("hello") → byteOffset: 1472, .buffer: 8192 bytes
new Response(pooled.buffer) → body: 8192 bytes ✗
new Response(pooled) → body: 5 bytes ✓
```
_Disclaimer:_ _produced_ _by_ _Claude_ _Opus_ _4\.6,_ _reviewed_ _by_ _me._
**This is a breaking change.** Users who are not have `owner` or sitewide `auditor` roles will no longer be able to view interceptions.
Regular users should not need to view this information; in fact, it could be used by a malicious insider to see what information we track and don't track to exfiltrate data or perform actions unobserved.
---
Changed authorization for AI Bridge interception-related operations from system-level permissions to resource-specific permissions. The following functions now authorize against `rbac.ResourceAibridgeInterception` instead of `rbac.ResourceSystem`:
- `ListAIBridgeTokenUsagesByInterceptionIDs`
- `ListAIBridgeToolUsagesByInterceptionIDs`
- `ListAIBridgeUserPromptsByInterceptionIDs`
Updated RBAC roles to grant AI Bridge interception permissions:
- **User/Member roles**: Can create and update AI Bridge interceptions but cannot read them back
- **Service accounts**: Same create/update permissions without read access
- **Owners/Auditors**: Retain full read access to all interceptions
Removed system-level authorization bypass in `populatedAndConvertAIBridgeInterceptions` function, allowing proper resource-level authorization checks.
Updated tests to reflect the new permission model where members cannot view AI Bridge interceptions, even their own, while owners and auditors maintain full visibility.
<!--
If you have used AI to produce some or all of this PR, please ensure you have read our [AI Contribution guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING) before submitting.
-->
_Disclaimer:_ _initially_ _produced_ _by_ _Claude_ _Opus_ _4\.6,_ _heavily_ _modified_ _and_ _reviewed_ _by_ _me._
Closes https://github.com/coder/internal/issues/1360
Adds a new `/api/v2/aibridge/sessions` API which returns "sessions".
Sessions, as defined in the [RFC](https://www.notion.so/coderhq/AI-Bridge-Sessions-Threads-2ccd579be59280f28021d3baf7472fbe?source=copy_link), are a set of interceptions logically grouped by a session key issued by the client.
The API design for this endpoint was done in [this doc](https://github.com/coder/internal/issues/1360).
If the client has not provided a session ID, we will revert to the thread root ID, and if that's not present we use the interception's own ID (i.e. a session of a single interception - which is effectively what we show currently in our `/api/v2/aibridge/interceptions` API).
The SQL query looks gnarly but it's relatively simple, and seems to perform well (~200ms) even when I import dogfood's `aibridge_*` tables into my workspace. If we need to improve performance on this later we can investigate materialized views, perhaps, but for now I don't think it's warranted.
---
_The PR looks large but it's got a lot of generated code; the actual changes aren't huge._
## Summary
- replace dynamic dayjs() date generation in LicenseCard stories with
fixed deterministic timestamps
- preserve story behavior while preventing day-over-day visual drift in
Chromatic
- use shared constants for expired and future date scenarios
## Issue context
On `dev.coder.com`, users could successfully log in, briefly see the web
UI, and then get redirected back to `/login`.
We traced the most reliable repro to viewing Tracy's workspaces on the
`/workspaces` page. That page eagerly issues authenticated per-row
requests such as:
- `POST /api/v2/authcheck`
- `GET /api/v2/workspacebuilds/:workspacebuild/parameters`
One confirmed failing request was for Tracy's workspace
`nav-scroll-fix-1f6b`:
- route: `GET
/api/v2/workspacebuilds/f2104ae6-7d53-457c-a8df-de831bee76db/parameters`
- build owner/workspace: `tracy/nav-scroll-fix-1f6b`
The failing response body was:
- message: `An internal error occurred. Please try again or contact the
system administrator.`
- detail: `Internal error fetching API key by id. fetch object: pq:
password authentication failed for user "coder"`
That showed the request was not actually unauthorized. The server hit an
internal database/authentication problem while resolving the session API
key. The underlying issue was that DB password rotation had been
enabled, it has since been disabled.
However, the logout cascade happened because:
1. `APIKeyFromRequest()` returned `ok=false` for both genuine auth
failures and internal backend failures.
2. `ValidateAPIKey()` wrapped every `!ok` result as `401 Unauthorized`.
3. `RequireAuth.tsx` signs the user out on any `401` response.
So a transient backend/database failure was being misreported as an auth
failure, which made the client forcibly log the user out.
A useful extra clue was that the installed PWA did not repro. The PWA
starts on `/agents`, which avoids the `/workspaces` request fan-out.
That helped narrow the problem to the eager authenticated requests on
the workspace list rather than to cookies or the login flow itself.
## What changed
This PR now fixes the bug without changing the exported
`APIKeyFromRequest()` surface:
- `ValidateAPIKey()` now uses a new internal helper that returns a typed
`ValidateAPIKeyError`
- the exported `APIKeyFromRequest()` helper remains compatible for
existing callers like `userauth.go`
- internal API-key lookup failures are classified as `500 Internal
Server Error` plus `Hard: true`
- internal `UserRBACSubject()` failures now return `500 Internal Server
Error` instead of `401 Unauthorized`
- a focused regression test verifies that an internal `GetAPIKeyByID`
failure surfaces as `500`
This removes the brittle message-based classification and makes the
internal-auth-failure path robust for all API-key lookup failures
handled by auth middleware.
## What
Adds per-user per-model auto-compaction threshold overrides. Users can
now customize the percentage of context window usage that triggers chat
compaction, independently for each enabled model.
## Why
The compaction threshold was previously only configurable at the
deployment level (`chat_model_configs.compression_threshold`). Different
users have different preferences — some want aggressive compaction to
keep costs low, others prefer higher thresholds to retain more context.
This gives users control without requiring admin intervention.
## Architecture
**Storage:** Reuses the existing `user_configs` table (no migration
needed). Overrides are stored as key/value pairs with keys shaped
`chat_compaction_threshold:<modelConfigID>` and integer percent values.
**API:** Three new experimental endpoints under
`/api/experimental/chats/config/`:
- `GET /user-compaction-thresholds` — list all overrides for the current
user
- `PUT /user-compaction-thresholds/{modelConfig}` — upsert an override
(validates model exists and is enabled, validates 0–100 range)
- `DELETE /user-compaction-thresholds/{modelConfig}` — clear an override
(idempotent)
**Runtime resolution:** In `coderd/chatd/chatd.go`, a new
`resolveUserCompactionThreshold()` helper runs at the start of each chat
turn (inside `runChat()`), after the model config is resolved but before
`CompactionOptions` is built. If a valid override exists, it replaces
`modelConfig.CompressionThreshold`. The threshold source
(`user_override` vs `model_default`) is logged with each compaction
event.
**Precedence:** `effectiveThreshold = userOverride ??
modelConfig.CompressionThreshold`
**UI:** New "Context Compaction" subsection in the Agents → Settings →
Behavior tab, placed after Personal Instructions. Shows one row per
enabled model with the system default, a number input for the override,
and Save/Reset controls.
## Testing
- 9 API subtests covering CRUD, validation (boundary values 0/100,
out-of-range rejection), upsert behavior, idempotent delete, user
isolation, and non-existent model config
- 4 dbauthz tests (16 scenarios) verifying `ActionReadPersonal` /
`ActionUpdatePersonal` on all query methods
- 4 Storybook stories with play functions (Default, WithOverrides,
Loading, Error)
<details>
<summary>Implementation plan</summary>
### Phase 1 — Tests
- Backend API tests in `coderd/chats_test.go` (9 subtests)
- Database auth wrapper tests in
`coderd/database/dbauthz/dbauthz_test.go` (4 methods)
- Frontend stories in `UserCompactionThresholdSettings.stories.tsx` (4
stories)
### Phase 2 — Backend preference surface
- 4 SQL queries in `coderd/database/queries/users.sql` (list, get,
upsert, delete)
- `make gen` to propagate into generated artifacts
- Auth/metrics wrappers in dbauthz and dbmetrics
- SDK types and client methods in `codersdk/chats.go`
- HTTP handlers and routes in `coderd/chats.go` and `coderd/coderd.go`
- Key prefix constant shared between handlers and runtime
### Phase 3 — Runtime override
- `resolveUserCompactionThreshold()` helper in `coderd/chatd/chatd.go`
- Override injection in `runChat()` before building `CompactionOptions`
- `threshold_source` field added to compaction log
### Phase 4 — Settings UI
- API client methods and React Query hooks in `site/src/api/`
- `UserCompactionThresholdSettings` component extracted from
`SettingsPageContent`
- Per-model mutation tracking (only the active row disables during save)
- 100% warning, "System default" label, helpful empty state copy
### Phase 5 — Refactor and review fixes
- Consolidated key prefix constant in `codersdk`
- Explicit PUT range validation (not just struct tags)
- GET handler gracefully skips malformed rows instead of 500
- Boundary value, upsert, and non-existent model config tests
- UX improvements: per-model mutation state, aria-live on errors
</details>
## Problem
When adding an external MCP server with `auth_type=oauth2`, admins
currently must manually provide:
- `oauth2_client_id`
- `oauth2_client_secret`
- `oauth2_auth_url`
- `oauth2_token_url`
This requires the admin to manually register an OAuth2 client with the
external MCP server's authorization server first — a friction-heavy
process that contradicts the MCP spec's vision of plug-and-play
discovery.
## Solution
When an admin creates an MCP server config with `auth_type=oauth2` and
omits the OAuth2 fields, Coder now automatically discovers and registers
credentials following the MCP authorization spec:
1. **Protected Resource Metadata (RFC 9728)** — Fetches
`/.well-known/oauth-protected-resource` from the MCP server to discover
its authorization server. Falls back to probing the server URL for a
`WWW-Authenticate` header with a `resource_metadata` parameter.
2. **Authorization Server Metadata (RFC 8414)** — Fetches
`/.well-known/oauth-authorization-server` from the discovered auth
server to find all endpoints.
3. **Dynamic Client Registration (RFC 7591)** — Registers Coder as an
OAuth2 client at the auth server's registration endpoint, obtaining a
`client_id` and `client_secret` automatically.
The discovered/generated credentials are stored in the MCP server
config, and the existing per-user OAuth2 connect flow works unchanged.
### Backward compatibility
- **Manual config still works**: If all three fields
(`oauth2_client_id`, `oauth2_auth_url`, `oauth2_token_url`) are
provided, the existing behavior is unchanged.
- **Partial config is rejected**: Providing some but not all fields
returns a clear error explaining the two options.
- **Discovery failure is clear**: If auto-discovery fails, the error
message explains what went wrong and suggests manual configuration.
## Changes
- **New package `coderd/mcpauth`** — Self-contained discovery and DCR
logic with no `codersdk` dependency
- **Modified `coderd/mcp.go`** — `createMCPServerConfig` handler now
attempts auto-discovery when OAuth2 fields are omitted
- **Tests** — Unit tests for discovery (happy path, WWW-Authenticate
fallback, no registration endpoint, registration failure) and
`parseResourceMetadataParam` helper
The slices package provides type-safe generic replacements for the
old typed sort convenience functions. The codebase already uses
slices.Sort in 43 call sites; this finishes the migration for the
remaining 29.
- sort.Strings(x) -> slices.Sort(x)
- sort.Float64s(x) -> slices.Sort(x)
- sort.StringsAreSorted(x) -> slices.IsSorted(x)
Consolidates invocations of `coderdtest.New` to a single shared instance per
parent for the following tests:
- `TestOAuth2ClientMetadataValidation`
- `TestOAuth2ClientNameValidation`
- `TestOAuth2ClientScopeValidation`
- `TestOAuth2ClientMetadataEdgeCases`
> 🤖 This PR was created with the help of Coder Agents, and was
reviewed by my human. 🧑💻
The test-storybook target uses @vitest/browser-playwright with
Chromium but never installs the browser binaries. pnpm install
only fetches the npm package; the actual browser must be
downloaded separately via playwright install. This mirrors what
test-e2e already does.
2026-03-23 20:57:03 +00:00
1946 changed files with 136303 additions and 38476 deletions
description: "Multi-reviewer code review. Spawns domain-specific reviewers in parallel, cross-checks findings, posts a single structured GitHub review."
---
# Deep Review
Multi-reviewer code review. Spawns domain-specific reviewers in parallel, cross-checks their findings for contradictions and convergence, then posts a single structured GitHub review with inline comments.
- When you want independent perspectives cross-checked against each other, not just a single-pass review.
Use `.claude/skills/code-review/` for focused single-domain changes or quick single-pass reviews.
**Prerequisite:** This skill requires the ability to spawn parallel subagents. If your agent runtime cannot spawn subagents, use code-review instead.
**Severity scales:** Deep-review uses P0–P4 (consequence-based). Code-review uses 🔴🟡🔵. Both are valid; they serve different review depths. Approximate mapping: P0–P1 ≈ 🔴, P2 ≈ 🟡, P3–P4 ≈ 🔵.
## When NOT to use this skill
- Docs-only or config-only PRs (no code to structurally review). Use `.claude/skills/doc-check/` instead.
- Single-file changes under ~50 lines.
- The PR author asked for a quick review.
## 0. Proportionality check
Estimate scope before committing to a deep review. If the PR has fewer than 3 files and fewer than 100 lines changed, suggest code-review instead. If the PR is docs-only, suggest doc-check. Proceed only if the change warrants multi-reviewer analysis.
## 1. Scope the change
**Author independence.** Review with the same rigor regardless of who authored the PR. Don't soften findings because the author is the person who invoked this review, a maintainer, or a senior contributor. Don't harden findings because the author is a new contributor. The review's value comes from honest, consistent assessment.
Create the review output directory before anything else:
```sh
exportREVIEW_DIR="/tmp/deep-review/$(date +%s)"
mkdir -p "$REVIEW_DIR"
```
**Re-review detection.** Check if you or a previous agent session already reviewed this PR:
If a prior agent review exists, you must produce a prior-findings classification table before proceeding. This is not optional — the table is an input to step 3 (reviewer prompts). Without it, reviewers will re-discover resolved findings.
1. Read every author response since the last review (inline replies, PR comments, commit messages).
2. Diff the branch to see what changed since the last review.
3. Engage with any author questions before re-raising findings.
4. Write `$REVIEW_DIR/prior-findings.md` with this format:
- **Resolved**: author pushed a code fix. Verify the fix addresses the finding's specific concern — not just that code changed in the relevant area. Check that the fix doesn't introduce new issues.
- **Acknowledged**: author agreed but deferred.
- **Contested**: author disagreed or raised a constraint. Write their argument in the table.
- **No response**: author didn't address it.
Only **Contested** and **No response** findings carry forward to the new review. Resolved and Acknowledged findings must not be re-raised.
**Scope the diff.** Get the file list from the diff, PR, or user. Skim for intent and note which layers are touched (frontend, backend, database, auth, concurrency, tests, docs).
For each changed file, briefly check the surrounding context:
- Config files (package.json, tsconfig, vite.config, etc.): scan the existing entries for naming conventions and structural patterns.
- New files: check if an existing file could have been extended instead.
- Comments in the diff: do they explain why, or just restate what the code does?
## 2. Pick reviewers
Match reviewer roles to layers touched. The Test Auditor, Edge Case Analyst, and Contract Auditor always run. Conditional reviewers activate when their domain is touched.
- **Modernization Reviewer**: one instance per language present in the diff. Filter by extension:
- Go: `*.go` — reference `.claude/docs/GO.md` before reviewing.
- TypeScript: `*.ts``*.tsx`: reference `.agents/skills/deep-review/references/typescript.md` before reviewing.
- React: `*.tsx``*.jsx`: reference `.agents/skills/deep-review/references/react.md` before reviewing.
`.tsx` files match both TypeScript and React filters. Spawn both instances when the diff contains `.tsx` changes — TS covers language-level patterns; React covers component and hooks patterns. Before spawning, verify each instance's filter produces a non-empty diff. Skip instances whose filtered diff is empty.
Each reviewer writes findings to `$REVIEW_DIR/{role-name}.md` where `{role-name}` is the kebab-cased role name (e.g. `test-auditor`, `go-architect`). For Modernization Reviewer instances, qualify with the language: `modernization-reviewer-go.md`, `modernization-reviewer-ts.md`, `modernization-reviewer-react.md`. The orchestrator does not read reviewer findings from the subagent return text — it reads the files in step 4.
Spawn all Tier 1 and Tier 2 reviewers in parallel. Give each reviewer a reference (PR number, branch name), not the diff content. The reviewer fetches the diff itself. Reviewers are read-only — no worktrees needed.
**Tier 1 prompt:**
```text
Read `AGENTS.md` in this repository before starting.
You are the {Role Name} reviewer. Read your methodology in
For Modernization Reviewer instances, add the language reference after the methodology line:
- **Go:** `Read .claude/docs/GO.md as your Go language reference before reviewing.`
- **TypeScript:** `Read .agents/skills/deep-review/references/typescript.md as your TypeScript language reference before reviewing.`
- **React:** `Read .agents/skills/deep-review/references/react.md as your React language reference before reviewing.`
For re-reviews, append to both Tier 1 and Tier 2 prompts:
> Prior findings and author responses are in {REVIEW_DIR}/prior-findings.md. Read it before reviewing. Do not re-raise Resolved or Acknowledged findings.
## 4. Cross-check findings
### 4a. Read findings from files
Read each reviewer's output file from `$REVIEW_DIR/` one at a time. One file per read — do not batch multiple reviewer files in parallel. Batching causes reviewer voices to blend in the context window, leading to misattribution (grabbing phrasing from one reviewer and attributing it to another).
For each file:
1. Read the file.
2. List each finding with its severity, location, and one-line summary.
3. Note the reviewer's exact evidence line for each finding.
If a file says "No findings," record that and move on. If a file is missing (reviewer crashed or timed out), note the gap and proceed — do not stall or silently drop the reviewer's perspective.
After reading all files, you have a finding inventory. Proceed to cross-check.
### 4b. Cross-check
Handle Tier 1 and Tier 2 findings separately before merging.
**Tier 2 nit findings:** Apply a lighter filter. Drop nits that are purely subjective, that duplicate what a linter already enforces, or that the author clearly made intentionally. Keep nits that have a practical benefit (clearer name, better error message, obsolete stdlib usage). Surviving nits stay as Nit.
**Tier 1 structural findings:** Before producing the final review, look across all findings for:
- **Contradictions.** Two reviewers recommending opposite approaches. Flag both and note the conflict.
- **Interactions.** One finding that solves or worsens another (e.g. a refactor suggestion that addresses a separate cleanup concern). Link them.
- **Convergence.** Two or more reviewers flagging the same function or component from different angles. Don't just merge at max(severity) and don't treat convergence as headcount ("more reviewers = higher confidence in the same thing"). After listing the convergent findings, trace the consequence chain _across_ them. One reviewer flags a resource leak, another flags an unbounded hang, a third flags infinite retries on reconnect — the combination means a single failure leaves a permanent resource drain with no recovery. That combined consequence may deserve its own finding at higher severity than any individual one.
- **Async findings.** When a finding mentions setState after unmount, unused cancellation signals, or missing error handling near an await: (1) find the setState or callback, (2) trace what renders or fires as a result, (3) ask "if this fires after the user navigated away, what do they see?" If the answer is "nothing" (a ref update, a console.log), it's P3. If the answer is "a dialog opens" or "state corrupts," upgrade. The severity depends on what's at the END of the async chain, not the start.
- **Mechanism vs. consequence.** Reviewers describe findings using mechanism vocabulary ("unused parameter", "duplicated code", "test passes by coincidence"), not consequence vocabulary ("dialog opens in wrong view", "attacker can bypass check", "removing this code has no test to catch it"). The Contract Auditor and Structural Analyst tend to frame findings by consequence already — use their framing directly. For mechanism-framed findings from other reviewers, restate the consequence before accepting the severity. Consequences include UX bugs, security gaps, data corruption, and silent regressions — not just things users see on screen.
- **Weak evidence.** Findings that assert a problem without demonstrating it. Downgrade or drop.
- **Unnecessary novelty.** New files, new naming patterns, new abstractions where the existing codebase already has a convention. If no reviewer flagged it but you see it, add it. If a reviewer flagged it as an observation, evaluate whether it should be a finding.
- **Scope creep.** Suggestions that go beyond reviewing what changed into redesigning what exists. Downgrade to P4.
- **Structural alternatives.** One reviewer proposes a design that eliminates a documented tradeoff, while others have zero findings because the current approach "works." Don't discount this as an outlier or scope creep. A structural alternative that removes the need for a tradeoff can be the highest-value output of the review. Preserve it at its original severity — the author decides whether to adopt it, but they need enough signal to evaluate it.
- **Pre-existing behavior.** "Pre-existing" doesn't erase severity. Check whether the PR introduced new code (comments, branches, error messages) that describes or depends on the pre-existing behavior incorrectly. The new code is in scope even when the underlying behavior isn't.
For each finding **and observation**, apply the severity test in **both directions**. Observations are not exempt — a reviewer may underrate a convention violation or a missing guarantee as Obs when the consequence warrants P3+:
- Downgrade: "Is this actually less severe than stated?"
- Upgrade: "Could this be worse than stated?"
When the severity spread among reviewers exceeds one level, note it explicitly. Only credit reviewers at or above the posted severity. A finding that survived 2+ independent reviewers needs an explicit counter-argument to drop. "Low risk" is not a counter when the reviewers already addressed it in their evidence.
Before forwarding a nit, form an independent opinion on whether it improves the code. Before rejecting a nit, verify you can prove it wrong, not just argue it's debatable.
Drop findings that don't survive this check. Adjust severity where the cross-check changes the picture.
After filtering both tiers, check for overlap: a nit that points at the same line as a Tier 1 finding can be folded into that comment rather than posted separately.
### 4c. Quoting discipline
When a finding survives cross-check, the reviewer's technical evidence is the source of record. Do not paraphrase it.
**Convergent findings — sharpest first.** When multiple reviewers flag the same issue:
1. Rank the converging findings by evidence quality.
2. Start from the sharpest individual finding as the base text.
3. Layer in only what other reviewers contributed that the base didn't cover (a concrete detail, a preemptive counter, a stronger framing).
4. Attribute to the 2–3 reviewers with the strongest evidence, not all N who noticed the same thing.
**Single-reviewer findings.** Go back to the reviewer's file and copy the evidence verbatim. The orchestrator owns framing, severity assessment, and practical judgment — those are your words. The technical claim and code-level evidence are the reviewer's words.
A posted finding has two voices:
- **Reviewer voice** (quoted): the specific technical observation and code evidence exactly as the reviewer wrote it.
- **Orchestrator voice** (original): severity framing, practical judgment ("worth fixing now because..."), scenario building, and conversational tone.
If you need to adjust a finding's scope (e.g. the reviewer said "file.go:42" but the real issue is broader), say so explicitly rather than silently rewriting the evidence.
**Attribution must show severity spread.** When reviewers disagree on severity, the attribution should reflect that — not flatten everyone to the posted severity. Show each reviewer's individual severity: `*(Security Reviewer P1, Concurrency Reviewer P1, Test Auditor P2)*` not `*(Security Reviewer, Concurrency Reviewer, Test Auditor)*`.
**Integrity check.** Before posting, verify that quoted evidence in findings actually corresponds to content in the diff. This guards against garbled cross-references from the file-reading step.
## 5. Post the review
When reviewing a GitHub PR, post findings as a proper GitHub review with inline comments, not a single comment dump.
**Review body.** Open with a short, friendly summary: what the change does well, what the overall impression is, and how many findings follow. Call out good work when you see it. A review that only lists problems teaches authors to dread your comments.
```text
Clean approach to X. The Y handling is particularly well done.
A couple things to look at: 1 P2, 1 P3, 3 nits across 5 inline
comments.
```
For re-reviews (round 2+), open with what was addressed:
```text
Thanks for fixing the wire-format break and the naming issue.
Fresh review found one new issue: 1 P2 across 1 inline comment.
```
Keep the review body to 2–4 sentences. Don't use markdown headers in the body — they render oversized in GitHub's review UI.
**Inline comments.** Every finding is an inline comment, pinned to the most relevant file and line. For findings that span multiple files, pin to the primary file (GitHub supports file-level comments when `position` is omitted or set to 1).
Inline comment format:
```text
**P{n}** One-sentence finding *(Reviewer Role)*
> Reviewer's evidence quoted verbatim from their file
Orchestrator's practical judgment: is this worth fixing now, or
is the current tradeoff acceptable? Scenario building, severity
reasoning, fix suggestions — these are your words.
```
For convergent findings (multiple reviewers, same issue):
> *Contract Auditor adds:* Additional detail from their file
Orchestrator's practical judgment.
```
For observations: `**Obs** One-sentence observation *(Role)* ...` For nits: `**Nit** One-sentence finding *(Role)* ...`
P3 findings and observations can be one-liners. Group multiple nits on the same file into one comment when they're co-located.
**Review event.** Always use `COMMENT`. Never use `REQUEST_CHANGES` — this isn't the norm in this repository. Never use `APPROVE` — approval is a human responsibility.
For P0 or P1 findings, add a note in the review body: "This review contains findings that may need attention before merge."
**Posting via GitHub API.**
The `gh api` endpoint for posting reviews routes through GraphQL by default. Field names differ from the REST API docs:
- Use `position` (diff-relative line number), not `line` + `side`. `side` is not a valid field in the GraphQL schema.
-`subject_type: "file"` is not recognized. Pin file-level comments to `position: 1` instead.
- Use `-X POST` with `--input` to force REST API routing.
To compute positions: save the PR diff to a file, then count lines from the first `@@` hunk header of each file's diff section. For new files, position = line number + 1 (the hunk header is position 1, first content line is position 2).
```sh
gh pr diff {number} > /tmp/pr.diff
```
Submit:
```sh
gh api -X POST \
repos/{owner}/{repo}/pulls/{number}/reviews \
--input review.json
```
Where `review.json`:
```json
{
"event":"COMMENT",
"body":"Summary of what's good and what to look at.\n1 P2, 1 P3 across 2 inline comments.",
**Tone guidance.** Frame design concerns as questions: "Could we use X instead?" — be direct only for correctness issues. Hedge design, not bugs. Build concrete scenarios to make concerns tangible. When uncertain, say so. See `.claude/docs/PR_STYLE_GUIDE.md` for PR conventions.
## Follow-up
After posting the review, monitor the PR for author responses. If the author pushes fixes or responds to findings, consider running a re-review (this skill, starting from step 1 with the re-review detection path). Allow time for the author to address multiple findings before re-reviewing — don't trigger on each individual response.
If the filtered diff is empty, say so in one line and stop.
You are a nit reviewer. Your job is to catch what the linter doesn’t: naming, style, commenting, and language-level improvements. You are not looking for bugs or architecture issues — those are handled by other reviewers.
Write all findings to the output file specified in your prompt. Create the directory if it doesn’t exist. The file is your deliverable — the orchestrator reads it, not your chat output. Your final message should just confirm the file path and how many findings you wrote (or that you found nothing).
Use this structure in the file:
---
**Nit**`file.go:42` — One-sentence finding.
Why it matters: brief explanation. If there’s an obvious fix, mention it.
---
Rules:
- Use **Nit** for all findings. Don’t use P0-P4 severity; that scale is for structural reviewers.
- Findings MUST reference specific lines or names. Vague style observations aren’t findings.
- Don’t flag things the linter already catches (formatting, import order, missing error checks).
- Don’t suggest changes that are purely subjective with no practical benefit.
- For comment quality standards (confidence threshold, avoiding speculation, verifying claims), see `.claude/skills/code-review/SKILL.md` Comment Standards section.
- If you find nothing, write a single line to the output file: "No findings."
# Modern React (18–19.2) + Compiler 1.0 — Reference
Reference for writing idiomatic React. Covers what changed, what it replaced, and what to reach for. Includes React Compiler patterns — what the compiler handles automatically, what it changes semantically, and how to verify its behavior empirically. Scope: client-side SPA patterns only. Server Components, `use server`, and `use client` directives are framework-specific and omitted. Check the project's React version and compiler config before reaching for newer APIs.
## How modern React thinks differently
**Concurrent rendering** (18): React can now pause, interrupt, and resume renders. This is the foundation everything else builds on. Most existing code "just works," but components that produce side effects during render (mutations, subscriptions, network calls in the render body) are unsafe and will misbehave. Concurrent features are opt-in — they only activate when you use a concurrent API like `startTransition` or `useDeferredValue`.
**Urgent vs. non-urgent updates** (18): The `startTransition` / `useTransition` API introduces a formal split between updates that must feel immediate (typing, clicking) and updates that can be interrupted (filtering a large list, navigating to a new screen). Non-urgent updates yield to urgent ones mid-render. Use this instead of `setTimeout` or manual debounce when you want the UI to stay responsive during expensive re-renders.
**Actions** (19): Async functions passed to `startTransition` are called "Actions." They automatically manage pending state, error handling, and optimistic updates as a unit. The `useActionState` hook and `<form action={fn}>` prop are built on this. The pattern replaces the hand-rolled `isPending/setIsPending` + `try/catch` + `setError` boilerplate that was previously necessary for every data mutation.
**Automatic batching** (18): State updates are now batched everywhere — inside `setTimeout`, `Promise.then`, native event handlers, etc. Previously batching only happened inside React-managed event handlers. If you genuinely need a synchronous flush, use `flushSync`.
**Automatic memoization** (Compiler 1.0): React Compiler is a build-time Babel plugin that automatically inserts memoization into components and hooks. It replaces manual `useMemo`, `useCallback`, and `React.memo` — including conditional memoization and memoization after early returns, which manual APIs cannot express. The compiler only processes components and hooks, not standalone functions. It understands data flow and mutability through its own HIR (High-level Intermediate Representation), so it can memoize more granularly than a human would. Projects adopt it incrementally — typically via path-based Babel overrides or the `"use memo"` directive. Components that violate the Rules of React are silently skipped (no build error), so the automated lint tools that check compiler compatibility matter.
## Replace these patterns
The left column reflects patterns common before React 18/19. Write the right column instead. The "Since" column tells you the minimum React version required.
| `useTransition()` / `startTransition()` | 18 | Mark a state update as non-urgent so React can interrupt it to handle clicks or keystrokes. The `isPending` boolean lets you show a loading indicator without blocking the UI. |
| `useDeferredValue(value, initialValue?)` | 18 / 19 | Defer re-rendering a slow subtree: pass the deferred value as a prop, wrap the expensive child in `memo`. Unlike debounce, uses no fixed timeout — renders as soon as the browser is idle. The `initialValue` arg (19) avoids a flash on first render. |
| `useId()` | 18 | Generate a stable, SSR-consistent ID for accessibility attributes (`htmlFor`, `aria-describedby`). Do not use for list keys. |
| `useSyncExternalStore(subscribe, getSnapshot, getServerSnapshot?)` | 18 | Subscribe to external (non-React) state stores safely under concurrent rendering. Preferred over `useEffect`-based subscriptions in libraries. |
| `useActionState(action, initialState)` | 19 | Manage an async mutation: returns `[state, wrappedAction, isPending]`. Handles pending, result, and error state as a unit. Replaces the manual `isPending` + `try/catch` + `setError` pattern. |
| `useOptimistic(currentValue)` | 19 | Show a speculative value while an async Action is in flight. Returns `[optimisticValue, setOptimistic]`. React automatically reverts to `currentValue` when the transition settles. |
| `use(promiseOrContext)` | 19 | Read a promise or Context value inside a component or custom hook. Unlike hooks, `use` can be called conditionally (after early returns). Promises must come from a cache — do not create them during render. |
| `useFormStatus()` (from `react-dom`) | 19 | Read `{ pending, data, method, action }` of the nearest parent `<form>` Action. Works across component boundaries without prop drilling — useful for submit buttons inside design-system components. |
| `useEffectEvent(fn)` | 19.2 | Extract a non-reactive callback from an effect. The function sees the latest props/state without being listed in deps, and is never stale. Replaces the `useRef`-and-mutate-in-layout-effect workaround for stable event-like callbacks. The compiler has built-in knowledge of this hook and correctly prunes its return value from effect dependency arrays. Both `useEffectEvent` and the old ref workaround compile cleanly; `useEffectEvent` is preferred for clarity. |
| `<Activity>` | 19.2 | Hide part of the UI while preserving its state and DOM. React deprioritizes updates to hidden content. Use via framework APIs for route prerendering or tab preservation — not a direct replacement for CSS `visibility`. |
| `captureOwnerStack()` | 19.1 | Dev-only API that returns a string showing which components are responsible for rendering the current component (owner stack, not call stack). Useful for custom error overlays. Returns `null` in production. |
| `<form action={fn}>` | 19 | Pass an async function as a form's `action` prop. React handles submission, pending state, and automatic form reset on success. Works with `useActionState` and `useFormStatus`. |
| Ref cleanup function | 19 | Return a cleanup function from a ref callback: `ref={el => { ...; return () => cleanup(); }}`. React calls it on unmount. Replaces the pattern of checking `el === null` in the callback. |
| `<link rel="stylesheet" precedence="default">` | 19 | Declare a stylesheet next to the component that needs it. React deduplicates and inserts it in the correct order before revealing Suspense content. |
| `preinit`, `preload`, `prefetchDNS`, `preconnect` (from `react-dom`) | 19 | Imperatively hint the browser to load resources early. Call from render or event handlers. React deduplicates hints across the component tree. |
| React Compiler (`babel-plugin-react-compiler`) | C 1.0 | Build-time automatic memoization for components and hooks. Install, add to Babel/Vite pipeline. Projects typically start with path-based overrides to compile a subset of files. |
| `"use memo"` directive | C 1.0 | Opt a single function into compilation when using `compilationMode: 'annotation'`. Place at the start of the function body. Module-level `"use memo"` at the top of a file compiles all functions in that file. |
| `"use no memo"` directive | C 1.0 | Temporary escape hatch — skip compilation for a specific component or hook that causes a runtime regression. Not a permanent solution. Place at the start of the function body. |
| Compiler-powered ESLint rules | C 1.0 | Rules for purity, refs, set-state-in-render, immutability, etc. now ship in `eslint-plugin-react-hooks` recommended preset. Surface Rules-of-React violations even without the compiler installed. Note: some projects use Biome instead — check project lint config. |
## Key APIs
### `useTransition` and `startTransition` (18)
`useTransition` returns `[isPending, startTransition]`. Wrap any state update that is not directly tied to the user's current gesture inside `startTransition`. React will render the old UI while computing the new one, and `isPending` is `true` during that window.
In React 19, `startTransition` can accept an async function (an "Action"). React sets `isPending` to `true` for the entire duration of the async work, not just during the synchronous part.
```tsx
// 18: synchronous transition
const[isPending,startTransition]=useTransition();
startTransition(()=>setQuery(input));
// 19: async Action — isPending stays true until the await settles
startTransition(async()=>{
consterr=awaitupdateName(name);
if(err)setError(err);
});
```
Use `startTransition` (the module-level export) when you cannot use the hook (outside a component, in a router callback, etc.).
### `useDeferredValue` (18 / 19)
Creates a "lagging" copy of a value. Pass it to a memoized, expensive component so that React can render the stale UI while computing the updated one.
```tsx
// 19: initialValue shows '' on first render; avoids loading flash
constdeferred=useDeferredValue(searchQuery,"");
return<Resultsquery={deferred}/>;// Results wrapped in memo
```
`deferred !== searchQuery` while the deferred render is in progress — use this to show a "stale" indicator.
### `useActionState` (19)
Replaces the `useState` + `isPending` + `try/catch` + `setError` boilerplate for any async operation that can be retried or submitted as a form.
if(err)returnerr;// returned value becomes next state
redirect("/profile");
returnnull;
},
null,// initialState
);
// Use submitAction as the form's action prop or call it directly
<formaction={submitAction}>
<inputname="name"/>
<buttondisabled={isPending}>Save</button>
{error&&<p>{error}</p>}
</form>;
```
### `useOptimistic` (19)
Shows a speculative value immediately while an async Action is in progress. React automatically reverts to the server-confirmed value when the Action resolves or rejects.
The correct way for libraries (and app code) to subscribe to non-React state. Prevents tearing under concurrent rendering.
```tsx
constvalue=useSyncExternalStore(
store.subscribe,// called when store changes
store.getSnapshot,// returns current value (must be stable reference if unchanged)
store.getServerSnapshot,// optional: for SSR
);
```
## Verifying compiler behavior
The compiler is a black box unless you inspect its output. When reviewing code in compiled paths, run the compiler on the specific code to see what it actually does. Do not guess — verify.
-`if ($[n] === Symbol.for("react.memo_cache_sentinel"))` — one-time initialization. Runs once on first render, cached forever after. This is how the compiler handles expressions with no reactive dependencies.
-`_temp` functions — pure callbacks the compiler hoisted out of the component body.
**Check all compiled files at once:**
```sh
cd site && pnpm run lint:compiler
```
This runs the compiler on every file in the compiled paths and reports CompileError / CompileSkip diagnostics. Zero diagnostics means all functions compiled cleanly.
**What the compiler catches vs. what it does not:**
The compiler emits `CompileError` for mutations of props, state, or hook arguments during render, and for `ref.current` access during render. The project's lint pipeline catches these automatically — do not flag them in review.
The compiler does **not** flag impure function calls during render (`Math.random()`, `Date.now()`, `new Date()`). Instead it silently memoizes them with a sentinel guard, freezing the value after first render. This changes semantics without any diagnostic. Verify suspicious calls by running the compiler and checking for sentinel guards in the output.
## Pitfalls
Things that are easy to get wrong even when you know the modern API exists. Check your output against these.
**Effects run twice in development with StrictMode.** React 18 intentionally mounts → unmounts → remounts every component in dev to surface effects that are not resilient to remounting. This is not a bug. If an effect breaks on the second mount, it is missing a cleanup function. Write `return () => cleanup()` from every effect that sets up a subscription, timer, or external resource.
**Concurrent rendering can call render multiple times.** The render function (component body) may be called more than once before React commits to the DOM. Side effects (mutations, subscriptions, logging) in the render body will run multiple times. Move them into `useEffect` or event handlers.
**Do not create promises during render and pass them to `use()`.** A new promise is created every render, causing an infinite suspend-retry loop. Create the promise outside the component (module level), or use a caching library (SWR, React Query, `cache()` from React) to stabilize it.
**`useOptimistic` reverts automatically — do not fight it.** The optimistic value is a presentation layer only. When the Action settles, React replaces it with the real `currentValue` you passed in. Do not try to sync optimistic state back to your real state; let React handle the revert.
**`flushSync` opts out of automatic batching.** If third-party code or a browser API (e.g. `ResizeObserver`) calls `setState` and you need synchronous DOM flushing, wrap with `flushSync(() => setState(...))`. This is a last resort; prefer letting React batch.
**`forwardRef` still works in React 19 but will be deprecated.** Function components accept `ref` as a plain prop now. New code should use the prop directly. Existing `forwardRef` wrappers continue to work without changes; migrate when convenient.
**`<Activity>` does not unmount.** Content inside a hidden `<Activity>` boundary stays mounted. Effects keep running. Use it for preserving scroll position or form state, not for preventing expensive mounts — use lazy loading for that.
**TypeScript: implicit returns from ref callbacks are now type errors.** In React 19, returning anything other than a cleanup function (or nothing) from a ref callback is rejected by the TypeScript types. The most common case is arrow-function refs that implicitly return the DOM node:
```tsx
// Error in React 19 types:
<divref={el=>(instance=el)}/>
// Fix — use a block body:
<divref={el=>{instance=el;}}/>
```
**TypeScript: `useRef` now requires an argument.**`useRef<T>()` with no argument is a type error. Pass `undefined` for mutable refs or `null` for DOM refs you initialize on mount: `useRef<T>(undefined)` / `useRef<HTMLDivElement | null>(null)`.
**`useId` output format changed across versions.** React 18 produced `:r0:`. React 19.1 changed it to `«r0»`. React 19.2 changed it again to `_r0`. Do not parse or depend on the specific format — treat it as an opaque string.
**`useFormStatus` reads the nearest parent `<form>` with a function `action`.** It does not reflect native HTML form submissions — only React Actions. A submit button that is a sibling of `<form>` (rather than a descendant) will not see the form's status.
**Context as a provider (`<Context>`) requires React 19; `<Context.Provider>` still works.** Do not use `<Context>` shorthand in a codebase that needs to support React 18. The two forms can coexist during migration.
**Compiler freezes impure expressions silently.**`Math.random()`, `Date.now()`, `new Date()`, and `window.innerWidth` in a component body all compile without diagnostics. The compiler wraps them in a sentinel guard (`Symbol.for("react.memo_cache_sentinel")`) that runs the expression once and caches the result forever. The value never updates on re-render. Fix: move to a `useState` initializer (`useState(() => Math.random())`), `useEffect`, or event handler.
**Component granularity affects compiler optimization.** When one pattern in a component causes a `CompileError` (e.g., a necessary `ref.current` read during render), the compiler skips the **entire** component. If the rest of the component would benefit from compilation, extract the non-compilable pattern into a small child component. This keeps the parent compiled.
**The compiler only memoizes components and hooks.** Standalone utility functions (even expensive ones called during render) are not compiled. If a utility function is truly expensive, it still needs its own caching strategy outside of React (e.g., a module-level cache, `WeakMap`, etc.).
**Changing memoization can shift `useEffect` firing.** A value that was unstable before compilation may become stable after, causing an effect that depended on it to fire less often. Conversely, future compiler changes may alter memoization granularity. Effects that use memoized values as dependencies should be resilient to these changes — they should be true synchronization effects, not "run this when X changes" hacks.
## Behavioral changes that affect code
- **Automatic batching** (18): State updates in `setTimeout`, `Promise.then`, `addEventListener` callbacks, etc. are now batched into a single re-render. Previously only React synthetic event handlers were batched. Code that relied on unbatched updates (reading DOM synchronously after each `setState`) must use `flushSync`.
- **StrictMode double-invoke** (18): In development, every component is mounted → unmounted → remounted with the previous state. Every effect runs cleanup → setup twice on initial mount. `useMemo` and `useCallback` also double-invoke their functions. Production behavior is unchanged. If a test or component breaks under this, the component had a latent cleanup bug.
- **StrictMode ref double-invoke** (19): In development, ref callbacks are also invoked twice on mount (attach → detach → attach). Return a cleanup function from the ref callback to handle detach correctly.
- **StrictMode memoization reuse** (19): During the second pass of double-rendering, `useMemo` and `useCallback` now reuse the cached result from the first pass instead of calling the function again. Components that are already StrictMode-compatible should not notice a difference.
- **Suspense fallback commits immediately** (19): When a component suspends, React now commits the nearest `<Suspense>` fallback without waiting for sibling trees to finish rendering. After the fallback is shown, React "pre-warms" suspended siblings in the background. This makes fallbacks appear faster but changes the order of rendering work.
- **Error re-throwing removed** (19): Errors that are not caught by an Error Boundary are now reported to `window.reportError` (not re-thrown). Errors caught by an Error Boundary go to `console.error` once. If your production monitoring relied on the re-thrown error, add handlers to `createRoot`: `createRoot(el, { onUncaughtError, onCaughtError })`.
- **Transitions in `popstate` are synchronous** (19): Browser back/forward navigation triggers synchronous transition flushing. This ensures the URL and UI update together atomically during history navigation.
- **`useEffect` from discrete events flushes synchronously** (18): Effects triggered by a click or keydown (discrete events) are now flushed synchronously before the browser paints, consistent with `useLayoutEffect` for those cases.
- **Hydration mismatches treated as errors** (18 / improved in 19): Text content mismatches between server HTML and client render revert to client rendering up to the nearest `<Suspense>` boundary. React 19 logs a single diff instead of multiple warnings, making mismatches much easier to diagnose.
- **New JSX transform required** (19): The automatic JSX runtime introduced in 2020 (`react/jsx-runtime`) is now mandatory. The classic transform (which required `import React from 'react'` in every file) is no longer supported. Most toolchains have already shipped the new transform; check your Babel or TypeScript config if you see warnings.
- **UMD builds removed** (19): React no longer ships UMD bundles. Load via npm and a bundler, or use an ESM CDN (`import React from "https://esm.sh/react@19"`).
- **React Compiler automatic memoization** (Compiler 1.0): Build-time Babel plugin that inserts memoization into components and hooks. Components that follow the Rules of React are automatically memoized; components that violate them are silently skipped (no build error, no runtime change). The compiler can memoize conditionally and after early returns — things impossible with manual `useMemo`/`useCallback`. Works with React 17+ via `react-compiler-runtime`; best with React 19+. Projects adopt incrementally via path-based Babel overrides, `compilationMode: 'annotation'`, or the `"use memo"` / `"use no memo"` directives. Check the project's Vite/Babel config to know which paths are compiled. Compiled components show a "Memo ✨" badge in React DevTools.
Reference for writing idiomatic TypeScript. Covers what changed, what it replaced, and what to reach for. Respect the project's minimum TypeScript version: don't emit features from a version newer than what the project targets. Check `package.json` and `tsconfig.json` before writing code.
## How modern TypeScript thinks differently
The 5.x era resolves years of module system ambiguity and cleans house on legacy options. Three themes dominate:
**Module semantics are explicit.**`--verbatimModuleSyntax` (5.0) makes import/export intent visible in source: type imports must carry `type`, value imports stay. Combined with `--module preserve` or `--moduleResolution bundler`, the compiler now accurately models what bundlers and modern runtimes actually do. `import defer` (5.9) extends the model to deferred evaluation.
**Resource lifetimes are first-class.**`using` and `await using` (5.2) provide deterministic cleanup without `try/finally`. Any object implementing `Symbol.dispose` participates. `DisposableStack` handles ad-hoc multi-resource cleanup in functions where creating a full class is overkill.
**Inference is smarter about what it knows.** Inferred type predicates (5.5) let `.filter(x => x !== undefined)` produce `T[]` instead of `(T | undefined)[]` automatically. `NoInfer<T>` (5.4) gives library authors precise control over which parameters drive inference. Narrowing now survives closures after last assignment, constant indexed accesses, and `switch (true)` patterns.
**TypeScript 6.0 is a transition release toward 7.0** (the Go-native port). It turns years of soft deprecations into errors and changes several defaults. Most impactful: `types` defaults to `[]` (must list `@types` packages explicitly), `rootDir` defaults to `.`, `strict` defaults to `true`, `module` defaults to `esnext`. Projects relying on implicit behavior need explicit config. Check the deprecations section before upgrading.
## Replace these patterns
The left column reflects patterns still common before TypeScript 5.x. Write the right column instead. The "Since" column tells you the minimum TypeScript version required.
| Ad-hoc cleanup with multiple `try/finally` blocks | `using cleanup = new DisposableStack(); cleanup.defer(() => ...)` | 5.2 |
| `import data from "./data.json" assert { type: "json" }` | `import data from "./data.json" with { type: "json" }` | 5.3 |
| `.filter(Boolean)` or `.filter(x => !!x)` to remove nulls | `.filter(x => x !== undefined)` or `.filter(x => x !== null)` (infers type predicate) | 5.5 |
| Extra phantom type param to block inference bleed: `<C extends string, D extends C>` | `NoInfer<C>` on the parameter you don't want to drive inference | 5.4 |
| `export * from "..."` when all re-exported members are types | `export type * from "..."` (or `export type * as ns from "..."`) | 5.0 |
| `function f(): undefined { return undefined; }` — explicit return required in `: undefined`-returning function | Remove the `return` entirely; `undefined`-returning functions no longer require any return statement | 5.1 |
| Manual type predicate annotation on a simple arrow: `(x: T \| undefined): x is T => x !== undefined` | Remove the annotation; TypeScript infers `x is T` from `!== null/undefined` and `instanceof` checks automatically | 5.5 |
| `const val = obj[key]; if (typeof val === "string") { use(val); }` — extract to const to narrow indexed access | `if (typeof obj[key] === "string") { obj[key].toUpperCase(); }` directly — both `obj` and `key` must be effectively constant | 5.5 |
| Copy narrowed `let`/param to a `const`, or restructure code to escape stale closure narrowing after reassignment | Remove the copy; narrowing survives into closures created after the last assignment to the variable | 5.4 |
| `(arr as string[]).filter(...)` or restructure to avoid "not callable" errors on `string[] \| number[]` | Call `.filter`, `.find`, `.some`, `.every`, `.reduce` directly on union-of-array types | 5.2 |
| `if`/`else` chain used to work around lack of narrowing inside a `switch (true)` body | `switch (true)` — each `case` condition now narrows the tested variable in its clause | 5.3 |
## New capabilities
These enable things that weren't practical before. Reach for them in the described situations.
| `using` / `await using` declarations | 5.2 | Any resource needing deterministic cleanup (file handles, DB connections, locks, event listeners). Object must implement `Symbol.dispose` / `Symbol.asyncDispose`. |
| `DisposableStack` / `AsyncDisposableStack` | 5.2 | Ad-hoc multi-resource cleanup without creating a class. Call `.defer(fn)` right after acquiring each resource. Stack disposes in LIFO order. |
| `const` modifier on type parameters | 5.0 | Force `const`-like (literal/readonly tuple) inference at call sites without requiring callers to write `as const`. Constraint must use `readonly` arrays. |
| Decorator metadata (`Symbol.metadata`) | 5.2 | Attach and read per-class metadata from decorators via `context.metadata`. Retrieved as `MyClass[Symbol.metadata]`. Requires `Symbol.metadata ??= Symbol(...)` polyfill. |
| `NoInfer<T>` utility type | 5.4 | Prevent a parameter from contributing inference candidates for `T`. Use when one argument should be the "source of truth" and others should only be checked against it. |
| Inferred type predicates | 5.5 | Filter callbacks that test for `!== null` or `instanceof` now automatically produce a type predicate. `Array.prototype.filter` then narrows the result array type. |
| `--isolatedDeclarations` | 5.5 | Require explicit return types on exported declarations. Unlocks parallel declaration emit by external tooling (esbuild, oxc, etc.) without needing a full type-checker pass. |
| `${configDir}` in tsconfig paths | 5.5 | Anchor `typeRoots`, `paths`, `outDir`, etc. in a shared base tsconfig to the _consuming_ project's directory, not the shared file's location. |
| Always-truthy/nullish check errors | 5.6 | Catches regex literals in `if`, arrow functions as comparators, `?? 100` on non-nullable left side, misplaced parentheses. No API to call; existing bugs now surface as errors. |
| Iterator helper methods (`IteratorObject`) | 5.6 | Built-in iterators from `Map`, `Set`, generators, etc. now have `.map()`, `.filter()`, `.take()`, `.drop()`, `.flatMap()`, `.toArray()`, `.reduce()`, etc. Use `Iterator.from(iterable)` to wrap any iterable. |
| `--noUncheckedSideEffectImports` | 5.6 | Error when a side-effect import (`import "..."`) resolves to nothing. Catches typos in polyfill or CSS imports. |
| `--noCheck` | 5.6 | Skip type checking entirely during emit. Useful for separating "fast emit" from "thorough check" pipeline stages, especially with `--isolatedDeclarations`. |
| `--rewriteRelativeImportExtensions` | 5.7 | Rewrite `.ts`→`.js`, `.tsx`→`.jsx`, `.mts`→`.mjs`, `.cts`→`.cjs` in relative imports during emit. Required when writing `.ts` imports for Node.js strip-types mode and still needing `.js` output for library distribution. |
| `--erasableSyntaxOnly` | 5.8 | Error on constructs that can't be type-stripped by Node.js `--experimental-strip-types`: `enum`, `namespace` with code, parameter properties, `import =` aliases. |
| `require()` of ESM under `--module nodenext` | 5.8 | Node.js 22+ allows CJS to `require()` ESM files (no top-level `await`). TypeScript now allows this under `nodenext` without error. |
| `import defer * as ns from "..."` | 5.9 | Defer module _evaluation_ (not loading) until first property access. Module is loaded and verified at import time; side-effects are delayed. Only works with `--module preserve` or `esnext`. |
| `Object.groupBy` / `Map.groupBy` | 5.4 | Group an iterable into buckets by key function. Return type has all keys as optional (not every key is guaranteed present). Requires `esnext` or `es2024`+ lib. |
| `Temporal` API types | 6.0 RC | `Temporal.Now`, `Temporal.Instant`, `Temporal.PlainDate`, etc. Available under `esnext` or `esnext.temporal` lib. Usable in runtimes that already ship it (V8 118+, SpiderMonkey, etc.). |
| `@satisfies` in JSDoc | 5.0 | Validates that a JS expression satisfies a type without widening it — the TS `satisfies` operator for `.js` files. Write `/** @satisfies {MyType} */` above the declaration or inline on a parenthesized expression. |
| `@overload` in JSDoc | 5.0 | Declare multiple call signatures for a JS function. Each JSDoc comment tagged `@overload` is treated as a distinct overload; the final JSDoc comment (without `@overload`) describes the implementation signature. |
| Getter/setter with completely unrelated types | 5.1 | `get style(): CSSStyleDeclaration` and `set style(v: string)` can now have fully unrelated types, provided both have explicit type annotations. Previously the getter type was required to be a subtype of the setter type. |
| `instanceof` narrowing via `Symbol.hasInstance` | 5.3 | When a class defines `static [Symbol.hasInstance](val: unknown): val is T`, the `instanceof` operator now narrows to the predicate type `T`, not the class type itself. Useful when the runtime check and the structural type differ. |
| Regex literal syntax checking | 5.5 | TypeScript validates regex literal syntax: malformed groups, nonexistent backreferences, named capture mismatches, and features not available at the current `--target`. No API needed; existing latent bugs surface as errors automatically. |
| `--build` continues past intermediate errors | 5.6 | `tsc --build` no longer stops at the first failing project. All projects are built and errors reported together. Use `--stopOnBuildErrors` to restore the old stop-on-first-error behavior. Useful for monorepos during upgrades. |
| `--module node18` | 5.8 | Stable `--module` flag for Node.js 18 semantics: disallows `require()` of ESM (unlike `nodenext`) and still allows import assertions. Use when pinned to Node 18 and not ready for `nodenext` behavior changes. |
| `--module node20` | 5.9 | Stable `--module` flag for Node.js 20 semantics: permits `require()` of ESM, rejects import assertions. Implies `--target es2023` (unlike `nodenext`, which floats to `esnext`). |
-`DisposableStack` — `defer(fn)`, `use(resource)`, `adopt(value, disposeFn)`, `move()`. Is itself `Disposable`.
-`AsyncDisposableStack` — async equivalent. Is itself `AsyncDisposable`.
-`SuppressedError` — thrown when both the scope body and a `[Symbol.dispose]` throw. `.error` holds the dispose-phase error; `.suppressed` holds the original error.
Each decorator kind receives a typed context object as its second parameter:
-`ClassDecoratorContext`
-`ClassMethodDecoratorContext`
-`ClassGetterDecoratorContext`
-`ClassSetterDecoratorContext`
-`ClassFieldDecoratorContext`
-`ClassAccessorDecoratorContext`
All context objects have `.name`, `.kind`, `.static`, `.private`, and `.metadata`. Method/getter/setter/accessor contexts also have `.addInitializer(fn)` for running code at construction time.
### `IteratorObject` (5.6)
`IteratorObject<T, TReturn, TNext>` is the new type for built-in iterable iterators. Key methods: `map`, `filter`, `take`, `drop`, `flatMap`, `forEach`, `reduce`, `some`, `every`, `find`, `toArray`. Not the same as the pre-existing structural `Iterator<T>` protocol.
- Generators produce `Generator<T>` which extends `IteratorObject`.
-`Map.prototype.entries()` returns `MapIterator<[K, V]>`, `Set.prototype.values()` returns `SetIterator<T>`, etc.
-`Iterator.from(iterable)` converts any `Iterable` to an `IteratorObject`.
-`AsyncIteratorObject` exists for async parity.
-`--strictBuiltinIteratorReturn` (new `--strict`-mode flag in 5.6) makes the return type of `BuiltinIteratorReturn` be `undefined` instead of `any`, catching unchecked `done` access.
### Array copying methods (5.2)
Declared on `Array`, `ReadonlyArray`, and all `TypedArray` types. Use these instead of the mutating variants when you need to preserve the original:
| `arr.splice(start, del, ...items)` | `arr.toSpliced(start, del, ...items)` |
| `arr[i] = v` | `arr.with(i, v)` |
## Pitfalls
Things easy to get wrong even when you know the modern API exists. Check your output against these.
**tsconfig defaults changed hard in 6.0.**`types: []` means no `@types/*` packages load implicitly. If you see floods of "cannot find name 'process'" or "cannot find module 'fs'" after upgrading to 6.0, add `"types": ["node"]` (or whatever you need) to `compilerOptions`. `rootDir: "."` means a project with source in `src/` will emit to `dist/src/` instead of `dist/` — add `"rootDir": "./src"` explicitly. `strict: true` by default means projects with loose code see new errors.
**`using` requires a runtime polyfill on older runtimes.** `Symbol.dispose` and `Symbol.asyncDispose` don't exist before Node.js 18.x / Chrome 120. Add the two-line polyfill at your entry point. `DisposableStack` and `AsyncDisposableStack` need a more substantial polyfill (e.g. from `@microsoft/using-polyfill`).
**`using` disposes in LIFO order.** Resources declared later in a scope are disposed first. Declare in the order you want reversed cleanup (acquisition order). `DisposableStack.defer` also runs in LIFO order.
**Inferred type predicates have if-and-only-if semantics.**`x => !!x` does NOT infer `x is NonNullable<T>` because `0`, `""`, and `false` are falsy but not absent. TypeScript correctly refuses the predicate. Use `x => x !== undefined` or `x => x !== null` for precise null/undefined filters. If a predicate isn't being inferred, the false branch is probably ambiguous.
**`--verbatimModuleSyntax` breaks CJS `require` emit.** Under this flag ESM `import`/`export` is emitted verbatim. You cannot produce `require()` calls from standard `import` syntax. For CJS output you must use `import foo = require("foo")` and `export = { ... }` syntax explicitly.
**`NoInfer<T>` doesn't prevent `T` from being resolved, only from being contributed at that position.** Other parameters can still infer `T`. It means "don't use me as an inference candidate", not "block `T` from being resolved".
**`--isolatedDeclarations` requires explicit return types on all exports.** Exported arrow functions, function declarations, and class methods all need annotations if their return type isn't trivially inferrable from a literal or type assertion. Editor quick-fixes can add them automatically.
**Standard decorators are incompatible with `--experimentalDecorators`.** Different type signatures, metadata model, and emit. A decorator written for one will not work with the other. `--emitDecoratorMetadata` is not supported with standard decorators. Don't mix the two systems in one project.
**`import defer` does not downlevel.** TypeScript does not transform `import defer` to polyfill-compatible code. The module is still _loaded_ eagerly (must exist); only _evaluation_ is deferred. Only use it under `--module preserve` or `esnext` with a runtime or bundler that supports it.
**`--erasableSyntaxOnly` prohibits parameter properties.** `constructor(public x: number)` is not allowed. Expand to an explicit field declaration plus assignment in the constructor body.
**Closure narrowing is invalidated if the variable is assigned anywhere in a nested function.** TypeScript cannot know when a nested function will run, so any assignment to a `let`/param inside a nested function — even a no-op like `value = value` — invalidates narrowing for all closures in the outer scope. Only the outer "no further assignments after this point" pattern is safe.
**Constant indexed access narrowing requires both `obj` and `key` to be unmodified between the check and the use.** If either is a `let` that could be reassigned, TypeScript will not narrow `obj[key]`. Extract the value to a `const` in that case.
**`switch (true)` narrowing does not carry across fall-through cases.** In a `switch (true)`, each `case` condition narrows independently. A variable narrowed in `case typeof x === "string":` that falls through to the next case will have its narrowing widened by the next condition, not accumulated from the previous one.
**`const` type parameter modifier falls back when constraint is mutable.** `<const T extends string[]>(args: T)` falls back to `string[]` because `readonly ["a", "b"]` isn't assignable to `string[]`. Use `<const T extends readonly string[]>` for arrays.
**`assert` import syntax errors under `--module nodenext` since 5.8.** Any remaining `import x from "..." assert { ... }` must be updated to `import x from "..." with { ... }`.
**`Array.prototype.filter(x => x !== null)` now narrows to non-null (5.5).** This is almost always correct, but if you intentionally needed the nullable type downstream, add an explicit annotation: `const items: (T | null)[] = arr.filter(x => x !== null)`.
## Behavioral changes that affect code
- **All enums are union enums** (5.0): Every enum member gets its own literal type. Out-of-domain literal assignment to an enum type now errors. Cross-enum assignment between enums with identical names but differing values now errors.
- **Relational operators no longer allow implicit string/number coercions** (5.0): `ns > 4` where `ns: number | string` is a type error. Use `+ns > 4` to explicitly coerce.
- **`--module`/`--moduleResolution` must agree on node flavor** (5.2): Mixing `--module nodenext` with `--moduleResolution bundler` is an error. Use `--module nodenext` alone or `--module esnext --moduleResolution bundler`.
- **Deprecations from 5.0 become hard errors in 5.5**: `--importsNotUsedAsValues`, `--preserveValueImports`, `--target ES3`, `--out`, and several others are fully removed in 5.5. They can no longer be specified, even with `"ignoreDeprecations": "5.0"`. Migrate to `--verbatimModuleSyntax` for the import flags.
- **Type-only imports conflicting with local values** (5.4): Under `--isolatedModules`, `import { Foo } from "..."` where a local `let Foo` also exists now errors. Use `import type { Foo }` or `import { type Foo }`.
- **Reference directives no longer synthesized or preserved in declaration emit** (5.5): `/// <reference types="node" />` TypeScript used to add automatically is no longer emitted. User-written directives are dropped unless they carry `preserve="true"`. Update library `tsconfig.json` if you relied on this.
- **`.mts` files never emit CJS; `.cts` files never emit ESM** (5.6): Regardless of `--module` setting. Previously the extension was ignored in some modes.
- **JSON imports under `--module nodenext` require `with { type: "json" }`** (5.7): `import data from "./config.json"` without the attribute is now a type error.
- **`TypedArray`s are now generic** (5.7): `Uint8Array` is `Uint8Array<TArrayBuffer extends ArrayBufferLike = ArrayBufferLike>`. Code passing `Buffer` (from `@types/node`) to typed-array parameters may see new errors. Update `@types/node` to a version that matches.
- **`import assert { ... }` is an error under `--module nodenext`** (5.8): Node.js 22 dropped support for the old syntax. Use `with { ... }`.
- **`types` defaults to `[]` in 6.0**: All implicit `@types/*` loading stops. Add an explicit `"types": ["node"]` or the array will remain empty. Using `"types": ["*"]` restores the 5.x behavior.
- **`rootDir` defaults to `.` (the tsconfig directory) in 6.0**: Previously inferred from the common ancestor of all source files. Projects with `"include": ["./src"]` and no explicit `rootDir` will now emit into `dist/src/` instead of `dist/`. Add `"rootDir": "./src"` to fix.
- **`strict` defaults to `true` in 6.0**: Projects that were implicitly not strict will see new errors. Set `"strict": false` explicitly if you're not ready to fix them.
- **`--baseUrl` deprecated in 6.0** and no longer acts as a module resolution root. Add explicit prefixes to your `paths` entries instead.
- **`--moduleResolution node` (node10) deprecated in 6.0**: Removed in 7.0. Migrate to `nodenext` or `bundler`.
- **`amd`, `umd`, `systemjs`, `none` module targets deprecated in 6.0**: Removed in 7.0. Migrate to a bundler.
- **`--outFile` removed in 6.0**: Use a bundler (esbuild, Rollup, Webpack, etc.).
- **`module Foo { }` syntax removed in 6.0**: Rename all such declarations to `namespace Foo { }`.
- **`--esModuleInterop false` and `--allowSyntheticDefaultImports false` removed in 6.0**: Safe interop is now always on. Default imports from CJS modules (`import express from "express"`) are always valid.
- **Explicit `typeRoots` disables upward `node_modules/@types` fallback** (5.1): When `typeRoots` is specified and a lookup fails in those directories, TypeScript no longer walks parent directories for `@types`. If you relied on the fallback, add `"./node_modules/@types"` explicitly to your `typeRoots` array.
- **`super.` on instance field properties is a type error** (5.3): Calling `super.foo()` where `foo` is a class field (arrow function assigned in the constructor) rather than a prototype method now errors. Instance fields don't exist on the prototype; `super.field` is `undefined` at runtime.
- **`--build` always emits `.tsbuildinfo`** (5.6): Previously only written when `--incremental` or `--composite` was set. Now written unconditionally in any `--build` invocation. Update `.gitignore` or CI artifact management if needed.
- **`.mts`/`.cts` extensions and `package.json``"type"` respected in all module modes** (5.6): Format-specific extensions and the `"type"` field inside `node_modules` are now honored regardless of `--module` setting (except `amd`, `umd`, `system`). A `.mts` file will never emit CJS output even under `--module commonjs`.
- **Granular return expression checking** (5.8): Each branch of a conditional expression (`cond ? a : b`) directly inside a `return` statement is now checked individually against the declared return type. Previously an `any`-typed branch could silently suppress type errors in the other branch.
- Find specific interleavings that break. A select statement where case ordering starves one branch. An unbuffered channel that deadlocks under backpressure. A context cancellation that races with a send on a closed channel.
- Check shutdown sequences. Component A depends on component B, but B was already torn down. "Fire and forget" goroutines that are actually "fire and leak." Join points that never arrive because nobody is waiting.
- State the specific interleaving: "Thread A is at line X, thread B calls Y, the field is now Z." Don't say "this might have a race."
- Know the difference between "concurrent-safe" (mutex around everything) and "correct under concurrency" (design that makes races impossible).
**Scope boundaries:** You review concurrency. You don't review architecture, package boundaries, or test quality. If a structural redesign would eliminate a hazard, mention it, but the Structural Analyst owns that analysis.
You review code by asking: **"What does this code promise, and does it keep that promise?"**
Every piece of code makes promises. An API endpoint promises a response shape. A status code promises semantics. A state transition promises reachability. An error message promises a diagnosis. A flag name promises a scope. A comment promises intent. Your job is to find where the implementation breaks the promise.
Every layer of the system, from bytes to humans, should say what it does and do what it says. False signals compound into bugs. A misleading name is a future misuse. A missing error path is a future outage. A flag that affects more than its name says is a future support ticket.
**Method — four modes, use all on every diff.** Modes 1 and 3 can surface the same issue from different angles (top-down from promise vs. bottom-up from signal). If they converge, report once and note both angles.
**1. Contract tracing.** Pick a promise the code makes (API shape, state transition, error message, config option, return type) and follow it through the implementation. Read every branch. Find where the promise breaks. Ask: does the implementation do what the name/comment/doc says? Does the error response match what the caller will see? Does the status code match the response body semantics? Does the flag/config affect exactly what its name and help text claim? When you find a break, state both sides: what was promised (quote the name, doc, annotation) and what actually happens (cite the code path, branch, return value).
**2. Lifecycle completeness.** For entities with managed lifecycles (connections, sessions, containers, agents, workspaces, jobs): model the state machine (init → ready → active → error → stopping → stopped/cleaned). Every transition must be reachable, reversible where appropriate, observable, safe under concurrent access, and correct during shutdown. Enumerate transitions. Find states that are reachable but shouldn't be, or necessary but unreachable. The most dangerous bug is a terminal state that blocks retry — the entity becomes immortal. Ask: what happens if this operation fails halfway? What state is the entity left in after an error? Can the user retry, or is the entity stuck? What happens if shutdown races with an in-progress operation? Does every path leave state consistent?
**3. Semantic honesty.** Every word in the codebase is a signal to the next reader. Audit signals for fidelity. Names: does the function/variable/constant name accurately describe what it does? A constant named after one concept that stores a different one is a lie. Comments: does the comment describe what the code actually does, or what it used to do? Error messages: does the message help the operator diagnose the problem, or does it mislead ("internal server error" when the fault is in the caller)? Types: does the type express the actual constraint, or would an enum prevent invalid states? Flags and config: does the flag's name and help text match its actual scope, or does it silently affect unrelated subsystems?
**4. Adversarial imagination.** Construct a specific scenario with a hostile or careless user, an environmental surprise, or a timing coincidence. Trace the system state step by step. Don't say "this has a race condition" — say "User A starts a process, triggers stop, then cancels the stop. The entity enters cancelled state. The previous stop never completed. The process runs in perpetuity." Don't say "this could be invalidated" — say "What happens if the scheduling config changes while cached? Each invalidation skips recomputation." Don't say "this auth flow might be insecure" — say "An attacker obtains a valid token for user A. They submit it alongside user B's identifier. Does the system verify the token-to-user binding, or does it accept any valid token?" Build the scenario. Name the actor. Describe the sequence. State the resulting system state. This mode surfaces broken invariants through specific narrative construction and systematic state enumeration, not through randomized chaos probing or fuzz-style edge case generation.
**Finding structure.** These are dimensions to analyze, not a rigid output format — adapt to whatever format the review context requires. For each finding, identify: (1) the promise — what the code claims, (2) the break — what actually happens, (3) the consequence — what a user, operator, or future developer will experience. Not every finding blocks. Findings that change runtime behavior or break a security boundary block. Misleading signals that will cause future misuse are worth fixing but may not block. Latent risks with no current trigger are worth noting.
**Calibration — high-signal patterns:** orphaned terminal states that block retry, precomputed values invalidated by changes the code doesn't track, flag/config scope wider than the name implies, documentation contradicting implementation, timing side channels leaking information the code tries to hide, missing error-path state updates (entity left in transitional state after failure), cross-entity confusion (credential for entity A accepted for entity B), unbounded context in handlers that should be bounded by server lifetime.
**Scope boundaries:** You trace promises and find where they break. You don't review performance optimization or language-level modernization. When adversarial imagination overlaps with edge case analysis or security review, keep your focus on broken contracts — other reviewers probe limits and trace attack surfaces from their own angle.
When you find nothing: say so. A clean review is a valid outcome. Don't manufacture findings to justify your existence.
**Lens:** PostgreSQL, data modeling, Go↔SQL boundary.
**Method:**
- Check migration safety. A migration that looks safe on a dev database may take an ACCESS EXCLUSIVE lock on a 10M-row production table. Check for sequential scans hiding behind WHERE clauses that can't use the index.
- Check schema design for future cost. Will the next feature need a column that doesn't fit? A query that can't perform?
- Own the Go↔SQL boundary. Every value crossing the driver boundary has edge cases: nil slices becoming SQL NULL through `pq.Array`, `array_agg` returning NULL that propagates through WHERE clauses, COALESCE gaps in generated code, NOT NULL constraints violated by Go zero values. Check both sides.
**Scope boundaries:** You review database interactions. You don't review application logic, frontend code, or test quality.
- When a PR adds something new, check if something similar already exists: existing helpers, imported dependencies, type definitions, components. Search the codebase.
- Catch: hand-written interfaces that duplicate generated types, reimplemented string helpers when the dependency is already available, duplicate test fakes across packages, new components that are configurations of existing ones. A new page that could be a prop on an existing page. A new wrapper that could be a call to an existing function.
- Don't argue. Show where it already lives.
**Scope boundaries:** You check for duplication. You don't review correctness, performance, or security.
- Find hidden connections. Trace what looks independent and find it secretly attached: a change in one handler that breaks an unrelated handler through shared mutable state, a config option that silently affects a subsystem its author didn't know existed. Pull one thread and watch what moves.
- Find surface deception. Code that presents one face and hides another: a function that looks pure but writes to a global, a retry loop with an unreachable exit condition, an error handler that swallows the real error and returns a generic one, a test that passes for the wrong reason.
- Probe limits. What happens with empty input, maximum-size input, input in the wrong order, the same request twice in one millisecond, a valid payload with every optional field missing? What happens when the clock skews, the disk fills, the DNS lookup hangs?
- Rate potential, not just current severity. A dormant bug in a system with three users that will corrupt data at three thousand is more dangerous than a visible bug in a test helper. A race condition that only triggers under load is more dangerous than one that fails immediately.
**Scope boundaries:** You probe limits and find hidden connections. You don't review test quality, naming conventions, or documentation.
- Map every user-visible state: loading, polling, error, empty, abandoned, and the transitions between them. Find the gaps. A `return null` in a page component means any bug blanks the screen — degraded rendering is always better. Form state that vanishes on navigation is a lost route.
- Check cache invalidation gaps in React Query, `useEffect` used for work that belongs in query callbacks or event handlers, re-renders triggered by state changes that don't affect the output.
- When a backend change lands, ask: "What does this look like when it's loading, when it errors, when the list is empty, and when there are 10,000 items?"
**Scope boundaries:** You review frontend code. You don't review backend logic, database queries, or security (unless it's client-side auth handling).
**Lens:** Package boundaries, API lifecycle, middleware.
**Method:**
- Check dependency direction. Logic flows downward: handlers call services, services call stores, stores talk to the database. When something reaches upward or sideways, flag it.
- Question whether every abstraction earns its indirection. An interface with one implementation is unnecessary. A handler doing business logic belongs in a service layer. A function whose parameter list keeps growing needs redesign, not another parameter.
- Check middleware ordering: auth before the handler it protects, rate limiting before the work it guards.
- Track API lifecycle. A shipped endpoint is a published contract. Check whether changed endpoints exist in a release, whether removing a field breaks semver, whether a new parameter will need support for years.
**Scope boundaries:** You review Go architecture. You don't review concurrency primitives, test quality, or frontend code.
- Read the version file first (go.mod, package.json, or equivalent). Don't suggest features the declared version doesn't support.
- Flag hand-rolled utilities the standard library now covers. Flag deprecated APIs still in active use. Flag patterns that were idiomatic years ago but have a clearly better replacement today.
- Name which version introduced the alternative.
- Only flag when the delta is worth the diff. If the old pattern works and the new one is only marginally better, pass.
**Scope boundaries:** You review language-level patterns. You don't review architecture, correctness, or security.
**Lens:** Hot paths, resource exhaustion, invisible degradation.
**Method:**
- Trace the hot path through the call stack. Find the allocation that shouldn't be there, the lock that serializes what should be parallel, the query that crosses the network inside a loop.
- Find multiplication at scale. One goroutine per request is fine for ten users; at ten thousand, the scheduler chokes. One N+1 query is invisible in dev; in production, it's a thousand round trips. One copy in a loop is nothing; a million copies per second is an OOM.
- Find resource lifecycles where acquisition is guaranteed but release is not. Memory leaks that grow slowly. Goroutine counts that climb and never decrease. Caches with no eviction. Temp files cleaned only on the happy path.
- Calculate, don't guess. A cold path that runs once per deploy is not worth optimizing. A hot path that runs once per request is. Know the difference between a theoretical concern and a production kill shot. If you can't estimate the load, say so.
**Scope boundaries:** You review performance. You don't review correctness, naming, or test quality.
- Ask "do users actually need this?" Not "is this elegant" or "is this extensible." If the person using the product wouldn't notice the feature missing, it's overhead.
- Question complexity. Three layers of abstraction for something that could be a function. A notification system that spams a thousand users when ten are active. A config surface nobody asked for.
- Check proportionality. Is the solution sized to the problem? A 3-line bug shouldn't produce a 200-line refactor.
**Scope boundaries:** You review product sense. You don't review implementation correctness, concurrency, or security.
- Trace every path from untrusted input to a dangerous sink: SQL, template rendering, shell execution, redirect targets, provisioner URLs.
- Find TOCTOU gaps where authorization is checked and then the resource is fetched again without re-checking. Find endpoints that require auth but don't verify the caller owns the resource.
- Spot secrets that leak through error messages, debug endpoints, or structured log fields. Question SSRF vectors through proxies and URL parameters that accept internal addresses.
- Insist on least privilege. Broad token scopes are attack surface. A permission granted "just in case" is a weakness. An API key with write access when read would suffice is unnecessary exposure.
- "The UI doesn't expose this" is not a security boundary.
**Scope boundaries:** You review security. You don't review performance, naming, or code style.
You review code by asking: **"What does this code assume that it doesn't express?"**
Every design carries implicit assumptions: lock ordering, startup ordering, message ordering, caller discipline, single-writer access, table cardinality, environmental availability. Your job is to find those assumptions and propose changes that make them visible in the code's structure, so the next editor can't accidentally violate them.
Eliminate the class of bug, not the instance. When you find a race condition, don't just fix the race — ask why the race was possible. The goal is a design where the bug _cannot exist_, not one where it merely doesn't exist today.
**Method — four modes, use all on every diff.**
**1. Structural redesign.** Find where correctness depends on something the code doesn't enforce. Propose alternatives where correctness falls out from the structure. Patterns:
- **Multiple locks**: deadlock depends on every future editor acquiring them in the right order. Propose one lock + condition variable.
- **Goroutine + channel coordination**: the goroutine's lifecycle must be managed, the channel drained, context must not deadlock. Propose timer/callback on the struct.
- **Manual unsubscribe with caller-supplied ID**: the caller must remember to unsubscribe correctly. Propose subscription interface with close method.
- **Hardcoded access control**: exceptions make the API brittle. Propose the policy system (RBAC, middleware).
- **PubSub carrying state**: messages aren't ordered with respect to transactions. Propose PubSub as notification only + database read for truth.
- **Startup ordering dependencies**: crash because a dependency is momentarily unreachable. Propose self-healing with retry/backoff.
- **Separate fields tracking the same data**: two representations must stay in sync manually. Propose deriving one from the other.
- **Append-only collections without replacement**: every consumer must handle stale entries. Propose replace semantics or explicit versioning.
Be concrete: name the type, the interface, the field, the method. Quote the specific implicit assumption being eliminated.
**2. Concurrency design review.** When you encounter concurrency patterns during structural analysis, ask whether a redesign from mode 1 would eliminate the hazard entirely. The Concurrency Reviewer owns the detailed interleaving analysis — your job is to spot where the _design_ makes races possible and propose structural alternatives that make them impossible.
**3. Test layer audit.** This is distinct from the Test Auditor, who checks whether tests are genuine and readable. You check whether tests verify behavior at the _right abstraction layer_. Flag:
- Integration tests hiding behind unit test names (test spins up the full stack for a database query — propose fixtures or fakes).
- Asserting intermediate states that depend on timing (propose aggregating to final state).
- Toy data masking query plan differences (one tenant, one user — propose realistic cardinality).
- Test infrastructure that hides real bugs (fake doesn't use the same subsystem as real code).
- Missing timeout wrappers (system bug hangs the entire test suite).
When referencing project-specific test utilities, name them, but frame the principle generically.
**4. Dead weight audit.** Unnecessary code is an implicit claim that it matters. Every dead line misleads the next reader. Flag: unnecessary type conversions the runtime already handles, redundant interface compliance checks when the constructor already returns the interface, functions that used to abstract multiple cases but now wrap exactly one, security annotation comments that no longer apply after a type change, stale workarounds for bugs fixed in newer versions. If it does nothing, delete it. If it does something but the name doesn't say what, rename it.
**Finding structure.** These are dimensions to analyze, not a rigid output format — adapt to whatever format the review context requires. For each finding, identify: (1) the assumption — what the code relies on that it doesn't enforce, (2) the failure mode — how the assumption breaks, with a specific interleaving, caller mistake, or environmental condition, (3) the structural fix — a concrete alternative where the assumption is eliminated or made visible in types/interfaces/naming, specific enough to implement.
Ship pragmatically. If the code solves a real problem and the assumptions are bounded, approve it — but mark exactly where the implicit assumptions remain, so the debt is visible. "A few nits inline, but I don't need to review again" is a valid outcome. So is "this needs structural rework before it's safe to merge."
**Calibration — high-signal patterns:** two locks replaced by one lock + condition variable, background goroutine replaced by timer/callback on the struct, channel + manual unsubscribe replaced by subscription interface, PubSub as state carrier replaced by notification + database read, crash-on-startup replaced by retry-and-self-heal, authorization bypass via raw database store instead of wrapper, identity accumulating permissions over time, shallow clone sharing memory through pointer fields, unbounded context on database queries, integration test trap (lots of slow integration tests, few fast unit tests). Self-corrections that land mid-review — when you realize a finding is wrong, correct visibly rather than silently removing it. Visible correction beats silent edit.
**Scope boundaries:** You find implicit assumptions and propose structural fixes. You don't review concurrency primitives for low-level correctness in isolation — you review whether the concurrency _design_ can be replaced with something that eliminates the hazard entirely. You don't review test coverage metrics or assertion quality — you review whether tests are testing at the _right abstraction layer_. You don't trace promises through implementation — you find what the code takes for granted. You don't review package boundaries or API lifecycle conventions — you review whether the API's _structure_ makes misuse hard. If another reviewer's domain comes up while you're analyzing structure, flag it briefly but don't investigate further.
When you find nothing: say so. A clean review is a valid outcome.
- Read every name fresh. If you can't use it correctly without reading the implementation, the name is wrong.
- Read every comment fresh. If it restates the line above it, it's noise. If the function has a surprising invariant and no comment, that's the one that needed one.
- Track patterns. If one misleading name appears, follow the scent through the whole diff. If `handle` means "transform" here, what does it mean in the next file? One inconsistency is a nit. A pattern of inconsistencies is a finding.
- Be direct. "This name is wrong" not "this name could perhaps be improved."
- Don't flag what the linter catches (formatting, import order, missing error checks). Focus on what no tool can see.
**Scope boundaries:** You review naming and style. You don't review architecture, correctness, or security.
**Lens:** Test authenticity, missing cases, readability.
**Method:**
- Distinguish real tests from fake ones. A real test proves behavior. A fake test executes code and proves nothing. Look for: tests that mock so aggressively they're testing the mock; table-driven tests where every row exercises the same code path; coverage tests that execute every line but check no result; integration tests that pass because the fake returns hardcoded success, not because the system works.
- Ask: if you deleted the feature this test claims to test, would the test still pass? If yes, the test is fake.
- Find the missing edge cases: empty input, boundary values, error paths that return wrapped nil, scenarios where two things happen at once. Ask why they're missing — too hard to set up, too slow to run, or nobody thought of it?
- Check test readability. A test nobody can read is a test nobody will maintain. Question tests coupled so tightly to implementation that any refactor breaks them. Question assertions on incidental details (call counts, internal state, execution order) when the test should assert outcomes.
**Scope boundaries:** You review tests. You don't review architecture, concurrency design, or security. If you spot something outside your lens, flag it briefly and move on.
Get the diff for the review target specified in your prompt, then review it.
Write all findings to the output file specified in your prompt. Create the directory if it doesn’t exist. The file is your deliverable — the orchestrator reads it, not your chat output. Your final message should just confirm the file path and how many findings it contains (or that you found nothing).
- **PR:** `gh pr diff {number}`
- **Branch:** `git diff origin/main...{branch}`
- **Commit range:** `git diff {base}..{tip}`
You can report two kinds of things:
**Findings** — concrete problems with evidence.
**Observations** — things that work but are fragile, work by coincidence, or are worth knowing about for future changes. These aren’t bugs, they’re context. Mark them with `Obs`.
Use this structure in the file for each finding:
---
**P{n}**`file.go:42` — One-sentence finding.
Evidence: what you see in the code, and what goes wrong.
---
For observations:
---
**Obs**`file.go:42` — One-sentence observation.
Why it matters: brief explanation.
---
Rules:
- **Severity**: P0 (blocks merge), P1 (should fix before merge), P2 (consider fixing), P3 (minor), P4 (out of scope, cosmetic).
- Severity comes from **consequences**, not mechanism. “setState on unmounted component” is a mechanism. “Dialog opens in wrong view” is a consequence. “Attacker can upload active content” is a consequence. “Removing this check has no test to catch it” is a consequence. Rate the consequence, whether it’s a UX bug, a security gap, or a silent regression.
- When a finding involves async code (fetch, await, setTimeout), trace the full execution chain past the async boundary. What renders, what callbacks fire, what state changes? Rate based on what happens at the END of the chain, not the start.
- Findings MUST have evidence. An assertion without evidence is an opinion.
- Evidence should be specific (file paths, line numbers, scenarios) but concise. Write it like you’re explaining to a colleague, not building a legal case.
- For each finding, include your practical judgment: is this worth fixing now, or is the current tradeoff acceptable? If there’s an obvious fix, mention it briefly.
- Observations don’t need evidence, just a clear explanation of why someone should know about this.
- Check the surrounding code for existing conventions. Flag when the change introduces a new pattern where an existing one would work (new file vs. extending existing, new naming scheme vs. established prefix, etc.).
- Note what the change does well. Good patterns are worth calling out so they get repeated.
- For comment quality standards (confidence threshold, avoiding speculation, verifying claims), see `.claude/skills/code-review/SKILL.md` Comment Standards section.
- If you find nothing, write a single line to the output file: “No findings.”
description: Iteratively refine development plans using TDD methodology. Ensures plans are clear, actionable, and include red-green-refactor cycles with proper test coverage.
---
# Refine Development Plan
## Overview
Good plans eliminate ambiguity through clear requirements, break work into clear phases, and always include refactoring to capture implementation insights.
`This PR is targeting the \`${baseBranch}\` release branch, but its title does not start with \`fix:\` or \`fix(scope):\`.`,
"",
"Only **bug fixes** should be cherry-picked to release branches. If this is a bug fix, please update the PR title to match the conventional commit format:",
"",
"```",
"fix: description of the bug fix",
"fix(scope): description of the bug fix",
"```",
"",
"If this is **not** a bug fix, it likely should not target a release branch.",
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.