The mux module's input variable was renamed from `add-project` to
`add_project`. This updates the dogfood template to use the new name.
Ref:
https://github.com/coder/registry/blob/main/registry/coder/modules/mux/main.tf
(variable `add_project`)
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore <dependency name> major version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's major version (unless you unignore this specific
dependency's major version or upgrade to it yourself)
- `@dependabot ignore <dependency name> minor version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's minor version (unless you unignore this specific
dependency's minor version or upgrade to it yourself)
- `@dependabot ignore <dependency name>` will close this group update PR
and stop Dependabot creating any more for the specific dependency
(unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore <dependency name>` will remove all of the ignore
conditions of the specified dependency
- `@dependabot unignore <dependency name> <ignore condition>` will
remove the ignore condition of the specified dependency and ignore
conditions
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
## Summary
Wire VAPID web push notifications into the Agents (chat) system so users
get desktop notifications when an agent finishes running.
### Backend
- Add `webpush.Dispatcher` to `chatd.Server` and pass it through from
`coderd.Options.WebPushDispatcher`
- In `processChat()`'s deferred cleanup, dispatch a web push
notification when the chat reaches a terminal state:
- **`waiting`** (success): "Agent has finished running."
- **`error`** (failure): the error message, or "Agent encountered an
error."
- Sub-agent chats (`ParentChatID.Valid`) are skipped to avoid
notification spam from internal delegation
- Gracefully no-ops when the dispatcher is nil (web push disabled)
### Frontend
- New `WebPushButton` component — a bell icon that uses the existing
`useWebpushNotifications` hook
- Returns `null` when the `web-push` experiment is off
- Three states: loading spinner, green bell (subscribed), muted bell-off
(unsubscribed)
- Tooltip + toast feedback on toggle
- Added to both the Agents page empty state top bar and the AgentDetail
top bar
- The Agents page has its own layout (no standard Navbar), so it needs
its own subscribe button
### End-to-end flow
1. User clicks the bell icon on `/agents` → browser subscribes via VAPID
2. User starts an agent chat → chat enters `running` status
3. Agent finishes → `processChat` defer sets status to `waiting`/`error`
→ dispatches web push
4. Browser service worker shows a desktop notification with the chat
title and status
---------
Co-authored-by: Coder <coder@users.noreply.github.com>
## Summary
Replaces the plain `<TextareaAutosize>` in the Agent chat input
(`AgentChatInput`) with a Lexical-based editor component, matching the
pattern used in [coder/blink](https://github.com/coder/blink).
## What changed
### New component: `ChatMessageInput`
`site/src/components/ChatMessageInput/ChatMessageInput.tsx`
A Lexical-powered text input that behaves as a plain-text editor with:
- **Enter** submits, **Shift+Enter** inserts newline
- Rich-text formatting disabled (Cmd+B/I/U blocked)
- Paste sanitization (strips formatting, inserts plain text)
- Undo/redo via HistoryPlugin
- Imperative ref API: `insertText()`, `clear()`, `focus()`, `getValue()`
### Updated components
- **`AgentChatInput.tsx`** — Swapped `<TextareaAutosize>` for
`<ChatMessageInput>`. Moved from controlled `value`/`onChange` to
ref-based pattern with `initialValue`/`onContentChange`.
- **`AgentDetail.tsx`** — Updated to use `useRef` for input value
tracking and `editorInitialValue` state for editor resets (edit/cancel
flows).
- **`AgentsPage.tsx`** — Updated to use `useRef` + `initialValue`
pattern.
- **`AgentChatInput.stories.tsx`** — Updated prop names.
### Why Lexical?
This lays the groundwork for features that a native `<textarea>` can't
support:
- Ghost text / inline autocomplete suggestions
- @-mentions and slash commands
- Programmatic text insertion (e.g. from speech-to-text)
- Custom inline decorators (chips, pills, badges)
- Syntax-highlighted code blocks
No adornments are added in this PR — it's a drop-in replacement that
matches existing behavior.
---------
Co-authored-by: Coder <coder@coder.com>
When hovering over a running/pending chat in the agents sidebar, the
spinning status icon was being replaced by the expand/collapse chevron
button. This was disorienting because the spinner conveys important "in
progress" state.
## Changes
**`AgentsSidebar.tsx`**:
- Added `group/icon` scoped hover group to the icon container div
- When a chat is executing (`pending`/`running`), the chevron toggle
only appears on hover of the icon area itself, not the entire row
- Non-executing chats retain the original whole-row hover behavior (no
UX change)
**`AgentsSidebar.stories.tsx`**:
- Added `RunningChatPreservesSpinner` story verifying the spinner is
present and the toggle button starts invisible for running chats with
children
Co-authored-by: Coder <coder@users.noreply.github.com>
## Problem
The `agentproc` process manager spawns processes with only
`os.Environ()`, missing agent-level environment variables like
`GIT_ASKPASS`, `CODER_*`, and `GIT_SSH_COMMAND` that are injected by the
agent's `updateCommandEnv` function. This means processes started
through the HTTP process API (used by chat tools) cannot authenticate
git operations via the Coder gitaskpass helper.
By contrast, SSH sessions get the full agent environment because the SSH
server calls `updateCommandEnv` via its `UpdateEnv` config hook.
## Fix
Wire the agent's `updateCommandEnv` hook into the process manager so all
spawned processes receive the full agent environment. The hook is:
- Passed as a parameter through `NewAPI` → `newManager`
- Called in `manager.start()` with `os.Environ()` as the base, producing
the same enriched env that SSH sessions get
- Gracefully falls back to `os.Environ()` if the hook returns an error
Request-level env vars (`req.Env`, set by chat tools) are still appended
last and take precedence.
## Changes
- `agent/agentproc/process.go`: Add `updateEnv` field to manager, call
it when building process env
- `agent/agentproc/api.go`: Accept `updateEnv` parameter in `NewAPI`
- `agent/agent.go`: Pass `a.updateCommandEnv` when creating the process
API
- `agent/agentproc/api_test.go`: Add `UpdateEnvHook` and
`UpdateEnvHookOverriddenByReqEnv` tests
Co-authored-by: Coder <coder@coder.com>
## Problem
Chat titles sometimes don't update in the UI. The generated AI title
gets stuck as the fallback (first 6 words of the message) even though
the backend successfully generates a proper title.
## Root Causes
### 1. Cancelable context used during cleanup DB read (P0)
In `processChat`, the deferred cleanup re-reads the chat from the DB to
pick up the AI-generated title for the `status_change` pubsub event. But
it used the cancelable `ctx` instead of `cleanupCtx`:
```go
// Before — ctx may already be canceled here
if freshChat, readErr := p.db.GetChatByID(ctx, chat.ID); readErr == nil {
```
When the context is canceled, the DB read fails silently and the
`status_change` event carries the stale fallback title.
### 2. Title goroutine not tracked by inflight WaitGroup (P2)
The `maybeGenerateChatTitle` goroutine was fire-and-forget — not tracked
by `p.inflight`. During graceful shutdown, the server could exit before
the goroutine completes its DB write or pubsub publish.
### 3. No recovery when watchChats() WebSocket misses events
The frontend relies entirely on the `watchChats()` SSE connection for
title updates. If the connection drops or misses events, titles never
recover — the only fix was a full page reload.
## Fixes
1. **Use `cleanupCtx`** for the `GetChatByID` call and logger in the
deferred cleanup block.
2. **Track the title goroutine** with `p.inflight.Add(1)` / `defer
p.inflight.Done()` so shutdown waits for it.
3. **Invalidate chats query** on WebSocket open/close/error events so
missed updates are recovered via refetch. Also enable
`refetchOnWindowFocus` for the chats query.
Co-authored-by: Coder <coder@users.noreply.github.com>
When a chatd server shuts down (`Close()`), the server context is
canceled. Previously, in-flight chats would be marked as `error` because
the `context.Canceled` error was not distinguished from actual
processing failures.
This adds `isShutdownCancellation()` to detect when the error is caused
by the server context being canceled (as opposed to a chat-specific
cancellation like `ErrInterrupted`). When detected, the chat status is
set to `pending` with no `last_error`, allowing another replica to pick
it up and retry.
Extracted from #22440 — only the context cancellation bug fix, no
chattest changes.
Inspired by openai/codex's `apply_patch` implementation, this changes
the `edit_files` search-and-replace to use a cascading match strategy
when the exact search string isn't found:
1. **Exact substring match** (byte-for-byte) — existing behavior,
unchanged
2. **Line-by-line match ignoring trailing whitespace** — handles
trailing spaces/tabs the LLM omits
3. **Line-by-line match ignoring all leading/trailing whitespace** —
handles tabs-vs-spaces and wrong indentation depth
## Problem
When the chat agent uses `edit_files`, it generates a search string that
must match the file content exactly. LLMs frequently get whitespace
wrong:
- Emitting spaces when the file uses tabs (or vice versa)
- Getting the indentation depth wrong by one or more levels
- Omitting trailing whitespace that exists in the file
When this happens, the edit silently does nothing, and the agent falls
into a retry loop using `cat -A` to diagnose the exact whitespace
characters.
## Solution
Adopted the same cascading fuzzy match strategy that [openai/codex uses
in
`seek_sequence.rs`](https://github.com/openai/codex/blob/main/codex-rs/apply-patch/src/seek_sequence.rs):
- Pass 1: exact match (existing behavior)
- Pass 2: `TrimRight` each line before comparing (trailing whitespace
tolerance)
- Pass 3: `TrimSpace` each line before comparing (full indentation
tolerance)
When a fuzzy match is found, the matched lines in the original file are
replaced with the replacement text. This preserves surrounding content
exactly.
## Changes
- `agent/agentfiles/files.go`: Replaced `icholy/replace` streaming
transformer with in-memory `fuzzyReplace` + helper functions
(`seekLines`, `spliceLines`)
- `agent/agentfiles/files_test.go`: Added 6 new test cases covering
trailing whitespace, tabs-vs-spaces, different indent depths, exact
match preference, no-match behavior, and mixed whitespace multiline
edits
- Removed `icholy/replace` dependency from go.mod/go.sum
---------
Co-authored-by: Kyle Carberry <kylecarbs@users.noreply.github.com>
The in-memory stream buffer accumulated message-part events for the
entire duration of a chat run. Late-joining subscribers received all
buffered parts even though the backing messages had already been
committed to the database, wasting memory and potentially duplicating
content.
Clear the buffer at the end of each `persistStep` call so that only
in-flight (uncommitted) parts remain in the buffer.
## Summary
Remove the `workspace_agent_id` column from the `chats` table and
dynamically look up the first workspace agent instead.
## Problem
When a workspace is stopped and restarted, the workspace agent gets a
new ID. The `workspace_agent_id` stored on the chat at creation time
becomes stale, making the agent unreachable. This caused chats to break
after workspace restarts.
## Solution
Instead of persisting the agent ID, dynamically look up the first agent
from the workspace's latest build via
`GetWorkspaceAgentsInLatestBuildByWorkspaceID` whenever an agent
connection is needed. The `workspace_id` on the chat remains stable
across restarts.
This behavior may be refined later (e.g., agent selection heuristics),
but picking the first agent resolves the immediate breakage.
## Changes
- **Migration 000425**: Drop `workspace_agent_id` column from `chats`
- **SQL queries**: Remove `workspace_agent_id` from `InsertChat` and
`UpdateChatWorkspace`
- **chatd.go**: `getWorkspaceConn` and `resolveInstructions` now look up
agents dynamically from workspace ID
- **chatd.go**: Remove `refreshChatWorkspaceSnapshot` (no longer needed)
- **createworkspace.go**: Stop persisting agent ID when associating
workspace with chat
- **subagent.go**: Stop passing agent ID to child chats
- **SDK/frontend**: Remove `WorkspaceAgentID` / `workspace_agent_id`
from Chat type
---------
Co-authored-by: Kyle Carberry <kylecarbs@gmail.com>
Two changes:
1. **Gate subagent tools behind `!chat.ParentChatID.Valid`** so child
agents never receive `spawn_agent`, `wait_agent`, `message_agent`, or
`close_agent`. Previously all 4 tools were given to every chat.
`spawn_agent` would fail at runtime ("delegated chats cannot create
child subagents") but the other 3 had no guard at all — meaning a child
could theoretically operate on sibling chats. Removing the tools
entirely is cleaner and saves context window.
2. **Rewrite tool descriptions to explain *when* to use them**, not just
what they do. `spawn_agent` now says to use it for clearly scoped,
independent, self-contained tasks (e.g. fixing a specific bug, writing a
single module, running a migration) and explicitly says *not* to use it
for simple operations you can handle with
`execute`/`read_file`/`write_file`. It also states that child agents
cannot spawn their own subagents. The other 3 tools get similar
guidance-oriented descriptions.
Co-authored-by: Coder <coder@users.noreply.github.com>
The shimmer component has an infinitely repeating animation that causes
Chromatic snapshot diffs on every run. Adding `data-chromatic="ignore"`
to prevent false positives, consistent with how other animated
components in the codebase handle this (e.g. `Spinner`, `Alert`,
`SyntaxHighlighter`).
Co-authored-by: Coder <coder@users.noreply.github.com>
## Summary
Fixes four frontend↔backend discrepancies in chat stream state
management that could cause duplicate content, UI flicker, and stale
stream state.
### Backend fixes (`coderd/chatd/chatd.go`)
**1. No-pubsub path double-replayed message_part events**
`Subscribe()` built an `initialSnapshot` containing `message_part`
events from `localSnapshot`, then the no-pubsub goroutine replayed the
same `localSnapshot` into the `mergedEvents` channel. Since `streamChat`
sends the snapshot first then reads the channel, the frontend received
every `message_part` twice. `applyMessagePartToStreamState` doesn't
deduplicate — text gets concatenated, so content appeared doubled.
Fix: Only forward live `localParts` in the no-pubsub goroutine; the
snapshot already contains the historical events.
**2. Snapshot missing status event**
The initial snapshot never included a `status` event. The frontend's
`shouldApplyMessagePart()` gates on status (`pending`/`waiting`), but
the initial status came from a separate REST query via `useEffect`.
During the race window between snapshot arrival and REST resolution,
`message_part` events could be incorrectly accepted or rejected.
Fix: Prepend a `status` event to the snapshot after loading the chat
from DB, so the frontend has the authoritative status from the very
first batch.
### Frontend fixes (`ChatContext.ts`)
**3. Scheduled stream reset not canceled by subsequent message_parts**
When a `message` event arrived, `scheduleStreamReset()` queued
`clearStreamState` via `requestAnimationFrame`. If new `message_part`
events arrived in the next WebSocket frame before the rAF fired, they
were pushed to `pendingMessageParts` without canceling the scheduled
reset. The rAF would fire between frames, clearing stream state, then
the next flush would re-populate it — causing a visible flash.
Fix: Call `cancelScheduledStreamReset()` when accumulating
`message_part` events.
**4. startTransition race with synchronous clearStreamState**
`flushMessageParts` wrapped `applyMessageParts` in `startTransition`,
which React can defer. If a `status: "waiting"` event arrived in the
same batch after `message_part` events, the status handler cleared
stream state synchronously, but the deferred `applyMessageParts`
callback could fire afterward and re-populate it.
Fix: Re-check `shouldApplyMessagePart()` inside the `startTransition`
callback at execution time.
### Tests added
- **Go**: `TestSubscribeSnapshotIncludesStatusEvent` — asserts the first
snapshot event is a status event
- **Go**: `TestSubscribeNoPubsubNoDuplicateMessageParts` — asserts the
events channel doesn't replay snapshot events
- **TS**: `cancels scheduled stream reset when message_part arrives
after message` — verifies stream state survives a [message,
message_part] batch
- **TS**: `does not apply message parts after status changes to waiting`
— verifies deferred applyMessageParts respects status transitions
## Summary
Adds a new agent-side process management HTTP API and rewrites the chat
execute tool to use it instead of SSH sessions.
## What changed
### New agent/agentproc/ package
- **headtail.go** — Thread-safe io.Writer with bounded memory (16KB head
+ 16KB tail ring buffer). Provides LLM-ready output with truncation
metadata and long-line truncation at 2048 bytes.
- **headtail_test.go** — 16 tests including race detector coverage for
concurrent writes.
- **process.go** — Manager + Process types for lifecycle management
using agentexec.Execer for proper OOM/nice scores.
- **api.go** — HTTP API following the agentfiles chi router pattern. 4
endpoints: start, list, output, signal.
### Agent wiring (agent/agent.go, agent/api.go)
Mounts the process API at /api/v0/processes, mirroring how agentfiles is
mounted.
### SDK (codersdk/workspacesdk/agentconn.go)
4 new AgentConn interface methods + 7 request/response types:
- StartProcess, ListProcesses, ProcessOutput, SignalProcess
### Execute tool rewrite (coderd/chatd/chattool/execute.go)
- SSH to Agent API: conn.StartProcess() + conn.ProcessOutput() polling
- New parameters: workdir, run_in_background
- Structured response: success, exit_code, wall_duration_ms, error,
truncated, note, background_process_id
- Non-interactive env vars: GIT_EDITOR=true, TERM=dumb, NO_COLOR=1,
PAGER=cat, etc.
- Output truncation: HeadTailBuffer caps at 32KB for LLM consumption
- File-dump detection with advisory notes suggesting read_file
- Default timeout: 60s to 10s
- Foreground polling: 200ms intervals until exit or timeout
## Architecture
State lives on the agent, surviving coderd failover and instance
changes. Any coderd replica can query any agent via HTTP over tailnet.
Adds a nullable `last_error` column to the `chats` table so error
reasons survive page reloads.
**Backend:**
- Migration adds `last_error TEXT` (nullable) to chats
- `UpdateChatStatus` writes the error reason when status transitions to
`error`, clears it (NULL) on recovery
- `convertChat` maps `sql.NullString` to `*string` in the SDK
**Frontend:**
- Sidebar falls back to `chat.last_error` when no stream error reason is
cached
- Chat detail page does the same for `persistedErrorReason`
- Fixtures updated for new required field
Replaces the hand-rolled LCS diffing in `buildEditDiff` and the
manual patch-string assembly in `buildWriteFileDiff` with
[`Diff.createPatch()`](https://www.npmjs.com/package/diff) from the
`diff` npm package.
Both functions now just call `Diff.createPatch()` and feed the result
straight into `parsePatchFiles()`, removing all the manual line
splitting, prefix tagging, hunk-header arithmetic, and trailing-newline
cleanup.
### Changes
- Add `diff` as a dependency
- `buildWriteFileDiff`: replaced ~20 lines of manual patch assembly
with a single `Diff.createPatch()` call
- `buildEditDiff`: replaced ~60 lines (line splitting, `Diff.diffLines`
→ prefixed strings, hunk counting) with a `Diff.createPatch()` call
per edit
- Removed the `chunkLines` helper and the `diffLines` wrapper +
its test block
Net: +21 / -157 lines across source and tests.
The diff view on the `/agents` page had no way to handle lines wider
than the panel. The `@pierre/diffs` library supports an `overflow`
option — switching it from `"scroll"` (the shared default) to `"wrap"`
for the side panel makes long lines wrap naturally instead of being
clipped.
Also adds a long import line to the Storybook sample diff so the
wrapping behavior is easy to verify visually.
## Summary
Adds a typed-confirmation step before deleting a deployment license to
reduce accidental removals.
<img width="457" height="440" alt="Screenshot 2026-02-13 at 15 31 58"
src="https://github.com/user-attachments/assets/b13320a7-4b10-43fa-ab01-56f3284435b6"
/>
## Changes
- Swapped the license removal dialog from `ConfirmDialog` to
`DeleteDialog`, requiring the admin to type the license ID before
enabling **Remove**.
- Added interaction coverage to verify the confirmation guard.
TemplateVersionEditorPage tests have been flaking since I ported them to
vitest in 99a4ecd. Turns out our test timeout on jest is 20s (presumably
for these sorts of page-level journey tests). I kinda like the current
5s timeout as it forces us to write speedy tests, but I think in this
case it's unavoidable and makes sense to lengthen the timeout just for
these tests.
Hopefully fixescoder/internal#1369
You may want the whitespaceless diff here:
https://github.com/coder/coder/pull/22412/changes?w=1
## Summary
Adds a new `diff_status_change` event kind to the `/chats/watch` pubsub
stream so the sidebar can update diff status (PR created, files changed,
branch info) without a full page reload.
### Problem
When a chat's diff status changes (e.g. PR created via GitHub, git
branch pushed), the sidebar didn't update because:
1. The backend `publishChatPubsubEvent` didn't include diff status data
2. The frontend watch handler only merged `status`, `title`, and
`updated_at` from events
### Solution
A **notify-only** approach: a new `ChatEventKindDiffStatusChange` event
kind tells the frontend "diff status changed for chat X" — the frontend
then invalidates the relevant React Query cache entries to re-fetch.
### Backend changes
- **`coderd/pubsub/chatevent.go`**: New `ChatEventKindDiffStatusChange =
"diff_status_change"` constant
- **`coderd/chatd/chatd.go`**: New `PublishDiffStatusChange(ctx,
chatID)` method on `Server`
- **`coderd/chats.go`**: New `publishChatDiffStatusEvent` helper.
Published from:
- `refreshWorkspaceChatDiffStatuses` — after each chat's diff status is
refreshed via GitHub API
- `storeChatGitRef` — after persisting git branch/origin info from
workspace agent
### Frontend changes
- **`AgentsPage.tsx`**: Handle `diff_status_change` event by
invalidating `chatDiffStatusKey` and `chatDiffContentsKey` queries
- **`ChatContext.ts`**: Remove redundant diff status invalidation that
fired on every chat status change (the new event kind handles this
properly)
## Problem
When sending a message in the agent detail chat, the text lingered in
the input textarea while the HTTP POST round-tripped to the server. Only
after the server responded did the input clear and the message appear in
the timeline (via WebSocket). This created a noticeable delay where the
user couldn't start typing their next message.
## Solution
**Optimistic input clear** (`AgentChatInput.tsx`):
- Clear the textarea and editing state *immediately* on submit, before
awaiting the network call.
- Capture the input text beforehand so it can be restored in the `catch`
block if the request fails.
**Optimistic user bubble** (`AgentDetail.tsx`):
- Inject a temporary `ChatMessage` (with a negative ID) into the chat
store so the user's message bubble appears in the timeline instantly.
- Set chat status to `pending` and clear stream state, mirroring the
existing edit-message path.
- On error, roll back: remove the optimistic message and restore the
previous chat status.
The real message arrives via the WebSocket stream and
`upsertDurableMessage` replaces the optimistic entry naturally (the
server message has a positive ID, so it's inserted alongside; the
optimistic negative-ID message gets cleaned up when `replaceMessages` is
called with the authoritative message list from the next query
invalidation).
## Testing
- Type a message and press Enter — input clears and bubble appears
immediately.
- Simulate a network error — input text is restored, optimistic bubble
is removed.
- Edit an existing message — unchanged behavior (already had optimistic
updates).
- Queue a message while streaming — unchanged behavior.
Adds two keyboard shortcuts to the agents page:
- **Escape** — Interrupts the running agent when viewing a chat detail
page. Only fires when focus is outside text inputs/textareas so it
doesn't conflict with the existing edit-cancel Escape handler in the
chat input.
- **Ctrl+N / Cmd+N** — Navigates to create a new agent. Also skipped
when focus is in a text input/textarea.
Both keybindings are implemented in a new `useAgentsPageKeybindings.ts`
hook file:
- `useAgentsPageKeybindings` — used in `AgentsPage.tsx` for Ctrl+N
- `useAgentDetailKeybindings` — used in `AgentDetail.tsx` for Escape →
interrupt
## Summary
The UI has always labeled the action as "Archive agent" but the backend
was performing a hard `DELETE`, permanently destroying chats and all
their messages.
This change replaces the hard delete with a soft archive, consistent
with the pattern used by template versions.
## Changes
### Database
- **Migration 000423**: Add `archived boolean DEFAULT false NOT NULL`
column to `chats` table
- Replace `DeleteChatByID` query with `ArchiveChatByID` (`UPDATE SET
archived = true`)
- Add `UnarchiveChatByID` query (`UPDATE SET archived = false`)
- Filter archived chats from `GetChatsByOwnerID` (`WHERE archived =
false`)
### API
- Remove `DELETE /api/experimental/chats/{chat}`
- Add `POST /api/experimental/chats/{chat}/archive` — archives a chat
and all its descendants
- Add `POST /api/experimental/chats/{chat}/unarchive` — unarchives a
single chat (API only, no UI yet)
### Backend
- `archiveChatTree()` recursively archives child chats (replaces
`deleteChatTree()` which hard-deleted)
- Chat daemon's `ArchiveChat()` archives the full chat tree in a
transaction
- Authorization uses `ActionUpdate` instead of `ActionDelete`
### SDK
- Replace `DeleteChat()` with `ArchiveChat()` and `UnarchiveChat()`
- Add `Archived` field to `Chat` struct
### Frontend
- `archiveChat` API call uses `POST .../archive` instead of `DELETE`
- No UI changes — the "Archive agent" button now actually archives
instead of deleting
## Design Decision
This follows the **template version archive pattern** (Pattern B in the
codebase):
- `archived boolean` column (not `deleted boolean`)
- Dedicated `POST .../archive` and `POST .../unarchive` routes (not
repurposing `DELETE`)
- Reversible — users can unarchive via the API (UI for this will come
later)
## Problem
`resolveChatGitHubAccessToken` reads the `OAuthAccessToken` directly
from the database without refreshing it. When the token expires, GitHub
returns "bad credentials" and the chat diff features break.
## Fix
Call `config.RefreshToken()` before returning the token — the same code
path used by `provisionerdserver` when handing tokens to provisioners.
- Builds a map of provider ID → `*externalauth.Config` during the
existing config iteration
- After fetching the `ExternalAuthLink` from the DB, calls
`cfg.RefreshToken()` if a matching config exists
- On refresh failure, falls through to the existing token (GitHub tokens
without expiry still work) with a debug log
## Problem
Context compaction in chatd persisted durable messages for the
`chat_summarized` tool call and result via `publishMessage`, but never
published `message_part` streaming events via `publishMessagePart`. This
meant connected clients had no streaming representation of the
compaction.
The client's `streamState` (built entirely from `message_part` events in
`streamState.ts`) never saw the compaction tool call, so:
- No **"Summarizing..."** running state was shown to the user during
summary generation (which can take up to 90s).
- The durable `message` events arrived after or interleaved with the
`status: waiting` event, causing the tool to appear as "Summarized" with
the chat appearing to just stop.
## Fix
### 1. `CompactionOptions.OnStart` callback (chatloop)
Added an `OnStart` callback to `CompactionOptions`, called in
`maybeCompact` right before `generateCompactionSummary` (the slow LLM
call). This gives `chatd` a hook to publish the tool-call `message_part`
immediately when compaction begins.
### 2. Tool-result streaming part (chatd)
`persistChatContextSummary` now publishes a tool-result `message_part`
before the durable `message` events, so clients transition from
"Summarizing..." to "Summarized" before the status change arrives.
### Event ordering is now:
1. `message_part` (tool call via `OnStart`) — client shows
"Summarizing..."
2. LLM generates summary (up to 90s)
3. `message_part` (tool result) — client shows "Summarized" in stream
state
4. `message` (assistant) — durable message persisted, stream state
resets
5. `message` (tool) — durable tool result persisted
6. `status: waiting` — chat transitions to idle
## Tests
- **`OnStartFiresBeforePersist`**: Verifies callback ordering is
`on_start` → `generate` → `persist`.
- **`OnStartNotCalledBelowThreshold`**: Verifies `OnStart` is not called
when context usage is below the compaction threshold.
## Problem
The `update workspace, new required, mutable parameter added` e2e test
has been flaking consistently
([internal#1328](https://github.com/coder/internal/issues/1328)). The
error:
```
Error: Timed out 5000ms waiting for expect(locator).toHaveValue(expected)
Locator: getByTestId('parameter-field-Sixth parameter').locator('input')
Expected string: "99"
Received string: ""
```
## Root Cause
A race between page navigation and data hydration in `verifyParameters`:
1. The page navigates with `waitUntil: "domcontentloaded"` which does
not wait for API responses to settle
2. React Query may serve stale cached workspace data initially (from
before the update), causing the form to render with empty/old parameter
values
3. The `toHaveValue` assertion uses the default `actionTimeout` of
5000ms which isn't enough time for fresh data to arrive and the form to
re-render
## Fix
- Switch `verifyParameters` navigation to `waitUntil: "networkidle"` to
ensure API responses (workspace data, build parameters) are settled
before the form renders
- Increase the `toHaveValue` timeout to 15s to handle cases where
dynamic parameters hydrate slowly after initial render
Fixescoder/internal#1328
---------
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
## Problem
When switching between chats on the agents page, stream parts could be
lost or applied to the wrong chat due to several race conditions in
`ChatContext.ts`:
1. **`startTransition` deferred parts escape cleanup** —
`startTransition(() => store.applyMessageParts(parts))` defers the state
update. If a chat switch happens between `flushMessageParts` being
called and the transition executing, old-chat parts could apply after
`resetTransientState()` has already cleared stream state for the new
chat.
2. **`message` event has no `chat_id` filter** — Unlike `message_part`,
`queue_update`, and `status` events, the `message` event handler did not
check `streamEvent.chat_id`. While the server-scoped WebSocket makes
this safe in practice, it's an inconsistency in defensive programming.
3. **Brief stale message window on switch** — Between `chatID` changing
and `replaceMessages()` firing (after the query resolves), the store
held old-chat messages while the new WebSocket was already connected.
## Changes
### `ChatContext.ts`
- Added `activeChatIDRef` to track the currently active chat ID
- Guard `startTransition` callback: check `activeChatIDRef` before
applying message parts, discarding them if the chat has switched
- Added `chat_id` filter to `message` event handler, matching the
pattern used by all other event types
- Added `store.replaceMessages([])` to the chatID-change effect so
messages are cleared immediately on switch
### `ChatContext.test.tsx`
Four new tests covering the chat-switch lifecycle:
- WebSocket closure and state reset when chatID changes
- `message` event filtering by `chat_id`
- `startTransition` deferred parts discarded after switch
- Messages cleared immediately before new query resolves
All 13 tests pass (8 existing + 4 new + 1 existing).
## Problem
Non-admin users of the Agents (chat) feature send `model_config_id:
"00000000-0000-0000-0000-000000000000"` (nil UUID) when creating chats,
because the `GET /api/experimental/chats/model-configs` endpoint
requires `policy.ActionRead` on `rbac.ResourceDeploymentConfig`, which
is only granted to admins.
The flow:
1. `AgentsPage.tsx` calls `useQuery(chatModelConfigs())` → hits
`listChatModelConfigs`
2. Non-admin users get a **403 Forbidden** response
3. `chatModelConfigsQuery.data` is `undefined`, so the
`modelConfigIDByModelID` map is empty
4. `handleCreateChat` falls back to `nilUUID` for `model_config_id`
5. The backend rejects the nil UUID: `"Invalid model config ID."`
## Fix
Changed `listChatModelConfigs` to allow all authenticated users to read
model configs:
- **Admin users** continue to see all configs (including disabled ones)
for management via `GetChatModelConfigs`
- **Non-admin users** now see only enabled configs via
`GetEnabledChatModelConfigs` with a system context, which is sufficient
for using the chat feature
This follows the same pattern as `listChatModels`, which already uses
`dbauthz.AsSystemRestricted(ctx)` to allow all authenticated users to
see available models.
Write endpoints (create/update/delete) retain their existing
`ResourceDeploymentConfig` authorization.
## Testing
- Updated `TestListChatModelConfigs/ForbiddenForOrganizationMember` →
`SuccessForOrganizationMember` to verify non-admin users can list
enabled model configs
- All existing chat tests continue to pass
## Problem
When coderd instances are redeployed (e.g. rolling deployment on
dogfood), in-flight chats get stuck in `running` status permanently. The
UI shows them as "thinking" with a spinning indicator, but no worker is
actually processing them. They never error or resume.
## Root Cause
Two bugs combine to cause this:
### Bug 1: Shutdown cleanup uses a canceled context
The `processChat` defer block updates the chat status in the DB when
processing completes. But it uses `ctx`, which `Close()` cancels
*before* the defer runs. The DB transaction silently fails with
`context.Canceled`, leaving the chat in `status=running` with a dead
`worker_id`.
```go
// Close() calls p.cancel() which cancels ctx
// Then the defer tries to use the now-canceled ctx:
defer func() {
err := p.db.InTx(func(tx database.Store) error {
tx.GetChatByIDForUpdate(ctx, chat.ID) // FAILS
tx.UpdateChatStatus(ctx, ...) // FAILS
}, nil)
}()
```
### Bug 2: Stale recovery runs only once at startup
`recoverStaleChats()` was called only once in `start()`, not
periodically. During a rolling deployment, the new instance starts while
the old one is still alive (fresh heartbeat). By the time the old
instance crashes, no one checks again.
## Fix
1. **Use `context.WithoutCancel(ctx)` in the processChat defer** — the
cleanup transaction now completes even during graceful shutdown.
2. **Run `recoverStaleChats` periodically** — a second ticker in the
`start()` loop checks for stale chats at `inFlightChatStaleAfter / 5`
intervals (default: every 1 minute). This catches orphaned chats even
when the instance that owns them crashes without clean shutdown.
## Tests
- `TestRecoverStaleChatsPeriodically` — Verifies chats orphaned *after*
startup are recovered by the periodic loop (not just the startup check).
- `TestNewReplicaRecoversStaleChatFromDeadReplica` — Verifies a new
replica recovers stale chats on startup.
- `TestWaitingChatsAreNotRecoveredAsStale` — Negative test: `waiting`
chats are not incorrectly modified by recovery.
## Problem
The git diff on the `/agents` page had color issues: the editor
background followed light mode but the syntax highlighting used dark
mode (`github-dark-high-contrast`), and the filename header used
light-colored text on a light background.
The root cause was hardcoded dark theme options in the `FileDiff`
component:
```tsx
themeType: "dark",
theme: "github-dark-high-contrast",
```
## Fix
Uses the same theme-aware pattern as every other diff/file viewer in the
codebase (`WriteFileTool`, `EditFilesTool`, `ReadFileTool`, `Tool`,
`response.tsx`):
1. `useTheme()` from `@emotion/react` to read `palette.mode`
2. `getDiffViewerOptions(isDark)` from the shared `utils.ts` module —
returns `github-light` theme for light mode, `github-dark-high-contrast`
for dark mode
3. Reuses `DIFFS_FONT_STYLE` and `diffViewerCSS` constants instead of
inlining duplicates
## Storybook coverage
Added four new stories with real unified diff content:
- **WithDiffDark** — dark mode with a PR link
- **WithDiffLight** — light mode with a PR link
- **NoPullRequestDark** — dark mode, "Files Changed" header
- **NoPullRequestLight** — light mode, "Files Changed" header
The existing stories only covered empty and parse-error states with no
rendered diff.
## Summary
Adds a new line-based file reading endpoint to the workspace agent,
replacing the unbounded byte-based approach for the `read_file` chat
tool and `coder_workspace_read_file` MCP tool.
**Problem**: The current `read_file` tool returns the entire file
contents with no limits, which can blow up LLM context windows and cause
OOM issues with large files.
**Solution**: Inspired by [`coder/mux`](https://github.com/coder/mux)
and [`openai/codex`](https://github.com/openai/codex), implement a
line-based reader with safety limits.
## Changes
### Agent (`agent/agentfiles/`)
- New `/read-file-lines` endpoint with `HandleReadFileLines` handler
- Line-based `offset` (1-based line number, default: 1) and `limit`
(line count, default: 2000)
- Safety constants:
| Constant | Value | Purpose |
|---|---|---|
| `MaxFileSize` | 1 MB | Reject files larger than this at stat |
| `MaxLineBytes` | 1,024 | Per-line truncation with `... [truncated]`
marker |
| `MaxResponseLines` | 2,000 | Max lines per response |
| `MaxResponseBytes` | 32 KB | Max total response size |
| `DefaultLineLimit` | 2,000 | Default when no limit specified |
- Line numbering format: `1\tcontent` (tab-separated)
- Structured JSON response: `{ success, file_size, total_lines,
lines_read, content, error }`
- Hard errors when limits exceeded — tells the LLM to use
`offset`/`limit`
- Existing byte-based `/read-file` endpoint preserved (used by
`instruction.go`)
### SDK (`codersdk/workspacesdk/`)
- `ReadFileLinesResponse` type added
- `ReadFileLines` method added to `AgentConn` interface
- Mock regenerated
### Chat tool (`coderd/chatd/chattool/`)
- `read_file` tool now uses `conn.ReadFileLines()` instead of
`conn.ReadFile()`
- Updated tool description to document line-based parameters
- Response includes `file_size`, `total_lines`, `lines_read` metadata
### MCP tool (`codersdk/toolsdk/`)
- `coder_workspace_read_file` updated to use line-based reading
- Schema descriptions updated for line-based offset/limit
- Removed `maxFileLimit` constant (agent handles limits now)
### Tests
- 13 new test cases for `TestReadFileLines`:
- Path validation (empty, relative, non-existent, directory, no
permissions)
- Empty file handling
- Basic read, offset, limit, offset+limit combinations
- Offset beyond file length
- Long line truncation (>1024 bytes)
- Large file rejection (>1MB)
- All existing tests pass unchanged
## Design decisions
| Decision | Rationale |
|---|---|
| Line-based, not byte-based | Both coder/mux and openai/codex use
line-based — matches how LLMs reason about code |
| Default limit of 2000 | Matches codex; prevents accidental full-file
dumps while being generous |
| 32 KB response cap | Compromise between mux (16 KB) and codex (no cap)
|
| 1024 byte/line truncation with marker | More generous than codex
(500), marker helps LLM know data is missing |
| Hard errors on overflow | Matches mux; forces LLM to paginate rather
than getting partial data |
| Preserve byte-based endpoint | `instruction.go` needs raw byte access
for AGENTS.md |
## Problem
Chat titles revert to the fallback truncated title after briefly showing
the AI-generated title. Even reloading the page doesn't help — the
correct title flashes then gets overwritten.
## Root Cause
Single bug, two symptoms.
In `processChat` (`coderd/chatd/chatd.go`), the `chat` variable is
passed by value. The flow:
1. `processChat(ctx, chat)` receives `chat` with the initial fallback
title (truncated first message).
2. Inside `runChat`, `maybeGenerateChatTitle` generates an AI title,
writes it to the DB via `UpdateChatByID`, and publishes a `title_change`
event. **The DB has the correct title.** The client briefly displays it.
3. `runChat` returns. The **deferred cleanup** in `processChat`
publishes `publishChatPubsubEvent(chat, StatusChange)` — but `chat` here
is the original value copy that still has the **old fallback title**.
4. The frontend receives the `status_change` SSE event and
**unconditionally applies `title` from every event kind** (see
`AgentsPage.tsx` line ~305: `title: updatedChat.title`). This overwrites
the correct AI title with the stale fallback.
**Why reload doesn't help:** If the chat is still processing when the
page reloads, `listChats` loads the correct title from the DB, but then
the deferred `status_change` event arrives moments later and clobbers
it. The title was always in the DB — it was the pubsub event that kept
overwriting it.
## Fix
Re-read the chat from the database in the deferred cleanup before
publishing the final `status_change` event, so it carries the current
(AI-generated) title.
When navigating to a specific agent on the Agents page, the browser tab
title now reflects the agent's chat title (e.g. `Fix login bug - Agents
- Coder`). When the title hasn't loaded yet or when navigating away, it
falls back to `Agents - Coder`.
**Changes:**
- Added a `useEffect` in `AgentDetail` that sets `document.title` via
the existing `pageTitle` utility whenever the chat title changes.
- The cleanup function resets the title back to `Agents - Coder` when
unmounting (navigating away from the agent).
When injecting system instructions into the chat prompt, include:
1. **Operating system** and **working directory** from the
`workspace_agents` table
2. **Home-level instructions** from `~/.coder/AGENTS.md` (existing
behavior)
3. **Project-level instructions** from `<pwd>/AGENTS.md` (new)
The XML tag is renamed from `<coder-home-instructions>` to
`<system-instructions>` since it now carries more than just the home
instruction file.
### Example output (both files present)
```xml
<system-instructions>
Operating System: linux
Working Directory: /home/coder/coder
Source: /home/coder/.coder/AGENTS.md
... home instructions ...
Source: /home/coder/coder/AGENTS.md
... project instructions ...
</system-instructions>
```
### Example output (no AGENTS.md files)
```xml
<system-instructions>
Operating System: linux
Working Directory: /home/coder/coder
</system-instructions>
```
### Changes
- **`coderd/chatd/instruction.go`**:
- Renamed types: `homeInstructionContext` → `agentContext`, added
`instructionFile` struct
- Extracted `readInstructionFileAtPath` shared helper
- Added `readWorkingDirectoryInstructionFile` to read `<pwd>/AGENTS.md`
- Replaced `formatHomeInstruction` with `formatInstructions` that
renders both files under `<system-instructions>`
- **`coderd/chatd/chatd.go`**:
- Renamed `resolveHomeInstruction` → `resolveInstructions`; now reads
both home and pwd instruction files
- `resolveAgentContext` returns `agentContext` (renamed from
`homeInstructionContext`)
- pwd file read is skipped gracefully if directory is empty or file
doesn't exist
- **`coderd/chatd/instruction_test.go`**:
- Added `TestReadWorkingDirectoryInstructionFile` (success, not-found,
empty-directory)
- Replaced `TestFormatHomeInstruction` with `TestFormatInstructions`
covering all combinations
- Added ordering test (`AgentContextBeforeFiles`) to verify OS/pwd
appear before file sources
## Summary
The `chattool` `list_templates` tool previously returned all templates
in a single response with no popularity signal. On deployments with many
templates (e.g. 71 on dogfood), this wastes tokens and makes it hard for
the AI to pick the right template for broad user questions.
## Changes
Single file: `coderd/chatd/chattool/listtemplates.go`
- **`page` parameter** — optional, 1-indexed, 10 results per page
- **Popularity sort** — queries
`GetWorkspaceUniqueOwnerCountByTemplateIDs` to get active developer
counts, then sorts descending (most popular first). The DB query returns
templates alphabetically, so this explicit sort is needed.
- **`active_developers`** — included on each template item so the agent
can see the signal
- **Pagination metadata** — `page`, `total_pages`, `total_count` in the
response so the agent knows there are more results
- **Updated tool description** — tells the agent that results are
ordered by popularity and paginated
## Frontend
No frontend changes needed. The renderer already reads `rec.templates`
and `rec.count` from the response — the new fields (`page`,
`total_pages`, `total_count`) are additive and safely ignored.
When switching between chats on the agents page, the scroll position was
preserved from the previous chat instead of resetting to show the most
recent messages.
## Problem
Clicking a different chat in the sidebar loaded the new chat's messages
but kept the scroll container at whatever position the user had scrolled
to in the previous chat. This meant users often landed in the middle of
a conversation instead of at the bottom where the latest messages are.
## Fix
Added a `useEffect` in `AgentDetail` that resets `scrollTop` to `0`
whenever `agentId` changes. The scroll container uses
`flex-col-reverse`, so `scrollTop = 0` corresponds to the bottom (most
recent messages).
Fixes https://github.com/coder/coder/issues/22375
Updates `stringutil.Truncate` to properly handle multi-byte UTF-8
characters.
Adds tests for multi-byte truncation with word boundary.
Created by Mux using Opus 4.6
Resolves cases where the user is entitled to AI Governance but we don't
show them the page because its not enabled. If for some reason the user
doesn't have AI Bridge enabled anymore but still wants to access the old
logs page they now can.
Furthermore, we link to the docs regardless of if they have AI Bridge
enabled, this is inline with our other settings pages.
Replaces the approach in #22061 (with a cleaner `git history`)
This now ensures that we don't attempt to cause a layout shift when the
sidebars pop-in-out of existence (when scroll locking within `radix`).
This element was receiving the provisioner key daemons and then
immediately filtering them. This lead to the default state being a table
with nothing rendered rather than the `<TableEmpty />` as we would
expect.
<img width="1133" height="608" alt="image"
src="https://github.com/user-attachments/assets/229edb00-b108-4ec3-ac2f-33633c3e5760"
/>
This previously let auditors view the page though they can't update
anything. In a different fashion to #22382 the user will be able to see
all of this as they're logged in to the application anyway, we can
simply tell them `Sorry, no access`.
setup-go has been sporadically failing to download Go, and we were advised
by a member of the Go team that downloading Go from `storage.googleapis.com`
is not guaranteed (which is what setup-go <= v5.6.0 does).
Also remove the use-preinstalled-go optimization for Windows runners.
setup-go v6 sets GOTOOLCHAIN=local, which prevents the pre-installed
Go from auto-downloading the toolchain specified in go.mod. The windows
optimization with v5 relied on GOTOOLCHAIN=auto. setup-go uses the runner
cache, which is a different caching path but should serve the same purpose.
This change adds user-facing feedback when opening apps in a new window
fails due to popup blocking, replacing a silent no-op with a clear
recovery message. It improves reliability and supportability across
app-launch flows by helping users immediately understand and fix the
issue.
This was a poor UX decision to have to reload the entire page when a
template got invalidated. Simply now we refetch the data so that things
come across way smooother.
## Description
- Updates `wsbuilder` to return a `BuildError` with
`http.StatusBadRequest` to signify a "validation error" on missing or
invalid parameters
- Adds a short-circuit in `prebuilds.StoreReconciler` to mark presets
for which creating a build returns a "validation error" as "validation
failed" and skip further attempts to reconcile.
- Adds a test to verify the above
- Introduces a new Prometheus metric
`coderd_prebuilt_workspaces_preset_validation_failed` to track the above
Closes: https://github.com/coder/coder/issues/21237
---------
Co-authored-by: Cian Johnston <cian@coder.com>
State updates from setIsPublishingDialogOpen,
setLastSuccessfulPublishedVersion, and navigation were firing after
waitFor resolved, causing sporadic act() warnings and timeouts in the
publish template version tests (or so says Claude Sonnet 4.6).
Fixescoder/internal#1369
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Related to #22367
It was pointed out to me that we actually did regress this mildly by
removing a dividing line in the changes made in #22367, I've restored
this in a better way by taking advantage of `divide-y` and wrapping this
in a proper `<div />`.
<img width="332" height="385" alt="image"
src="https://github.com/user-attachments/assets/2827a9ae-7b54-4c48-aae9-2f6e965e7f8b"
/>
Switch to asserting only on the onChange spy, which is the actual
component contract being tested. Monaco's textarea value is always empty
regardless of model content, so the toHaveValue assertions were
unreliable anyway.
Fixes the new storybook test introduced in #22202
This was a bad smell that was being addressed by the frontend. This type
was generating out to be a `nil`/`null` instead of an empty `License[]`.
Now this returns as an empty array and we can actively check if we have
no licenses with a length of `0`.
This pull-request takes our icons shown in the sidebar tree and shows
them alongside the names of the files in the `Source Code` page of our
templates.
Also does a quick de-mui of this page.
<img width="637" height="345" alt="image"
src="https://github.com/user-attachments/assets/f3013eb6-9572-4d05-a683-10bb99b4e802"
/>
Adds a brief "Structured Logging" section to the [AI Bridge
Setup](https://coder.com/docs/ai-coder/ai-bridge/setup) page documenting
the `--aibridge-structured-logging` /
`CODER_AIBRIDGE_STRUCTURED_LOGGING` flag.
Covers:
- How to enable structured logging (CLI flag, env var, YAML)
- The five `record_type` values emitted (`interception_start`,
`interception_end`, `token_usage`, `prompt_usage`, `tool_usage`) and
their key fields
- How to filter for these records in a logging pipeline
Created on behalf of @dannykopping
---------
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Fixes three bugs that caused `coder update` to always re-prompt for
multi-select (`list(string)`) parameters instead of reusing previous
build values:
1. **`isValidTemplateParameterOption` failed for multi-select values**
(`cli/parameterresolver.go`): It compared the entire JSON array string
(e.g. `["vim","emacs"]`) against individual option values, which never
matched. Now parses the JSON array and validates each element
separately.
2. **`RichParameter` ignored previous build value for multi-select**
(`cli/cliui/parameter.go`): The `list(string)` branch always used the
template's default value instead of the `defaultValue` argument (which
carries the previous build's value). Now uses `defaultValue` when
available, falling back to the template default.
3. **Pre-existing crash when `list(string)` has no default value**
(`cli/cliui/parameter.go`): `json.Unmarshal` on an empty string caused
`unexpected end of JSON input`. Now skips unmarshaling when the default
source is empty.
Fixes#19956
The sonner migration (https://github.com/coder/coder/pull/22258) shows
validation errors in both the inline form field and a toast. Scoping the
assertion to the form element avoids flaky matches against the toast.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
The Monaco editor wrapper was only calling `onChange` if the template
file has content, but we want to allow saving an empty file.
Fixes#19721
Claude was used to port tests from jest to vitest, and for the stories.
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Kayla はな <mckayla@hey.com>
Claude 3.5 Haiku (`claude-3-5-haiku-20241022`) was retired by Anthropic
on February 19th, 2026. Requests to this model now return errors.
Switch to Claude Haiku 4.5 (`claude-haiku-4-5`), which is the
[recommended
replacement](https://docs.anthropic.com/en/docs/resources/model-deprecations).
---
One-line change in `coderd/taskname/taskname.go` L25:
```diff
- defaultModel = anthropic.ModelClaude3_5HaikuLatest
+ defaultModel = anthropic.ModelClaudeHaiku4_5
```
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
The listen loop in workspaceAgentsExternalAuthListen compared
OAuthExpiry using == which compares `time.Time` internal struct fields
including the `*time.Location` pointer.
`time.LoadLocation` does not cache the returned `*Location` pointer, so
each lib/pq connection gets a distinct pointer for the same timezone.
When `pq.ParseTimestamp()` applies the connection's location to a parsed
timestamp, the resulting time.Time embeds that connection-specific
pointer. If the `sql.DB` pool hands out different connections for the
two GetExternalAuthLink reads, the identical timestamp produces
`time.Time` values where == returns false despite representing the same
instant. This is intermittent because the pool _usually_ reuses the same
connection for sequential queries.
This change uses `.Equal()` to compare instants regardless of location.
Also makes the test's validation call counter atomic to fix a possible
data race between the HTTP server and test goroutines.
Replaces our custom `<GlobalSnackbar />` (MUI Snackbar + event emitter)
with [`sonner`](https://github.com/emilkowalski/sonner). Deletes
`GlobalSnackbar/`, the custom event emitter infra, and migrates ~80
source files to `toast.success()` / `toast.error()` from `sonner`.
- ~47 error toasts now surface API error detail via
`getErrorDetail(error)` in the toast description, not just a generic
message. Coincides with #22229.
- Toast messages follow an `{Action} "{entity}" {result}.` format (e.g.
`User "alice" suspended successfully.`) since toasts persist across
navigation now.
- 17 uses of `toast.promise()` for loading → success → error lifecycle.
- Some toasts include action buttons for quick navigation (e.g. "View
task", "View template").
- Multiple toasts can stack and display simultaneously.
---------
Co-authored-by: Kayla はな <mckayla@hey.com>
This pull-request moves our baseline CSS styles from the MUI theme
(`site/src/theme/mui.ts`) definition to `index.css`. As these are global
styles they should live in one dedicated place not two.
This pull-request removes the last instance of `@mui/material/Chip` from
the codebase. And removes it from our `vite.config.mts` so we no longer
have to cache it 🙂
This pull-request implements a simple filtering logic so that we're able
to pick which model the user actually used when logs were sent to AI
Bridge.
- Add `GET /aibridge/models` API endpoint that returns distinct model
names from AI Bridge interceptions, with pagination and search support
- New `ListAIBridgeModels` SQL query using case-sensitive prefix
matching (`LIKE model || '%'`) to allow B-tree index usage
- Hand-written `ListAuthorizedAIBridgeModels` in `modelqueries.go` for
RBAC authorization filter injection
- `AIBridgeModels` search query parser in searchquery/search.go
(defaults bare terms to the `model` field)
- dbauthz wrappers, dbmetrics, and dbmock implementations for the new
query
<img width="292" height="185" alt="image"
src="https://github.com/user-attachments/assets/134771df-2d26-4c54-acc4-27f58128b351"
/>
## Description
When multiple organizations have templates with the same name, the
Prometheus `/metrics` endpoint returns HTTP 500 because Prometheus
rejects duplicate label combinations. The three `coderd_insights_*`
metrics (`coderd_insights_templates_active_users`,
`coderd_insights_applications_usage_seconds`,
`coderd_insights_parameters`) used only `template_name` as a
distinguishing label, so two templates named e.g. `"openstack-v1"` in
different orgs would produce duplicate metric series.
This adds `organization_name` as a label to all three insight metric
descriptors to disambiguate templates across organizations.
## Changes
**`coderd/prometheusmetrics/insights/metricscollector.go`**:
- Added `organization_name` label to all three metric descriptors
- Added `organizationNames` field (template ID → org name) to the
`insightsData` struct
- In `doTick`: after fetching templates, collect unique org IDs, fetch
organizations via `GetOrganizations`, and build a
template-ID-to-org-name mapping
- In `Collect()`: pass the organization name as an additional label
value in every `MustNewConstMetric` call
**`coderd/prometheusmetrics/insights/testdata/insights-metrics.json`**:
Updated golden file to include `organization_name=coder` in all metric
label keys.
Fixes#21748
- Previously all tests were sharing the global http.Transport meaning on
`Close` it would close connections presumed to be idle for other tests.
fixes https://github.com/coder/internal/issues/112
Fixes#22030
## Problem
When a template has `require_active_version = true` and a workspace is
outdated, the web UI always shows "Update and start" as the **only**
button (for all users including admins), but `coder start` starts with
the old version. For admins, this silently succeeds on the stale
version. For non-admins, it goes through a clunky 403→retry path. This
also affects the VS Code extension, which calls `coder start --yes`
under the hood.
## Root Cause
`buildWorkspaceStartRequest()` in `cli/start.go` checks
`workspace.AutomaticUpdates == "always"` but ignores
`workspace.TemplateRequireActiveVersion`. The server-side autostart
already ORs both settings together:
```go
// coderd/autobuild/lifecycle_executor.go
func useActiveVersion(opts, ws) bool {
return opts.RequireActiveVersion || ws.AutomaticUpdates == "always"
}
```
The CLI was missing the `RequireActiveVersion` check.
## Fix
Add `workspace.TemplateRequireActiveVersion` to the existing OR
condition:
```go
// Before:
if workspace.AutomaticUpdates == codersdk.AutomaticUpdatesAlways || action == WorkspaceUpdate {
// After:
if workspace.AutomaticUpdates == codersdk.AutomaticUpdatesAlways || workspace.TemplateRequireActiveVersion || action == WorkspaceUpdate {
```
Now `coder start` and `coder restart` proactively use the active
template version when `require_active_version` is set, matching the web
UI and server autostart behavior. The 403→retry fallback remains as a
safety net but is no longer the primary path for any user.
## Testing
Updated `enterprise/cli/start_test.go` — all user types (owner, template
admin, ACL admin, group ACL admin, member) now expect the active version
when `require_active_version` is set, and verify the 403→retry message
does NOT appear.
When AgentAPI is configured, `WithTaskReporter` unconditionally
overrides all self-reported states to `working`. The intent was to
distrust the agent's `idle` and rely on the screen watcher, but the
override also blocks `failure` and `complete`, which only the agent can
produce (the screen watcher only knows `running`/`stable`). Tasks get
stuck as `working` or `null` forever.
Now only `idle` is overridden to `working`; `failure`, `complete`, and
`working` pass through as-is.
Also:
- Remove misplaced unconditional `"Failed to watch screen events"` log
that fired on every startup
- Add SSE reconnection with exponential backoff (1s-30s) in
`startWatcher` so it recovers from dropped connections instead of dying
silently
- Add `complete` to the `coder_report_task` tool enum, which the
`coder/claude-code` registry module already instructs agents to use but
was missing from the schema
Refs coder/internal#1350
Relates to https://github.com/coder/internal/issues/1259
Adds new database queries and telemetry collection functions to gather
task lifecycle events (pause/resume cycles, idle time) for analytics.
Task events track pause/resume activity, idle duration before pausing,
paused duration, and time from resume to first app status, filtered to
recent activity based on the telemetry snapshot interval.
🤖 Created with Mux (Opus 4.6).
## Summary
Moves expired token filtering from client-side to server-side by adding
an `include_expired` parameter to the `GetAPIKeysByLoginType` and
`GetAPIKeysByUserID` database queries. This is more efficient for large
deployments with many expired/short-lived tokens.
## Changes
- Add `include_expired` parameter to SQL queries using `OR`
short-circuit
- Add `include_expired` query parameter to `GET
/users/{user}/keys/tokens`
- Add `IncludeExpired` field to `codersdk.TokensFilter`
- Remove client-side filtering from CLI `tokens list` command
- Add `TestTokensFilterExpired` test
Fixescoder/internal#1357
<!--
If you have used AI to produce some or all of this PR, please ensure you
have read our [AI Contribution
guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING)
before submitting.
-->
part of https://github.com/coder/coder/issues/21335
This moves updating app status (used by Tasks) into the workspace agent
API over dRPC. This will allow us to update the status without having to
re-authenticate each time, like we would with an HTTP PATCH request.
Further PRs in this stack will pipe these requests thru from the CLI MCP
server to the agentsock and finally to this dRPC call to coderd.
## Problem
When a template adds a new immutable parameter, `coder update
--parameter param=value` fails with:
```
error: start workspace: parameter "machine_type" is immutable and cannot be updated
```
The interactive prompt handles this correctly (allows setting first-time
immutable params), but the CLI `--parameter` flag path does not.
## Root Cause
In `cli/parameterresolver.go`, `verifyConstraints()` runs before the
interactive prompt and unconditionally rejects any immutable parameter
during updates. It doesn't distinguish between **new** immutable
parameters (first-time use, should be allowed) and **existing** ones
(already set, should be blocked from changing).
## Fix
Added an `isFirstTimeUse` check to the immutable parameter constraint,
matching the logic already used by the interactive prompt path (line
323). New immutable parameters can now be set via `--parameter`, while
existing immutable parameters are still blocked from being changed.
## Testing
Added `TestUpdateValidateRichParameters/NewImmutableParameterViaFlag`
which:
1. Creates a workspace with a mutable parameter
2. Updates the template to add a new immutable parameter
3. Runs `coder update --parameter immutable_param=value`
4. Verifies the update succeeds and the parameter is set correctly
Fixes#22164
The provisioner state for a workspace build was being loaded for every
long-lived agent rpc connection. Since this state can be anywhere from
kilobytes to megabytes this can gradually cause the `coderd` memory
footprint to grow over time. It's also a lot of unnecessary allocations
for every query that fetches a workspace build since only a few callers
ever actually reference the provisioner state.
This PR removes it from the returned workspace build and adds a query to
fetch the provisioner state explicitly.
Adds two new icons to the icon library:
- **`anthropic.svg`** — Anthropic logo
- **`gemini-monochrome.svg`** — Gemini logo, monochrome variant
Both use `monochrome` theme handling to adapt for dark and light
backgrounds.
### Changes
- Added `anthropic.svg` and `gemini-monochrome.svg` to
`site/static/icon/`
- Registered both in `site/src/theme/icons.json` (alphabetically sorted)
- Added `monochrome` theme handling for both in
`site/src/theme/externalImages.ts`
---
Created on behalf of @tracyjohnsonux
---------
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Closes https://github.com/coder/internal/issues/1353
Does not solve the issue, but the error is currently opaque. This fails
the test when the init fails, hopefully raising up the error.
Bumps [github.com/gohugoio/hugo](https://github.com/gohugoio/hugo) from
0.155.2 to 0.156.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/gohugoio/hugo/releases">github.com/gohugoio/hugo's
releases</a>.</em></p>
<blockquote>
<h2>v0.156.0</h2>
<p>This release brings significant speedups of <a
href="https://gohugo.io/functions/collections/where/#article">collections.Where</a>
and <a
href="https://gohugo.io/functions/collections/sort/#article">collections.Sort</a>
– but this is mostly a "spring cleaning" release, to make the
API cleaner and simpler to understand/document.</p>
<h2>Deprecated</h2>
<ul>
<li>Site.AllPages is Deprecated</li>
<li>Site.BuildDrafts is Deprecated</li>
<li>Site.Languages is Deprecated</li>
<li>Site.Data is deprecated, use hugo.Data</li>
<li>Page.Sites and Site.Sites is Deprecated, use hugo.Sites</li>
</ul>
<p>See <a
href="https://discourse.gohugo.io/t/deprecations-in-v0-156-0/56732">this
topic</a> for more info.</p>
<h2>Removed</h2>
<p>These have all been deprecated at least since <code>v0.136.0</code>
and any usage have been logged as an error for a long time:</p>
<p>Template functions</p>
<ul>
<li>data.GetCSV / getCSV (use resources.GetRemote)</li>
<li>data.GetJSON / getJSON (use resources.GetRemote)</li>
<li>crypto.FNV32a (use hash.FNV32a)</li>
<li>resources.Babel (use js.Babel)</li>
<li>resources.PostCSS (use css.PostCSS)</li>
<li>resources.ToCSS (use css.Sass)</li>
</ul>
<p>Page methods:</p>
<ul>
<li>.Page.NextPage (use .Page.Next)</li>
<li>.Page.PrevPage (use .Page.Prev)</li>
</ul>
<p>Paginator:</p>
<ul>
<li>.Paginator.PageSize (use .Paginator.PagerSize)</li>
</ul>
<p>Site methods:</p>
<ul>
<li>.Site.LastChange (use .Site.Lastmod)</li>
<li>.Site.Author (use .Site.Params.Author)</li>
<li>.Site.Authors (use .Site.Params.Authors)</li>
<li>.Site.Social (use .Site.Params.Social)</li>
<li>.Site.IsMultiLingual (use hugo.IsMultilingual)</li>
<li>.Sites.First (use .Sites.Default)</li>
</ul>
<p>Site config:</p>
<ul>
<li>paginate (use pagination.pagerSize)</li>
<li>paginatePath (use pagination.path)</li>
</ul>
<p>File caches:</p>
<ul>
<li>getjson cache</li>
<li>getcsv cache</li>
</ul>
<h2>Notes</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/gohugoio/hugo/commit/9d914726dee87b0e8e3d7890d660221bde372eec"><code>9d91472</code></a>
releaser: Bump versions for release of 0.156.0</li>
<li><a
href="https://github.com/gohugoio/hugo/commit/86aa62524f8bc36a04c8e0c0f76d1fd952585509"><code>86aa625</code></a>
hugolib: Move site.Data to hugo.Data, deprecate
Site.AllPages/BuildDrafts/Lan...</li>
<li><a
href="https://github.com/gohugoio/hugo/commit/d8ec0eeeaf2ff078565fddbbab5565a65b86346c"><code>d8ec0ee</code></a>
build(deps): bump google.golang.org/api from 0.255.0 to 0.267.0</li>
<li><a
href="https://github.com/gohugoio/hugo/commit/4148eded9c5f90036c47d241faac73e1d0c6ee70"><code>4148ede</code></a>
hugolib: Add Page.Sites to Site.Sites deprecation notice</li>
<li><a
href="https://github.com/gohugoio/hugo/commit/bba2aed3527e5c6086244c0ab76192b35b6ffa73"><code>bba2aed</code></a>
hugolib: Simplify sites collection</li>
<li><a
href="https://github.com/gohugoio/hugo/commit/29b8e17d29ad38621cf6c7c104309bcedf5c20c5"><code>29b8e17</code></a>
hugolib: Adjust hugo.Sites.Default</li>
<li><a
href="https://github.com/gohugoio/hugo/commit/3c823408ee51bbfbad847d4b9f926ba813097185"><code>3c82340</code></a>
Move common/hugo/HugoInfo to resources/page</li>
<li><a
href="https://github.com/gohugoio/hugo/commit/3f9d0ad2b6045849cbafe133cb9fb82ed5f5ee06"><code>3f9d0ad</code></a>
commands: Fix --panicOnWarning flag having no effect with module version
warn...</li>
<li><a
href="https://github.com/gohugoio/hugo/commit/ab62320d6bceece0faa7029f8bd79d546d0f64be"><code>ab62320</code></a>
hugolib: Add hugo.Sites and .Site.IsDefault(), modify .Site.Sites</li>
<li><a
href="https://github.com/gohugoio/hugo/commit/21be4afd49767eb63e3a2304b4c10816c86f799d"><code>21be4af</code></a>
build(deps): bump github.com/bep/textandbinarywriter</li>
<li>Additional commits viewable in <a
href="https://github.com/gohugoio/hugo/compare/v0.155.2...v0.156.0">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps ubuntu from `c7eb020` to `3ba65aa`.
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
## Summary
Harden the OAuth2 provider with multiple security fixes addressing
`coder/security#121` (CSRF session takeover) and converge on OAuth 2.1
compliance.
### Security Fixes
| Fix | Description | Commits |
|-----|-------------|---------|
| **CSRF on `/oauth2/authorize`** | Enforce CSRF protection on the
authorize endpoint POST (consent form submission) | `ba7d646`, `b94a64e`
|
| **Clickjacking: `frame-ancestors` CSP** | Prevent consent page from
being iframed (`Content-Security-Policy: frame-ancestors 'none'` +
`X-Frame-Options: DENY`) | `597aeb2` |
| **Exact redirect URI matching** | Changed from prefix matching to full
string exact matching per OAuth 2.1 §4.1.2.1 | `73d64b1`, `93897f1` |
| **Store & verify `redirect_uri`** | Store redirect_uri with auth code
in DB, verify at token exchange matches exactly (RFC 6749 §4.1.3) |
`50569b9`, `d7ca315` |
| **Mandatory PKCE** | Require `code_challenge` at authorization (for
`response_type=code`) + unconditional `code_verifier` verification at
token exchange | `d7ca315`, `1cda1a9` |
| **Reject implicit grant** | `response_type=token` now returns
`unsupported_response_type` error page (OAuth 2.1 removes implicit flow)
| `d7ca315`, `91b8863` |
### Changes by File
**`coderd/httpmw/csrf.go`** — Extended the CSRF `ExemptFunc` to enforce
CSRF on `/oauth2/authorize` in addition to `/api` routes. The consent
form POST is now CSRF-protected to prevent cross-site authorization code
theft.
**`site/site.go`** — Added `Content-Security-Policy: frame-ancestors
'none'` and `X-Frame-Options: DENY` headers to `RenderOAuthAllowPage`
(consent page only — does not affect the SPA/global CSP used by AI
tasks).
**`coderd/httpapi/queryparams.go`** — Changed `RedirectURL` from prefix
matching (`strings.HasPrefix(v.Path, base.Path)`) to full URI exact
matching (`v.String() != base.String()`), comparing scheme, host, path,
and query.
**`coderd/oauth2provider/authorize.go`** — Added PKCE enforcement:
`code_challenge` is required when `response_type=code` (via a
conditional check, not `RequiredNotEmpty`, so `response_type=token` can
reach the explicit rejection path). `ShowAuthorizePage` (GET) validates
`response_type` before rendering and returns a 400 error page for
unsupported types. `ProcessAuthorize` (POST) stores the `redirect_uri`
with the auth code when explicitly provided.
**`coderd/oauth2provider/tokens.go`** — PKCE verification is now
unconditional (not gated on `code_challenge` being present in DB). If
the stored code has a `redirect_uri`, the token endpoint verifies it
matches exactly — mismatch returns `errBadCode` → `invalid_grant`.
Missing `code_verifier` returns `invalid_grant`.
**`codersdk/oauth2.go`** — `OAuth2ProviderResponseTypeToken` constant
and `Valid()` acceptance are **kept** so the authorize handler can parse
`response_type=token` and return the proper `unsupported_response_type`
error rather than failing at parameter validation.
**`coderd/database/migrations/000421_*`** — Added `redirect_uri text`
column to `oauth2_provider_app_codes`.
### Design Decisions
**`state` parameter remains optional** — The plan initially required
`state` via `RequiredNotEmpty`, but this was reverted in `376a753` to
avoid breaking existing clients. The `state` is still hashed and stored
when provided (via `state_hash` column), securing clients that opt in.
**`response_type=token` kept in `Valid()`** — Removing it from `Valid()`
would cause the parameter parser to reject the request before the
authorize handler can return the proper `unsupported_response_type`
error. The constant is kept for correct error handling flow.
**CSP scoped to consent page only** — `frame-ancestors 'none'` is set
only on the OAuth consent page renderer, not globally. The SPA/global
CSP was previously changed to allow framing for AI tasks
([#18102](https://github.com/coder/coder/pull/18102)); this change does
not regress that.
### Out of Scope (follow-up PRs)
- Bearer tokens in query strings (needs internal caller audit)
- Scope enforcement on OAuth2 tokens
- Rate limiting on dynamic client registration
---
<details>
<summary>📋 Implementation Plan</summary>
# Plan: Harden OAuth2 Provider — Security Fixes + OAuth 2.1 Compliance
## Context & Why
Security issue `coder/security#121` reports a critical session takeover
via CSRF on the OAuth2 provider. This plan covers all remaining security
fixes from that issue **plus** convergence on OAuth 2.1 requirements.
The goal is a single PR that closes all actionable gaps.
## Current State (already committed on branch `csrf-sjx1`)
| Fix | Status | Commits |
|-----|--------|---------|
| Fix 1: CSRF on `/oauth2/authorize` | ✅ Done | `ba7d646`, `b94a64e` |
| CSRF token in consent form HTML | ✅ Done | `b94a64e` |
| `state_hash` column + storage | ✅ Done (hash stored, but state still
optional) | `9167d83`, `b94a64e` |
| Tests for CSRF + state hash | ✅ Done | `e4119b5` |
## Remaining Work
### ~~Fix 2 — Require `state` parameter~~ (DROPPED)
> **Decision:** Do not enforce `state` as required. The `state`
parameter is still hashed and stored when provided (via
`hashOAuth2State` / `state_hash` column from prior commits), but clients
are not forced to supply it. This avoids breaking existing integrations
that omit state.
**Rollback:** Remove `"state"` from the `RequiredNotEmpty` call in
`coderd/oauth2provider/authorize.go:42`:
```go
// BEFORE (current on branch)
p.RequiredNotEmpty("response_type", "client_id", "state", "code_challenge")
// AFTER
p.RequiredNotEmpty("response_type", "client_id", "code_challenge")
```
No test changes needed — tests already pass `state` voluntarily.
### Fix 4 — Exact redirect URI matching
Currently `coderd/httpapi/queryparams.go:233` uses prefix matching:
```go
// CURRENT — prefix match
if v.Host != base.Host || !strings.HasPrefix(v.Path, base.Path) {
```
OAuth 2.1 requires **exact string matching**. Change to:
```go
// AFTER — exact match (OAuth 2.1 §4.1.2.1)
if v.Host != base.Host || v.Path != base.Path {
```
**File: `coderd/httpapi/queryparams.go` — `RedirectURL` method**
Also update the error message from "must be a subset of" to "must
exactly match".
**Additionally**, store `redirect_uri` with the auth code and verify at
the token endpoint (RFC 6749 §4.1.3):
1. **New migration** (same migration file or a new `000421`): Add
`redirect_uri text` column to `oauth2_provider_app_codes`
2. **Update INSERT query** in `coderd/database/queries/oauth2.sql` to
include `redirect_uri`
3. **`coderd/oauth2provider/authorize.go`**: Store
`params.redirectURL.String()` when inserting the code
4. **`coderd/oauth2provider/tokens.go`**: After retrieving the code from
DB, verify that `redirect_uri` from the token request matches the stored
value exactly. Currently `tokens.go:103` calls `p.RedirectURL(vals,
callbackURL, "redirect_uri")` for prefix validation only — it must
compare against the stored redirect_uri from the code, not just the
app's callback URL.
<details>
<summary>Why both exact match AND store+verify?</summary>
Exact matching at the authorize endpoint prevents open redirectors
(attacker can't use a sub-path).
Storing and verifying at the token endpoint prevents code injection — an
attacker who steals a code can't exchange it with a different
redirect_uri than was originally authorized. This is required by RFC
6749 §4.1.3 and OAuth 2.1.
</details>
### Fix 7 — `frame-ancestors` CSP on consent page
The consent page can be iframed by a workspace app (same-site), which is
the attack vector. Add a `Content-Security-Policy` header to prevent
framing.
**File: `site/site.go` — `RenderOAuthAllowPage` function (~line 731)**
Before writing the response, add:
```go
func RenderOAuthAllowPage(rw http.ResponseWriter, r *http.Request, data RenderOAuthAllowData) {
rw.Header().Set("Content-Type", "text/html; charset=utf-8")
// Prevent the consent page from being framed to mitigate
// clickjacking attacks (coder/security#121).
rw.Header().Set("Content-Security-Policy", "frame-ancestors 'none'")
rw.Header().Set("X-Frame-Options", "DENY")
...
```
Both headers for defense-in-depth (CSP for modern browsers,
X-Frame-Options for legacy).
### OAuth 2.1 — Mandatory PKCE
Currently PKCE is checked only when `code_challenge` was provided during
authorization (`tokens.go:258`):
```go
// CURRENT — conditional check
if dbCode.CodeChallenge.Valid && dbCode.CodeChallenge.String != "" {
// verify PKCE
}
```
OAuth 2.1 requires PKCE for ALL authorization code flows. Change to:
**File: `coderd/oauth2provider/authorize.go`** — Add `"code_challenge"`
to required params:
```go
p.RequiredNotEmpty("response_type", "client_id", "code_challenge")
```
**File: `coderd/oauth2provider/tokens.go:257-265`** — Make PKCE
verification unconditional:
```go
// AFTER — PKCE always required (OAuth 2.1)
if req.CodeVerifier == "" {
return codersdk.OAuth2TokenResponse{}, errInvalidPKCE
}
if !dbCode.CodeChallenge.Valid || dbCode.CodeChallenge.String == "" {
// Code was issued without a challenge — should not happen
// with the authorize endpoint enforcement, but defend in
// depth.
return codersdk.OAuth2TokenResponse{}, errInvalidPKCE
}
if !VerifyPKCE(dbCode.CodeChallenge.String, req.CodeVerifier) {
return codersdk.OAuth2TokenResponse{}, errInvalidPKCE
}
```
**File: `codersdk/oauth2.go`** — Remove
`OAuth2ProviderResponseTypeToken` from the enum or reject it explicitly
in the authorize handler. Currently it's defined at line 216 but the
handler ignores `response_type` and always issues a code. We should
either:
- (a) Remove the `"token"` variant from the enum and reject it with
`unsupported_response_type`, OR
- (b) Add an explicit check in `ProcessAuthorize` that rejects
`response_type=token`
Option (b) is simpler and more backwards-compatible:
```go
// In ProcessAuthorize, after extracting params:
if params.responseType != codersdk.OAuth2ProviderResponseTypeCode {
httpapi.WriteOAuth2Error(ctx, rw, http.StatusBadRequest,
codersdk.OAuth2ErrorCodeUnsupportedResponseType,
"Only response_type=code is supported")
return
}
```
### OAuth 2.1 — Bearer tokens in query strings
`coderd/httpmw/apikey.go:743` accepts `access_token` from URL query
parameters. OAuth 2.1 prohibits this. However, this may be used
internally (e.g., workspace apps, DERP). Need to audit callers before
removing.
**Approach:** This is a larger change with potential breakage. Mark as a
**separate follow-up issue** rather than including in this PR. Document
the finding.
### OAuth 2.1 — Removed flows
✅ **Already compliant.** `tokens.go` only supports `authorization_code`
and `refresh_token` grant types. The implicit grant
(`response_type=token`) will be explicitly rejected per the PKCE section
above.
### OAuth 2.1 — Refresh token rotation
✅ **Already compliant.** `tokens.go:442` deletes the old API key when a
refresh token is used.
## Migration Plan
All DB changes can go in a single new migration (or extend 000420 if the
branch is rebased before merge). Columns to add:
- `redirect_uri text` on `oauth2_provider_app_codes`
The `state_hash` column is already added by migration 000420.
## Implementation Order
1. **Fix 7** — CSP headers on consent page (isolated, no deps)
2. ~~**Fix 2** — Require `state` parameter~~ (DROPPED — state stays
optional)
3. **Fix 4** — Exact redirect URI matching + store/verify redirect_uri
4. **PKCE mandatory** — Require `code_challenge` + reject
`response_type=token`
5. **Rollback** — Remove `"state"` from `RequiredNotEmpty` in
`authorize.go`
6. **Tests** — Update/add tests for all changes
7. **`make gen`** after DB changes
## Out of Scope (separate PRs)
- Bearer tokens in query strings (needs internal caller audit)
- Scope enforcement on OAuth2 tokens
- Rate limiting / quota on dynamic client registration
</details>
---
_Generated with [`mux`](https://github.com/coder/mux) • Model:
`anthropic:claude-opus-4-6` • Thinking: `xhigh`_
This pull-request removes all the magic of `@mui/material/Alert` 🥳 We're
officially free of any alerts that are being handled by Material UI so
this is dead code.
After a PostgreSQL round-trip, job timestamps lose their monotonic
clock component, making the subtraction susceptible to wall-clock
adjustments producing a small negative delta. Floor at 1ms since
a zero or negative queue wait is meaningless. Fixes TestProvisionerJobQueueWaitMetric
flakes where small negative values (~ -2ms) are observed.
Use the server-rendered meta tag value as an intermediate fallback for
theme preference, between the JS-fetched value and the default theme.
This ensures the correct theme is applied before the API response loads.
Fixes#20050
Previously, when secret deployment options like CODER_OIDC_CLIENT_SECRET
were populated, the API correctly returned the "secret": "true"
annotation, but the UI did not indicate that these secrets were
configured. The UI would show "Not set" regardless of whether the secret
was set or not.
Now, the UI checks both the secret annotation and the value_source
field. When a secret is configured (value_source is set), it displays
"Set" to indicate the secret is populated. When a secret is not
configured, it displays "Not set".
Fixes#18913
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
`--secure-auth-cookie` now automatically sources it's default value from `--access-url`
If the access url uses HTTPS, secure is set to `true`.
To revert to old behavior, set the value explicitly to `false`
If a deployment has 2 domains, overriding the oidc url allows the oidc
redirect to differ from the access_url
response to https://github.com/coder/coder/discussions/21500
**This config setting is hidden by default**
In relation to
[`internal#1281`](https://github.com/coder/internal/issues/1281)
Remove the `soft_limit` field from the `Feature` type and simplify
license limit handling. This change:
- Removes the `soft_limit` field from the API and SDK
- Uses the soft limit value as the single `limit` value in the UI and
API
- Simplifies warning logic to only show warnings when the limit is
exceeded
- Updates tests to reflect the new behavior
- Updates the UI to use the single limit value for display
In relation to
[`internal#1281`](https://github.com/coder/internal/issues/1281)
Managed agent workspace build limits are now advisory only. Breaching
the limit no longer blocks workspace creation — it only surfaces a
warning.
- Removed hard-limit enforcement in `checkAIBuildUsage` so AI task
builds are always permitted regardless of managed agent count.
- Updated the license warning to remove "Further managed agent builds
will be blocked." verbiage.
- Updated tests to assert builds succeed beyond the limit instead of
failing.
- Removed the "Limit" display from the `ManagedAgentsConsumption`
progress bar — the bar is now relative to the included allowance (soft
limit) only, and turns orange when usage exceeds it.
Bonus:
- De-MUI'd `LicenseBannerView` — replaced Emotion CSS and MUI `Link`
with Tailwind classes.
- Added `highlight-orange` color token to the Tailwind theme.
This pull-request implement animations for each of our `<ChevronDown />`
(and a few other chevrons) so that everything is uniform with
`<Autocomplete />`.
Based on previous PR reviews it appears we don't want to use these
components anymore. We previously deprecated the use of `<Stack />` in
this way in #20973 so it would be good to take the same approach here.
This PR stops Vite from repeatedly re-optimizing certain MUI modules
during development, which was triggering an HMR feedback loop and
crashing my dev environment on specific pages — most notably
`<LicensesSettingsPage />`.
After some digging, the culprit turned out to be:
```ts
import Paper from "@mui/material/Paper";
```
Importing components this way causes Vite to continuously re-optimize
them during HMR, which leads to the page refreshing over and over until
the dev server taps out and `504 "Outdated Optimize Dep"`'s us.
The fix ensures these modules are computed once at startup instead of
being reprocessed on every hot update. Development is now stable, and
the infinite refresh loop is gone.
I did experiment with using globs to handle this more generically, but
since they’re still early-access in this context, they ended up breaking
things 😔
In short: fewer re-optimizations, no more HMR meltdown, and a much
calmer dev experience.
Continuation of #22186 (without `vitest` addon)
Upgrades the dependency so that we can actively make use of new
features/speed/less-dependencies. Short simple sweet and lovely 🙂
## Summary
Custom roles that can create workspaces on behalf of other users need to
be able to list users to populate the owner dropdown in the workspace
creation UI. Previously, this required a separate `user:read`
permission, causing the dropdown to fail for custom roles.
## Changes
- Modified `GetUsers` in `dbauthz` to check if the user can create
workspaces for any owner (`workspace:create` with `owner_id: *`)
- If the user has this permission, they can list all users without
needing explicit `user:read` permission
- Added tests to verify the new behavior
## Testing
- Updated mock tests to assert the new authorization check
- Added integration tests for both positive and negative cases
Fixes#18203
Parent agents were re-using AuthInstanceID when spawning child agents.
This caused GetWorkspaceAgentByInstanceID to return the most recently
created sub agent instead of the parent when the parent tried to refetch
its own manifest.
Fix by not reusing AuthInstanceID for sub agents, and updating
GetWorkspaceAgentByInstanceID to filter them out entirely.
The existing README for the Azure Linux starter template only mentioned
that the VM is ephemeral and the managed disk is persistent, but did not
explain that the resource group, virtual network, subnet, and network
interface also persist when a workspace is stopped.
This led to confusion where users expected all Azure resources to be
cleaned up on stop, when in reality only the VM is destroyed.
## Changes
- Added the persistent networking/infrastructure resources to the
resource list
- Added "What happens on stop" section explaining which resources
persist and why
- Added "What happens on delete" section confirming all resources are
cleaned up
- Moved the existing note about ephemeral tools/files into a "Workspace
restarts" subsection for clarity
These changes exactly mirror https://github.com/coder/registry/pull/713
since the registry is not yet linked to the starter templates in
`coder/coder`. Once the registry is linked, the starter templates will
pull from the registry and this duplication will no longer be necessary.
---------
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Add a `TaskLogPreview` component that displays the last N messages of AI
chat logs when a task is paused or its build has failed. The preview
fetches log snapshots via a new `getTaskLogs` API method and renders
them in a scrollable panel with `[user]` and `[agent]` labels, colored
left borders on type transitions, and a snapshot timestamp tooltip.
The build-logs auto-scroll in `BuildingWorkspace` was simplified by
replacing the `useRef`/`useLayoutEffect` pattern with a `useCallback`
ref, and client-side message slicing was removed in favor of
server-side limits. `InfoTooltip` now accepts an optional `title` prop.
Updates the reference to `ANTHROPIC_API_KEY` in the Claude Code client
docs to `ANTHROPIC_AUTH_TOKEN`.
**File changed:**
- `docs/ai-coder/ai-bridge/clients/claude-code.md` — configuration
instructions
Created on behalf of @dannykopping
---------
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Since Go 1.22, the loop variable capture issue is resolved. Variables
declared by for loops are now per-iteration rather than per-loop, making
the 'v := v' pattern unnecessary.
`coder templates version list` makes a call to determine the `active`
version:
```
➜ ~ coder templates version list aws-linux-dynamic
NAME CREATED AT CREATED BY STATUS ACTIVE
infallible_feistel2 2025-10-10T10:34:02+11:00 rowansmith Succeeded Active
mystifying_almeida1 2025-10-10T10:32:38+11:00 rowansmith Succeeded
```
but this is not carried across to the `-ojson` output version, so this
PR implements that in order to support programattic addressing.
It is added a top level entry. If it should be nested under
`TemplateVersion` let me know.
```
➜ ~ ./Downloads/coder-cli-templateversions-json-active templates version list aws-linux-dynamic -ojson | jq '.[] | select(.active == true) | { active, id: .TemplateVersion.id }'
{
"active": true,
"id": "38f66eae-ec63-49b7-a9d2-cdb79c379d19"
}
➜ ~ ./Downloads/coder-cli-templateversions-json-active templates version list aws-linux-dynamic -ojson |jq '.[] | select(.active == true)'
{
"TemplateVersion": {
"id": "38f66eae-ec63-49b7-a9d2-cdb79c379d19",
"template_id": "1a84ce78-06a6-41ad-99e4-8ea5d9b91e89",
"organization_id": "35f75f20-890e-4095-95f1-bb8f2ba02e79",
"created_at": "2025-10-10T10:34:02.254357+11:00",
"updated_at": "2025-10-10T10:34:46.594032+11:00",
"name": "infallible_feistel2",
"message": "Uploaded from the CLI",
"job": {
"id": "8afd05ca-b4be-48d5-a6b9-82dcfd12c960",
"created_at": "2025-10-10T10:34:02.251234+11:00",
"started_at": "2025-10-10T10:34:02.257301+11:00",
"completed_at": "2025-10-10T10:34:46.594032+11:00",
"status": "succeeded",
"worker_id": "a0940ade-ecdd-47c2-98c6-f2a4e5eb0733",
"file_id": "05fd653c-3a3f-4e5c-856b-29407732e1b1",
"tags": {
"owner": "",
"scope": "organization"
},
"queue_position": 0,
"queue_size": 0,
"organization_id": "35f75f20-890e-4095-95f1-bb8f2ba02e79",
"initiator_id": "d20c05ff-ecf3-4521-a99d-516c8befbaa6",
"input": {
"template_version_id": "38f66eae-ec63-49b7-a9d2-cdb79c379d19"
},
"type": "template_version_import",
"metadata": {
"template_version_name": "",
"template_id": "00000000-0000-0000-0000-000000000000",
"template_name": "",
"template_display_name": "",
"template_icon": ""
},
"logs_overflowed": false
},
"readme": "---\ndxxxxx,
"created_by": {
"id": "d20c05ff-ecf3-4521-a99d-516c8befbaa6",
"username": "rowansmith",
"name": "rowan smith"
},
"archived": false,
"has_external_agent": false
},
"active": true
}
```
Closes#21130
Adds documentation for Google Antigravity IDE integration, following the
same pattern as Cursor and Windsurf (dedicated page for desktop IDEs).
**Changes:**
- `docs/user-guides/workspace-access/antigravity.md` — New dedicated
page with install guide, Coder extension setup, and template
configuration example using the [Antigravity registry
module](https://registry.coder.com/modules/coder/antigravity)
- `docs/user-guides/workspace-access/index.md` — Added Antigravity IDE
section alongside Cursor and Windsurf
- `docs/manifest.json` — Added sidebar navigation entry after Windsurf
Antigravity uses the `antigravity://` protocol (added in #20873) and the
built-in `/icon/antigravity.svg` icon (added in #21068). The [registry
module](https://registry.coder.com/modules/coder/antigravity) wraps
`vscode-desktop-core` with `protocol = "antigravity"`.
Created on behalf of @matifali
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
### Notes
- Closes https://github.com/coder/internal/issues/558
- I closed previous attempt with `ptySemaphore`:
https://github.com/coder/coder/pull/21981
- We can consider implementing the retries proposed by Spike in:
https://github.com/coder/coder/pull/21981#pullrequestreview-3783200423,
if increasing the limit isn’t enough.
- I looked into Datadog — this particular test doesn’t seem very flaky
right now. It failed once in the Nightly gauntlet (3 weeks ago), but it
hasn’t failed again in the last 3 months (at least I couldn’t find any
other failures in Datadog).
## Fix PTY exhaustion flake on macOS CI
### Problem
macOS CI runners were experiencing PTY exhaustion during test runs,
causing flakes. The default PTY limit on macOS is 511, which can be
insufficient when running parallel tests.
### Solution
Added a CI step to increase the PTY limit on macOS runners from the
default 511 to the maximum allowed value of 999 before running tests.
### Changes
- Added `Increase PTY limit (macOS)` step in `.github/workflows/ci.yaml`
- Sets `kern.tty.ptmx_max=999` using `sysctl` (maximum value on our CI
runners)
- Runs only on macOS runners before the test-go-pg action
Description:
This PR updates the bundled Terraform binary and related version pins
from 1.14.1 to 1.14.5 (base image, installer fallback, and CI/test
fixtures). Terraform is statically built with an embedded Go runtime.
Moving to 1.14.5 updates the embedded toolchain and is intended to
address Go stdlib CVEs reported by security scanning.
Notes:
- Change is version-only; no functional Coder logic changes.
- Backport-friendly: intended to be cherry-picked to release branches
after merge.
## Summary
coder-logstream-kube and other tools that use the agent token to connect
to the RPC endpoint were incorrectly triggering connection monitoring,
causing false connected/disconnected timestamps on the agent. This led
to VSCode/JetBrains disconnections and incorrect dashboard status.
## Changes
Add a `role` query parameter to `/api/v2/workspaceagents/me/rpc`:
- `role=agent`: triggers connection monitoring (default for the agent
SDK)
- any other value (e.g. `logstream-kube`): skips connection monitoring
- omitted: triggers monitoring for backward compatibility with older
agents
The agent SDK now sends `role=agent` by default. A new `Role` field on
the `agentsdk.Client` allows non-agent callers to specify a different
role.
## Required follow-up
coder-logstream-kube needs to set `client.Role = "logstream-kube"`
before calling `ConnectRPC20()`. Without that change, it will still send
`role=agent` and trigger monitoring.
Fixes#21625
At present it is not possible to obtain the `id` of the template version
in the table output:
```
➜ ~ coder templates version list -h
coder v2.30.1+16408b1
USAGE:
coder templates versions list [flags] <template>
List all the versions of the specified template
OPTIONS:
-O, --org string, $CODER_ORGANIZATION
Select which organization (uuid or name) to use.
-c, --column [name|created at|created by|status|active|archived] (default: name,created at,created by,status,active)
Columns to display in table output.
➜ ~ coder templates version list aws-linux-dynamic
NAME CREATED AT CREATED BY STATUS ACTIVE
infallible_feistel2 2025-10-10T10:34:02+11:00 rowansmith Succeeded Active
mystifying_almeida1 2025-10-10T10:32:38+11:00 rowansmith Succeeded
```
Adding this because it is useful when wanting to programatically
retrieve the details of the latest template version, and `-ojson` does
not include `active` details in it's output.
```
➜ Downloads ./coder-cli-templateversions-list-id templates version list -h
coder v2.30.1-devel+bab99db9e7
USAGE:
coder templates versions list [flags] <template>
List all the versions of the specified template
OPTIONS:
-O, --org string, $CODER_ORGANIZATION
Select which organization (uuid or name) to use.
-c, --column [id|name|created at|created by|status|active|archived] (default: name,created at,created by,status,active)
Columns to display in table output.
--include-archived bool
Include archived versions in the result list.
-o, --output table|json (default: table)
Output format.
———
Run `coder --help` for a list of global options.
➜ Downloads ./coder-cli-templateversions-list-id templates version list aws-linux-dynamic -c id,name,'created at','created by',status,active
ID NAME CREATED AT CREATED BY STATUS ACTIVE
38f66eae-ec63-49b7-a9d2-cdb79c379d19 infallible_feistel2 2025-10-10T10:34:02+11:00 rowansmith Succeeded Active
aa797ea5-4221-461b-80b0-90c5164f8dc0 mystifying_almeida1 2025-10-10T10:32:38+11:00 rowansmith Succeeded
```
Closes#20965
This pull-request enables a quick permission check that the user is
allowed to view the `<RequestLogsPage />` under the admin panel.
Previously, users would be able to view this page and browse their own
logs if they had this permission (which was fine), however now we've
decided as this is an admin page, they should only be able to do this
via the API/CLI not from the main admin panel.
The login page component incorrectly uses client-side routing to handle
redirects to /oauth2/authorize. Since this path is not defined as a
route in the react application but as a backend endpoint for the OAuth2
provider flow, the frontend displays a 404 "Route not found" error.
- resolves#22097
<!--
If you have used AI to produce some or all of this PR, please ensure you
have read our [AI Contribution
guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING)
before submitting.
-->
Relates to https://github.com/coder/internal/issues/1252
When a workspace with a TaskID hits its deadline, use
BuildReasonTaskAutoPause instead of BuildReasonAutostop. This allows
downstream systems to distinguish between regular autostop and task
workspace pauses.
Created by Mux using Opus 4.5.
Remove the warning about JetBrains Toolbox not persisting log level
configuration between restarts.
As of JetBrains Toolbox 3.2, log level configuration now persists
between restarts, making this warning outdated.
Created on behalf of @matifali
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
## Summary
> NOTE: Calling this out as a breaking change in case existing consumers
of the CLI depend on being able to see expired tokens OR being able to
delete tokens immediately.
Updates the `coder tokens rm` command to immediately expire a token by
ID, preserving the token record for audit trail purposes. Tokens can
still be deleted by passing `--delete`.
## Problem
During an incident on dev.coder.com, operators needed to urgently expire
an API key that was stuck in a hot loop. The only way to do this was via
direct database access:
```sql
UPDATE api_keys SET expires_at = NOW() WHERE id = '...';
```
This is not ideal for operators who may not have direct DB access or
want to avoid manual SQL.
## Solution
This PR adds:
- **API endpoint**: `PUT /api/v2/users/{user}/keys/{keyid}/expire` -
Sets the token's `expires_at` to now
- **SDK method**: `ExpireAPIKey(ctx, userID, keyID)`
- **Updates CLI**: `coder tokens rm <name|id|token>` now _expires_ by
default. You can still delete by passing the `--delete` flag. The `coder
tokens list` command now also hides expired tokens by default. You can
`--include-expired` if needed to include them.
- **Audit logging**: The expire action is logged with old and new key
states
## Test plan
- Tests cover: owner expiring own token, admin expiring other user's
token, non-admin cannot expire other's token, 404 for non-existent token
Closes#21782🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Closes#20859
This page previously wasn't rendered to the user, however, there is a
possibility that they can navigate to this page and things will end up
in `<Spinner />`s until the requests ultimately fail. We can mitigate
this problem by showing them the `<RequirePermission />` modal.
<img width="1456" height="861" alt="image"
src="https://github.com/user-attachments/assets/57195643-ad55-4340-9c97-f8247b05a13b"
/>
Bumps rust from `760ad1d` to `9663b80`.
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Closes#21703
This doesn't make sense to have an `Activity bump` value when the
`Default autostop` is set to `0`. There is nothing to bump if we don't
have a timed stopping mechanism on the container. This is already
present on the backend and now we're describing this to the user on the
frontend.
## Summary
The license removal confirmation dialog always showed:
> Removing this license will disable all Premium features. You add a new
license at any time.
This is misleading when the license being removed is already expired —
an expired license isn't providing any features, so removing it won't
disable anything.
## Changes
- Extracted `isExpired` variable in `LicenseCard` (reusing the existing
expiry check)
- Made the dialog description conditional:
- **Expired license**: "This license has already expired and is not
providing any features. Removing it will not affect your current
entitlements."
- **Active license**: "Removing this license will disable all Premium
features. You can add a new license at any time."
- Also fixed a minor typo in the active license message ("You add" →
"You can add")
- Added two new tests covering both dialog variants
## Testing
All 5 `LicenseCard` tests pass, including the 2 new ones:
- `shows expired removal message for expired licenses`
- `shows disabling features warning for active licenses`
---------
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
## Problem
Site-wide admins (e.g., Owners) could not use `coder create --org <org>`
to create workspaces in organizations they are not members of. The error
was:
```
$ coder create my-workspace -t docker --org data-science
error: organization "data-science" not found, are you sure you are a member of this organization?
```
This was inconsistent with the web UI, where Owners can create
workspaces in any organization.
## Root Cause
The CLI's `OrganizationContext.Selected()` function only checked the
user's membership list, ignoring site-wide RBAC permissions that grant
Owners access to all organizations.
## Solution
Added a fallback in `OrganizationContext.Selected()` that fetches the
org directly via the API when not found in the membership list. This
works because the API endpoint applies RBAC filtering, allowing Owners
to read any org.
## Impact
This fixes `coder create --org` and all other CLI commands that use
`OrganizationContext.Selected()` (29+ commands), including:
- `coder templates push --org <any-org>`
- `coder organizations members add --org <any-org>`
- `coder provisioner list --org <any-org>`
## Testing
Added `TestEnterpriseCreate/OwnerCanCreateInNonMemberOrg` which:
- Creates an Owner user who is NOT a member of a second org
- Verifies they can create a workspace there using `--org`
- Properly fails without the code fix, passes with it
---
*This PR was generated by [mux](https://mux.coder.com) but reviewed by a
human.*
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore <dependency name> major version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's major version (unless you unignore this specific
dependency's major version or upgrade to it yourself)
- `@dependabot ignore <dependency name> minor version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's minor version (unless you unignore this specific
dependency's minor version or upgrade to it yourself)
- `@dependabot ignore <dependency name>` will close this group update PR
and stop Dependabot creating any more for the specific dependency
(unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore <dependency name>` will remove all of the ignore
conditions of the specified dependency
- `@dependabot unignore <dependency name> <ignore condition>` will
remove the ignore condition of the specified dependency and ignore
conditions
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Closes#16148
This pull-request resolves a few issues with wider displays.
Particularly in ensuring the content's container center's as one would
expect and the content of the headings isn't being contained into a
`max-w-prose`.
**Background**
Reported in #17417, there is a `deleted` query parameter supported by
/api/v2/templates, but we do not respect this field on the client,
showing the "Create Workspace" button for deleted templates.
**Expected Behavior**
Don't show the "Create Workspace" button for deleted templates.
**Notes**
This PR adds a new `deleted` field to the templates API response.
Co-authored-by: Danielle Maywood <danielle@themaywoods.com>
## Description
This PR wires up the metrics scanner in the Makefile to automatically regenerate metrics documentation when source files change.
## Changes
* Add Makefile target `scripts/metricsdocgen/generated_metrics` to run the AST scanner to generate the metrics file
* Update `docs/admin/integrations/prometheus.md` Makefile target to depend on `scripts/metricsdocgen/generated_metrics`
* Add `scripts/metricsdocgen/README.md` documenting the metrics generation process
Closes: https://github.com/coder/coder/issues/13223
## Description
This PR refactors `scripts/metricsdocgen/main.go` to support merging static and generated metrics files for documentation generation.
The static `metrics` file remains necessary for metrics not defined in the coder codebase (`go_*`, `process_*`, `promhttp_*`, `coder_aibridged_*`), as well as **edge cases** the scanner cannot handle (e.g., such as metrics with runtime-determined labels or function-local variable references for fields, ...). Handling these edge cases in the scanner would make it significantly more complex, so we keep this hybrid approach to accommodate them. This means that in such cases, developers need to update the `metrics` file directly, meaning there is still a risk of out-of-date information in the documentation. However, this solution should already encompass most cases.
Static metrics take priority over generated metrics when both files contain the same metric name, allowing manual overrides without modifying the scanner. Some of these edge cases could be easily fixed by updating the codebase to use one of the supported patterns.
## Changes
* Update `scripts/metricsdocgen/main.go` to read from two separate metrics files:
* `metrics`: static, manually maintained metrics (e.g., `go_*`, `process_*`, `promhttp_*`, `coder_aibridged_*`)
* `generated_metrics`: auto-generated by the AST scanner
* Update `metrics` file to contain only static and edge-case metrics
* Skip metrics with empty HELP descriptions in the scanner
* Update `generated_metrics` to reflect skipped metrics
* Update `docs/admin/integrations/prometheus.md` with merged metrics
Related to: https://github.com/coder/coder/issues/13223
**Disclosure:** This PR was mainly developed with Claude Sonnet 4, with iterative review and refinement by @ssncferreira
## Description
This PR implements extraction of metrics defined using `promauto.With()` factory patterns.
## Changes
* Add `extractPromautoMetric()` to handle:
* `promauto.With(reg).NewCounterVec(prometheus.CounterOpts{...}, labels)`
* `factory.NewGaugeVec(prometheus.GaugeOpts{...}, labels)`
* Script generates an updated `scripts/metricsdocgen/generated_metrics` file
Related to: https://github.com/coder/coder/issues/13223
**Disclosure:** This PR was mainly developed with Claude Sonnet 4, with iterative review and refinement by @ssncferreira
## Description
This PR implements extraction of metrics defined using `prometheus.New*()` and `prometheus.New*Vec()` patterns with `*Opts{}` structs.
## Changes
* Add `extractOptsMetric()` to handle:
* `prometheus.NewGauge(prometheus.GaugeOpts{...})`
* `prometheus.NewCounter(prometheus.CounterOpts{...})`
* `prometheus.NewHistogram(prometheus.HistogramOpts{...})`
* `prometheus.NewSummary(prometheus.SummaryOpts{...})`
* `prometheus.New*Vec(prometheus.*Opts{...}, labels)`
* Script generates an updated `scripts/metricsdocgen/generated_metrics` file
Related to: https://github.com/coder/coder/issues/13223
**Disclosure:** This PR was mainly developed with Claude Sonnet 4, with iterative review and refinement by @ssncferreira
## Description
This PR implements extraction of metrics defined using the `prometheus.NewDesc()` pattern.
## Changes
* Add `extractNewDescMetric()` to extract metrics from `prometheus.NewDesc()` calls
* Script generates an updated `scripts/metricsdocgen/generated_metrics` file
Related to: https://github.com/coder/coder/issues/13223
**Disclosure:** This PR was mainly developed with Claude Sonnet 4, with iterative review and refinement by @ssncferreira
## Description
This PR adds an AST-based scanner to automatically generate Prometheus metrics documentation from the coder source code.
## Changes
* Add `scripts/metricsdocgen/scanner/scanner.go` with:
* Directory walking for `agent/`, `coderd/`, `enterprise/`, `provisionerd/`
* Go file parsing (skipping `*_test.go` files)
* AST inspection for metric extraction
* `Metric.String()` for Prometheus text exposition format rendering
* `writeMetrics()` to output metrics to stdout
* Placeholder `extractMetricFromCall()` (implemented in subsequent PRs)
* Empty `scripts/metricsdocgen/generated_metrics` placeholder (populated by subsequent PRs)
**Note:** To facilitate the review process, this was separated into scoped stacked PRs. The division was based on the main structure, the different Prometheus patterns currently present in the codebase, and updates to the build process.
Related to: https://github.com/coder/coder/issues/13223
**Disclosure:** This PR was mainly developed with Claude Sonnet 4, with iterative review and refinement by @ssncferreira
This pull-request refactors the `<Combobox />` component from a
monolithic design to a composable compound component pattern, providing
more flexibility and reusability across the codebase
- Migrates `<SelectFilter />` to use the new `<Combobox />` instead of
the legacy `<SelectMenu />` components
- Updates all existing consumers of `<Combobox />` and `<SelectFilter
/>` to use the new API
<img
src="https://github.com/user-attachments/assets/a3336431-590c-48b5-adde-3fc5c16f459d"
/>
The `<Combobox />` component has been refactored to use a compound
component pattern, exposing:
- `Combobox` - Root component with context provider for open/value state
- `ComboboxTrigger` - Trigger wrapper (re-exports PopoverTrigger)
- `ComboboxButton` - Styled button with chevron and selected option
display
- `ComboboxContent` - Popover content with Command wrapper
- `ComboboxInput` - Search input (re-exports CommandInput)
- `ComboboxList` - List container (re-exports CommandList)
- `ComboboxItem` - Individual option with checkmark indicator
- `ComboboxEmpty` - Empty state (re-exports CommandEmpty)
- `useCombobox` - Hook to access combobox context
This pattern allows consumers to compose their own combobox layouts
while sharing consistent behavior and styling.
Furthermore, we had an issue with `CreateWorkspacePageView.stories.tsx`
lacking stories which would let us see the passed parameters and presets
in context. I've added stories to surround this.
### Updated Consumers
- `DynamicParameter.tsx` - Updated to use new Combobox API for parameter
options
- `CreateWorkspacePageView.tsx` - Updated preset combobox usage
- `IdpOrgSyncPageView.tsx` - Updated organization sync form
- `IdpGroupSyncForm.tsx` - Updated group sync form
- `IdpRoleSyncForm.tsx` - Updated role sync form
- `WorkspacesPage/filter/menus.tsx` - Updated workspace filter menus
---------
Co-authored-by: ケイラ <mckayla@hey.com>
This PR adds some metrics to help identify job enqueue rates and
latencies. This work was initiated as a way to help reduce the cost of
the observation/measurement itself for autostart scaletests, which
impacts our ability to identify/reason about the load caused by
autostart. See: https://github.com/coder/internal/issues/1209
I've extended the metrics here to account for regular user initiated
builds, prebuilds, autostarts, etc. IMO there is still the question here
of whether we want to include or need the `transition` label, which is
only present on workspace builds. Including it does lead to an increase
in cardinality, and in the case of the histogram (when not using native
histograms) that's at least a few extra series for every bucket. We
could remove the transition label there but keep it on the counter.
Additionally, the histogram is currently observing latencies for other
jobs, such as template builds/version imports, those do not have a
transition type associated with them.
Tested briefly in a workspace, can see metric values like the following:
-
`coderd_workspace_builds_enqueued_total{build_reason="autostart",provisioner_type="terraform",status="success",transition="start"}
1`
-
`coderd_provisioner_job_queue_wait_seconds_bucket{build_reason="autostart",job_type="workspace_build",provisioner_type="terraform",transition="start",le="0.025"}
1`
---------
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Closes#21830
Remove redundant icon sizing across the frontend. Components like
`Button`, `DropdownMenuItem`, and `CommandItem` already control child
SVG sizes via CSS selectors (e.g., `[&>svg]:size-icon-lg`), so explicit
`size` props and `className` overrides on icons nested inside them are
unnecessary. This PR strips those out and lets parent components handle
sizing consistently.
As a bonus, also migrates the `DropdownArrow` component from Emotion
CSS-in-JS to Tailwind utilities, replaces raw `<a>` tags with the `<Link
/>` component in the Premium page, and adds Storybook coverage for
`PremiumPageView`.
The AI Bridge setup docs showed `CODER_AIBRIDGE_ENABLED=true coder
server` as a single line, which can confuse users into thinking the env
var is a one-time prefix rather than a persistent setting.
Split this into `export CODER_AIBRIDGE_ENABLED=true` on its own line
followed by `coder server`, which is clearer and consistent with how the
Bedrock credentials section already handles env vars in the same file.
Created on behalf of @dannykopping
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
## Problem
CI failure showed 3 goroutines leaked in the prebuilds reconciler, all
stuck in `select` state:
1) `MetricsCollector.BackgroundFetch` (metrics goroutine)
2) `StoreReconciler.Run` (main reconciliation loop)
3) `StoreReconciler.Run.func3()` (provisioner job publisher goroutine)
All three goroutines were waiting for `ctx.Done()`, which likely means
`cancelFn()` was never called to trigger shutdown.
**Note:** I was unable to reproduce the flake locally. The likely cause
was a race condition between `Run()` and `Stop()` where `Stop()` could
check `running` (seeing `false`), return early, and then `Run()` would
start goroutines that never get cleaned up. This could happen in any
`coderd` test that starts a server with prebuilds enabled.
### Problems identified
1) Missing waitgoroup tracking: provisioner job publisher goroutine was
not tracked in the waitgroup, therefore, this goroutine was not tracked
for a clean shutdown in `Run defer func()`.
2) The provisioner job publisher goroutine had a redundant `case
<-c.done` that could race with `Stop()` select statement.
3) Race condition between `Run()` and `Stop()`: the `running` and
`stopped` fields were `atomic.Bool` values checked and set
independently, allowing a window where `Stop()` could see
`running=false` and return early, then `Run()` would set `running=true`
and start goroutines that would never be cleaned up. This could happen
in any `coderd` test that starts a server with prebuilds enabled.
## Changes
* Added `wg.Add(1)` and `defer wg.Done()` to track provisioner job
publisher goroutine in waitgroup
* Removed redundant `case <-c.done` from provisioner job publisher
goroutine to eliminate race condition
* Replaced `atomic.Bool` for `running` and `stopped` with a `sync.Mutex`
lifecycle state, also protecting `cancelFn` under the same mutex, to
eliminate the race between `Run()` and `Stop()`
* Added a guard in `Run()` to prevent double-start (`c.stopped ||
c.running`)
* Improved comments in Stop() and Run() to clarify shutdown behavior
Closes: https://github.com/coder/internal/issues/1116
### Summary
Workspace created via mode=auto links now require explicit user
confirmation before provisioning. A warning dialog shows all prefilled
param.* values from the URL and blocks creation until the user clicks
`Confirm and Create`. Clicking `Cancel` falls back to the standard form
view.
<img width="820" height="475" alt="auto-create-consent-dialog"
src="https://github.com/user-attachments/assets/8339e3bd-434f-4a04-9385-436bf95f49d7"
/>
### Breaking behavior change
Links using `mode=auto` (e.g., "Open in Coder" buttons) will no longer
silently create workspaces. Users will now see a consent dialog and must
explicitly confirm before the workspace is provisioned. Any existing
integrations or automation relying on `mode=auto` for seamless workspace
creation will now require manual user interaction.
---------
Co-authored-by: Jake Howell <jacob@coder.com>
This change adds Linux support for Desktop VPN by aligning Linux
behavior with the existing Windows daemon implementation and adding a
Linux networking stack implementation.
### What changed
- Consolidated the daemon command implementation into a shared file:
- `cli/vpndaemon_windows_linux.go` (`//go:build windows || linux`)
- Consolidated daemon tests into a shared file:
- `cli/vpndaemon_windows_linux_test.go` (`//go:build windows || linux`)
- Removed Linux-only duplicate daemon files:
- `cli/vpndaemon_linux.go`
- `cli/vpndaemon_linux_test.go`
- Removed unsupported-platform stubs per current supported OS targets:
- `cli/vpndaemon_other.go`
- `vpn/tun.go`
- Kept Linux networking stack implementation in:
- `vpn/tun_linux.go`
### Notes
- Linux now uses the same `rpc-read-handle` / `rpc-write-handle` flags
and env vars as Windows.
- The daemon logs to stderr (via CLI logger sinks), and does not forward
logs over the RPC pipe.
## Problem
The Copilot provider was missing from the AI Bridge logs filter dropdown, so users couldn't filter interceptions by Copilot. Additionally, the `AIBridgeProviderIcon` component didn't handle the copilot provider, so it would render a fallback question mark icon.
<img width="1392" height="333" alt="Screenshot 2026-02-10 at 09 26 16" src="https://github.com/user-attachments/assets/ecb97400-a4dd-4e88-accc-68d7fdf19b2a" />
## Changes
* Added `copilot` case to `AIBridgeProviderIcon`, using the existing `/icon/github.svg`.
* Added Copilot as a provider option in the filter dropdown.
* Added `MockInterceptionAnthropic` and `MockInterceptionCopilot` mock data with sample prompts, and updated the Storybook stories to use one interception per provider.
## Problem
Previously, the AI Bridge model column icon was derived from the provider field. This worked because each provider only served its own models: OpenAI interceptions always used OpenAI models, and Anthropic interceptions always used Anthropic models.
With the introduction of the Copilot provider, this assumption no longer holds. Copilot can forward requests to both OpenAI and Anthropic models, so the provider field alone is not enough to determine the correct model icon. This caused Copilot interceptions to display a fallback question mark icon for the model.
<img width="1337" height="365" alt="Screenshot 2026-02-10 at 09 10 34" src="https://github.com/user-attachments/assets/1efd613d-16c9-4738-8337-6ccf92e610fc" />
## Changes
* Added `AIBridgeModelIcon` component that infers the model family (Claude, OpenAI) from the model name string and renders the appropriate icon.
* Updated `RequestLogsRow` to use `AIBridgeModelIcon` instead of `AIBridgeProviderIcon` in both the table row and the expanded detail view.
This PR fixes a workspace app authentication bug where requests that
include an `Authorization` header (intended for the upstream app) can
cause Coder to ignore the workspace app session cookie
(`coder_subdomain_app_session_token_*` /
`coder_path_app_session_token`). When that happens, Coder fails to mint
or renew `coder_signed_app_token` and redirects to
`/api/v2/applications/auth-redirect` instead of proxying the request to
the workspace.
This commonly shows up when users run a frontend and backend in the same
workspace and the backend requires `Authorization` (for example, `curl
-H "Authorization: bearer ..."` or browser `fetch()` calls).
Related issues / context:
* Primary bug report and repro:
[https://github.com/coder/coder/issues/21467](https://github.com/coder/coder/issues/21467)
* Related symptoms reported as CORS / redirect failures for workspace
apps:
*
[https://github.com/coder/coder/issues/20667](https://github.com/coder/coder/issues/20667)
*
[https://github.com/coder/coder/issues/19728](https://github.com/coder/coder/issues/19728)
## Root Cause
In `coderd/workspaceapps/cookies.go`, `AppCookies.TokenFromRequest`
checked `httpmw.APITokenFromRequest(r)` first. That helper returns a
token from several places, including `Authorization: Bearer ...`.
As a result, when a request included an upstream `Authorization` header,
that header value was returned as the “session token” for the app proxy,
and `coder_subdomain_app_session_token_*` was never read. Authentication
then failed and the request was treated as signed out.
## Fix
Change the precedence in `AppCookies.TokenFromRequest`:
1. First check the access-method-specific cookie:
* subdomain apps: `coder_subdomain_app_session_token_{hash}`
* path apps: `coder_path_app_session_token`
2. If not present, fall back to `httpmw.APITokenFromRequest(r)` (so
non-browser clients can still authenticate via query, header, or bearer
tokens if they really want to).
This ensures that:
* Backend requests that require `Authorization` still reach the
workspace.
* `coder_signed_app_token` can be renewed from the app session cookie
even when `Authorization` is present.
* `Authorization` is still forwarded to the upstream app (the reverse
proxy code does not strip it).
Initially, I attempted workarounds
([https://github.com/coder/coder/issues/20667#issuecomment-3868578388](https://github.com/coder/coder/issues/20667#issuecomment-3868578388),
[https://github.com/coder/coder/issues/19728#issuecomment-3868578093](https://github.com/coder/coder/issues/19728#issuecomment-3868578093)),
but adding `/auth-redirect` to the permissive CORS paths and extending
the validity of workspace app auth tokens from 1 minute to 1 hour only
partially masked the issue. After workspace restarts and token expiry, I
no longer saw CORS errors, but the tokens were still not renewed.
After patching my local Nix-based setup on Coder v1.30.0 with this
change, I can no longer observe this behavior.
When discussing the changes needed for #22032 I was complaining about
how the `overflow-hidden` didn't work correctly so we could safely
remove it.
To continue these changes, I've refactored down how we work on mobile
within these triggers and enable full truncating and `max-w-`'s on each
of the content. Everything stemmed from the `<fieldset />` having a
`width: max-content` causing the content to extend past the bounds of
the container with `flex` in-toe.
Furthermore, the `(Default)` on `Preset` has been turned into a badge so
that we get the full truncation effect as we do with `Template Version`.
Follow-up improvements here might be to wrap the content of this input
on smaller displays.
### Preview
Top is the old, bottom is the new.
<img width="924" height="594" alt="preview"
src="https://github.com/user-attachments/assets/c1bbf152-03a6-4cad-b925-aad0549536a7"
/>
I was trying to figure out why `goleak` was complaining about a dangling
http2 connection goroutine in tests. Turns out that `taskname.Generate`
will call out to Anthropic if an API key is set, and we're calling it in
`dbgen`. Modified to use testutil method instead.
Closes#22028
This pull-request simply takes debounces the message sent to our
web-socket backend and debounces it to ensure we're not overwriting the
users input as they type. As an added bonus this will debounce message
spam if people are going crazy on Radio Items or similar.
An extra flavour bit of flavour with resolving a good use-case for
`cn()` in diagnostic errors 🙂
This pull-request takes the MUI based components from `<AuditLogRow />`
and its subsidiaries and updates them to use the correct newer Tailwind
based components.
This reverts commit 5224387c5a.
This is causing layout shifts to `0,0` when attempting to open
dropdowns. Something more battle-tested is needed unfortunately, Radix +
Scrollgutters is really annoying.
Add the ability to pause a running task and resume a paused task directly
from the TaskPage. This includes showing contextual messages when a task
is paused (manual vs timeout) and proper error handling with dialogs for
API errors.
- Extract task action logic into reusable mutations (api/queries/tasks.ts)
- Move TaskActionButton to modules/tasks for better organization
- Add pause button to TaskStartingAgent component
- Show appropriate state messages for transitioning states (pausing,
canceling, deleting)
The "Deploy PR manually" image (`deploy-pr-manually.png`) referenced in
the contributing docs has never existed in the repository, resulting in
a broken image on the [docs
site](https://coder.com/docs/about/contributing/CONTRIBUTING#deploying-a-pr).
This PR removes the broken `<Image>` tag and ends the sentence with a
period instead. The `pr-deploy.yaml` workflow link remains intact for
users to navigate to the workflow dispatch page directly.
Created on behalf of @DavidFrawormo
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Bumps the x group with 2 updates:
[golang.org/x/oauth2](https://github.com/golang/oauth2) and
[golang.org/x/sys](https://github.com/golang/sys).
Updates `golang.org/x/oauth2` from 0.34.0 to 0.35.0
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/golang/oauth2/commit/89ff2e1ac388c1a234a687cb2735341cde3f7122"><code>89ff2e1</code></a>
google: add safer credentials JSON loading options.</li>
<li>See full diff in <a
href="https://github.com/golang/oauth2/compare/v0.34.0...v0.35.0">compare
view</a></li>
</ul>
</details>
<br />
Updates `golang.org/x/sys` from 0.40.0 to 0.41.0
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/golang/sys/commit/fc646e489fd944b6f77d327ab77f1a4bab81d5ad"><code>fc646e4</code></a>
cpu: use IsProcessorFeaturePresent to calculate ARM64 on windows</li>
<li><a
href="https://github.com/golang/sys/commit/f11c7bb268eb8a49f5a42afe15387a159a506935"><code>f11c7bb</code></a>
windows: add IsProcessorFeaturePresent and processor feature consts</li>
<li><a
href="https://github.com/golang/sys/commit/d25a7aaff8c2b056b2059fd7065afe1d4132e082"><code>d25a7aa</code></a>
unix: add IoctlSetString on all platforms</li>
<li><a
href="https://github.com/golang/sys/commit/6fb913b30f367555467f08da4d60f49996c9b17a"><code>6fb913b</code></a>
unix: return early on error in Recvmsg</li>
<li>See full diff in <a
href="https://github.com/golang/sys/compare/v0.40.0...v0.41.0">compare
view</a></li>
</ul>
</details>
<br />
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore <dependency name> major version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's major version (unless you unignore this specific
dependency's major version or upgrade to it yourself)
- `@dependabot ignore <dependency name> minor version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's minor version (unless you unignore this specific
dependency's minor version or upgrade to it yourself)
- `@dependabot ignore <dependency name>` will close this group update PR
and stop Dependabot creating any more for the specific dependency
(unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore <dependency name>` will remove all of the ignore
conditions of the specified dependency
- `@dependabot unignore <dependency name> <ignore condition>` will
remove the ignore condition of the specified dependency and ignore
conditions
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps rust from `df6ca8f` to `760ad1d`.
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Resolves the TODO in TestPool by adding TestPool_Expiry which uses Go
1.25's testing/synctest to verify TTL-based cache eviction.
I wanted to get familiar with the new `synctest` package in Go 1.25 and
found this TODO comment, so I decided to take a stab at it 😄
Migrates `ConnectionLogRow` and `ConnectionLogDescription` off MUI and
Emotion. Replaces `@mui/material/Link` with the existing shadcn-based
`Link` component, swaps the deprecated `Stack` wrappers for plain divs
with Tailwind flex utilities, and converts all Emotion `css` prop styles
to Tailwind classes.
Also fixes a pre-existing lint issue where `tabIndex` was set on a
non-interactive div.
Replace all usages of MUI's `visuallyHidden` utility from `@mui/utils`
with Tailwind's `sr-only` class. Both produce identical CSS, so this is
a no-op behaviorally -- just removes another MUI dependency from the
codebase. Also updates the accessibility example in the frontend
contributing docs to match.
closes: https://github.com/coder/internal/issues/1331
Fixes up an issue in the test where we end up calling `FailNow` outside
the main test goroutine. Also adds the ability to name a `ptytest.PTY`
for cases like this one where we start multiple commands. This will help
debugging if we see the issue again.
This doesn't address the root cause of the failure, but I think we
should close the flake issue. I think we'd need like a stacktrace of all
goroutines at the point of failing the test, but that's way too much
effort unless we see this again.
Closes https://github.com/coder/internal/issues/1261.
This pull request adds an endpoint to pause coder tasks by stopping the
underlying workspace.
* Instead of `POST /api/v2/tasks/{user}/{task}/pause`, the endpoint is
currently experimental.
* We do not currently set the build reason to `task_manual_pause`,
because build reasons are currently only used on stop transitions.
This pull-request takes our `@mui/*` dependencies and replaces them with
shiny new Tailwind ones. Furthermore, it resolves an issue with the
`input` where `aria-invalid` wouldn't give it a red-ring like
`<InputGroup />` does.
As an added touch we've applied Formik to `<RequestOTPPage />` so that
we can render an invalid email easily.
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore <dependency name> major version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's major version (unless you unignore this specific
dependency's major version or upgrade to it yourself)
- `@dependabot ignore <dependency name> minor version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's minor version (unless you unignore this specific
dependency's minor version or upgrade to it yourself)
- `@dependabot ignore <dependency name>` will close this group update PR
and stop Dependabot creating any more for the specific dependency
(unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore <dependency name>` will remove all of the ignore
conditions of the specified dependency
- `@dependabot unignore <dependency name> <ignore condition>` will
remove the ignore condition of the specified dependency and ignore
conditions
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
This pull-request finds all of our previous instances of the MUI-based
Latency `color`'s and updates them to use the equivalents form the
Tailwind package.
Adds a standalone command that acts as a mock telemetry server,
receiving snapshots and printing them as a JSON stream to stdout. Useful
for local development testing with scripts/develop.sh by setting
CODER_TELEMETRY_ENABLE and CODER_TELEMETRY_URL environment variabless.
Adds coderd_template_workspace_build_duration_seconds histogram that
tracks the full duration from workspace build creation to agent ready.
This captures the complete user-perceived build time including
provisioning and agent startup.
The metric is emitted when the agent reports ready/error/timeout via the
lifecycle API, ensuring each build is counted exactly once per replica.
Previously, UpsertBoundaryUsageStats (INSERT...ON CONFLICT DO UPDATE) and
GetAndResetBoundaryUsageSummary (DELETE...RETURNING) could race during
telemetry period cutover. Without serialization, an upsert concurrent with the
delete could lose data (deleted right after being written) or commit after the
delete (miscounted in the next period). Both operations now acquire
LockIDBoundaryUsageStats within a transaction to ensure a clean cutover.
This pull request updates the documentation review workflow in
`.github/workflows/doc-check.yaml` to improve clarity and introduce
sticky comment logic for doc-check reviews. The changes focus on
refining the review context messages and providing detailed instructions
for updating existing doc-check comments, ensuring more consistent and
actionable documentation feedback.
**Workflow message and prompt improvements:**
* Refined the context messages for different PR trigger types to be
clearer and less repetitive, making instructions more concise for the
agent.
**Sticky comment logic and instructions:**
* Updated the task prompt to instruct the agent to look for an existing
doc-check comment containing `<!-- doc-check-sticky -->` and update it
instead of creating a new one, supporting more efficient and organized
review threads.
* Added detailed instructions for how to update sticky comments,
including checking off addressed items, striking through items no longer
needed, adding new items, and warning if changes can't be verified.
* Modified the comment format example to include sticky comment
conventions, such as strikethrough for reverted items, checkboxes for
addressed items, and warnings for unverifiable documentation changes.
* Ensured the `<!-- doc-check-sticky -->` marker is placed at the end of
the comment for easier identification and updates in future runs.
## Description
Fixes an incorrect path in the air-gapped/offline installation
documentation for publishing Coder modules to Artifactory.
The [coder/registry](https://github.com/coder/registry) repo has the
following structure:
```
registry/ # repo root
└── registry/ # subdirectory
└── coder/
└── modules/
```
The documentation previously instructed users to run:
```shell
cd registry/coder/modules
```
But the correct path is:
```shell
cd registry/registry/coder/modules
```
This was causing confusion for users trying to set up Coder modules in
air-gapped environments with Artifactory or similar repository managers.
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Adds a Go wrapper (`scripts/apidocgen/swaginit/main.go`) that calls
swag's Go API with `Strict: true`. The `--strict` flag isn't available
in swag's CLI in any version, so the wrapper is the only way to enable
it.
Also upgrades swag from v1.16.2 to v1.16.6 (better generics support,
precise numeric formats, `x-enum-descriptions`, CVE-2024-45338 fix).
Closes [`internal#1292`](https://github.com/coder/internal/issues/1292)
This pull-request reduces our nesting of the `View Task` button. Its
easier to jump to tasks now as we don't have to wait for the app status
to exist.
Previously we returned 400 Bad Request for all non-active states. This
was semantically incorrect for transitional and paused states where the
request is valid but conflicts with current state.
We now return 409 Conflict for pending/initializing/paused (resolvable
by waiting or resuming) and 400 for error/unknown (actual problems).
This enables client-side auto-resume orchestration per the task
lifecycle RFC.
Closescoder/internal#1265
Task snapshots were orphaned when tasks were soft-deleted. The
`task_snapshots` table has an `ON DELETE CASCADE` foreign key, but
that only fires on hard deletes.
Modified DeleteTask to use a CTE that atomically soft-deletes the
task and removes its snapshot in a single transaction. The query now
returns just the task UUID instead of the full row.
Closescoder/internal#1283
Relates to https://github.com/coder/coder/pull/21922 /
https://github.com/coder/internal/issues/1259
* Adds `dbfake.BuilderOption func(*WorkspaceBuildBuilder)`
* Adds `BuilderOption` methods for setting various provisioner job
related fields on `WorkspaceBuildBuilder`.
* Migrates a number of existing tests that previously dependeded on
provisioner job timing to use these updated methods in the following
packages:
* `coderd/jobreaper`
* `coderd/notifications/reports`
* `enterprise/coderd/schedule`
* `enterprise/coderd/prebuilds`
* `scripts/workspace-runtime-audit`
🤖 Created using Mux (Opus 4.5)
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
We attempted to unify these previously in #21914 however it appears I
missed dropping this a `font-weight` level. This pull-request makes this
very simple change, its now inline with the Figma design!
fixes: https://github.com/coder/internal/issues/1300
Adds brotli and zstd compression to the binary cache. Also refactors coderd's streaming encoding middleware to use the same standard set of compression algorithms, so we have them in one place.
relates to: https://github.com/coder/internal/issues/1300
Refactors the options to the site handler to take the cache directory, rather than expecting the caller to call `ExtractOrReadBinFS` and pass the results.
This is important in this stack because we need direct access to the cache directory for compressed file caching.
relates to: https://github.com/coder/internal/issues/1300
Refactors the bin handler to be a `struct` instead of a handlerfunc. The reason we want this is because we are going to introduce a cache of compressed files, so we need somewhere to put this cache.
relates to: https://github.com/coder/internal/issues/1300
Refactors the site binary handler routines to their own file. The `site.go` was getting pretty long and I want to do some refactoring on how the binary handler works.
This PR is literally just moving code from file to file; at the package level nothing is changed.
relates to: https://github.com/coder/internal/issues/1300
Adds a new package called `cachecompress` which takes a `http.FileSystem` and wraps it with an on-disk cache of compressed files. We lazily compress files when they are requested over HTTP.
# Why we want this
With cached compress, we reduce CPU utilization during workspace creation significantly.

This is from a 2k scaletest at the top of this stack of PRs so that it's used to server `/bin/` files. Previously we pegged the 4-core Coderds, with profiling showing 40% of CPU going to `zstd` compression (c.f. https://github.com/coder/internal/issues/1300).
With this change compression is reduced down to 1s of CPU time (from 7 minutes).
# Implementation details
The basic structure is taken from Chi's Compressor middleware. I've reproduced the `LICENSE` in the directory because it's MIT licensed, not AGPL like the rest of Coder.
I've structured it not as a middleware that calls an arbitrary upstream HTTP handler, but taking an explicit `http.FileSystem`. This is done for safety so we are only caching static files and not dynamically generated content with this.
One limitation is that on first request for a resource, it compresses the whole file before starting to return any data to the client. For large files like the Coder binaries, this can add 1-5 seconds to the time-to-first-byte, depending on the compression used.
I think this is reasonable: it only affects the very first download of the binary with a particular compression for a particular Coderd.
If we later find this unacceptible, we can fix it without changing interfaces. We can poll the file system to figure out how much data is available while the compression is inprogress.
follows on from #21940.
The API endpoints existed for this already, so this PR just adds CLI functionality which uses those API endpoints.
Generated with the help of Mux
## Summary
Updates the AI Governance documentation to explicitly mention that both
Community and Premium deployments include 1,000 Agent Workspace Builds.
Also clarifies that Community deployments do not have access to AI
Bridge or Agent Boundaries.
This is a follow-up to #21943 which made the same clarification in the
Tasks documentation.
## Changes
- Updated the "Agent Workspace Build Limits" section in
`docs/ai-coder/ai-governance.md`
- Added explicit mention that Community deployments lack AI Bridge and
Agent Boundaries access
---
Created on behalf of @mattvollmer
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
## Summary
Fixes flaky `TestServer/BuiltinPostgres` test caused by port conflicts
in CI.
## Fix
Increase retry attempts from 3 to 10 for better odds when port conflicts
occur.
Fixes https://github.com/coder/internal/issues/1017
Adds additional logs for determining what signal the agent receives
prior to shut down. Also helps distinguish whether the signal originated
at the agent or reaper.
## Description
This PR adds documentation for configuring clients to work with AI
Bridge via AI Bridge Proxy, specifically GitHub Copilot.
Preview:
https://coder.com/docs/@docs-aibridge-proxy-client-config/ai-coder/ai-bridge/ai-bridge-proxy/setup#client-configuration
## Changes
* Add Client Configuration section to
`docs/ai-coder/ai-bridge/ai-bridge-proxy/setup.md` covering proxy and CA
certificate configuration
* Add `docs/ai-coder/ai-bridge/clients/copilot.md` with configuration
instructions for: Copilot CLI, VS Code Copilot Extension, JetBrains IDEs
* Update `docs/ai-coder/ai-bridge/clients/index.md`:
* Add introduction explaining base URL vs proxy-based integration
* Add GitHub Copilot to compatibility table
Related to: https://github.com/coder/internal/issues/1188
Context was created before expensive setup operations (building
workspaces, starting agents), leaving insufficient time for the actual
command execution. Split into setupCtx for setup and a fresh ctx for
the command to ensure both get the full timeout.
The API endpoints existed for this already, so this PR just adds CLI
functionality which uses those API endpoints.
closes#21891
Generated with the help of Mux
macOS runners lack GNU toolchain dependencies (bash 4+, GNU getopt, make
4+) required by `scripts/lib.sh`. When any script sources `lib.sh`, it
checks for these dependencies and fails if they're missing.
This caused consistent failures in the `test-go-pg (macos-latest)` job
in `nightly-gauntlet.yaml`, which didn't have the GNU tools setup that
`ci.yaml` had. Commit 9a417df ("ci: add retry logic for Go module
operations") added a macOS GNU tools step to `ci.yaml`, but
`nightly-gauntlet.yaml` was not updated.
This PR adds a reusable `setup-gnu-tools` action and uses it
consistently across all workflows with macOS jobs, replacing the inline
brew install steps.
Closes https://github.com/coder/internal/issues/1133
The Connection Log page has a preset filter "Active SSH connections"
that was using `status:connected`, but the only valid status enum values
are `completed` and `ongoing`. This caused the preset to generate an
invalid query.
This changes the preset to use `status:ongoing type:ssh` and adds a
typed helper function so that invalid enum values will be caught at
compile time.
---
PR generated by [mux](https://mux.coder.com), but reviewed by a human.
Adds support for filtering workspaces by health status using
healthy:true or healthy:false in the search query.
This is done by changing `has-agent` to accept a list of statuses and
aliasing `health:true` to `has-agent:connected` and `healthy:false` to
`has-agent:timeout,disconnected`.
Fixes#21623
Add the ability to pause and resume tasks directly from the Tasks table,
allowing users to manage workspace resources without navigating to
individual task pages.
This pull-request implements various permission checks to the
`<OAuth2App* />` stories and components. We're trying to ensure that
we're actually allowed to `create`/`view`/`delete` on both Secrets and
Applications before showing them to the user/allowing action.
Furthermore, I've added various stories to catch when a user lacks these
permissions.
I noticed this particularly because I'm only an `Auditor` on our DEV
instance and can't see these fields.
---------
Co-authored-by: coder-tasks[bot] <254784001+coder-tasks[bot]@users.noreply.github.com>
The comments generated are too noisy and not of sufficiently high signal
that we should automatically opt every PR in.
This PR moves the trigger to the `code-review` label _only_.
Signed-off-by: Danny Kopping <danny@coder.com>
This pull-request implements a super simple change, essentially when we
fail to login we'd like to persist the `email` used when attempting to
sign-in. This just speeds up the flow rather than having to type the
email in again.
This PR increases the size of the schedule increment/decrement buttons
([-] [+]) to match the icon button style at size `sm` (same as the Stop,
Restart buttons).
## Changes
- Button dimensions: 20×20px → 32×32px
- Icon size: `size-icon-xs` → `size-icon-sm`
- Border radius: 4px → 6px (consistent with other icon buttons)
## Before
The [-] [+] buttons were tiny (20×20px) and difficult to click.
## After
The buttons now match the icon button style at size `sm` (32×32px),
consistent with other topbar buttons.
---
Created on behalf of @christin
---------
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
> [!NOTE]
> It should be noted that these #21781#21807#21809 pull-request are
required before we can merge this. This will stop us to battling the
`z-index` that is provided by MUI.
This is avoiding the changes that would be required in #21819
This pull-request removes on our reliance to control the scroll from
within another`<div />`, this means that we can actively make use of
`<ScrollRestoration />` where the page will return the top of the page
when you navigate to a new URL.
Updates the multi-model support description in the Coder Research docs
to reference provider companies (Anthropic, xAI, OpenAI) instead of
specific model names (Claude sonnet-4/opus-4, Grok, GPT-5).
This makes the docs more stable as model names change frequently, while
provider names remain constant.
---------
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Co-authored-by: Matt Vollmer <matthewjvollmer@outlook.com>
- remove beta labels
- clarify how AWB is measured
- reassurance of no downtimes when limit is reached
---------
Co-authored-by: Atif Ali <atif@coder.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Matt Vollmer <matthewjvollmer@outlook.com>
Adds `add-project` to the `mux` module in the dogfood Coder template so
Mux opens the cloned repo by default.
- Uses `local.repo_dir` (defaults to `/home/coder/coder`) so it stays
correct if the repo base dir parameter changes.
Testing:
- `terraform fmt -check dogfood/coder/main.tf`
Adds a new AI Bridge client configuration page for **Mux** and lists it
in the client compatibility table.
- Add `docs/ai-coder/ai-bridge/clients/mux.md` with a short intro, UI +
env var + `~/.mux/providers.jsonc` examples
- Add Mux to the AI Bridge client compatibility table
- Add the new page to `docs/manifest.json`
Refs: https://mux.coder.com/config/providers#environment-variables
This pull-request ensures that we're using `<DropdownMenu />` in the
`Admin Settings` button as things weren't uniform before. This is inline
with the Figma design with the darker ("black") background. This has an
added side-benefit of removing some MUI-specific code.
<img
src="https://github.com/user-attachments/assets/4eb9136b-91b3-44ac-81a0-5abd1cf2cdf2"
/>
Update the agent protobuf schema (agent/proto/agent.proto) to include:
- subagent_id field in WorkspaceAgentDevcontainer message
- id field in CreateSubAgentRequest message
Bump the Agent API version from v2.7 to v2.8 and update all client
references throughout the codebase (ConnectRPC27 -> ConnectRPC28,
DRPCAgentClient27 -> DRPCAgentClient28).
## Description
Add documentation for AI Bridge Proxy.
## Changes
This PR adds documentation for AI Bridge Proxy under
`docs/ai-coder/ai-bridge/ai-bridge-proxy/`:
* `index.md`: Overview of AI Bridge Proxy, how it works (MITM vs tunnel
modes), and when to use it
* `setup.md`: Setup guide covering:
* Proxy configuration and required settings
* Security considerations and deployment options
* CA certificate generation (self-signed and organization-signed)
* Upstream proxy chaining configuration
Note: TODO comments in the documentation will be addressed in follow-up
PRs.
Related to: https://github.com/coder/internal/issues/1188
This pull-request refactors filter-related dropdown and input components
from MUI to our Tailwind-based design system. This is more inline with
the Figma design, controversially we are changing the button group for
canned filters and input to two seperate components.
- **InputGroup**: Complete rewrite to a compound component pattern
(`InputGroup`, `InputGroupAddon`, `InputGroupInput`, `InputGroupButton`)
using Tailwind and CVA, replacing the old CSS-in-JS approach
- **SearchField**: Migrated from MUI TextField to use the new InputGroup
components, with a simplified API and proper ref forwarding
- **Filter/PresetMenu**: Replaced MUI Menu with our DropdownMenu
component, and updated icon to `SlidersHorizontal`
### Changes
| Component | Before | After |
|-----------|--------|-------|
| InputGroup | CSS-in-JS with MUI margin hacks | Compound component with
Tailwind group states |
| SearchField | MUI TextField + InputAdornment | InputGroup +
InputGroupAddon composition |
| PresetMenu | MUI Menu/MenuItem | DropdownMenu/DropdownMenuItem |
| MenuSearch | Complex CSS overrides | Single Tailwind class |
<img
src="https://github.com/user-attachments/assets/5b819027-2dca-4dcc-b6d6-7096fa3775c0"
/>
On Windows, `pty.New()` was creating a `ConPTY` (`PseudoConsole`) even
when no process would be attached. `ConPTY` requires a real process to
function correctly - without one, the pipe handles become invalid
intermittently, causing flaky test failures like `read |0: The handle is
invalid.`
This affected tests using the `ptytest.New()` + `Attach()` pattern for
in-process CLI testing.
The fix splits Windows PTY creation into two paths:
- `newPty()` now returns a simple pipe-based PTY for the `Attach()` use
case
- `newConPty()` creates a real `ConPTY`, called by `Start()` when a
process will be attached
AFAICT this will result in no change in behaviour outside of tests.
Fixescoder/internal#1277
_Disclaimer: investigated and implemented by Claude Opus 4.5, reviewed
by me._
---------
Signed-off-by: Danny Kopping <danny@coder.com>
* Adds support for parameter `format=text` in the following API routes:
* `/api/v2/workspaceagents/:id/logs`
* `/api/v2/workspacebuilds/:id/logs`
* `/api/v2/templateversions/:id/logs`
* `/api/v2/templateversions/:id/dry-run/:id/logs`
* Adds links to view raw logs on the following pages:
* Workspace build page
* Template editor page
* Template version page
* Refactors existing log formatting in `cli/logs.go` to live in `codersdk`.
🤖 Generated with Claude Opus 4.5, reviewed by me.
---------
Co-authored-by: Claude <noreply@anthropic.com>
The AcquireProvisionerJob query only checked started_at IS NULL, allowing
it to acquire jobs that were canceled while pending (which have
completed_at set but started_at still NULL).
Added completed_at IS NULL check to the query to prevent this.
Also fixed JobCompleteBuilder.Do() in dbfake to set started_at when
completing jobs to match production behavior.
Fixescoder/internal#1323
## Summary
Previously, `CODER_PPROF_ADDRESS` and `CODER_PROMETHEUS_ADDRESS` were
hardcoded in the Helm chart template to `0.0.0.0:6060` and
`0.0.0.0:2112` respectively. These values could not be overridden via
`coder.env` values because the hardcoded values were set first in the
template, and Kubernetes uses the first occurrence of duplicate env
vars.
This was a security concern because binding to `0.0.0.0` exposes these
endpoints to any pod in the cluster:
- **pprof** can expose sensitive runtime information (goroutine stacks,
heap profiles, CPU profiles that may contain memory contents)
- **Prometheus metrics** may contain sensitive operational data
## Changes
1. **`helm/coder/templates/_coder.tpl`**: Added logic to check if the
user has set `CODER_PPROF_ADDRESS` or `CODER_PROMETHEUS_ADDRESS` in
`coder.env` before applying the default values. If the user provides a
value, the hardcoded default is skipped.
2. **`helm/coder/values.yaml`**: Updated documentation to:
- Remove these vars from the "cannot be overridden" list
- Add them to a new "can be overridden" section with security
recommendations
3. **Tests**: Added test cases for both override scenarios with
corresponding golden files.
## Usage
Users can now restrict pprof and prometheus to localhost only:
```yaml
coder:
env:
- name: CODER_PPROF_ADDRESS
value: "127.0.0.1:6060"
- name: CODER_PROMETHEUS_ADDRESS
value: "127.0.0.1:2112"
```
## Local Testing
To verify the fix locally:
```bash
# Update helm dependencies
cd helm/coder && helm dependency update
# Test default behavior (should show 0.0.0.0)
helm template coder . -f tests/testdata/default_values.yaml --namespace default | grep -A1 'CODER_PPROF_ADDRESS\|CODER_PROMETHEUS_ADDRESS'
# Test pprof override (should show 127.0.0.1:6060)
helm template coder . -f tests/testdata/pprof_address_override.yaml --namespace default | grep -A1 'CODER_PPROF_ADDRESS'
# Test prometheus override (should show 127.0.0.1:2112)
helm template coder . -f tests/testdata/prometheus_address_override.yaml --namespace default | grep -A1 'CODER_PROMETHEUS_ADDRESS'
# Run Go tests
cd tests && go test . -v
```
Fixes#21713
---------
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Co-authored-by: uzair-coder07 <uzair@coder.com>
## Summary
Updates the description for the "Use" role in the workspace sharing
dropdown to explicitly mention that users with this permission can start
and stop the workspace, not just read and access it.
## Changes
- Updated the "Use" role description from "Can read and access this
workspace." to "Can read, access, start, and stop this workspace."
## Context
This clarification helps users understand the full scope of the "Use"
permission, which includes `ActionWorkspaceStart` and
`ActionWorkspaceStop` as defined in `coderd/database/db2sdk/db2sdk.go`.
---
*Created on behalf of @geokat*
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Fixes the state format for Workspace Sharing in `docs/manifest.json`.
Changes `"early_access"` to `"early access"` (with space, no underscore)
to match the format used by other early access entries and to fix builds
on coder/coder.com.
Follow-up to #21797.
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
This pull request adds a new documentation file that defines the
"code-review" skill for use in the project. The document outlines a
standard workflow, severity levels, key areas to focus on during code
reviews, and Coder-specific review guidelines. This aims to standardize
and improve the quality and consistency of code reviews across the team.
Documentation and process standardization:
* Added `.claude/skills/code-review/SKILL.md`, which describes the
code-review skill, including workflow steps, severity levels, what to
look for in reviews, and what not to comment on. It also provides
Coder-specific patterns and best practices for authorization, error
handling, and shell scripting.
This PR changes the shared workspaces documentation page from Beta to
Early Access status.
Changes `docs/manifest.json` to update the state from `["beta"]` to
`["early_access"]` for the Workspace Sharing page.
Ref: https://coder.com/docs/user-guides/shared-workspaces
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
When a workspace has multiple agents (e.g., main + devcontainer), the
build timeline was showing all events duplicated under each agent
instead of filtering by the agent they belong to.
Added agentId to the Stage type and filter timings by workspace_agent_id
so each agent section only shows its own events.
Fixes#18002
These tests use dbfake to set up database state directly and don't
need a provisioner daemon. Removing it fixes a flaky failure on
Windows where the provisioner daemon acquired a job that dbfake had
already "completed", causing the task status to be "error" instead
of "paused".
Fixescoder/internal#1322
Refs coder/internal#1323
Previously there were two issues that could cause incorrect boundary
usage telemetry data.
1. Bad handling across snapshot intervals: After telemetry snapshot deleted
the DB row, the next flush would INSERT the stale cumulative data (which
included already-reported usage). This would then be overwritten by
subsequent UPDATE flushes, causing the delta between the last snapshot
and the reset to be lost (under-reporting usage). Additionally, if there
was no new usage after the reset, the tracker would carry over all usage
from the previous period into the next period (over-reporting usage).
2. Missed usage from a race condition: Track() calls between the first
mutex unlock and second mutex lock in FlushToDB() were lost. The data
wasn't included in the current flush (already snapshotted) and was wiped
by the subsequent reset. This is likely low impact to overall usage
numbers in the real world.
Fix by tracking unique workspace/user deltas separately from cumulative
values and always tracking delta allowed/denied requests. Deltas are used
for INSERT (fresh start after reset), cumulative for UPDATE (accurate unique
counts within a period). All counters reset atomically before the DB operation
so Track() calls during the operation are preserved for the next flush.
Archiving modules attempts to save as many modules as it can before it hits the limit. Enabling the template as much as it can, rather than a hard failure.
## Description
Adds authentication support for upstream proxies in `aibridgeproxyd`.
When credentials are provided in the upstream proxy URL, the
`Proxy-Authorization` header is now included in `CONNECT` requests.
## Changes
* Extract credentials from upstream proxy URL and set
`Proxy-Authorization` header on tunneled `CONNECT` requests
* Support optional user and password
* Fail at startup if both username and password are empty
* Add tests for all auth scenarios
Follow-up: https://github.com/coder/internal/issues/1204
Apply optimizations:
* https://github.com/openai/openai-go/pull/602
* https://github.com/coder/aibridge/pull/160
These reduce CPU time and allocation count for OpenAI `chat/completions`
and `responses` APIs, making the use of OpenAI chat models through AI
Bridge more performant.
In order to test these changes, we add scaletesting support for the
responses API.
## Summary
This PR restructures the Agent Boundaries documentation to improve URL
clarity and consistency:
### Changes
- Renames `/docs/ai-coder/boundary/` to
`/docs/ai-coder/agent-boundaries/`
- Renames `agent-boundary.md` to `index.md` for cleaner URLs
- Updates all internal doc references to the new paths
- Updates `manifest.json` with new paths
- Updates prose references from "Boundary" to "Agent Boundaries"
throughout the documentation (33 changes across 4 files)
### New URL structure
| Old URL | New URL |
|---------|----------|
| `/docs/ai-coder/boundary/agent-boundary` |
`/docs/ai-coder/agent-boundaries` |
| `/docs/ai-coder/boundary/nsjail` |
`/docs/ai-coder/agent-boundaries/nsjail` |
| `/docs/ai-coder/boundary/landjail` |
`/docs/ai-coder/agent-boundaries/landjail` |
| `/docs/ai-coder/boundary/rules-engine` |
`/docs/ai-coder/agent-boundaries/rules-engine` |
| `/docs/ai-coder/boundary/version` |
`/docs/ai-coder/agent-boundaries/version` |
### Follow-up required
Redirects need to be added to `coder/coder.com` for the old URLs:
- `/docs/ai-coder/agent-boundary` → `/docs/ai-coder/agent-boundaries`
(this one is currently 404'ing from Google search results)
- `/docs/ai-coder/boundary/:path*` →
`/docs/ai-coder/agent-boundaries/:path*`
---
Created on behalf of @mattvollmer
---------
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Co-authored-by: Matt Vollmer <matthewjvollmer@outlook.com>
Liveness checks are currently causing pods to be killed during
long-running migrations.
They are generally not advisable for our workloads; if a pod becomes
unresponsive we _need_ to know about it (due to a deadlock, etc) and not
paper over the issue by killing the pod.
I've also made all probe settings configurable.
---------
Signed-off-by: Danny Kopping <danny@coder.com>
## Summary
The `lint/actions/zizmor` target flakes in CI due to network
connectivity issues when running on depot runners
(https://github.com/coder/internal/issues/1233). The zizmor tool needs
to reach GitHub's API but intermittently fails with "Connection refused"
errors.
## Changes
- Creates a new `lint-actions` CI job that only runs when `.github/**`
files are touched (using existing `ci` filter)
- Removes zizmor from the main `lint` job
- Uses a Makefile conditional to include actionlint in `make lint`
locally but skip it in CI (where `lint-actions` handles it)
This reduces unnecessary flake exposure for PRs that don't modify GitHub
Actions files.
## Testing
- `actionlint` passes on the modified ci.yaml
- Verified Makefile conditional works: actionlint included locally,
skipped when `CI=true`
Fixes https://github.com/coder/internal/issues/1233
Closes#21044
This pull-request addresses an issue we were seeing where we would
attempt to filter the `<UserCombobox />` by the users username or email
not their username (which the rendered options would show).
To highlight this I created three different users. Each with a username
that did not contain their `email` or `name` and attempted to filter.
Attempting to search for `John` wouldn't actually show the user as his
username was `x`, and infact whereas a subset of users might be returned
from the backend for having `john` in the `email` it would've been
filtered by the frontend for not being in the `name` field.
| Name | Username |
| --- | --- |
| `Jake` | `z` |
| `Jeff` | `y` |
| `John` | `x` |
| Previously | Now |
| --- | --- |
| <img width="560" height="547" alt="OLD_USER_COMBOBOX"
src="https://github.com/user-attachments/assets/a0567264-0034-42ac-aba0-95b05c4f92dd"
/> | <img width="580" height="548" alt="NEW_USER_COMBOBOX"
src="https://github.com/user-attachments/assets/1aa0c942-d340-4b1c-8dde-b97879525bfb"
/> |
## Description
When configuring a From address with a display name (e.g., `Coder System
<system@coder.com>`), the SMTP `MAIL FROM` command was incorrectly
receiving the full address string instead of just the bare email
address, causing `501 Invalid MAIL argument` errors on some SMTP
servers.
## Changes
- Updated `validateFromAddr` to return both:
- `envelopeFrom`: bare email for SMTP `MAIL FROM` command (RFC 5321)
- `headerFrom`: original address with display name for email header (RFC
5322)
Fixes#20727
## Description
Mark `--ssh-hostname-prefix` flag and `CODER_SSH_HOSTNAME_PREFIX` env
variable as deprecated, recommending users to use
`--workspace-hostname-suffix` / `CODER_WORKSPACE_HOSTNAME_SUFFIX`
instead for consistency with Coder Desktop.
The deprecated option is now hidden from help output and docs but
remains functional for backward compatibility. When used, it will show a
deprecation warning pointing to the recommended alternative.
## Changes
- Added `UseInstead` pointing to `workspace-hostname-suffix` option
(triggers deprecation warning)
- Set `Hidden: true` to hide from CLI help and documentation
- Updated description to mention deprecation
- Regenerated docs and help files via `make gen`
Closes#18156
---
_Originally requested by @matifali in
https://github.com/coder/coder/pull/18085#discussion_r2115594447_
This pull-request addresses the size of the iconography within the
`<SingleSignOnSection />` section component. As a side-effect of the
changes in #21347 we are now rendering this too large.
Furthermore, to catch these issues in future we've introduced two new
stories within `SecurityPageView.stories.tsx` which render both `oidc`
and `github` login routes.
| Old | New |
| --- | --- |
| <img width="520" height="399" alt="OLD_SSO_PROVIDER"
src="https://github.com/user-attachments/assets/f6687b9a-d6bc-4bca-859a-0b59a3f6ba03"
/> | <img width="520" height="398" alt="NEW_SSO_PROVIDER"
src="https://github.com/user-attachments/assets/5beb8149-3e07-4dbc-9e0f-06f9207ecc59"
/> |
## Summary
The bottom admin bar (DeploymentBannerView) was showing a thick
scrollbar when content overflowed horizontally. This change applies the
native thin scrollbar style instead.
## Changes
- Added `[scrollbar-width:thin]` Tailwind CSS arbitrary value to the
deployment banner container
This uses the native CSS `scrollbar-width: thin` property which is
supported in modern browsers (Firefox, Chrome, Edge, Safari) and
provides a less obtrusive scrollbar when horizontal scrolling is needed.
## Testing
- The change is purely CSS and was verified with lint and format checks
passing
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> **Low Risk**
> Purely a CSS styling tweak with no behavioral, data, or security
impact; risk is limited to minor cross-browser appearance differences.
>
> **Overview**
> Updates the dashboard `DeploymentBannerView` bottom admin bar styling
to use the native CSS `scrollbar-width: thin` via Tailwind
(`[scrollbar-width:thin]`), reducing scrollbar thickness when the banner
overflows horizontally.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
ba36e48d66. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
Co-authored-by: Cursor Agent <cursor@coder.com>
This pull-request resolves a really annoying issue with the `<TasksPage
/>` switcher control. Essentially every time I navigated to this page my
eyes were drawn to this button that felt out of place. I finally figured
out why and its that its breaking the first rules of nested rounded
corners.
We should be using the following math to calculate the roundedness.
```
outerRadius - gap = innerRadius
```
<img width="852" height="596" alt="button-rounding"
src="https://github.com/user-attachments/assets/89de5d98-0891-4c9d-a5aa-66f722796630"
/>
## Summary
Adds support for pre-filling the OAuth2 application creation form via
URL query parameters.
## Query Parameters
| Parameter | Description |
|-----------|-------------|
| `name` | Pre-fills the "Application name" field |
| `callback_url` | Pre-fills the "Callback URL" field |
| `icon` | Pre-fills the "Application icon" field |
## Example
```
/deployment/oauth2-provider/apps/add?name=MyApp&callback_url=https://example.com/callback&icon=/icon/github.svg
```
This allows external tools or documentation to link directly to the
OAuth2 app creation page with pre-populated values.
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore <dependency name> major version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's major version (unless you unignore this specific
dependency's major version or upgrade to it yourself)
- `@dependabot ignore <dependency name> minor version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's minor version (unless you unignore this specific
dependency's minor version or upgrade to it yourself)
- `@dependabot ignore <dependency name>` will close this group update PR
and stop Dependabot creating any more for the specific dependency
(unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore <dependency name>` will remove all of the ignore
conditions of the specified dependency
- `@dependabot unignore <dependency name> <ignore condition>` will
remove the ignore condition of the specified dependency and ignore
conditions
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore <dependency name> major version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's major version (unless you unignore this specific
dependency's major version or upgrade to it yourself)
- `@dependabot ignore <dependency name> minor version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's minor version (unless you unignore this specific
dependency's minor version or upgrade to it yourself)
- `@dependabot ignore <dependency name>` will close this group update PR
and stop Dependabot creating any more for the specific dependency
(unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore <dependency name>` will remove all of the ignore
conditions of the specified dependency
- `@dependabot unignore <dependency name> <ignore condition>` will
remove the ignore condition of the specified dependency and ignore
conditions
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Update provisionerdserver to handle the changes introduced to
provisionerd in https://github.com/coder/coder/pull/21602
We now create a relationship between `workspace_agent_devcontainers` and
`workspace_agents` with the newly created `subagent_id`.
## Summary
Clarifies the [AI Bridge client config authentication
section](https://coder.com/docs/ai-coder/ai-bridge/client-config#authentication)
to explicitly state that only **Coder-issued tokens** are accepted.
## Changes
- Changed "API key" to "Coder API key" throughout the Authentication
section
- Added a note clarifying that provider-specific API keys (OpenAI,
Anthropic, etc.) will not work with AI Bridge
Fixes#21790
---
Created on behalf of @dannykopping
---------
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Previously the task logs endpoint only worked when the workspace was
running, leaving users unable to view task history after pausing.
This change adds snapshot retrieval with state-based branching: active
tasks fetch live logs from AgentAPI, paused/initializing/pending tasks
return stored snapshots (providing continuity during pause/resume), and
error/unknown states return HTTP 409 Conflict.
The response includes snapshot metadata (snapshot, snapshot_at) to
indicate whether logs are live or historical.
Closescoder/internal#1254
Operators need to know which API key was used in HTTP requests.
For example, if a key is leaking and a DDOS is underway using that key, operators need a way to identify the key in use and take steps to expire the key (see https://github.com/coder/coder/issues/21782).
_Disclaimer: created using Claude Opus 4.5_
During development of #21659 I approved some `<Paywall />` code that had
an extensive props system, however, I wasn't a huge fan of this. This
approach attempts to take it further like something `shadcn` would,
where-in we define the `<Paywall />` (and its subset of components) and
we wrap around those when needed for `<PaywallAIGovernance />` and
`<PaywallPremium />`.
Theoretically there is no real CSS/Design changes here. However
screenshot for prosperity.
| Previously | Now |
| --- | --- |
| <img width="2306" height="614" alt="CleanShot 2026-01-29 at 10 56
05@2x"
src="https://github.com/user-attachments/assets/83a4aa1b-da74-459d-ae11-fae06c1a8371"
/> | <img width="2308" height="622" alt="CleanShot 2026-01-29 at 10 55
05@2x"
src="https://github.com/user-attachments/assets/4aa43b09-6705-4af3-86cc-edc0c08e53b1"
/> |
---------
Co-authored-by: Ben Potter <me@bpmct.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
## Description
Removes the following deprecated Prometheus metrics:
- `coderd_api_workspace_latest_build_total` → use
`coderd_api_workspace_latest_build` instead
- `coderd_oauth2_external_requests_rate_limit_total` → use
`coderd_oauth2_external_requests_rate_limit` instead
These metrics were deprecated in #12976 because gauge metrics should
avoid the `_total` suffix per [Prometheus naming
conventions](https://prometheus.io/docs/practices/naming/).
## Changes
- Removed deprecated metric `coderd_api_workspace_latest_build_total`
from `coderd/prometheusmetrics/prometheusmetrics.go`
- Removed deprecated metric
`coderd_oauth2_external_requests_rate_limit_total` from
`coderd/promoauth/oauth2.go`
- Updated tests to use the non-deprecated metric name
Fixes#12999
The test was creating two template versions without explicit names,
relying on `namesgenerator.NameDigitWith()` which can produce
collisions. When both versions got the same random name, the test failed
with a 409 Conflict error.
Fix by giving each version an explicit name (`v1`, `v2`).
Closes https://github.com/coder/internal/issues/1309
---
*Generated by [mux](https://mux.coder.com)*
Add PeriodStart and PeriodDurationMilliseconds fields to BoundaryUsageSummary
so consumers of telemetry data can understand usage within a particular time window.
## Summary
This PR updates the note on the Tasks documentation page to more clearly
explain the relationship between Premium task limits and the AI
Governance Add-On.
## Problem
The previous wording:
> "Premium Coder deployments are limited to running 1,000 tasks. Contact
us for pricing options or learn more about our AI Governance Add-On to
evaluate all of Coder's AI features."
The "or" in this sentence could be interpreted as two separate paths:
(1) contact sales for custom pricing that might not require the add-on,
OR (2) get AI Governance. This led to confusion about whether higher
task limits could be obtained without the AI Governance Add-On.
## Solution
Updated the note to be explicit about the scaling path:
> "Premium deployments include 1,000 Agent Workspace Builds for
proof-of-concept use. To scale beyond this limit, the AI Governance
Add-On provides expanded usage pools that grow with your user count.
Contact us to discuss pricing."
This makes it clear that:
1. Premium includes 1,000 builds for POC use
2. Scaling beyond that requires the AI Governance Add-On
3. Contact sales to discuss pricing for the add-on
Created on behalf of @mattvollmer
---------
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Co-authored-by: Matt Vollmer <matthewjvollmer@outlook.com>
Justification:
- Populating `members` is authorized with `group_member.read` which is
not required to be able to share a workspace
- Populating `total_member_count` is authorized with `group.read` which
is required to be able to share
- The updated helper is only used in template/workspace sharing UIs, so
other pages that might need counts of readable members are unaffected
Related to: https://github.com/coder/internal/issues/1302
## Description
Adds Prometheus metrics to the AI Bridge Proxy for observability into
proxy traffic and performance.
## Changes
* Add Metrics struct with the following metrics:
* `connect_sessions_total`: counts CONNECT sessions by type
(mitm/tunneled)
* `mitm_requests_total`: counts MITM requests by provider
* `inflight_mitm_requests`: gauge tracking in-flight requests by
provider
* `mitm_request_duration_seconds`: histogram of request latencies by
provider
* `mitm_responses_total`: counts responses by status code class
(2XX/3XX/4XX/5XX) and provider
* Register metrics with `coder_aibridgeproxyd_` prefix in CLI
* Unregister metrics on server close to prevent registry leaks
* Add `tunneledMiddleware` to track non-allowlisted CONNECT sessions
* Add tests for metric recording in both MITM and tunneled paths
Closes: https://github.com/coder/internal/issues/1185
Adds a new subcommand to print the current session token for use in
scripts and automation, similar to `gh auth token`.
## Usage
```bash
CODER_SESSION_TOKEN=$(coder login token)
```
Fixes#21515
## Description
Add exponential backoff retries to all `go install` and `go mod
download` commands across CI workflows and actions.
## Why
Fixes
[coder/internal#1276](https://github.com/coder/internal/issues/1276) -
CI fails when `sum.golang.org` returns 500 errors during Go module
verification. This is an infrastructure-level flake that can't be
controlled.
## Changes
- Created `.github/scripts/retry.sh` - reusable retry helper with
exponential backoff (2s, 4s, 8s delays, max 3 attempts), using
`scripts/lib.sh` helpers
- Wrapped all `go install` and `go mod download` commands with retry in:
- `.github/actions/setup-go/action.yaml`
- `.github/actions/setup-sqlc/action.yaml`
- `.github/actions/setup-go-tools/action.yaml`
- `.github/workflows/ci.yaml`
- `.github/workflows/release.yaml`
- `.github/workflows/security.yaml`
- Added GNU tools setup (bash 4+, GNU getopt, make 4+) for macOS in
`test-go-pg` job, since `retry.sh` uses `lib.sh` which requires these
tools
## Summary
Fixes the broken Kilo Code documentation link in the AI Bridge
client-config page.
## Changes
- Updated the Kilo Code link from the old
`/docs/features/api-configuration-profiles` (returns 404) to the current
`/docs/ai-providers/openai-compatible` page
The Kilo Code documentation was restructured and the old URL no longer
exists.
Fixes#21750
Fixes: https://github.com/coder/internal/issues/560
"Select" CLI UI component should ignore "space" when `+Add custom value`
is highlighted. Otherwise it interprets that as a potential option...
and panics.
Fixes: coder/internal#767
Adds two new Prometheus metrics for license health monitoring:
- `coderd_license_warnings` - count of active license warnings
- `coderd_license_errors` - count of active license errors
Metrics endpoint after startup of a deployment with license enabled:
```
...
# HELP coderd_license_errors The number of active license errors.
# TYPE coderd_license_errors gauge
coderd_license_errors 0
...
# HELP coderd_license_warnings The number of active license warnings.
# TYPE coderd_license_warnings gauge
coderd_license_warnings 0
...
```
fixes: https://github.com/coder/internal/issues/1304
Subscribe to heartbeats synchronously on startup of PGCoordinator. This ensures tests that send heartbeats don't race with this subscription.
Closes [#1246](https://github.com/coder/internal/issues/1246)
This PR adds a new component to display AI Governance user entitlements
in the Licenses Settings page. The implementation includes:
- New `AIGovernanceUsersConsumptionChart` component that shows the
number of entitled users for AI Governance features
- Storybook stories for various states (default, disabled, error states)
- Integration with the existing license settings page
- Collapsible "Learn more" section with links to relevant documentation
- Updated the ManagedAgentsConsumption component with clearer
terminology ("Agent Workspace Builds" instead of "Managed AI Agents")
The chart displays the number of users entitled to use AI features like
AI Bridge, Boundary, and Tasks, with a note that additional analytics
are coming soon.
### Preview
<img width="3516" height="2390" alt="CleanShot 2026-01-27 at 22 44
25@2x"
src="https://github.com/user-attachments/assets/cb97a215-f054-45cb-a3e7-3055c249ef04"
/>
<img width="3516" height="2390" alt="CleanShot 2026-01-27 at 22 45
04@2x"
src="https://github.com/user-attachments/assets/d2534189-cffb-4ad2-b2e2-67eb045572e8"
/>
---------
Co-authored-by: Jaayden Halko <jaayden.halko@gmail.com>
This pull request makes a minor update to the documentation check
workflow. It clarifies that a comment should not be posted if there are
no documentation changes needed and simplifies the comment format
instructions.
The reaper (PID 1) now returns the child's exit code instead of always
exiting 0. Signal termination uses the standard Unix convention of 128 +
signal number.
fixes#21661
My previous change to this test did not create another **workspace**
using the template containing `coder_ai_task` resources, meaning that
this test was not actually testing the right thing. This PR addresses
this oversight.
The test occasionally times out at 15s on Windows CI runners.
Investigation of CI logs shows the HTTP request to the agent's
gitsshkey endpoint never appears in server logs, suggesting it
hangs before the request completes (possibly in connection setup,
middleware, or database queries). Increase to 60s to reduce flake
rate.
Fixescoder/internal#770
## Description
Moves the provider lookup from `handleRequest` to `authMiddleware` so
that the provider is determined during the `CONNECT` handshake and
stored in the request context. This enables provider information to be
available earlier in the request lifecycle.
## Changes
* Move `aibridgeProviderFromHost` call from `handleRequest` to
`authMiddleware`
* Store `Provider` in `requestContext` during `CONNECT` handshake
* Add provider validation in `authMiddleware` (reject if no provider
mapping)
* Keep defensive provider check in `handleRequest` for safety
Follow-up from: https://github.com/coder/coder/pull/21617
Closes#20598
This pull-request implements a very basic change to also render the
`icon` of the `Preset` when we've specifically defined one within the
template. Furthermore, theres a `ⓘ` icon with a description.
### Preview
<img width="984" height="442" alt="CleanShot 2026-01-27 at 20 15 29@2x"
src="https://github.com/user-attachments/assets/d4ceebf9-a5fe-4df4-a8b2-a8355d6bb25e"
/>
2026-01-28 18:51:22 +11:00
1253 changed files with 101573 additions and 19826 deletions
With issues: "## 🔍 Code Review\\n\\nReviewed [5-8 words].\\n\\n**Found X issues** (Y critical, Z nitpicks).\\n\\n---\\n*AI review via [Coder Tasks](https://coder.com/docs/ai-coder/tasks)*"
No issues: "## 🔍 Code Review\\n\\nReviewed [5-8 words].\\n\\n✅ **Looks good** - no production issues found.\\n\\n---\\n*AI review via [Coder Tasks](https://coder.com/docs/ai-coder/tasks)*"
</github_api_documentation>
<critical_rules>
1. Read ENTIRE files before commenting - use read_file or grep to verify
2. Check the EXACT line you're commenting on - does the issue actually exist there?
3. Suggestion block = ONLY replacement lines (never include unchanged surrounding lines)
CONTEXT="This is a NEW PR. Perform a thorough documentation review."
CONTEXT="This is a NEW PR. Perform initial documentation review."
;;
pr_updated)
CONTEXT="This PR was UPDATED with new commits. Only comment if the changes affect documentation needs or address previous feedback."
CONTEXT="This PR was UPDATED with new commits. Check if previous feedback was addressed or if new doc needs arose."
;;
label_requested)
CONTEXT="A documentation review was REQUESTED via label. Perform a thorough documentation review."
CONTEXT="A documentation review was REQUESTED via label. Perform a thorough review."
;;
ready_for_review)
CONTEXT="This PR was marked READY FOR REVIEW (converted from draft). Perform a thorough documentation review."
CONTEXT="This PR was marked READY FOR REVIEW. Perform a thorough review."
;;
manual)
CONTEXT="This is a MANUAL review request. Perform a thorough documentation review."
CONTEXT="This is a MANUAL review request. Perform a thorough review."
;;
*)
CONTEXT="Perform a thorough documentation review."
CONTEXT="Perform a documentation review."
;;
esac
# Build task prompt with PR-specific context
# Build task prompt with sticky comment logic
TASK_PROMPT="Use the doc-check skill to review PR #${PR_NUMBER} in coder/coder.
${CONTEXT}
Use \`gh\` to get PR details, diff, and all comments. Check for previous doc-check comments (from coder-doc-check) and only post a new comment if it adds value.
Use \`gh\` to get PR details, diff, and all comments. Look for an existing doc-check comment containing \`<!-- doc-check-sticky -->\` - if one exists, you'll update it instead of creating a new one.
**Do not comment if no documentation changes are needed.**
If a sticky comment already exists, compare your current findings against it:
- Check off \`[x]\` items that are now addressed
- Strikethrough items no longer needed (e.g., code was reverted)
- Add new unchecked \`[ ]\` items for newly discovered needs
- If an item is checked but you can't verify the docs were added, add a warning note below it
- If nothing meaningful changed, don't update the comment at all
6. Start frontend: `cd site && CODER_HOST=http://127.0.0.1:3000 pnpm dev --host`
7. Access UI at http://localhost:8080 (admin@coder.com / SomeSecurePassword!)
### Building
The `make build` target tries to regenerate code (`make gen`) which requires tools not installed in the Cloud VM (sqlc, protoc, mockgen). Since generated files are already committed, build the binary directly:
```sh
./scripts/build_go.sh --os linux --arch amd64 --output build/coder_linux_amd64
```
For a slim binary (no embedded frontend): add `--slim` flag.
If you need to rebuild the frontend: `cd site && pnpm build`
### Running Go tests
`gotestsum` is installed at `$(go env GOPATH)/bin/gotestsum`. Ensure `$(go env GOPATH)/bin` is on your PATH.
- Single package: `gotestsum --format short-verbose -- -count=1 -timeout=120s ./coderd/some/package/...`
- Full suite (uses `make test`): requires `gotestsum` on PATH.
### Running frontend tests/lint
- Lint: `cd site && pnpm lint:check` (Biome) and `pnpm lint:types` (TypeScript)
"file is %d bytes which exceeds the maximum of %d bytes. Use grep, sed, or awk to extract the content you need, or use offset and limit to read a portion.",
fileSize,limits.MaxFileSize,
))
}
// Read the entire file (up to MaxFileSize).
data,err:=io.ReadAll(f)
iferr!=nil{
returnerrResp(fmt.Sprintf("read file: %s",err))
}
// Split into lines.
content:=string(data)
// Handle empty file.
ifcontent==""{
returnReadFileLinesResponse{
Success:true,
FileSize:fileSize,
TotalLines:0,
LinesRead:0,
Content:"",
}
}
lines:=strings.Split(content,"\n")
totalLines:=len(lines)
// offset is 1-based line number.
ifoffset<1{
offset=1
}
ifoffset>int64(totalLines){
returnerrResp(fmt.Sprintf(
"offset %d is beyond the file length of %d lines",
offset,totalLines,
))
}
// Default limit.
iflimit<=0{
limit=int64(limits.MaxResponseLines)
}
startIdx:=int(offset-1)// convert to 0-based
endIdx:=startIdx+int(limit)
ifendIdx>totalLines{
endIdx=totalLines
}
varnumbered[]string
totalBytesAccumulated:=0
fori:=startIdx;i<endIdx;i++{
line:=lines[i]
// Per-line truncation.
iflen(line)>limits.MaxLineBytes{
line=line[:limits.MaxLineBytes]+"... [truncated]"
}
// Format with 1-based line number.
numberedLine:=fmt.Sprintf("%d\t%s",i+1,line)
lineBytes:=len(numberedLine)
// Check total byte budget.
newTotal:=totalBytesAccumulated+lineBytes
iflen(numbered)>0{
newTotal++// account for \n joiner
}
ifnewTotal>limits.MaxResponseBytes{
returnerrResp(fmt.Sprintf(
"output would exceed %d bytes. Read less at a time using offset and limit parameters.",
limits.MaxResponseBytes,
))
}
// Check line count.
iflen(numbered)>=limits.MaxResponseLines{
returnerrResp(fmt.Sprintf(
"output would exceed %d lines. Read less at a time using offset and limit parameters.",
// Context should NOT be canceled since we got an error (not a definitive "not alive")
require.NoError(t,childCtx.Err(),"context was canceled even though pidExists returned an error")
})
}
func(c*fakeCloser)Close()error{
*c.closes=append(*c.closes,c)
returnc.err
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.