Replace the createPortal + useRef slot pattern with direct layout
composition. The parent (AgentsPage) now renders all shell chrome
(top bar title, top bar actions, right panel content) directly,
instead of providing empty <div ref={...}> targets that children
portal into.
Changes:
- Flatten /agents route: remove nested Outlet, render AgentDetail
directly inside AgentsPage as a child component with explicit props
- Convert AgentDetail from default export with useOutletContext to
named export with AgentDetailProps interface
- Lift top bar content via onTopBarChange callback + useState in parent
- Move DiffStatsBadge and dropdown menu from TopBarPortals into
AgentDetail's onTopBarChange useEffect
- Change DiffRightPanel to accept children instead of ref
- Move admin button and ConfigureAgentsDialog from AgentsEmptyState
into AgentsPage
- Delete TopBarPortals.tsx entirely
- Simplify Storybook stories (remove portal/outlet wrappers)
Replaces the hand-rolled LCS diffing in `buildEditDiff` and the
manual patch-string assembly in `buildWriteFileDiff` with
[`Diff.createPatch()`](https://www.npmjs.com/package/diff) from the
`diff` npm package.
Both functions now just call `Diff.createPatch()` and feed the result
straight into `parsePatchFiles()`, removing all the manual line
splitting, prefix tagging, hunk-header arithmetic, and trailing-newline
cleanup.
### Changes
- Add `diff` as a dependency
- `buildWriteFileDiff`: replaced ~20 lines of manual patch assembly
with a single `Diff.createPatch()` call
- `buildEditDiff`: replaced ~60 lines (line splitting, `Diff.diffLines`
→ prefixed strings, hunk counting) with a `Diff.createPatch()` call
per edit
- Removed the `chunkLines` helper and the `diffLines` wrapper +
its test block
Net: +21 / -157 lines across source and tests.
The diff view on the `/agents` page had no way to handle lines wider
than the panel. The `@pierre/diffs` library supports an `overflow`
option — switching it from `"scroll"` (the shared default) to `"wrap"`
for the side panel makes long lines wrap naturally instead of being
clipped.
Also adds a long import line to the Storybook sample diff so the
wrapping behavior is easy to verify visually.
## Summary
Adds a typed-confirmation step before deleting a deployment license to
reduce accidental removals.
<img width="457" height="440" alt="Screenshot 2026-02-13 at 15 31 58"
src="https://github.com/user-attachments/assets/b13320a7-4b10-43fa-ab01-56f3284435b6"
/>
## Changes
- Swapped the license removal dialog from `ConfirmDialog` to
`DeleteDialog`, requiring the admin to type the license ID before
enabling **Remove**.
- Added interaction coverage to verify the confirmation guard.
TemplateVersionEditorPage tests have been flaking since I ported them to
vitest in 99a4ecd. Turns out our test timeout on jest is 20s (presumably
for these sorts of page-level journey tests). I kinda like the current
5s timeout as it forces us to write speedy tests, but I think in this
case it's unavoidable and makes sense to lengthen the timeout just for
these tests.
Hopefully fixescoder/internal#1369
You may want the whitespaceless diff here:
https://github.com/coder/coder/pull/22412/changes?w=1
## Summary
Adds a new `diff_status_change` event kind to the `/chats/watch` pubsub
stream so the sidebar can update diff status (PR created, files changed,
branch info) without a full page reload.
### Problem
When a chat's diff status changes (e.g. PR created via GitHub, git
branch pushed), the sidebar didn't update because:
1. The backend `publishChatPubsubEvent` didn't include diff status data
2. The frontend watch handler only merged `status`, `title`, and
`updated_at` from events
### Solution
A **notify-only** approach: a new `ChatEventKindDiffStatusChange` event
kind tells the frontend "diff status changed for chat X" — the frontend
then invalidates the relevant React Query cache entries to re-fetch.
### Backend changes
- **`coderd/pubsub/chatevent.go`**: New `ChatEventKindDiffStatusChange =
"diff_status_change"` constant
- **`coderd/chatd/chatd.go`**: New `PublishDiffStatusChange(ctx,
chatID)` method on `Server`
- **`coderd/chats.go`**: New `publishChatDiffStatusEvent` helper.
Published from:
- `refreshWorkspaceChatDiffStatuses` — after each chat's diff status is
refreshed via GitHub API
- `storeChatGitRef` — after persisting git branch/origin info from
workspace agent
### Frontend changes
- **`AgentsPage.tsx`**: Handle `diff_status_change` event by
invalidating `chatDiffStatusKey` and `chatDiffContentsKey` queries
- **`ChatContext.ts`**: Remove redundant diff status invalidation that
fired on every chat status change (the new event kind handles this
properly)
## Problem
When sending a message in the agent detail chat, the text lingered in
the input textarea while the HTTP POST round-tripped to the server. Only
after the server responded did the input clear and the message appear in
the timeline (via WebSocket). This created a noticeable delay where the
user couldn't start typing their next message.
## Solution
**Optimistic input clear** (`AgentChatInput.tsx`):
- Clear the textarea and editing state *immediately* on submit, before
awaiting the network call.
- Capture the input text beforehand so it can be restored in the `catch`
block if the request fails.
**Optimistic user bubble** (`AgentDetail.tsx`):
- Inject a temporary `ChatMessage` (with a negative ID) into the chat
store so the user's message bubble appears in the timeline instantly.
- Set chat status to `pending` and clear stream state, mirroring the
existing edit-message path.
- On error, roll back: remove the optimistic message and restore the
previous chat status.
The real message arrives via the WebSocket stream and
`upsertDurableMessage` replaces the optimistic entry naturally (the
server message has a positive ID, so it's inserted alongside; the
optimistic negative-ID message gets cleaned up when `replaceMessages` is
called with the authoritative message list from the next query
invalidation).
## Testing
- Type a message and press Enter — input clears and bubble appears
immediately.
- Simulate a network error — input text is restored, optimistic bubble
is removed.
- Edit an existing message — unchanged behavior (already had optimistic
updates).
- Queue a message while streaming — unchanged behavior.
Adds two keyboard shortcuts to the agents page:
- **Escape** — Interrupts the running agent when viewing a chat detail
page. Only fires when focus is outside text inputs/textareas so it
doesn't conflict with the existing edit-cancel Escape handler in the
chat input.
- **Ctrl+N / Cmd+N** — Navigates to create a new agent. Also skipped
when focus is in a text input/textarea.
Both keybindings are implemented in a new `useAgentsPageKeybindings.ts`
hook file:
- `useAgentsPageKeybindings` — used in `AgentsPage.tsx` for Ctrl+N
- `useAgentDetailKeybindings` — used in `AgentDetail.tsx` for Escape →
interrupt
## Summary
The UI has always labeled the action as "Archive agent" but the backend
was performing a hard `DELETE`, permanently destroying chats and all
their messages.
This change replaces the hard delete with a soft archive, consistent
with the pattern used by template versions.
## Changes
### Database
- **Migration 000423**: Add `archived boolean DEFAULT false NOT NULL`
column to `chats` table
- Replace `DeleteChatByID` query with `ArchiveChatByID` (`UPDATE SET
archived = true`)
- Add `UnarchiveChatByID` query (`UPDATE SET archived = false`)
- Filter archived chats from `GetChatsByOwnerID` (`WHERE archived =
false`)
### API
- Remove `DELETE /api/experimental/chats/{chat}`
- Add `POST /api/experimental/chats/{chat}/archive` — archives a chat
and all its descendants
- Add `POST /api/experimental/chats/{chat}/unarchive` — unarchives a
single chat (API only, no UI yet)
### Backend
- `archiveChatTree()` recursively archives child chats (replaces
`deleteChatTree()` which hard-deleted)
- Chat daemon's `ArchiveChat()` archives the full chat tree in a
transaction
- Authorization uses `ActionUpdate` instead of `ActionDelete`
### SDK
- Replace `DeleteChat()` with `ArchiveChat()` and `UnarchiveChat()`
- Add `Archived` field to `Chat` struct
### Frontend
- `archiveChat` API call uses `POST .../archive` instead of `DELETE`
- No UI changes — the "Archive agent" button now actually archives
instead of deleting
## Design Decision
This follows the **template version archive pattern** (Pattern B in the
codebase):
- `archived boolean` column (not `deleted boolean`)
- Dedicated `POST .../archive` and `POST .../unarchive` routes (not
repurposing `DELETE`)
- Reversible — users can unarchive via the API (UI for this will come
later)
## Problem
`resolveChatGitHubAccessToken` reads the `OAuthAccessToken` directly
from the database without refreshing it. When the token expires, GitHub
returns "bad credentials" and the chat diff features break.
## Fix
Call `config.RefreshToken()` before returning the token — the same code
path used by `provisionerdserver` when handing tokens to provisioners.
- Builds a map of provider ID → `*externalauth.Config` during the
existing config iteration
- After fetching the `ExternalAuthLink` from the DB, calls
`cfg.RefreshToken()` if a matching config exists
- On refresh failure, falls through to the existing token (GitHub tokens
without expiry still work) with a debug log
## Problem
Context compaction in chatd persisted durable messages for the
`chat_summarized` tool call and result via `publishMessage`, but never
published `message_part` streaming events via `publishMessagePart`. This
meant connected clients had no streaming representation of the
compaction.
The client's `streamState` (built entirely from `message_part` events in
`streamState.ts`) never saw the compaction tool call, so:
- No **"Summarizing..."** running state was shown to the user during
summary generation (which can take up to 90s).
- The durable `message` events arrived after or interleaved with the
`status: waiting` event, causing the tool to appear as "Summarized" with
the chat appearing to just stop.
## Fix
### 1. `CompactionOptions.OnStart` callback (chatloop)
Added an `OnStart` callback to `CompactionOptions`, called in
`maybeCompact` right before `generateCompactionSummary` (the slow LLM
call). This gives `chatd` a hook to publish the tool-call `message_part`
immediately when compaction begins.
### 2. Tool-result streaming part (chatd)
`persistChatContextSummary` now publishes a tool-result `message_part`
before the durable `message` events, so clients transition from
"Summarizing..." to "Summarized" before the status change arrives.
### Event ordering is now:
1. `message_part` (tool call via `OnStart`) — client shows
"Summarizing..."
2. LLM generates summary (up to 90s)
3. `message_part` (tool result) — client shows "Summarized" in stream
state
4. `message` (assistant) — durable message persisted, stream state
resets
5. `message` (tool) — durable tool result persisted
6. `status: waiting` — chat transitions to idle
## Tests
- **`OnStartFiresBeforePersist`**: Verifies callback ordering is
`on_start` → `generate` → `persist`.
- **`OnStartNotCalledBelowThreshold`**: Verifies `OnStart` is not called
when context usage is below the compaction threshold.
## Problem
The `update workspace, new required, mutable parameter added` e2e test
has been flaking consistently
([internal#1328](https://github.com/coder/internal/issues/1328)). The
error:
```
Error: Timed out 5000ms waiting for expect(locator).toHaveValue(expected)
Locator: getByTestId('parameter-field-Sixth parameter').locator('input')
Expected string: "99"
Received string: ""
```
## Root Cause
A race between page navigation and data hydration in `verifyParameters`:
1. The page navigates with `waitUntil: "domcontentloaded"` which does
not wait for API responses to settle
2. React Query may serve stale cached workspace data initially (from
before the update), causing the form to render with empty/old parameter
values
3. The `toHaveValue` assertion uses the default `actionTimeout` of
5000ms which isn't enough time for fresh data to arrive and the form to
re-render
## Fix
- Switch `verifyParameters` navigation to `waitUntil: "networkidle"` to
ensure API responses (workspace data, build parameters) are settled
before the form renders
- Increase the `toHaveValue` timeout to 15s to handle cases where
dynamic parameters hydrate slowly after initial render
Fixescoder/internal#1328
---------
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
## Problem
When switching between chats on the agents page, stream parts could be
lost or applied to the wrong chat due to several race conditions in
`ChatContext.ts`:
1. **`startTransition` deferred parts escape cleanup** —
`startTransition(() => store.applyMessageParts(parts))` defers the state
update. If a chat switch happens between `flushMessageParts` being
called and the transition executing, old-chat parts could apply after
`resetTransientState()` has already cleared stream state for the new
chat.
2. **`message` event has no `chat_id` filter** — Unlike `message_part`,
`queue_update`, and `status` events, the `message` event handler did not
check `streamEvent.chat_id`. While the server-scoped WebSocket makes
this safe in practice, it's an inconsistency in defensive programming.
3. **Brief stale message window on switch** — Between `chatID` changing
and `replaceMessages()` firing (after the query resolves), the store
held old-chat messages while the new WebSocket was already connected.
## Changes
### `ChatContext.ts`
- Added `activeChatIDRef` to track the currently active chat ID
- Guard `startTransition` callback: check `activeChatIDRef` before
applying message parts, discarding them if the chat has switched
- Added `chat_id` filter to `message` event handler, matching the
pattern used by all other event types
- Added `store.replaceMessages([])` to the chatID-change effect so
messages are cleared immediately on switch
### `ChatContext.test.tsx`
Four new tests covering the chat-switch lifecycle:
- WebSocket closure and state reset when chatID changes
- `message` event filtering by `chat_id`
- `startTransition` deferred parts discarded after switch
- Messages cleared immediately before new query resolves
All 13 tests pass (8 existing + 4 new + 1 existing).
## Problem
Non-admin users of the Agents (chat) feature send `model_config_id:
"00000000-0000-0000-0000-000000000000"` (nil UUID) when creating chats,
because the `GET /api/experimental/chats/model-configs` endpoint
requires `policy.ActionRead` on `rbac.ResourceDeploymentConfig`, which
is only granted to admins.
The flow:
1. `AgentsPage.tsx` calls `useQuery(chatModelConfigs())` → hits
`listChatModelConfigs`
2. Non-admin users get a **403 Forbidden** response
3. `chatModelConfigsQuery.data` is `undefined`, so the
`modelConfigIDByModelID` map is empty
4. `handleCreateChat` falls back to `nilUUID` for `model_config_id`
5. The backend rejects the nil UUID: `"Invalid model config ID."`
## Fix
Changed `listChatModelConfigs` to allow all authenticated users to read
model configs:
- **Admin users** continue to see all configs (including disabled ones)
for management via `GetChatModelConfigs`
- **Non-admin users** now see only enabled configs via
`GetEnabledChatModelConfigs` with a system context, which is sufficient
for using the chat feature
This follows the same pattern as `listChatModels`, which already uses
`dbauthz.AsSystemRestricted(ctx)` to allow all authenticated users to
see available models.
Write endpoints (create/update/delete) retain their existing
`ResourceDeploymentConfig` authorization.
## Testing
- Updated `TestListChatModelConfigs/ForbiddenForOrganizationMember` →
`SuccessForOrganizationMember` to verify non-admin users can list
enabled model configs
- All existing chat tests continue to pass
## Problem
When coderd instances are redeployed (e.g. rolling deployment on
dogfood), in-flight chats get stuck in `running` status permanently. The
UI shows them as "thinking" with a spinning indicator, but no worker is
actually processing them. They never error or resume.
## Root Cause
Two bugs combine to cause this:
### Bug 1: Shutdown cleanup uses a canceled context
The `processChat` defer block updates the chat status in the DB when
processing completes. But it uses `ctx`, which `Close()` cancels
*before* the defer runs. The DB transaction silently fails with
`context.Canceled`, leaving the chat in `status=running` with a dead
`worker_id`.
```go
// Close() calls p.cancel() which cancels ctx
// Then the defer tries to use the now-canceled ctx:
defer func() {
err := p.db.InTx(func(tx database.Store) error {
tx.GetChatByIDForUpdate(ctx, chat.ID) // FAILS
tx.UpdateChatStatus(ctx, ...) // FAILS
}, nil)
}()
```
### Bug 2: Stale recovery runs only once at startup
`recoverStaleChats()` was called only once in `start()`, not
periodically. During a rolling deployment, the new instance starts while
the old one is still alive (fresh heartbeat). By the time the old
instance crashes, no one checks again.
## Fix
1. **Use `context.WithoutCancel(ctx)` in the processChat defer** — the
cleanup transaction now completes even during graceful shutdown.
2. **Run `recoverStaleChats` periodically** — a second ticker in the
`start()` loop checks for stale chats at `inFlightChatStaleAfter / 5`
intervals (default: every 1 minute). This catches orphaned chats even
when the instance that owns them crashes without clean shutdown.
## Tests
- `TestRecoverStaleChatsPeriodically` — Verifies chats orphaned *after*
startup are recovered by the periodic loop (not just the startup check).
- `TestNewReplicaRecoversStaleChatFromDeadReplica` — Verifies a new
replica recovers stale chats on startup.
- `TestWaitingChatsAreNotRecoveredAsStale` — Negative test: `waiting`
chats are not incorrectly modified by recovery.
## Problem
The git diff on the `/agents` page had color issues: the editor
background followed light mode but the syntax highlighting used dark
mode (`github-dark-high-contrast`), and the filename header used
light-colored text on a light background.
The root cause was hardcoded dark theme options in the `FileDiff`
component:
```tsx
themeType: "dark",
theme: "github-dark-high-contrast",
```
## Fix
Uses the same theme-aware pattern as every other diff/file viewer in the
codebase (`WriteFileTool`, `EditFilesTool`, `ReadFileTool`, `Tool`,
`response.tsx`):
1. `useTheme()` from `@emotion/react` to read `palette.mode`
2. `getDiffViewerOptions(isDark)` from the shared `utils.ts` module —
returns `github-light` theme for light mode, `github-dark-high-contrast`
for dark mode
3. Reuses `DIFFS_FONT_STYLE` and `diffViewerCSS` constants instead of
inlining duplicates
## Storybook coverage
Added four new stories with real unified diff content:
- **WithDiffDark** — dark mode with a PR link
- **WithDiffLight** — light mode with a PR link
- **NoPullRequestDark** — dark mode, "Files Changed" header
- **NoPullRequestLight** — light mode, "Files Changed" header
The existing stories only covered empty and parse-error states with no
rendered diff.
## Summary
Adds a new line-based file reading endpoint to the workspace agent,
replacing the unbounded byte-based approach for the `read_file` chat
tool and `coder_workspace_read_file` MCP tool.
**Problem**: The current `read_file` tool returns the entire file
contents with no limits, which can blow up LLM context windows and cause
OOM issues with large files.
**Solution**: Inspired by [`coder/mux`](https://github.com/coder/mux)
and [`openai/codex`](https://github.com/openai/codex), implement a
line-based reader with safety limits.
## Changes
### Agent (`agent/agentfiles/`)
- New `/read-file-lines` endpoint with `HandleReadFileLines` handler
- Line-based `offset` (1-based line number, default: 1) and `limit`
(line count, default: 2000)
- Safety constants:
| Constant | Value | Purpose |
|---|---|---|
| `MaxFileSize` | 1 MB | Reject files larger than this at stat |
| `MaxLineBytes` | 1,024 | Per-line truncation with `... [truncated]`
marker |
| `MaxResponseLines` | 2,000 | Max lines per response |
| `MaxResponseBytes` | 32 KB | Max total response size |
| `DefaultLineLimit` | 2,000 | Default when no limit specified |
- Line numbering format: `1\tcontent` (tab-separated)
- Structured JSON response: `{ success, file_size, total_lines,
lines_read, content, error }`
- Hard errors when limits exceeded — tells the LLM to use
`offset`/`limit`
- Existing byte-based `/read-file` endpoint preserved (used by
`instruction.go`)
### SDK (`codersdk/workspacesdk/`)
- `ReadFileLinesResponse` type added
- `ReadFileLines` method added to `AgentConn` interface
- Mock regenerated
### Chat tool (`coderd/chatd/chattool/`)
- `read_file` tool now uses `conn.ReadFileLines()` instead of
`conn.ReadFile()`
- Updated tool description to document line-based parameters
- Response includes `file_size`, `total_lines`, `lines_read` metadata
### MCP tool (`codersdk/toolsdk/`)
- `coder_workspace_read_file` updated to use line-based reading
- Schema descriptions updated for line-based offset/limit
- Removed `maxFileLimit` constant (agent handles limits now)
### Tests
- 13 new test cases for `TestReadFileLines`:
- Path validation (empty, relative, non-existent, directory, no
permissions)
- Empty file handling
- Basic read, offset, limit, offset+limit combinations
- Offset beyond file length
- Long line truncation (>1024 bytes)
- Large file rejection (>1MB)
- All existing tests pass unchanged
## Design decisions
| Decision | Rationale |
|---|---|
| Line-based, not byte-based | Both coder/mux and openai/codex use
line-based — matches how LLMs reason about code |
| Default limit of 2000 | Matches codex; prevents accidental full-file
dumps while being generous |
| 32 KB response cap | Compromise between mux (16 KB) and codex (no cap)
|
| 1024 byte/line truncation with marker | More generous than codex
(500), marker helps LLM know data is missing |
| Hard errors on overflow | Matches mux; forces LLM to paginate rather
than getting partial data |
| Preserve byte-based endpoint | `instruction.go` needs raw byte access
for AGENTS.md |
## Problem
Chat titles revert to the fallback truncated title after briefly showing
the AI-generated title. Even reloading the page doesn't help — the
correct title flashes then gets overwritten.
## Root Cause
Single bug, two symptoms.
In `processChat` (`coderd/chatd/chatd.go`), the `chat` variable is
passed by value. The flow:
1. `processChat(ctx, chat)` receives `chat` with the initial fallback
title (truncated first message).
2. Inside `runChat`, `maybeGenerateChatTitle` generates an AI title,
writes it to the DB via `UpdateChatByID`, and publishes a `title_change`
event. **The DB has the correct title.** The client briefly displays it.
3. `runChat` returns. The **deferred cleanup** in `processChat`
publishes `publishChatPubsubEvent(chat, StatusChange)` — but `chat` here
is the original value copy that still has the **old fallback title**.
4. The frontend receives the `status_change` SSE event and
**unconditionally applies `title` from every event kind** (see
`AgentsPage.tsx` line ~305: `title: updatedChat.title`). This overwrites
the correct AI title with the stale fallback.
**Why reload doesn't help:** If the chat is still processing when the
page reloads, `listChats` loads the correct title from the DB, but then
the deferred `status_change` event arrives moments later and clobbers
it. The title was always in the DB — it was the pubsub event that kept
overwriting it.
## Fix
Re-read the chat from the database in the deferred cleanup before
publishing the final `status_change` event, so it carries the current
(AI-generated) title.
When navigating to a specific agent on the Agents page, the browser tab
title now reflects the agent's chat title (e.g. `Fix login bug - Agents
- Coder`). When the title hasn't loaded yet or when navigating away, it
falls back to `Agents - Coder`.
**Changes:**
- Added a `useEffect` in `AgentDetail` that sets `document.title` via
the existing `pageTitle` utility whenever the chat title changes.
- The cleanup function resets the title back to `Agents - Coder` when
unmounting (navigating away from the agent).
When injecting system instructions into the chat prompt, include:
1. **Operating system** and **working directory** from the
`workspace_agents` table
2. **Home-level instructions** from `~/.coder/AGENTS.md` (existing
behavior)
3. **Project-level instructions** from `<pwd>/AGENTS.md` (new)
The XML tag is renamed from `<coder-home-instructions>` to
`<system-instructions>` since it now carries more than just the home
instruction file.
### Example output (both files present)
```xml
<system-instructions>
Operating System: linux
Working Directory: /home/coder/coder
Source: /home/coder/.coder/AGENTS.md
... home instructions ...
Source: /home/coder/coder/AGENTS.md
... project instructions ...
</system-instructions>
```
### Example output (no AGENTS.md files)
```xml
<system-instructions>
Operating System: linux
Working Directory: /home/coder/coder
</system-instructions>
```
### Changes
- **`coderd/chatd/instruction.go`**:
- Renamed types: `homeInstructionContext` → `agentContext`, added
`instructionFile` struct
- Extracted `readInstructionFileAtPath` shared helper
- Added `readWorkingDirectoryInstructionFile` to read `<pwd>/AGENTS.md`
- Replaced `formatHomeInstruction` with `formatInstructions` that
renders both files under `<system-instructions>`
- **`coderd/chatd/chatd.go`**:
- Renamed `resolveHomeInstruction` → `resolveInstructions`; now reads
both home and pwd instruction files
- `resolveAgentContext` returns `agentContext` (renamed from
`homeInstructionContext`)
- pwd file read is skipped gracefully if directory is empty or file
doesn't exist
- **`coderd/chatd/instruction_test.go`**:
- Added `TestReadWorkingDirectoryInstructionFile` (success, not-found,
empty-directory)
- Replaced `TestFormatHomeInstruction` with `TestFormatInstructions`
covering all combinations
- Added ordering test (`AgentContextBeforeFiles`) to verify OS/pwd
appear before file sources
## Summary
The `chattool` `list_templates` tool previously returned all templates
in a single response with no popularity signal. On deployments with many
templates (e.g. 71 on dogfood), this wastes tokens and makes it hard for
the AI to pick the right template for broad user questions.
## Changes
Single file: `coderd/chatd/chattool/listtemplates.go`
- **`page` parameter** — optional, 1-indexed, 10 results per page
- **Popularity sort** — queries
`GetWorkspaceUniqueOwnerCountByTemplateIDs` to get active developer
counts, then sorts descending (most popular first). The DB query returns
templates alphabetically, so this explicit sort is needed.
- **`active_developers`** — included on each template item so the agent
can see the signal
- **Pagination metadata** — `page`, `total_pages`, `total_count` in the
response so the agent knows there are more results
- **Updated tool description** — tells the agent that results are
ordered by popularity and paginated
## Frontend
No frontend changes needed. The renderer already reads `rec.templates`
and `rec.count` from the response — the new fields (`page`,
`total_pages`, `total_count`) are additive and safely ignored.
When switching between chats on the agents page, the scroll position was
preserved from the previous chat instead of resetting to show the most
recent messages.
## Problem
Clicking a different chat in the sidebar loaded the new chat's messages
but kept the scroll container at whatever position the user had scrolled
to in the previous chat. This meant users often landed in the middle of
a conversation instead of at the bottom where the latest messages are.
## Fix
Added a `useEffect` in `AgentDetail` that resets `scrollTop` to `0`
whenever `agentId` changes. The scroll container uses
`flex-col-reverse`, so `scrollTop = 0` corresponds to the bottom (most
recent messages).
Fixes https://github.com/coder/coder/issues/22375
Updates `stringutil.Truncate` to properly handle multi-byte UTF-8
characters.
Adds tests for multi-byte truncation with word boundary.
Created by Mux using Opus 4.6
Resolves cases where the user is entitled to AI Governance but we don't
show them the page because its not enabled. If for some reason the user
doesn't have AI Bridge enabled anymore but still wants to access the old
logs page they now can.
Furthermore, we link to the docs regardless of if they have AI Bridge
enabled, this is inline with our other settings pages.
Replaces the approach in #22061 (with a cleaner `git history`)
This now ensures that we don't attempt to cause a layout shift when the
sidebars pop-in-out of existence (when scroll locking within `radix`).
This element was receiving the provisioner key daemons and then
immediately filtering them. This lead to the default state being a table
with nothing rendered rather than the `<TableEmpty />` as we would
expect.
<img width="1133" height="608" alt="image"
src="https://github.com/user-attachments/assets/229edb00-b108-4ec3-ac2f-33633c3e5760"
/>
This previously let auditors view the page though they can't update
anything. In a different fashion to #22382 the user will be able to see
all of this as they're logged in to the application anyway, we can
simply tell them `Sorry, no access`.
setup-go has been sporadically failing to download Go, and we were advised
by a member of the Go team that downloading Go from `storage.googleapis.com`
is not guaranteed (which is what setup-go <= v5.6.0 does).
Also remove the use-preinstalled-go optimization for Windows runners.
setup-go v6 sets GOTOOLCHAIN=local, which prevents the pre-installed
Go from auto-downloading the toolchain specified in go.mod. The windows
optimization with v5 relied on GOTOOLCHAIN=auto. setup-go uses the runner
cache, which is a different caching path but should serve the same purpose.
This change adds user-facing feedback when opening apps in a new window
fails due to popup blocking, replacing a silent no-op with a clear
recovery message. It improves reliability and supportability across
app-launch flows by helping users immediately understand and fix the
issue.
This was a poor UX decision to have to reload the entire page when a
template got invalidated. Simply now we refetch the data so that things
come across way smooother.
## Description
- Updates `wsbuilder` to return a `BuildError` with
`http.StatusBadRequest` to signify a "validation error" on missing or
invalid parameters
- Adds a short-circuit in `prebuilds.StoreReconciler` to mark presets
for which creating a build returns a "validation error" as "validation
failed" and skip further attempts to reconcile.
- Adds a test to verify the above
- Introduces a new Prometheus metric
`coderd_prebuilt_workspaces_preset_validation_failed` to track the above
Closes: https://github.com/coder/coder/issues/21237
---------
Co-authored-by: Cian Johnston <cian@coder.com>
State updates from setIsPublishingDialogOpen,
setLastSuccessfulPublishedVersion, and navigation were firing after
waitFor resolved, causing sporadic act() warnings and timeouts in the
publish template version tests (or so says Claude Sonnet 4.6).
Fixescoder/internal#1369
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Related to #22367
It was pointed out to me that we actually did regress this mildly by
removing a dividing line in the changes made in #22367, I've restored
this in a better way by taking advantage of `divide-y` and wrapping this
in a proper `<div />`.
<img width="332" height="385" alt="image"
src="https://github.com/user-attachments/assets/2827a9ae-7b54-4c48-aae9-2f6e965e7f8b"
/>
Switch to asserting only on the onChange spy, which is the actual
component contract being tested. Monaco's textarea value is always empty
regardless of model content, so the toHaveValue assertions were
unreliable anyway.
Fixes the new storybook test introduced in #22202
This was a bad smell that was being addressed by the frontend. This type
was generating out to be a `nil`/`null` instead of an empty `License[]`.
Now this returns as an empty array and we can actively check if we have
no licenses with a length of `0`.
This pull-request takes our icons shown in the sidebar tree and shows
them alongside the names of the files in the `Source Code` page of our
templates.
Also does a quick de-mui of this page.
<img width="637" height="345" alt="image"
src="https://github.com/user-attachments/assets/f3013eb6-9572-4d05-a683-10bb99b4e802"
/>
Adds a brief "Structured Logging" section to the [AI Bridge
Setup](https://coder.com/docs/ai-coder/ai-bridge/setup) page documenting
the `--aibridge-structured-logging` /
`CODER_AIBRIDGE_STRUCTURED_LOGGING` flag.
Covers:
- How to enable structured logging (CLI flag, env var, YAML)
- The five `record_type` values emitted (`interception_start`,
`interception_end`, `token_usage`, `prompt_usage`, `tool_usage`) and
their key fields
- How to filter for these records in a logging pipeline
Created on behalf of @dannykopping
---------
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Fixes three bugs that caused `coder update` to always re-prompt for
multi-select (`list(string)`) parameters instead of reusing previous
build values:
1. **`isValidTemplateParameterOption` failed for multi-select values**
(`cli/parameterresolver.go`): It compared the entire JSON array string
(e.g. `["vim","emacs"]`) against individual option values, which never
matched. Now parses the JSON array and validates each element
separately.
2. **`RichParameter` ignored previous build value for multi-select**
(`cli/cliui/parameter.go`): The `list(string)` branch always used the
template's default value instead of the `defaultValue` argument (which
carries the previous build's value). Now uses `defaultValue` when
available, falling back to the template default.
3. **Pre-existing crash when `list(string)` has no default value**
(`cli/cliui/parameter.go`): `json.Unmarshal` on an empty string caused
`unexpected end of JSON input`. Now skips unmarshaling when the default
source is empty.
Fixes#19956
The sonner migration (https://github.com/coder/coder/pull/22258) shows
validation errors in both the inline form field and a toast. Scoping the
assertion to the form element avoids flaky matches against the toast.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
The Monaco editor wrapper was only calling `onChange` if the template
file has content, but we want to allow saving an empty file.
Fixes#19721
Claude was used to port tests from jest to vitest, and for the stories.
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Kayla はな <mckayla@hey.com>
Claude 3.5 Haiku (`claude-3-5-haiku-20241022`) was retired by Anthropic
on February 19th, 2026. Requests to this model now return errors.
Switch to Claude Haiku 4.5 (`claude-haiku-4-5`), which is the
[recommended
replacement](https://docs.anthropic.com/en/docs/resources/model-deprecations).
---
One-line change in `coderd/taskname/taskname.go` L25:
```diff
- defaultModel = anthropic.ModelClaude3_5HaikuLatest
+ defaultModel = anthropic.ModelClaudeHaiku4_5
```
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
The listen loop in workspaceAgentsExternalAuthListen compared
OAuthExpiry using == which compares `time.Time` internal struct fields
including the `*time.Location` pointer.
`time.LoadLocation` does not cache the returned `*Location` pointer, so
each lib/pq connection gets a distinct pointer for the same timezone.
When `pq.ParseTimestamp()` applies the connection's location to a parsed
timestamp, the resulting time.Time embeds that connection-specific
pointer. If the `sql.DB` pool hands out different connections for the
two GetExternalAuthLink reads, the identical timestamp produces
`time.Time` values where == returns false despite representing the same
instant. This is intermittent because the pool _usually_ reuses the same
connection for sequential queries.
This change uses `.Equal()` to compare instants regardless of location.
Also makes the test's validation call counter atomic to fix a possible
data race between the HTTP server and test goroutines.
Replaces our custom `<GlobalSnackbar />` (MUI Snackbar + event emitter)
with [`sonner`](https://github.com/emilkowalski/sonner). Deletes
`GlobalSnackbar/`, the custom event emitter infra, and migrates ~80
source files to `toast.success()` / `toast.error()` from `sonner`.
- ~47 error toasts now surface API error detail via
`getErrorDetail(error)` in the toast description, not just a generic
message. Coincides with #22229.
- Toast messages follow an `{Action} "{entity}" {result}.` format (e.g.
`User "alice" suspended successfully.`) since toasts persist across
navigation now.
- 17 uses of `toast.promise()` for loading → success → error lifecycle.
- Some toasts include action buttons for quick navigation (e.g. "View
task", "View template").
- Multiple toasts can stack and display simultaneously.
---------
Co-authored-by: Kayla はな <mckayla@hey.com>
This pull-request moves our baseline CSS styles from the MUI theme
(`site/src/theme/mui.ts`) definition to `index.css`. As these are global
styles they should live in one dedicated place not two.
This pull-request removes the last instance of `@mui/material/Chip` from
the codebase. And removes it from our `vite.config.mts` so we no longer
have to cache it 🙂
This pull-request implements a simple filtering logic so that we're able
to pick which model the user actually used when logs were sent to AI
Bridge.
- Add `GET /aibridge/models` API endpoint that returns distinct model
names from AI Bridge interceptions, with pagination and search support
- New `ListAIBridgeModels` SQL query using case-sensitive prefix
matching (`LIKE model || '%'`) to allow B-tree index usage
- Hand-written `ListAuthorizedAIBridgeModels` in `modelqueries.go` for
RBAC authorization filter injection
- `AIBridgeModels` search query parser in searchquery/search.go
(defaults bare terms to the `model` field)
- dbauthz wrappers, dbmetrics, and dbmock implementations for the new
query
<img width="292" height="185" alt="image"
src="https://github.com/user-attachments/assets/134771df-2d26-4c54-acc4-27f58128b351"
/>
## Description
When multiple organizations have templates with the same name, the
Prometheus `/metrics` endpoint returns HTTP 500 because Prometheus
rejects duplicate label combinations. The three `coderd_insights_*`
metrics (`coderd_insights_templates_active_users`,
`coderd_insights_applications_usage_seconds`,
`coderd_insights_parameters`) used only `template_name` as a
distinguishing label, so two templates named e.g. `"openstack-v1"` in
different orgs would produce duplicate metric series.
This adds `organization_name` as a label to all three insight metric
descriptors to disambiguate templates across organizations.
## Changes
**`coderd/prometheusmetrics/insights/metricscollector.go`**:
- Added `organization_name` label to all three metric descriptors
- Added `organizationNames` field (template ID → org name) to the
`insightsData` struct
- In `doTick`: after fetching templates, collect unique org IDs, fetch
organizations via `GetOrganizations`, and build a
template-ID-to-org-name mapping
- In `Collect()`: pass the organization name as an additional label
value in every `MustNewConstMetric` call
**`coderd/prometheusmetrics/insights/testdata/insights-metrics.json`**:
Updated golden file to include `organization_name=coder` in all metric
label keys.
Fixes#21748
- Previously all tests were sharing the global http.Transport meaning on
`Close` it would close connections presumed to be idle for other tests.
fixes https://github.com/coder/internal/issues/112
Fixes#22030
## Problem
When a template has `require_active_version = true` and a workspace is
outdated, the web UI always shows "Update and start" as the **only**
button (for all users including admins), but `coder start` starts with
the old version. For admins, this silently succeeds on the stale
version. For non-admins, it goes through a clunky 403→retry path. This
also affects the VS Code extension, which calls `coder start --yes`
under the hood.
## Root Cause
`buildWorkspaceStartRequest()` in `cli/start.go` checks
`workspace.AutomaticUpdates == "always"` but ignores
`workspace.TemplateRequireActiveVersion`. The server-side autostart
already ORs both settings together:
```go
// coderd/autobuild/lifecycle_executor.go
func useActiveVersion(opts, ws) bool {
return opts.RequireActiveVersion || ws.AutomaticUpdates == "always"
}
```
The CLI was missing the `RequireActiveVersion` check.
## Fix
Add `workspace.TemplateRequireActiveVersion` to the existing OR
condition:
```go
// Before:
if workspace.AutomaticUpdates == codersdk.AutomaticUpdatesAlways || action == WorkspaceUpdate {
// After:
if workspace.AutomaticUpdates == codersdk.AutomaticUpdatesAlways || workspace.TemplateRequireActiveVersion || action == WorkspaceUpdate {
```
Now `coder start` and `coder restart` proactively use the active
template version when `require_active_version` is set, matching the web
UI and server autostart behavior. The 403→retry fallback remains as a
safety net but is no longer the primary path for any user.
## Testing
Updated `enterprise/cli/start_test.go` — all user types (owner, template
admin, ACL admin, group ACL admin, member) now expect the active version
when `require_active_version` is set, and verify the 403→retry message
does NOT appear.
When AgentAPI is configured, `WithTaskReporter` unconditionally
overrides all self-reported states to `working`. The intent was to
distrust the agent's `idle` and rely on the screen watcher, but the
override also blocks `failure` and `complete`, which only the agent can
produce (the screen watcher only knows `running`/`stable`). Tasks get
stuck as `working` or `null` forever.
Now only `idle` is overridden to `working`; `failure`, `complete`, and
`working` pass through as-is.
Also:
- Remove misplaced unconditional `"Failed to watch screen events"` log
that fired on every startup
- Add SSE reconnection with exponential backoff (1s-30s) in
`startWatcher` so it recovers from dropped connections instead of dying
silently
- Add `complete` to the `coder_report_task` tool enum, which the
`coder/claude-code` registry module already instructs agents to use but
was missing from the schema
Refs coder/internal#1350
Relates to https://github.com/coder/internal/issues/1259
Adds new database queries and telemetry collection functions to gather
task lifecycle events (pause/resume cycles, idle time) for analytics.
Task events track pause/resume activity, idle duration before pausing,
paused duration, and time from resume to first app status, filtered to
recent activity based on the telemetry snapshot interval.
🤖 Created with Mux (Opus 4.6).
## Summary
Moves expired token filtering from client-side to server-side by adding
an `include_expired` parameter to the `GetAPIKeysByLoginType` and
`GetAPIKeysByUserID` database queries. This is more efficient for large
deployments with many expired/short-lived tokens.
## Changes
- Add `include_expired` parameter to SQL queries using `OR`
short-circuit
- Add `include_expired` query parameter to `GET
/users/{user}/keys/tokens`
- Add `IncludeExpired` field to `codersdk.TokensFilter`
- Remove client-side filtering from CLI `tokens list` command
- Add `TestTokensFilterExpired` test
Fixescoder/internal#1357
<!--
If you have used AI to produce some or all of this PR, please ensure you
have read our [AI Contribution
guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING)
before submitting.
-->
part of https://github.com/coder/coder/issues/21335
This moves updating app status (used by Tasks) into the workspace agent
API over dRPC. This will allow us to update the status without having to
re-authenticate each time, like we would with an HTTP PATCH request.
Further PRs in this stack will pipe these requests thru from the CLI MCP
server to the agentsock and finally to this dRPC call to coderd.
## Problem
When a template adds a new immutable parameter, `coder update
--parameter param=value` fails with:
```
error: start workspace: parameter "machine_type" is immutable and cannot be updated
```
The interactive prompt handles this correctly (allows setting first-time
immutable params), but the CLI `--parameter` flag path does not.
## Root Cause
In `cli/parameterresolver.go`, `verifyConstraints()` runs before the
interactive prompt and unconditionally rejects any immutable parameter
during updates. It doesn't distinguish between **new** immutable
parameters (first-time use, should be allowed) and **existing** ones
(already set, should be blocked from changing).
## Fix
Added an `isFirstTimeUse` check to the immutable parameter constraint,
matching the logic already used by the interactive prompt path (line
323). New immutable parameters can now be set via `--parameter`, while
existing immutable parameters are still blocked from being changed.
## Testing
Added `TestUpdateValidateRichParameters/NewImmutableParameterViaFlag`
which:
1. Creates a workspace with a mutable parameter
2. Updates the template to add a new immutable parameter
3. Runs `coder update --parameter immutable_param=value`
4. Verifies the update succeeds and the parameter is set correctly
Fixes#22164
The provisioner state for a workspace build was being loaded for every
long-lived agent rpc connection. Since this state can be anywhere from
kilobytes to megabytes this can gradually cause the `coderd` memory
footprint to grow over time. It's also a lot of unnecessary allocations
for every query that fetches a workspace build since only a few callers
ever actually reference the provisioner state.
This PR removes it from the returned workspace build and adds a query to
fetch the provisioner state explicitly.
Adds two new icons to the icon library:
- **`anthropic.svg`** — Anthropic logo
- **`gemini-monochrome.svg`** — Gemini logo, monochrome variant
Both use `monochrome` theme handling to adapt for dark and light
backgrounds.
### Changes
- Added `anthropic.svg` and `gemini-monochrome.svg` to
`site/static/icon/`
- Registered both in `site/src/theme/icons.json` (alphabetically sorted)
- Added `monochrome` theme handling for both in
`site/src/theme/externalImages.ts`
---
Created on behalf of @tracyjohnsonux
---------
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Closes https://github.com/coder/internal/issues/1353
Does not solve the issue, but the error is currently opaque. This fails
the test when the init fails, hopefully raising up the error.
Bumps [github.com/gohugoio/hugo](https://github.com/gohugoio/hugo) from
0.155.2 to 0.156.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/gohugoio/hugo/releases">github.com/gohugoio/hugo's
releases</a>.</em></p>
<blockquote>
<h2>v0.156.0</h2>
<p>This release brings significant speedups of <a
href="https://gohugo.io/functions/collections/where/#article">collections.Where</a>
and <a
href="https://gohugo.io/functions/collections/sort/#article">collections.Sort</a>
– but this is mostly a "spring cleaning" release, to make the
API cleaner and simpler to understand/document.</p>
<h2>Deprecated</h2>
<ul>
<li>Site.AllPages is Deprecated</li>
<li>Site.BuildDrafts is Deprecated</li>
<li>Site.Languages is Deprecated</li>
<li>Site.Data is deprecated, use hugo.Data</li>
<li>Page.Sites and Site.Sites is Deprecated, use hugo.Sites</li>
</ul>
<p>See <a
href="https://discourse.gohugo.io/t/deprecations-in-v0-156-0/56732">this
topic</a> for more info.</p>
<h2>Removed</h2>
<p>These have all been deprecated at least since <code>v0.136.0</code>
and any usage have been logged as an error for a long time:</p>
<p>Template functions</p>
<ul>
<li>data.GetCSV / getCSV (use resources.GetRemote)</li>
<li>data.GetJSON / getJSON (use resources.GetRemote)</li>
<li>crypto.FNV32a (use hash.FNV32a)</li>
<li>resources.Babel (use js.Babel)</li>
<li>resources.PostCSS (use css.PostCSS)</li>
<li>resources.ToCSS (use css.Sass)</li>
</ul>
<p>Page methods:</p>
<ul>
<li>.Page.NextPage (use .Page.Next)</li>
<li>.Page.PrevPage (use .Page.Prev)</li>
</ul>
<p>Paginator:</p>
<ul>
<li>.Paginator.PageSize (use .Paginator.PagerSize)</li>
</ul>
<p>Site methods:</p>
<ul>
<li>.Site.LastChange (use .Site.Lastmod)</li>
<li>.Site.Author (use .Site.Params.Author)</li>
<li>.Site.Authors (use .Site.Params.Authors)</li>
<li>.Site.Social (use .Site.Params.Social)</li>
<li>.Site.IsMultiLingual (use hugo.IsMultilingual)</li>
<li>.Sites.First (use .Sites.Default)</li>
</ul>
<p>Site config:</p>
<ul>
<li>paginate (use pagination.pagerSize)</li>
<li>paginatePath (use pagination.path)</li>
</ul>
<p>File caches:</p>
<ul>
<li>getjson cache</li>
<li>getcsv cache</li>
</ul>
<h2>Notes</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/gohugoio/hugo/commit/9d914726dee87b0e8e3d7890d660221bde372eec"><code>9d91472</code></a>
releaser: Bump versions for release of 0.156.0</li>
<li><a
href="https://github.com/gohugoio/hugo/commit/86aa62524f8bc36a04c8e0c0f76d1fd952585509"><code>86aa625</code></a>
hugolib: Move site.Data to hugo.Data, deprecate
Site.AllPages/BuildDrafts/Lan...</li>
<li><a
href="https://github.com/gohugoio/hugo/commit/d8ec0eeeaf2ff078565fddbbab5565a65b86346c"><code>d8ec0ee</code></a>
build(deps): bump google.golang.org/api from 0.255.0 to 0.267.0</li>
<li><a
href="https://github.com/gohugoio/hugo/commit/4148eded9c5f90036c47d241faac73e1d0c6ee70"><code>4148ede</code></a>
hugolib: Add Page.Sites to Site.Sites deprecation notice</li>
<li><a
href="https://github.com/gohugoio/hugo/commit/bba2aed3527e5c6086244c0ab76192b35b6ffa73"><code>bba2aed</code></a>
hugolib: Simplify sites collection</li>
<li><a
href="https://github.com/gohugoio/hugo/commit/29b8e17d29ad38621cf6c7c104309bcedf5c20c5"><code>29b8e17</code></a>
hugolib: Adjust hugo.Sites.Default</li>
<li><a
href="https://github.com/gohugoio/hugo/commit/3c823408ee51bbfbad847d4b9f926ba813097185"><code>3c82340</code></a>
Move common/hugo/HugoInfo to resources/page</li>
<li><a
href="https://github.com/gohugoio/hugo/commit/3f9d0ad2b6045849cbafe133cb9fb82ed5f5ee06"><code>3f9d0ad</code></a>
commands: Fix --panicOnWarning flag having no effect with module version
warn...</li>
<li><a
href="https://github.com/gohugoio/hugo/commit/ab62320d6bceece0faa7029f8bd79d546d0f64be"><code>ab62320</code></a>
hugolib: Add hugo.Sites and .Site.IsDefault(), modify .Site.Sites</li>
<li><a
href="https://github.com/gohugoio/hugo/commit/21be4afd49767eb63e3a2304b4c10816c86f799d"><code>21be4af</code></a>
build(deps): bump github.com/bep/textandbinarywriter</li>
<li>Additional commits viewable in <a
href="https://github.com/gohugoio/hugo/compare/v0.155.2...v0.156.0">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps ubuntu from `c7eb020` to `3ba65aa`.
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
## Summary
Harden the OAuth2 provider with multiple security fixes addressing
`coder/security#121` (CSRF session takeover) and converge on OAuth 2.1
compliance.
### Security Fixes
| Fix | Description | Commits |
|-----|-------------|---------|
| **CSRF on `/oauth2/authorize`** | Enforce CSRF protection on the
authorize endpoint POST (consent form submission) | `ba7d646`, `b94a64e`
|
| **Clickjacking: `frame-ancestors` CSP** | Prevent consent page from
being iframed (`Content-Security-Policy: frame-ancestors 'none'` +
`X-Frame-Options: DENY`) | `597aeb2` |
| **Exact redirect URI matching** | Changed from prefix matching to full
string exact matching per OAuth 2.1 §4.1.2.1 | `73d64b1`, `93897f1` |
| **Store & verify `redirect_uri`** | Store redirect_uri with auth code
in DB, verify at token exchange matches exactly (RFC 6749 §4.1.3) |
`50569b9`, `d7ca315` |
| **Mandatory PKCE** | Require `code_challenge` at authorization (for
`response_type=code`) + unconditional `code_verifier` verification at
token exchange | `d7ca315`, `1cda1a9` |
| **Reject implicit grant** | `response_type=token` now returns
`unsupported_response_type` error page (OAuth 2.1 removes implicit flow)
| `d7ca315`, `91b8863` |
### Changes by File
**`coderd/httpmw/csrf.go`** — Extended the CSRF `ExemptFunc` to enforce
CSRF on `/oauth2/authorize` in addition to `/api` routes. The consent
form POST is now CSRF-protected to prevent cross-site authorization code
theft.
**`site/site.go`** — Added `Content-Security-Policy: frame-ancestors
'none'` and `X-Frame-Options: DENY` headers to `RenderOAuthAllowPage`
(consent page only — does not affect the SPA/global CSP used by AI
tasks).
**`coderd/httpapi/queryparams.go`** — Changed `RedirectURL` from prefix
matching (`strings.HasPrefix(v.Path, base.Path)`) to full URI exact
matching (`v.String() != base.String()`), comparing scheme, host, path,
and query.
**`coderd/oauth2provider/authorize.go`** — Added PKCE enforcement:
`code_challenge` is required when `response_type=code` (via a
conditional check, not `RequiredNotEmpty`, so `response_type=token` can
reach the explicit rejection path). `ShowAuthorizePage` (GET) validates
`response_type` before rendering and returns a 400 error page for
unsupported types. `ProcessAuthorize` (POST) stores the `redirect_uri`
with the auth code when explicitly provided.
**`coderd/oauth2provider/tokens.go`** — PKCE verification is now
unconditional (not gated on `code_challenge` being present in DB). If
the stored code has a `redirect_uri`, the token endpoint verifies it
matches exactly — mismatch returns `errBadCode` → `invalid_grant`.
Missing `code_verifier` returns `invalid_grant`.
**`codersdk/oauth2.go`** — `OAuth2ProviderResponseTypeToken` constant
and `Valid()` acceptance are **kept** so the authorize handler can parse
`response_type=token` and return the proper `unsupported_response_type`
error rather than failing at parameter validation.
**`coderd/database/migrations/000421_*`** — Added `redirect_uri text`
column to `oauth2_provider_app_codes`.
### Design Decisions
**`state` parameter remains optional** — The plan initially required
`state` via `RequiredNotEmpty`, but this was reverted in `376a753` to
avoid breaking existing clients. The `state` is still hashed and stored
when provided (via `state_hash` column), securing clients that opt in.
**`response_type=token` kept in `Valid()`** — Removing it from `Valid()`
would cause the parameter parser to reject the request before the
authorize handler can return the proper `unsupported_response_type`
error. The constant is kept for correct error handling flow.
**CSP scoped to consent page only** — `frame-ancestors 'none'` is set
only on the OAuth consent page renderer, not globally. The SPA/global
CSP was previously changed to allow framing for AI tasks
([#18102](https://github.com/coder/coder/pull/18102)); this change does
not regress that.
### Out of Scope (follow-up PRs)
- Bearer tokens in query strings (needs internal caller audit)
- Scope enforcement on OAuth2 tokens
- Rate limiting on dynamic client registration
---
<details>
<summary>📋 Implementation Plan</summary>
# Plan: Harden OAuth2 Provider — Security Fixes + OAuth 2.1 Compliance
## Context & Why
Security issue `coder/security#121` reports a critical session takeover
via CSRF on the OAuth2 provider. This plan covers all remaining security
fixes from that issue **plus** convergence on OAuth 2.1 requirements.
The goal is a single PR that closes all actionable gaps.
## Current State (already committed on branch `csrf-sjx1`)
| Fix | Status | Commits |
|-----|--------|---------|
| Fix 1: CSRF on `/oauth2/authorize` | ✅ Done | `ba7d646`, `b94a64e` |
| CSRF token in consent form HTML | ✅ Done | `b94a64e` |
| `state_hash` column + storage | ✅ Done (hash stored, but state still
optional) | `9167d83`, `b94a64e` |
| Tests for CSRF + state hash | ✅ Done | `e4119b5` |
## Remaining Work
### ~~Fix 2 — Require `state` parameter~~ (DROPPED)
> **Decision:** Do not enforce `state` as required. The `state`
parameter is still hashed and stored when provided (via
`hashOAuth2State` / `state_hash` column from prior commits), but clients
are not forced to supply it. This avoids breaking existing integrations
that omit state.
**Rollback:** Remove `"state"` from the `RequiredNotEmpty` call in
`coderd/oauth2provider/authorize.go:42`:
```go
// BEFORE (current on branch)
p.RequiredNotEmpty("response_type", "client_id", "state", "code_challenge")
// AFTER
p.RequiredNotEmpty("response_type", "client_id", "code_challenge")
```
No test changes needed — tests already pass `state` voluntarily.
### Fix 4 — Exact redirect URI matching
Currently `coderd/httpapi/queryparams.go:233` uses prefix matching:
```go
// CURRENT — prefix match
if v.Host != base.Host || !strings.HasPrefix(v.Path, base.Path) {
```
OAuth 2.1 requires **exact string matching**. Change to:
```go
// AFTER — exact match (OAuth 2.1 §4.1.2.1)
if v.Host != base.Host || v.Path != base.Path {
```
**File: `coderd/httpapi/queryparams.go` — `RedirectURL` method**
Also update the error message from "must be a subset of" to "must
exactly match".
**Additionally**, store `redirect_uri` with the auth code and verify at
the token endpoint (RFC 6749 §4.1.3):
1. **New migration** (same migration file or a new `000421`): Add
`redirect_uri text` column to `oauth2_provider_app_codes`
2. **Update INSERT query** in `coderd/database/queries/oauth2.sql` to
include `redirect_uri`
3. **`coderd/oauth2provider/authorize.go`**: Store
`params.redirectURL.String()` when inserting the code
4. **`coderd/oauth2provider/tokens.go`**: After retrieving the code from
DB, verify that `redirect_uri` from the token request matches the stored
value exactly. Currently `tokens.go:103` calls `p.RedirectURL(vals,
callbackURL, "redirect_uri")` for prefix validation only — it must
compare against the stored redirect_uri from the code, not just the
app's callback URL.
<details>
<summary>Why both exact match AND store+verify?</summary>
Exact matching at the authorize endpoint prevents open redirectors
(attacker can't use a sub-path).
Storing and verifying at the token endpoint prevents code injection — an
attacker who steals a code can't exchange it with a different
redirect_uri than was originally authorized. This is required by RFC
6749 §4.1.3 and OAuth 2.1.
</details>
### Fix 7 — `frame-ancestors` CSP on consent page
The consent page can be iframed by a workspace app (same-site), which is
the attack vector. Add a `Content-Security-Policy` header to prevent
framing.
**File: `site/site.go` — `RenderOAuthAllowPage` function (~line 731)**
Before writing the response, add:
```go
func RenderOAuthAllowPage(rw http.ResponseWriter, r *http.Request, data RenderOAuthAllowData) {
rw.Header().Set("Content-Type", "text/html; charset=utf-8")
// Prevent the consent page from being framed to mitigate
// clickjacking attacks (coder/security#121).
rw.Header().Set("Content-Security-Policy", "frame-ancestors 'none'")
rw.Header().Set("X-Frame-Options", "DENY")
...
```
Both headers for defense-in-depth (CSP for modern browsers,
X-Frame-Options for legacy).
### OAuth 2.1 — Mandatory PKCE
Currently PKCE is checked only when `code_challenge` was provided during
authorization (`tokens.go:258`):
```go
// CURRENT — conditional check
if dbCode.CodeChallenge.Valid && dbCode.CodeChallenge.String != "" {
// verify PKCE
}
```
OAuth 2.1 requires PKCE for ALL authorization code flows. Change to:
**File: `coderd/oauth2provider/authorize.go`** — Add `"code_challenge"`
to required params:
```go
p.RequiredNotEmpty("response_type", "client_id", "code_challenge")
```
**File: `coderd/oauth2provider/tokens.go:257-265`** — Make PKCE
verification unconditional:
```go
// AFTER — PKCE always required (OAuth 2.1)
if req.CodeVerifier == "" {
return codersdk.OAuth2TokenResponse{}, errInvalidPKCE
}
if !dbCode.CodeChallenge.Valid || dbCode.CodeChallenge.String == "" {
// Code was issued without a challenge — should not happen
// with the authorize endpoint enforcement, but defend in
// depth.
return codersdk.OAuth2TokenResponse{}, errInvalidPKCE
}
if !VerifyPKCE(dbCode.CodeChallenge.String, req.CodeVerifier) {
return codersdk.OAuth2TokenResponse{}, errInvalidPKCE
}
```
**File: `codersdk/oauth2.go`** — Remove
`OAuth2ProviderResponseTypeToken` from the enum or reject it explicitly
in the authorize handler. Currently it's defined at line 216 but the
handler ignores `response_type` and always issues a code. We should
either:
- (a) Remove the `"token"` variant from the enum and reject it with
`unsupported_response_type`, OR
- (b) Add an explicit check in `ProcessAuthorize` that rejects
`response_type=token`
Option (b) is simpler and more backwards-compatible:
```go
// In ProcessAuthorize, after extracting params:
if params.responseType != codersdk.OAuth2ProviderResponseTypeCode {
httpapi.WriteOAuth2Error(ctx, rw, http.StatusBadRequest,
codersdk.OAuth2ErrorCodeUnsupportedResponseType,
"Only response_type=code is supported")
return
}
```
### OAuth 2.1 — Bearer tokens in query strings
`coderd/httpmw/apikey.go:743` accepts `access_token` from URL query
parameters. OAuth 2.1 prohibits this. However, this may be used
internally (e.g., workspace apps, DERP). Need to audit callers before
removing.
**Approach:** This is a larger change with potential breakage. Mark as a
**separate follow-up issue** rather than including in this PR. Document
the finding.
### OAuth 2.1 — Removed flows
✅ **Already compliant.** `tokens.go` only supports `authorization_code`
and `refresh_token` grant types. The implicit grant
(`response_type=token`) will be explicitly rejected per the PKCE section
above.
### OAuth 2.1 — Refresh token rotation
✅ **Already compliant.** `tokens.go:442` deletes the old API key when a
refresh token is used.
## Migration Plan
All DB changes can go in a single new migration (or extend 000420 if the
branch is rebased before merge). Columns to add:
- `redirect_uri text` on `oauth2_provider_app_codes`
The `state_hash` column is already added by migration 000420.
## Implementation Order
1. **Fix 7** — CSP headers on consent page (isolated, no deps)
2. ~~**Fix 2** — Require `state` parameter~~ (DROPPED — state stays
optional)
3. **Fix 4** — Exact redirect URI matching + store/verify redirect_uri
4. **PKCE mandatory** — Require `code_challenge` + reject
`response_type=token`
5. **Rollback** — Remove `"state"` from `RequiredNotEmpty` in
`authorize.go`
6. **Tests** — Update/add tests for all changes
7. **`make gen`** after DB changes
## Out of Scope (separate PRs)
- Bearer tokens in query strings (needs internal caller audit)
- Scope enforcement on OAuth2 tokens
- Rate limiting / quota on dynamic client registration
</details>
---
_Generated with [`mux`](https://github.com/coder/mux) • Model:
`anthropic:claude-opus-4-6` • Thinking: `xhigh`_
This pull-request removes all the magic of `@mui/material/Alert` 🥳 We're
officially free of any alerts that are being handled by Material UI so
this is dead code.
After a PostgreSQL round-trip, job timestamps lose their monotonic
clock component, making the subtraction susceptible to wall-clock
adjustments producing a small negative delta. Floor at 1ms since
a zero or negative queue wait is meaningless. Fixes TestProvisionerJobQueueWaitMetric
flakes where small negative values (~ -2ms) are observed.
Use the server-rendered meta tag value as an intermediate fallback for
theme preference, between the JS-fetched value and the default theme.
This ensures the correct theme is applied before the API response loads.
Fixes#20050
Previously, when secret deployment options like CODER_OIDC_CLIENT_SECRET
were populated, the API correctly returned the "secret": "true"
annotation, but the UI did not indicate that these secrets were
configured. The UI would show "Not set" regardless of whether the secret
was set or not.
Now, the UI checks both the secret annotation and the value_source
field. When a secret is configured (value_source is set), it displays
"Set" to indicate the secret is populated. When a secret is not
configured, it displays "Not set".
Fixes#18913
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
`--secure-auth-cookie` now automatically sources it's default value from `--access-url`
If the access url uses HTTPS, secure is set to `true`.
To revert to old behavior, set the value explicitly to `false`
If a deployment has 2 domains, overriding the oidc url allows the oidc
redirect to differ from the access_url
response to https://github.com/coder/coder/discussions/21500
**This config setting is hidden by default**
In relation to
[`internal#1281`](https://github.com/coder/internal/issues/1281)
Remove the `soft_limit` field from the `Feature` type and simplify
license limit handling. This change:
- Removes the `soft_limit` field from the API and SDK
- Uses the soft limit value as the single `limit` value in the UI and
API
- Simplifies warning logic to only show warnings when the limit is
exceeded
- Updates tests to reflect the new behavior
- Updates the UI to use the single limit value for display
In relation to
[`internal#1281`](https://github.com/coder/internal/issues/1281)
Managed agent workspace build limits are now advisory only. Breaching
the limit no longer blocks workspace creation — it only surfaces a
warning.
- Removed hard-limit enforcement in `checkAIBuildUsage` so AI task
builds are always permitted regardless of managed agent count.
- Updated the license warning to remove "Further managed agent builds
will be blocked." verbiage.
- Updated tests to assert builds succeed beyond the limit instead of
failing.
- Removed the "Limit" display from the `ManagedAgentsConsumption`
progress bar — the bar is now relative to the included allowance (soft
limit) only, and turns orange when usage exceeds it.
Bonus:
- De-MUI'd `LicenseBannerView` — replaced Emotion CSS and MUI `Link`
with Tailwind classes.
- Added `highlight-orange` color token to the Tailwind theme.
This pull-request implement animations for each of our `<ChevronDown />`
(and a few other chevrons) so that everything is uniform with
`<Autocomplete />`.
Based on previous PR reviews it appears we don't want to use these
components anymore. We previously deprecated the use of `<Stack />` in
this way in #20973 so it would be good to take the same approach here.
This PR stops Vite from repeatedly re-optimizing certain MUI modules
during development, which was triggering an HMR feedback loop and
crashing my dev environment on specific pages — most notably
`<LicensesSettingsPage />`.
After some digging, the culprit turned out to be:
```ts
import Paper from "@mui/material/Paper";
```
Importing components this way causes Vite to continuously re-optimize
them during HMR, which leads to the page refreshing over and over until
the dev server taps out and `504 "Outdated Optimize Dep"`'s us.
The fix ensures these modules are computed once at startup instead of
being reprocessed on every hot update. Development is now stable, and
the infinite refresh loop is gone.
I did experiment with using globs to handle this more generically, but
since they’re still early-access in this context, they ended up breaking
things 😔
In short: fewer re-optimizations, no more HMR meltdown, and a much
calmer dev experience.
Continuation of #22186 (without `vitest` addon)
Upgrades the dependency so that we can actively make use of new
features/speed/less-dependencies. Short simple sweet and lovely 🙂
## Summary
Custom roles that can create workspaces on behalf of other users need to
be able to list users to populate the owner dropdown in the workspace
creation UI. Previously, this required a separate `user:read`
permission, causing the dropdown to fail for custom roles.
## Changes
- Modified `GetUsers` in `dbauthz` to check if the user can create
workspaces for any owner (`workspace:create` with `owner_id: *`)
- If the user has this permission, they can list all users without
needing explicit `user:read` permission
- Added tests to verify the new behavior
## Testing
- Updated mock tests to assert the new authorization check
- Added integration tests for both positive and negative cases
Fixes#18203
Parent agents were re-using AuthInstanceID when spawning child agents.
This caused GetWorkspaceAgentByInstanceID to return the most recently
created sub agent instead of the parent when the parent tried to refetch
its own manifest.
Fix by not reusing AuthInstanceID for sub agents, and updating
GetWorkspaceAgentByInstanceID to filter them out entirely.
The existing README for the Azure Linux starter template only mentioned
that the VM is ephemeral and the managed disk is persistent, but did not
explain that the resource group, virtual network, subnet, and network
interface also persist when a workspace is stopped.
This led to confusion where users expected all Azure resources to be
cleaned up on stop, when in reality only the VM is destroyed.
## Changes
- Added the persistent networking/infrastructure resources to the
resource list
- Added "What happens on stop" section explaining which resources
persist and why
- Added "What happens on delete" section confirming all resources are
cleaned up
- Moved the existing note about ephemeral tools/files into a "Workspace
restarts" subsection for clarity
These changes exactly mirror https://github.com/coder/registry/pull/713
since the registry is not yet linked to the starter templates in
`coder/coder`. Once the registry is linked, the starter templates will
pull from the registry and this duplication will no longer be necessary.
---------
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Add a `TaskLogPreview` component that displays the last N messages of AI
chat logs when a task is paused or its build has failed. The preview
fetches log snapshots via a new `getTaskLogs` API method and renders
them in a scrollable panel with `[user]` and `[agent]` labels, colored
left borders on type transitions, and a snapshot timestamp tooltip.
The build-logs auto-scroll in `BuildingWorkspace` was simplified by
replacing the `useRef`/`useLayoutEffect` pattern with a `useCallback`
ref, and client-side message slicing was removed in favor of
server-side limits. `InfoTooltip` now accepts an optional `title` prop.
Updates the reference to `ANTHROPIC_API_KEY` in the Claude Code client
docs to `ANTHROPIC_AUTH_TOKEN`.
**File changed:**
- `docs/ai-coder/ai-bridge/clients/claude-code.md` — configuration
instructions
Created on behalf of @dannykopping
---------
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Since Go 1.22, the loop variable capture issue is resolved. Variables
declared by for loops are now per-iteration rather than per-loop, making
the 'v := v' pattern unnecessary.
`coder templates version list` makes a call to determine the `active`
version:
```
➜ ~ coder templates version list aws-linux-dynamic
NAME CREATED AT CREATED BY STATUS ACTIVE
infallible_feistel2 2025-10-10T10:34:02+11:00 rowansmith Succeeded Active
mystifying_almeida1 2025-10-10T10:32:38+11:00 rowansmith Succeeded
```
but this is not carried across to the `-ojson` output version, so this
PR implements that in order to support programattic addressing.
It is added a top level entry. If it should be nested under
`TemplateVersion` let me know.
```
➜ ~ ./Downloads/coder-cli-templateversions-json-active templates version list aws-linux-dynamic -ojson | jq '.[] | select(.active == true) | { active, id: .TemplateVersion.id }'
{
"active": true,
"id": "38f66eae-ec63-49b7-a9d2-cdb79c379d19"
}
➜ ~ ./Downloads/coder-cli-templateversions-json-active templates version list aws-linux-dynamic -ojson |jq '.[] | select(.active == true)'
{
"TemplateVersion": {
"id": "38f66eae-ec63-49b7-a9d2-cdb79c379d19",
"template_id": "1a84ce78-06a6-41ad-99e4-8ea5d9b91e89",
"organization_id": "35f75f20-890e-4095-95f1-bb8f2ba02e79",
"created_at": "2025-10-10T10:34:02.254357+11:00",
"updated_at": "2025-10-10T10:34:46.594032+11:00",
"name": "infallible_feistel2",
"message": "Uploaded from the CLI",
"job": {
"id": "8afd05ca-b4be-48d5-a6b9-82dcfd12c960",
"created_at": "2025-10-10T10:34:02.251234+11:00",
"started_at": "2025-10-10T10:34:02.257301+11:00",
"completed_at": "2025-10-10T10:34:46.594032+11:00",
"status": "succeeded",
"worker_id": "a0940ade-ecdd-47c2-98c6-f2a4e5eb0733",
"file_id": "05fd653c-3a3f-4e5c-856b-29407732e1b1",
"tags": {
"owner": "",
"scope": "organization"
},
"queue_position": 0,
"queue_size": 0,
"organization_id": "35f75f20-890e-4095-95f1-bb8f2ba02e79",
"initiator_id": "d20c05ff-ecf3-4521-a99d-516c8befbaa6",
"input": {
"template_version_id": "38f66eae-ec63-49b7-a9d2-cdb79c379d19"
},
"type": "template_version_import",
"metadata": {
"template_version_name": "",
"template_id": "00000000-0000-0000-0000-000000000000",
"template_name": "",
"template_display_name": "",
"template_icon": ""
},
"logs_overflowed": false
},
"readme": "---\ndxxxxx,
"created_by": {
"id": "d20c05ff-ecf3-4521-a99d-516c8befbaa6",
"username": "rowansmith",
"name": "rowan smith"
},
"archived": false,
"has_external_agent": false
},
"active": true
}
```
Closes#21130
Adds documentation for Google Antigravity IDE integration, following the
same pattern as Cursor and Windsurf (dedicated page for desktop IDEs).
**Changes:**
- `docs/user-guides/workspace-access/antigravity.md` — New dedicated
page with install guide, Coder extension setup, and template
configuration example using the [Antigravity registry
module](https://registry.coder.com/modules/coder/antigravity)
- `docs/user-guides/workspace-access/index.md` — Added Antigravity IDE
section alongside Cursor and Windsurf
- `docs/manifest.json` — Added sidebar navigation entry after Windsurf
Antigravity uses the `antigravity://` protocol (added in #20873) and the
built-in `/icon/antigravity.svg` icon (added in #21068). The [registry
module](https://registry.coder.com/modules/coder/antigravity) wraps
`vscode-desktop-core` with `protocol = "antigravity"`.
Created on behalf of @matifali
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
### Notes
- Closes https://github.com/coder/internal/issues/558
- I closed previous attempt with `ptySemaphore`:
https://github.com/coder/coder/pull/21981
- We can consider implementing the retries proposed by Spike in:
https://github.com/coder/coder/pull/21981#pullrequestreview-3783200423,
if increasing the limit isn’t enough.
- I looked into Datadog — this particular test doesn’t seem very flaky
right now. It failed once in the Nightly gauntlet (3 weeks ago), but it
hasn’t failed again in the last 3 months (at least I couldn’t find any
other failures in Datadog).
## Fix PTY exhaustion flake on macOS CI
### Problem
macOS CI runners were experiencing PTY exhaustion during test runs,
causing flakes. The default PTY limit on macOS is 511, which can be
insufficient when running parallel tests.
### Solution
Added a CI step to increase the PTY limit on macOS runners from the
default 511 to the maximum allowed value of 999 before running tests.
### Changes
- Added `Increase PTY limit (macOS)` step in `.github/workflows/ci.yaml`
- Sets `kern.tty.ptmx_max=999` using `sysctl` (maximum value on our CI
runners)
- Runs only on macOS runners before the test-go-pg action
Description:
This PR updates the bundled Terraform binary and related version pins
from 1.14.1 to 1.14.5 (base image, installer fallback, and CI/test
fixtures). Terraform is statically built with an embedded Go runtime.
Moving to 1.14.5 updates the embedded toolchain and is intended to
address Go stdlib CVEs reported by security scanning.
Notes:
- Change is version-only; no functional Coder logic changes.
- Backport-friendly: intended to be cherry-picked to release branches
after merge.
## Summary
coder-logstream-kube and other tools that use the agent token to connect
to the RPC endpoint were incorrectly triggering connection monitoring,
causing false connected/disconnected timestamps on the agent. This led
to VSCode/JetBrains disconnections and incorrect dashboard status.
## Changes
Add a `role` query parameter to `/api/v2/workspaceagents/me/rpc`:
- `role=agent`: triggers connection monitoring (default for the agent
SDK)
- any other value (e.g. `logstream-kube`): skips connection monitoring
- omitted: triggers monitoring for backward compatibility with older
agents
The agent SDK now sends `role=agent` by default. A new `Role` field on
the `agentsdk.Client` allows non-agent callers to specify a different
role.
## Required follow-up
coder-logstream-kube needs to set `client.Role = "logstream-kube"`
before calling `ConnectRPC20()`. Without that change, it will still send
`role=agent` and trigger monitoring.
Fixes#21625
At present it is not possible to obtain the `id` of the template version
in the table output:
```
➜ ~ coder templates version list -h
coder v2.30.1+16408b1
USAGE:
coder templates versions list [flags] <template>
List all the versions of the specified template
OPTIONS:
-O, --org string, $CODER_ORGANIZATION
Select which organization (uuid or name) to use.
-c, --column [name|created at|created by|status|active|archived] (default: name,created at,created by,status,active)
Columns to display in table output.
➜ ~ coder templates version list aws-linux-dynamic
NAME CREATED AT CREATED BY STATUS ACTIVE
infallible_feistel2 2025-10-10T10:34:02+11:00 rowansmith Succeeded Active
mystifying_almeida1 2025-10-10T10:32:38+11:00 rowansmith Succeeded
```
Adding this because it is useful when wanting to programatically
retrieve the details of the latest template version, and `-ojson` does
not include `active` details in it's output.
```
➜ Downloads ./coder-cli-templateversions-list-id templates version list -h
coder v2.30.1-devel+bab99db9e7
USAGE:
coder templates versions list [flags] <template>
List all the versions of the specified template
OPTIONS:
-O, --org string, $CODER_ORGANIZATION
Select which organization (uuid or name) to use.
-c, --column [id|name|created at|created by|status|active|archived] (default: name,created at,created by,status,active)
Columns to display in table output.
--include-archived bool
Include archived versions in the result list.
-o, --output table|json (default: table)
Output format.
———
Run `coder --help` for a list of global options.
➜ Downloads ./coder-cli-templateversions-list-id templates version list aws-linux-dynamic -c id,name,'created at','created by',status,active
ID NAME CREATED AT CREATED BY STATUS ACTIVE
38f66eae-ec63-49b7-a9d2-cdb79c379d19 infallible_feistel2 2025-10-10T10:34:02+11:00 rowansmith Succeeded Active
aa797ea5-4221-461b-80b0-90c5164f8dc0 mystifying_almeida1 2025-10-10T10:32:38+11:00 rowansmith Succeeded
```
Closes#20965
This pull-request enables a quick permission check that the user is
allowed to view the `<RequestLogsPage />` under the admin panel.
Previously, users would be able to view this page and browse their own
logs if they had this permission (which was fine), however now we've
decided as this is an admin page, they should only be able to do this
via the API/CLI not from the main admin panel.
The login page component incorrectly uses client-side routing to handle
redirects to /oauth2/authorize. Since this path is not defined as a
route in the react application but as a backend endpoint for the OAuth2
provider flow, the frontend displays a 404 "Route not found" error.
- resolves#22097
<!--
If you have used AI to produce some or all of this PR, please ensure you
have read our [AI Contribution
guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING)
before submitting.
-->
Relates to https://github.com/coder/internal/issues/1252
When a workspace with a TaskID hits its deadline, use
BuildReasonTaskAutoPause instead of BuildReasonAutostop. This allows
downstream systems to distinguish between regular autostop and task
workspace pauses.
Created by Mux using Opus 4.5.
Remove the warning about JetBrains Toolbox not persisting log level
configuration between restarts.
As of JetBrains Toolbox 3.2, log level configuration now persists
between restarts, making this warning outdated.
Created on behalf of @matifali
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
## Summary
> NOTE: Calling this out as a breaking change in case existing consumers
of the CLI depend on being able to see expired tokens OR being able to
delete tokens immediately.
Updates the `coder tokens rm` command to immediately expire a token by
ID, preserving the token record for audit trail purposes. Tokens can
still be deleted by passing `--delete`.
## Problem
During an incident on dev.coder.com, operators needed to urgently expire
an API key that was stuck in a hot loop. The only way to do this was via
direct database access:
```sql
UPDATE api_keys SET expires_at = NOW() WHERE id = '...';
```
This is not ideal for operators who may not have direct DB access or
want to avoid manual SQL.
## Solution
This PR adds:
- **API endpoint**: `PUT /api/v2/users/{user}/keys/{keyid}/expire` -
Sets the token's `expires_at` to now
- **SDK method**: `ExpireAPIKey(ctx, userID, keyID)`
- **Updates CLI**: `coder tokens rm <name|id|token>` now _expires_ by
default. You can still delete by passing the `--delete` flag. The `coder
tokens list` command now also hides expired tokens by default. You can
`--include-expired` if needed to include them.
- **Audit logging**: The expire action is logged with old and new key
states
## Test plan
- Tests cover: owner expiring own token, admin expiring other user's
token, non-admin cannot expire other's token, 404 for non-existent token
Closes#21782🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Closes#20859
This page previously wasn't rendered to the user, however, there is a
possibility that they can navigate to this page and things will end up
in `<Spinner />`s until the requests ultimately fail. We can mitigate
this problem by showing them the `<RequirePermission />` modal.
<img width="1456" height="861" alt="image"
src="https://github.com/user-attachments/assets/57195643-ad55-4340-9c97-f8247b05a13b"
/>
Bumps rust from `760ad1d` to `9663b80`.
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Closes#21703
This doesn't make sense to have an `Activity bump` value when the
`Default autostop` is set to `0`. There is nothing to bump if we don't
have a timed stopping mechanism on the container. This is already
present on the backend and now we're describing this to the user on the
frontend.
## Summary
The license removal confirmation dialog always showed:
> Removing this license will disable all Premium features. You add a new
license at any time.
This is misleading when the license being removed is already expired —
an expired license isn't providing any features, so removing it won't
disable anything.
## Changes
- Extracted `isExpired` variable in `LicenseCard` (reusing the existing
expiry check)
- Made the dialog description conditional:
- **Expired license**: "This license has already expired and is not
providing any features. Removing it will not affect your current
entitlements."
- **Active license**: "Removing this license will disable all Premium
features. You can add a new license at any time."
- Also fixed a minor typo in the active license message ("You add" →
"You can add")
- Added two new tests covering both dialog variants
## Testing
All 5 `LicenseCard` tests pass, including the 2 new ones:
- `shows expired removal message for expired licenses`
- `shows disabling features warning for active licenses`
---------
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
## Problem
Site-wide admins (e.g., Owners) could not use `coder create --org <org>`
to create workspaces in organizations they are not members of. The error
was:
```
$ coder create my-workspace -t docker --org data-science
error: organization "data-science" not found, are you sure you are a member of this organization?
```
This was inconsistent with the web UI, where Owners can create
workspaces in any organization.
## Root Cause
The CLI's `OrganizationContext.Selected()` function only checked the
user's membership list, ignoring site-wide RBAC permissions that grant
Owners access to all organizations.
## Solution
Added a fallback in `OrganizationContext.Selected()` that fetches the
org directly via the API when not found in the membership list. This
works because the API endpoint applies RBAC filtering, allowing Owners
to read any org.
## Impact
This fixes `coder create --org` and all other CLI commands that use
`OrganizationContext.Selected()` (29+ commands), including:
- `coder templates push --org <any-org>`
- `coder organizations members add --org <any-org>`
- `coder provisioner list --org <any-org>`
## Testing
Added `TestEnterpriseCreate/OwnerCanCreateInNonMemberOrg` which:
- Creates an Owner user who is NOT a member of a second org
- Verifies they can create a workspace there using `--org`
- Properly fails without the code fix, passes with it
---
*This PR was generated by [mux](https://mux.coder.com) but reviewed by a
human.*
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore <dependency name> major version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's major version (unless you unignore this specific
dependency's major version or upgrade to it yourself)
- `@dependabot ignore <dependency name> minor version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's minor version (unless you unignore this specific
dependency's minor version or upgrade to it yourself)
- `@dependabot ignore <dependency name>` will close this group update PR
and stop Dependabot creating any more for the specific dependency
(unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore <dependency name>` will remove all of the ignore
conditions of the specified dependency
- `@dependabot unignore <dependency name> <ignore condition>` will
remove the ignore condition of the specified dependency and ignore
conditions
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Closes#16148
This pull-request resolves a few issues with wider displays.
Particularly in ensuring the content's container center's as one would
expect and the content of the headings isn't being contained into a
`max-w-prose`.
**Background**
Reported in #17417, there is a `deleted` query parameter supported by
/api/v2/templates, but we do not respect this field on the client,
showing the "Create Workspace" button for deleted templates.
**Expected Behavior**
Don't show the "Create Workspace" button for deleted templates.
**Notes**
This PR adds a new `deleted` field to the templates API response.
Co-authored-by: Danielle Maywood <danielle@themaywoods.com>
## Description
This PR wires up the metrics scanner in the Makefile to automatically regenerate metrics documentation when source files change.
## Changes
* Add Makefile target `scripts/metricsdocgen/generated_metrics` to run the AST scanner to generate the metrics file
* Update `docs/admin/integrations/prometheus.md` Makefile target to depend on `scripts/metricsdocgen/generated_metrics`
* Add `scripts/metricsdocgen/README.md` documenting the metrics generation process
Closes: https://github.com/coder/coder/issues/13223
## Description
This PR refactors `scripts/metricsdocgen/main.go` to support merging static and generated metrics files for documentation generation.
The static `metrics` file remains necessary for metrics not defined in the coder codebase (`go_*`, `process_*`, `promhttp_*`, `coder_aibridged_*`), as well as **edge cases** the scanner cannot handle (e.g., such as metrics with runtime-determined labels or function-local variable references for fields, ...). Handling these edge cases in the scanner would make it significantly more complex, so we keep this hybrid approach to accommodate them. This means that in such cases, developers need to update the `metrics` file directly, meaning there is still a risk of out-of-date information in the documentation. However, this solution should already encompass most cases.
Static metrics take priority over generated metrics when both files contain the same metric name, allowing manual overrides without modifying the scanner. Some of these edge cases could be easily fixed by updating the codebase to use one of the supported patterns.
## Changes
* Update `scripts/metricsdocgen/main.go` to read from two separate metrics files:
* `metrics`: static, manually maintained metrics (e.g., `go_*`, `process_*`, `promhttp_*`, `coder_aibridged_*`)
* `generated_metrics`: auto-generated by the AST scanner
* Update `metrics` file to contain only static and edge-case metrics
* Skip metrics with empty HELP descriptions in the scanner
* Update `generated_metrics` to reflect skipped metrics
* Update `docs/admin/integrations/prometheus.md` with merged metrics
Related to: https://github.com/coder/coder/issues/13223
**Disclosure:** This PR was mainly developed with Claude Sonnet 4, with iterative review and refinement by @ssncferreira
## Description
This PR implements extraction of metrics defined using `promauto.With()` factory patterns.
## Changes
* Add `extractPromautoMetric()` to handle:
* `promauto.With(reg).NewCounterVec(prometheus.CounterOpts{...}, labels)`
* `factory.NewGaugeVec(prometheus.GaugeOpts{...}, labels)`
* Script generates an updated `scripts/metricsdocgen/generated_metrics` file
Related to: https://github.com/coder/coder/issues/13223
**Disclosure:** This PR was mainly developed with Claude Sonnet 4, with iterative review and refinement by @ssncferreira
## Description
This PR implements extraction of metrics defined using `prometheus.New*()` and `prometheus.New*Vec()` patterns with `*Opts{}` structs.
## Changes
* Add `extractOptsMetric()` to handle:
* `prometheus.NewGauge(prometheus.GaugeOpts{...})`
* `prometheus.NewCounter(prometheus.CounterOpts{...})`
* `prometheus.NewHistogram(prometheus.HistogramOpts{...})`
* `prometheus.NewSummary(prometheus.SummaryOpts{...})`
* `prometheus.New*Vec(prometheus.*Opts{...}, labels)`
* Script generates an updated `scripts/metricsdocgen/generated_metrics` file
Related to: https://github.com/coder/coder/issues/13223
**Disclosure:** This PR was mainly developed with Claude Sonnet 4, with iterative review and refinement by @ssncferreira
## Description
This PR implements extraction of metrics defined using the `prometheus.NewDesc()` pattern.
## Changes
* Add `extractNewDescMetric()` to extract metrics from `prometheus.NewDesc()` calls
* Script generates an updated `scripts/metricsdocgen/generated_metrics` file
Related to: https://github.com/coder/coder/issues/13223
**Disclosure:** This PR was mainly developed with Claude Sonnet 4, with iterative review and refinement by @ssncferreira
## Description
This PR adds an AST-based scanner to automatically generate Prometheus metrics documentation from the coder source code.
## Changes
* Add `scripts/metricsdocgen/scanner/scanner.go` with:
* Directory walking for `agent/`, `coderd/`, `enterprise/`, `provisionerd/`
* Go file parsing (skipping `*_test.go` files)
* AST inspection for metric extraction
* `Metric.String()` for Prometheus text exposition format rendering
* `writeMetrics()` to output metrics to stdout
* Placeholder `extractMetricFromCall()` (implemented in subsequent PRs)
* Empty `scripts/metricsdocgen/generated_metrics` placeholder (populated by subsequent PRs)
**Note:** To facilitate the review process, this was separated into scoped stacked PRs. The division was based on the main structure, the different Prometheus patterns currently present in the codebase, and updates to the build process.
Related to: https://github.com/coder/coder/issues/13223
**Disclosure:** This PR was mainly developed with Claude Sonnet 4, with iterative review and refinement by @ssncferreira
This pull-request refactors the `<Combobox />` component from a
monolithic design to a composable compound component pattern, providing
more flexibility and reusability across the codebase
- Migrates `<SelectFilter />` to use the new `<Combobox />` instead of
the legacy `<SelectMenu />` components
- Updates all existing consumers of `<Combobox />` and `<SelectFilter
/>` to use the new API
<img
src="https://github.com/user-attachments/assets/a3336431-590c-48b5-adde-3fc5c16f459d"
/>
The `<Combobox />` component has been refactored to use a compound
component pattern, exposing:
- `Combobox` - Root component with context provider for open/value state
- `ComboboxTrigger` - Trigger wrapper (re-exports PopoverTrigger)
- `ComboboxButton` - Styled button with chevron and selected option
display
- `ComboboxContent` - Popover content with Command wrapper
- `ComboboxInput` - Search input (re-exports CommandInput)
- `ComboboxList` - List container (re-exports CommandList)
- `ComboboxItem` - Individual option with checkmark indicator
- `ComboboxEmpty` - Empty state (re-exports CommandEmpty)
- `useCombobox` - Hook to access combobox context
This pattern allows consumers to compose their own combobox layouts
while sharing consistent behavior and styling.
Furthermore, we had an issue with `CreateWorkspacePageView.stories.tsx`
lacking stories which would let us see the passed parameters and presets
in context. I've added stories to surround this.
### Updated Consumers
- `DynamicParameter.tsx` - Updated to use new Combobox API for parameter
options
- `CreateWorkspacePageView.tsx` - Updated preset combobox usage
- `IdpOrgSyncPageView.tsx` - Updated organization sync form
- `IdpGroupSyncForm.tsx` - Updated group sync form
- `IdpRoleSyncForm.tsx` - Updated role sync form
- `WorkspacesPage/filter/menus.tsx` - Updated workspace filter menus
---------
Co-authored-by: ケイラ <mckayla@hey.com>
This PR adds some metrics to help identify job enqueue rates and
latencies. This work was initiated as a way to help reduce the cost of
the observation/measurement itself for autostart scaletests, which
impacts our ability to identify/reason about the load caused by
autostart. See: https://github.com/coder/internal/issues/1209
I've extended the metrics here to account for regular user initiated
builds, prebuilds, autostarts, etc. IMO there is still the question here
of whether we want to include or need the `transition` label, which is
only present on workspace builds. Including it does lead to an increase
in cardinality, and in the case of the histogram (when not using native
histograms) that's at least a few extra series for every bucket. We
could remove the transition label there but keep it on the counter.
Additionally, the histogram is currently observing latencies for other
jobs, such as template builds/version imports, those do not have a
transition type associated with them.
Tested briefly in a workspace, can see metric values like the following:
-
`coderd_workspace_builds_enqueued_total{build_reason="autostart",provisioner_type="terraform",status="success",transition="start"}
1`
-
`coderd_provisioner_job_queue_wait_seconds_bucket{build_reason="autostart",job_type="workspace_build",provisioner_type="terraform",transition="start",le="0.025"}
1`
---------
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Closes#21830
Remove redundant icon sizing across the frontend. Components like
`Button`, `DropdownMenuItem`, and `CommandItem` already control child
SVG sizes via CSS selectors (e.g., `[&>svg]:size-icon-lg`), so explicit
`size` props and `className` overrides on icons nested inside them are
unnecessary. This PR strips those out and lets parent components handle
sizing consistently.
As a bonus, also migrates the `DropdownArrow` component from Emotion
CSS-in-JS to Tailwind utilities, replaces raw `<a>` tags with the `<Link
/>` component in the Premium page, and adds Storybook coverage for
`PremiumPageView`.
The AI Bridge setup docs showed `CODER_AIBRIDGE_ENABLED=true coder
server` as a single line, which can confuse users into thinking the env
var is a one-time prefix rather than a persistent setting.
Split this into `export CODER_AIBRIDGE_ENABLED=true` on its own line
followed by `coder server`, which is clearer and consistent with how the
Bedrock credentials section already handles env vars in the same file.
Created on behalf of @dannykopping
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
## Problem
CI failure showed 3 goroutines leaked in the prebuilds reconciler, all
stuck in `select` state:
1) `MetricsCollector.BackgroundFetch` (metrics goroutine)
2) `StoreReconciler.Run` (main reconciliation loop)
3) `StoreReconciler.Run.func3()` (provisioner job publisher goroutine)
All three goroutines were waiting for `ctx.Done()`, which likely means
`cancelFn()` was never called to trigger shutdown.
**Note:** I was unable to reproduce the flake locally. The likely cause
was a race condition between `Run()` and `Stop()` where `Stop()` could
check `running` (seeing `false`), return early, and then `Run()` would
start goroutines that never get cleaned up. This could happen in any
`coderd` test that starts a server with prebuilds enabled.
### Problems identified
1) Missing waitgoroup tracking: provisioner job publisher goroutine was
not tracked in the waitgroup, therefore, this goroutine was not tracked
for a clean shutdown in `Run defer func()`.
2) The provisioner job publisher goroutine had a redundant `case
<-c.done` that could race with `Stop()` select statement.
3) Race condition between `Run()` and `Stop()`: the `running` and
`stopped` fields were `atomic.Bool` values checked and set
independently, allowing a window where `Stop()` could see
`running=false` and return early, then `Run()` would set `running=true`
and start goroutines that would never be cleaned up. This could happen
in any `coderd` test that starts a server with prebuilds enabled.
## Changes
* Added `wg.Add(1)` and `defer wg.Done()` to track provisioner job
publisher goroutine in waitgroup
* Removed redundant `case <-c.done` from provisioner job publisher
goroutine to eliminate race condition
* Replaced `atomic.Bool` for `running` and `stopped` with a `sync.Mutex`
lifecycle state, also protecting `cancelFn` under the same mutex, to
eliminate the race between `Run()` and `Stop()`
* Added a guard in `Run()` to prevent double-start (`c.stopped ||
c.running`)
* Improved comments in Stop() and Run() to clarify shutdown behavior
Closes: https://github.com/coder/internal/issues/1116
### Summary
Workspace created via mode=auto links now require explicit user
confirmation before provisioning. A warning dialog shows all prefilled
param.* values from the URL and blocks creation until the user clicks
`Confirm and Create`. Clicking `Cancel` falls back to the standard form
view.
<img width="820" height="475" alt="auto-create-consent-dialog"
src="https://github.com/user-attachments/assets/8339e3bd-434f-4a04-9385-436bf95f49d7"
/>
### Breaking behavior change
Links using `mode=auto` (e.g., "Open in Coder" buttons) will no longer
silently create workspaces. Users will now see a consent dialog and must
explicitly confirm before the workspace is provisioned. Any existing
integrations or automation relying on `mode=auto` for seamless workspace
creation will now require manual user interaction.
---------
Co-authored-by: Jake Howell <jacob@coder.com>
This change adds Linux support for Desktop VPN by aligning Linux
behavior with the existing Windows daemon implementation and adding a
Linux networking stack implementation.
### What changed
- Consolidated the daemon command implementation into a shared file:
- `cli/vpndaemon_windows_linux.go` (`//go:build windows || linux`)
- Consolidated daemon tests into a shared file:
- `cli/vpndaemon_windows_linux_test.go` (`//go:build windows || linux`)
- Removed Linux-only duplicate daemon files:
- `cli/vpndaemon_linux.go`
- `cli/vpndaemon_linux_test.go`
- Removed unsupported-platform stubs per current supported OS targets:
- `cli/vpndaemon_other.go`
- `vpn/tun.go`
- Kept Linux networking stack implementation in:
- `vpn/tun_linux.go`
### Notes
- Linux now uses the same `rpc-read-handle` / `rpc-write-handle` flags
and env vars as Windows.
- The daemon logs to stderr (via CLI logger sinks), and does not forward
logs over the RPC pipe.
## Problem
The Copilot provider was missing from the AI Bridge logs filter dropdown, so users couldn't filter interceptions by Copilot. Additionally, the `AIBridgeProviderIcon` component didn't handle the copilot provider, so it would render a fallback question mark icon.
<img width="1392" height="333" alt="Screenshot 2026-02-10 at 09 26 16" src="https://github.com/user-attachments/assets/ecb97400-a4dd-4e88-accc-68d7fdf19b2a" />
## Changes
* Added `copilot` case to `AIBridgeProviderIcon`, using the existing `/icon/github.svg`.
* Added Copilot as a provider option in the filter dropdown.
* Added `MockInterceptionAnthropic` and `MockInterceptionCopilot` mock data with sample prompts, and updated the Storybook stories to use one interception per provider.
## Problem
Previously, the AI Bridge model column icon was derived from the provider field. This worked because each provider only served its own models: OpenAI interceptions always used OpenAI models, and Anthropic interceptions always used Anthropic models.
With the introduction of the Copilot provider, this assumption no longer holds. Copilot can forward requests to both OpenAI and Anthropic models, so the provider field alone is not enough to determine the correct model icon. This caused Copilot interceptions to display a fallback question mark icon for the model.
<img width="1337" height="365" alt="Screenshot 2026-02-10 at 09 10 34" src="https://github.com/user-attachments/assets/1efd613d-16c9-4738-8337-6ccf92e610fc" />
## Changes
* Added `AIBridgeModelIcon` component that infers the model family (Claude, OpenAI) from the model name string and renders the appropriate icon.
* Updated `RequestLogsRow` to use `AIBridgeModelIcon` instead of `AIBridgeProviderIcon` in both the table row and the expanded detail view.
This PR fixes a workspace app authentication bug where requests that
include an `Authorization` header (intended for the upstream app) can
cause Coder to ignore the workspace app session cookie
(`coder_subdomain_app_session_token_*` /
`coder_path_app_session_token`). When that happens, Coder fails to mint
or renew `coder_signed_app_token` and redirects to
`/api/v2/applications/auth-redirect` instead of proxying the request to
the workspace.
This commonly shows up when users run a frontend and backend in the same
workspace and the backend requires `Authorization` (for example, `curl
-H "Authorization: bearer ..."` or browser `fetch()` calls).
Related issues / context:
* Primary bug report and repro:
[https://github.com/coder/coder/issues/21467](https://github.com/coder/coder/issues/21467)
* Related symptoms reported as CORS / redirect failures for workspace
apps:
*
[https://github.com/coder/coder/issues/20667](https://github.com/coder/coder/issues/20667)
*
[https://github.com/coder/coder/issues/19728](https://github.com/coder/coder/issues/19728)
## Root Cause
In `coderd/workspaceapps/cookies.go`, `AppCookies.TokenFromRequest`
checked `httpmw.APITokenFromRequest(r)` first. That helper returns a
token from several places, including `Authorization: Bearer ...`.
As a result, when a request included an upstream `Authorization` header,
that header value was returned as the “session token” for the app proxy,
and `coder_subdomain_app_session_token_*` was never read. Authentication
then failed and the request was treated as signed out.
## Fix
Change the precedence in `AppCookies.TokenFromRequest`:
1. First check the access-method-specific cookie:
* subdomain apps: `coder_subdomain_app_session_token_{hash}`
* path apps: `coder_path_app_session_token`
2. If not present, fall back to `httpmw.APITokenFromRequest(r)` (so
non-browser clients can still authenticate via query, header, or bearer
tokens if they really want to).
This ensures that:
* Backend requests that require `Authorization` still reach the
workspace.
* `coder_signed_app_token` can be renewed from the app session cookie
even when `Authorization` is present.
* `Authorization` is still forwarded to the upstream app (the reverse
proxy code does not strip it).
Initially, I attempted workarounds
([https://github.com/coder/coder/issues/20667#issuecomment-3868578388](https://github.com/coder/coder/issues/20667#issuecomment-3868578388),
[https://github.com/coder/coder/issues/19728#issuecomment-3868578093](https://github.com/coder/coder/issues/19728#issuecomment-3868578093)),
but adding `/auth-redirect` to the permissive CORS paths and extending
the validity of workspace app auth tokens from 1 minute to 1 hour only
partially masked the issue. After workspace restarts and token expiry, I
no longer saw CORS errors, but the tokens were still not renewed.
After patching my local Nix-based setup on Coder v1.30.0 with this
change, I can no longer observe this behavior.
When discussing the changes needed for #22032 I was complaining about
how the `overflow-hidden` didn't work correctly so we could safely
remove it.
To continue these changes, I've refactored down how we work on mobile
within these triggers and enable full truncating and `max-w-`'s on each
of the content. Everything stemmed from the `<fieldset />` having a
`width: max-content` causing the content to extend past the bounds of
the container with `flex` in-toe.
Furthermore, the `(Default)` on `Preset` has been turned into a badge so
that we get the full truncation effect as we do with `Template Version`.
Follow-up improvements here might be to wrap the content of this input
on smaller displays.
### Preview
Top is the old, bottom is the new.
<img width="924" height="594" alt="preview"
src="https://github.com/user-attachments/assets/c1bbf152-03a6-4cad-b925-aad0549536a7"
/>
I was trying to figure out why `goleak` was complaining about a dangling
http2 connection goroutine in tests. Turns out that `taskname.Generate`
will call out to Anthropic if an API key is set, and we're calling it in
`dbgen`. Modified to use testutil method instead.
Closes#22028
This pull-request simply takes debounces the message sent to our
web-socket backend and debounces it to ensure we're not overwriting the
users input as they type. As an added bonus this will debounce message
spam if people are going crazy on Radio Items or similar.
An extra flavour bit of flavour with resolving a good use-case for
`cn()` in diagnostic errors 🙂
This pull-request takes the MUI based components from `<AuditLogRow />`
and its subsidiaries and updates them to use the correct newer Tailwind
based components.
This reverts commit 5224387c5a.
This is causing layout shifts to `0,0` when attempting to open
dropdowns. Something more battle-tested is needed unfortunately, Radix +
Scrollgutters is really annoying.
Add the ability to pause a running task and resume a paused task directly
from the TaskPage. This includes showing contextual messages when a task
is paused (manual vs timeout) and proper error handling with dialogs for
API errors.
- Extract task action logic into reusable mutations (api/queries/tasks.ts)
- Move TaskActionButton to modules/tasks for better organization
- Add pause button to TaskStartingAgent component
- Show appropriate state messages for transitioning states (pausing,
canceling, deleting)
The "Deploy PR manually" image (`deploy-pr-manually.png`) referenced in
the contributing docs has never existed in the repository, resulting in
a broken image on the [docs
site](https://coder.com/docs/about/contributing/CONTRIBUTING#deploying-a-pr).
This PR removes the broken `<Image>` tag and ends the sentence with a
period instead. The `pr-deploy.yaml` workflow link remains intact for
users to navigate to the workflow dispatch page directly.
Created on behalf of @DavidFrawormo
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Bumps the x group with 2 updates:
[golang.org/x/oauth2](https://github.com/golang/oauth2) and
[golang.org/x/sys](https://github.com/golang/sys).
Updates `golang.org/x/oauth2` from 0.34.0 to 0.35.0
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/golang/oauth2/commit/89ff2e1ac388c1a234a687cb2735341cde3f7122"><code>89ff2e1</code></a>
google: add safer credentials JSON loading options.</li>
<li>See full diff in <a
href="https://github.com/golang/oauth2/compare/v0.34.0...v0.35.0">compare
view</a></li>
</ul>
</details>
<br />
Updates `golang.org/x/sys` from 0.40.0 to 0.41.0
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/golang/sys/commit/fc646e489fd944b6f77d327ab77f1a4bab81d5ad"><code>fc646e4</code></a>
cpu: use IsProcessorFeaturePresent to calculate ARM64 on windows</li>
<li><a
href="https://github.com/golang/sys/commit/f11c7bb268eb8a49f5a42afe15387a159a506935"><code>f11c7bb</code></a>
windows: add IsProcessorFeaturePresent and processor feature consts</li>
<li><a
href="https://github.com/golang/sys/commit/d25a7aaff8c2b056b2059fd7065afe1d4132e082"><code>d25a7aa</code></a>
unix: add IoctlSetString on all platforms</li>
<li><a
href="https://github.com/golang/sys/commit/6fb913b30f367555467f08da4d60f49996c9b17a"><code>6fb913b</code></a>
unix: return early on error in Recvmsg</li>
<li>See full diff in <a
href="https://github.com/golang/sys/compare/v0.40.0...v0.41.0">compare
view</a></li>
</ul>
</details>
<br />
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore <dependency name> major version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's major version (unless you unignore this specific
dependency's major version or upgrade to it yourself)
- `@dependabot ignore <dependency name> minor version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's minor version (unless you unignore this specific
dependency's minor version or upgrade to it yourself)
- `@dependabot ignore <dependency name>` will close this group update PR
and stop Dependabot creating any more for the specific dependency
(unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore <dependency name>` will remove all of the ignore
conditions of the specified dependency
- `@dependabot unignore <dependency name> <ignore condition>` will
remove the ignore condition of the specified dependency and ignore
conditions
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps rust from `df6ca8f` to `760ad1d`.
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Resolves the TODO in TestPool by adding TestPool_Expiry which uses Go
1.25's testing/synctest to verify TTL-based cache eviction.
I wanted to get familiar with the new `synctest` package in Go 1.25 and
found this TODO comment, so I decided to take a stab at it 😄
Migrates `ConnectionLogRow` and `ConnectionLogDescription` off MUI and
Emotion. Replaces `@mui/material/Link` with the existing shadcn-based
`Link` component, swaps the deprecated `Stack` wrappers for plain divs
with Tailwind flex utilities, and converts all Emotion `css` prop styles
to Tailwind classes.
Also fixes a pre-existing lint issue where `tabIndex` was set on a
non-interactive div.
Replace all usages of MUI's `visuallyHidden` utility from `@mui/utils`
with Tailwind's `sr-only` class. Both produce identical CSS, so this is
a no-op behaviorally -- just removes another MUI dependency from the
codebase. Also updates the accessibility example in the frontend
contributing docs to match.
closes: https://github.com/coder/internal/issues/1331
Fixes up an issue in the test where we end up calling `FailNow` outside
the main test goroutine. Also adds the ability to name a `ptytest.PTY`
for cases like this one where we start multiple commands. This will help
debugging if we see the issue again.
This doesn't address the root cause of the failure, but I think we
should close the flake issue. I think we'd need like a stacktrace of all
goroutines at the point of failing the test, but that's way too much
effort unless we see this again.
Closes https://github.com/coder/internal/issues/1261.
This pull request adds an endpoint to pause coder tasks by stopping the
underlying workspace.
* Instead of `POST /api/v2/tasks/{user}/{task}/pause`, the endpoint is
currently experimental.
* We do not currently set the build reason to `task_manual_pause`,
because build reasons are currently only used on stop transitions.
This pull-request takes our `@mui/*` dependencies and replaces them with
shiny new Tailwind ones. Furthermore, it resolves an issue with the
`input` where `aria-invalid` wouldn't give it a red-ring like
`<InputGroup />` does.
As an added touch we've applied Formik to `<RequestOTPPage />` so that
we can render an invalid email easily.
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore <dependency name> major version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's major version (unless you unignore this specific
dependency's major version or upgrade to it yourself)
- `@dependabot ignore <dependency name> minor version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's minor version (unless you unignore this specific
dependency's minor version or upgrade to it yourself)
- `@dependabot ignore <dependency name>` will close this group update PR
and stop Dependabot creating any more for the specific dependency
(unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore <dependency name>` will remove all of the ignore
conditions of the specified dependency
- `@dependabot unignore <dependency name> <ignore condition>` will
remove the ignore condition of the specified dependency and ignore
conditions
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
This pull-request finds all of our previous instances of the MUI-based
Latency `color`'s and updates them to use the equivalents form the
Tailwind package.
Adds a standalone command that acts as a mock telemetry server,
receiving snapshots and printing them as a JSON stream to stdout. Useful
for local development testing with scripts/develop.sh by setting
CODER_TELEMETRY_ENABLE and CODER_TELEMETRY_URL environment variabless.
Adds coderd_template_workspace_build_duration_seconds histogram that
tracks the full duration from workspace build creation to agent ready.
This captures the complete user-perceived build time including
provisioning and agent startup.
The metric is emitted when the agent reports ready/error/timeout via the
lifecycle API, ensuring each build is counted exactly once per replica.
Previously, UpsertBoundaryUsageStats (INSERT...ON CONFLICT DO UPDATE) and
GetAndResetBoundaryUsageSummary (DELETE...RETURNING) could race during
telemetry period cutover. Without serialization, an upsert concurrent with the
delete could lose data (deleted right after being written) or commit after the
delete (miscounted in the next period). Both operations now acquire
LockIDBoundaryUsageStats within a transaction to ensure a clean cutover.
This pull request updates the documentation review workflow in
`.github/workflows/doc-check.yaml` to improve clarity and introduce
sticky comment logic for doc-check reviews. The changes focus on
refining the review context messages and providing detailed instructions
for updating existing doc-check comments, ensuring more consistent and
actionable documentation feedback.
**Workflow message and prompt improvements:**
* Refined the context messages for different PR trigger types to be
clearer and less repetitive, making instructions more concise for the
agent.
**Sticky comment logic and instructions:**
* Updated the task prompt to instruct the agent to look for an existing
doc-check comment containing `<!-- doc-check-sticky -->` and update it
instead of creating a new one, supporting more efficient and organized
review threads.
* Added detailed instructions for how to update sticky comments,
including checking off addressed items, striking through items no longer
needed, adding new items, and warning if changes can't be verified.
* Modified the comment format example to include sticky comment
conventions, such as strikethrough for reverted items, checkboxes for
addressed items, and warnings for unverifiable documentation changes.
* Ensured the `<!-- doc-check-sticky -->` marker is placed at the end of
the comment for easier identification and updates in future runs.
## Description
Fixes an incorrect path in the air-gapped/offline installation
documentation for publishing Coder modules to Artifactory.
The [coder/registry](https://github.com/coder/registry) repo has the
following structure:
```
registry/ # repo root
└── registry/ # subdirectory
└── coder/
└── modules/
```
The documentation previously instructed users to run:
```shell
cd registry/coder/modules
```
But the correct path is:
```shell
cd registry/registry/coder/modules
```
This was causing confusion for users trying to set up Coder modules in
air-gapped environments with Artifactory or similar repository managers.
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Adds a Go wrapper (`scripts/apidocgen/swaginit/main.go`) that calls
swag's Go API with `Strict: true`. The `--strict` flag isn't available
in swag's CLI in any version, so the wrapper is the only way to enable
it.
Also upgrades swag from v1.16.2 to v1.16.6 (better generics support,
precise numeric formats, `x-enum-descriptions`, CVE-2024-45338 fix).
Closes [`internal#1292`](https://github.com/coder/internal/issues/1292)
This pull-request reduces our nesting of the `View Task` button. Its
easier to jump to tasks now as we don't have to wait for the app status
to exist.
Previously we returned 400 Bad Request for all non-active states. This
was semantically incorrect for transitional and paused states where the
request is valid but conflicts with current state.
We now return 409 Conflict for pending/initializing/paused (resolvable
by waiting or resuming) and 400 for error/unknown (actual problems).
This enables client-side auto-resume orchestration per the task
lifecycle RFC.
Closescoder/internal#1265
Task snapshots were orphaned when tasks were soft-deleted. The
`task_snapshots` table has an `ON DELETE CASCADE` foreign key, but
that only fires on hard deletes.
Modified DeleteTask to use a CTE that atomically soft-deletes the
task and removes its snapshot in a single transaction. The query now
returns just the task UUID instead of the full row.
Closescoder/internal#1283
Relates to https://github.com/coder/coder/pull/21922 /
https://github.com/coder/internal/issues/1259
* Adds `dbfake.BuilderOption func(*WorkspaceBuildBuilder)`
* Adds `BuilderOption` methods for setting various provisioner job
related fields on `WorkspaceBuildBuilder`.
* Migrates a number of existing tests that previously dependeded on
provisioner job timing to use these updated methods in the following
packages:
* `coderd/jobreaper`
* `coderd/notifications/reports`
* `enterprise/coderd/schedule`
* `enterprise/coderd/prebuilds`
* `scripts/workspace-runtime-audit`
🤖 Created using Mux (Opus 4.5)
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
We attempted to unify these previously in #21914 however it appears I
missed dropping this a `font-weight` level. This pull-request makes this
very simple change, its now inline with the Figma design!
fixes: https://github.com/coder/internal/issues/1300
Adds brotli and zstd compression to the binary cache. Also refactors coderd's streaming encoding middleware to use the same standard set of compression algorithms, so we have them in one place.
relates to: https://github.com/coder/internal/issues/1300
Refactors the options to the site handler to take the cache directory, rather than expecting the caller to call `ExtractOrReadBinFS` and pass the results.
This is important in this stack because we need direct access to the cache directory for compressed file caching.
relates to: https://github.com/coder/internal/issues/1300
Refactors the bin handler to be a `struct` instead of a handlerfunc. The reason we want this is because we are going to introduce a cache of compressed files, so we need somewhere to put this cache.
relates to: https://github.com/coder/internal/issues/1300
Refactors the site binary handler routines to their own file. The `site.go` was getting pretty long and I want to do some refactoring on how the binary handler works.
This PR is literally just moving code from file to file; at the package level nothing is changed.
relates to: https://github.com/coder/internal/issues/1300
Adds a new package called `cachecompress` which takes a `http.FileSystem` and wraps it with an on-disk cache of compressed files. We lazily compress files when they are requested over HTTP.
# Why we want this
With cached compress, we reduce CPU utilization during workspace creation significantly.

This is from a 2k scaletest at the top of this stack of PRs so that it's used to server `/bin/` files. Previously we pegged the 4-core Coderds, with profiling showing 40% of CPU going to `zstd` compression (c.f. https://github.com/coder/internal/issues/1300).
With this change compression is reduced down to 1s of CPU time (from 7 minutes).
# Implementation details
The basic structure is taken from Chi's Compressor middleware. I've reproduced the `LICENSE` in the directory because it's MIT licensed, not AGPL like the rest of Coder.
I've structured it not as a middleware that calls an arbitrary upstream HTTP handler, but taking an explicit `http.FileSystem`. This is done for safety so we are only caching static files and not dynamically generated content with this.
One limitation is that on first request for a resource, it compresses the whole file before starting to return any data to the client. For large files like the Coder binaries, this can add 1-5 seconds to the time-to-first-byte, depending on the compression used.
I think this is reasonable: it only affects the very first download of the binary with a particular compression for a particular Coderd.
If we later find this unacceptible, we can fix it without changing interfaces. We can poll the file system to figure out how much data is available while the compression is inprogress.
follows on from #21940.
The API endpoints existed for this already, so this PR just adds CLI functionality which uses those API endpoints.
Generated with the help of Mux
## Summary
Updates the AI Governance documentation to explicitly mention that both
Community and Premium deployments include 1,000 Agent Workspace Builds.
Also clarifies that Community deployments do not have access to AI
Bridge or Agent Boundaries.
This is a follow-up to #21943 which made the same clarification in the
Tasks documentation.
## Changes
- Updated the "Agent Workspace Build Limits" section in
`docs/ai-coder/ai-governance.md`
- Added explicit mention that Community deployments lack AI Bridge and
Agent Boundaries access
---
Created on behalf of @mattvollmer
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
## Summary
Fixes flaky `TestServer/BuiltinPostgres` test caused by port conflicts
in CI.
## Fix
Increase retry attempts from 3 to 10 for better odds when port conflicts
occur.
Fixes https://github.com/coder/internal/issues/1017
Adds additional logs for determining what signal the agent receives
prior to shut down. Also helps distinguish whether the signal originated
at the agent or reaper.
## Description
This PR adds documentation for configuring clients to work with AI
Bridge via AI Bridge Proxy, specifically GitHub Copilot.
Preview:
https://coder.com/docs/@docs-aibridge-proxy-client-config/ai-coder/ai-bridge/ai-bridge-proxy/setup#client-configuration
## Changes
* Add Client Configuration section to
`docs/ai-coder/ai-bridge/ai-bridge-proxy/setup.md` covering proxy and CA
certificate configuration
* Add `docs/ai-coder/ai-bridge/clients/copilot.md` with configuration
instructions for: Copilot CLI, VS Code Copilot Extension, JetBrains IDEs
* Update `docs/ai-coder/ai-bridge/clients/index.md`:
* Add introduction explaining base URL vs proxy-based integration
* Add GitHub Copilot to compatibility table
Related to: https://github.com/coder/internal/issues/1188
Context was created before expensive setup operations (building
workspaces, starting agents), leaving insufficient time for the actual
command execution. Split into setupCtx for setup and a fresh ctx for
the command to ensure both get the full timeout.
The API endpoints existed for this already, so this PR just adds CLI
functionality which uses those API endpoints.
closes#21891
Generated with the help of Mux
macOS runners lack GNU toolchain dependencies (bash 4+, GNU getopt, make
4+) required by `scripts/lib.sh`. When any script sources `lib.sh`, it
checks for these dependencies and fails if they're missing.
This caused consistent failures in the `test-go-pg (macos-latest)` job
in `nightly-gauntlet.yaml`, which didn't have the GNU tools setup that
`ci.yaml` had. Commit 9a417df ("ci: add retry logic for Go module
operations") added a macOS GNU tools step to `ci.yaml`, but
`nightly-gauntlet.yaml` was not updated.
This PR adds a reusable `setup-gnu-tools` action and uses it
consistently across all workflows with macOS jobs, replacing the inline
brew install steps.
Closes https://github.com/coder/internal/issues/1133
The Connection Log page has a preset filter "Active SSH connections"
that was using `status:connected`, but the only valid status enum values
are `completed` and `ongoing`. This caused the preset to generate an
invalid query.
This changes the preset to use `status:ongoing type:ssh` and adds a
typed helper function so that invalid enum values will be caught at
compile time.
---
PR generated by [mux](https://mux.coder.com), but reviewed by a human.
Adds support for filtering workspaces by health status using
healthy:true or healthy:false in the search query.
This is done by changing `has-agent` to accept a list of statuses and
aliasing `health:true` to `has-agent:connected` and `healthy:false` to
`has-agent:timeout,disconnected`.
Fixes#21623
Add the ability to pause and resume tasks directly from the Tasks table,
allowing users to manage workspace resources without navigating to
individual task pages.
This pull-request implements various permission checks to the
`<OAuth2App* />` stories and components. We're trying to ensure that
we're actually allowed to `create`/`view`/`delete` on both Secrets and
Applications before showing them to the user/allowing action.
Furthermore, I've added various stories to catch when a user lacks these
permissions.
I noticed this particularly because I'm only an `Auditor` on our DEV
instance and can't see these fields.
---------
Co-authored-by: coder-tasks[bot] <254784001+coder-tasks[bot]@users.noreply.github.com>
The comments generated are too noisy and not of sufficiently high signal
that we should automatically opt every PR in.
This PR moves the trigger to the `code-review` label _only_.
Signed-off-by: Danny Kopping <danny@coder.com>
This pull-request implements a super simple change, essentially when we
fail to login we'd like to persist the `email` used when attempting to
sign-in. This just speeds up the flow rather than having to type the
email in again.
This PR increases the size of the schedule increment/decrement buttons
([-] [+]) to match the icon button style at size `sm` (same as the Stop,
Restart buttons).
## Changes
- Button dimensions: 20×20px → 32×32px
- Icon size: `size-icon-xs` → `size-icon-sm`
- Border radius: 4px → 6px (consistent with other icon buttons)
## Before
The [-] [+] buttons were tiny (20×20px) and difficult to click.
## After
The buttons now match the icon button style at size `sm` (32×32px),
consistent with other topbar buttons.
---
Created on behalf of @christin
---------
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
> [!NOTE]
> It should be noted that these #21781#21807#21809 pull-request are
required before we can merge this. This will stop us to battling the
`z-index` that is provided by MUI.
This is avoiding the changes that would be required in #21819
This pull-request removes on our reliance to control the scroll from
within another`<div />`, this means that we can actively make use of
`<ScrollRestoration />` where the page will return the top of the page
when you navigate to a new URL.
Updates the multi-model support description in the Coder Research docs
to reference provider companies (Anthropic, xAI, OpenAI) instead of
specific model names (Claude sonnet-4/opus-4, Grok, GPT-5).
This makes the docs more stable as model names change frequently, while
provider names remain constant.
---------
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Co-authored-by: Matt Vollmer <matthewjvollmer@outlook.com>
- remove beta labels
- clarify how AWB is measured
- reassurance of no downtimes when limit is reached
---------
Co-authored-by: Atif Ali <atif@coder.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Matt Vollmer <matthewjvollmer@outlook.com>
Adds `add-project` to the `mux` module in the dogfood Coder template so
Mux opens the cloned repo by default.
- Uses `local.repo_dir` (defaults to `/home/coder/coder`) so it stays
correct if the repo base dir parameter changes.
Testing:
- `terraform fmt -check dogfood/coder/main.tf`
Adds a new AI Bridge client configuration page for **Mux** and lists it
in the client compatibility table.
- Add `docs/ai-coder/ai-bridge/clients/mux.md` with a short intro, UI +
env var + `~/.mux/providers.jsonc` examples
- Add Mux to the AI Bridge client compatibility table
- Add the new page to `docs/manifest.json`
Refs: https://mux.coder.com/config/providers#environment-variables
This pull-request ensures that we're using `<DropdownMenu />` in the
`Admin Settings` button as things weren't uniform before. This is inline
with the Figma design with the darker ("black") background. This has an
added side-benefit of removing some MUI-specific code.
<img
src="https://github.com/user-attachments/assets/4eb9136b-91b3-44ac-81a0-5abd1cf2cdf2"
/>
Update the agent protobuf schema (agent/proto/agent.proto) to include:
- subagent_id field in WorkspaceAgentDevcontainer message
- id field in CreateSubAgentRequest message
Bump the Agent API version from v2.7 to v2.8 and update all client
references throughout the codebase (ConnectRPC27 -> ConnectRPC28,
DRPCAgentClient27 -> DRPCAgentClient28).
## Description
Add documentation for AI Bridge Proxy.
## Changes
This PR adds documentation for AI Bridge Proxy under
`docs/ai-coder/ai-bridge/ai-bridge-proxy/`:
* `index.md`: Overview of AI Bridge Proxy, how it works (MITM vs tunnel
modes), and when to use it
* `setup.md`: Setup guide covering:
* Proxy configuration and required settings
* Security considerations and deployment options
* CA certificate generation (self-signed and organization-signed)
* Upstream proxy chaining configuration
Note: TODO comments in the documentation will be addressed in follow-up
PRs.
Related to: https://github.com/coder/internal/issues/1188
This pull-request refactors filter-related dropdown and input components
from MUI to our Tailwind-based design system. This is more inline with
the Figma design, controversially we are changing the button group for
canned filters and input to two seperate components.
- **InputGroup**: Complete rewrite to a compound component pattern
(`InputGroup`, `InputGroupAddon`, `InputGroupInput`, `InputGroupButton`)
using Tailwind and CVA, replacing the old CSS-in-JS approach
- **SearchField**: Migrated from MUI TextField to use the new InputGroup
components, with a simplified API and proper ref forwarding
- **Filter/PresetMenu**: Replaced MUI Menu with our DropdownMenu
component, and updated icon to `SlidersHorizontal`
### Changes
| Component | Before | After |
|-----------|--------|-------|
| InputGroup | CSS-in-JS with MUI margin hacks | Compound component with
Tailwind group states |
| SearchField | MUI TextField + InputAdornment | InputGroup +
InputGroupAddon composition |
| PresetMenu | MUI Menu/MenuItem | DropdownMenu/DropdownMenuItem |
| MenuSearch | Complex CSS overrides | Single Tailwind class |
<img
src="https://github.com/user-attachments/assets/5b819027-2dca-4dcc-b6d6-7096fa3775c0"
/>
On Windows, `pty.New()` was creating a `ConPTY` (`PseudoConsole`) even
when no process would be attached. `ConPTY` requires a real process to
function correctly - without one, the pipe handles become invalid
intermittently, causing flaky test failures like `read |0: The handle is
invalid.`
This affected tests using the `ptytest.New()` + `Attach()` pattern for
in-process CLI testing.
The fix splits Windows PTY creation into two paths:
- `newPty()` now returns a simple pipe-based PTY for the `Attach()` use
case
- `newConPty()` creates a real `ConPTY`, called by `Start()` when a
process will be attached
AFAICT this will result in no change in behaviour outside of tests.
Fixescoder/internal#1277
_Disclaimer: investigated and implemented by Claude Opus 4.5, reviewed
by me._
---------
Signed-off-by: Danny Kopping <danny@coder.com>
* Adds support for parameter `format=text` in the following API routes:
* `/api/v2/workspaceagents/:id/logs`
* `/api/v2/workspacebuilds/:id/logs`
* `/api/v2/templateversions/:id/logs`
* `/api/v2/templateversions/:id/dry-run/:id/logs`
* Adds links to view raw logs on the following pages:
* Workspace build page
* Template editor page
* Template version page
* Refactors existing log formatting in `cli/logs.go` to live in `codersdk`.
🤖 Generated with Claude Opus 4.5, reviewed by me.
---------
Co-authored-by: Claude <noreply@anthropic.com>
The AcquireProvisionerJob query only checked started_at IS NULL, allowing
it to acquire jobs that were canceled while pending (which have
completed_at set but started_at still NULL).
Added completed_at IS NULL check to the query to prevent this.
Also fixed JobCompleteBuilder.Do() in dbfake to set started_at when
completing jobs to match production behavior.
Fixescoder/internal#1323
## Summary
Previously, `CODER_PPROF_ADDRESS` and `CODER_PROMETHEUS_ADDRESS` were
hardcoded in the Helm chart template to `0.0.0.0:6060` and
`0.0.0.0:2112` respectively. These values could not be overridden via
`coder.env` values because the hardcoded values were set first in the
template, and Kubernetes uses the first occurrence of duplicate env
vars.
This was a security concern because binding to `0.0.0.0` exposes these
endpoints to any pod in the cluster:
- **pprof** can expose sensitive runtime information (goroutine stacks,
heap profiles, CPU profiles that may contain memory contents)
- **Prometheus metrics** may contain sensitive operational data
## Changes
1. **`helm/coder/templates/_coder.tpl`**: Added logic to check if the
user has set `CODER_PPROF_ADDRESS` or `CODER_PROMETHEUS_ADDRESS` in
`coder.env` before applying the default values. If the user provides a
value, the hardcoded default is skipped.
2. **`helm/coder/values.yaml`**: Updated documentation to:
- Remove these vars from the "cannot be overridden" list
- Add them to a new "can be overridden" section with security
recommendations
3. **Tests**: Added test cases for both override scenarios with
corresponding golden files.
## Usage
Users can now restrict pprof and prometheus to localhost only:
```yaml
coder:
env:
- name: CODER_PPROF_ADDRESS
value: "127.0.0.1:6060"
- name: CODER_PROMETHEUS_ADDRESS
value: "127.0.0.1:2112"
```
## Local Testing
To verify the fix locally:
```bash
# Update helm dependencies
cd helm/coder && helm dependency update
# Test default behavior (should show 0.0.0.0)
helm template coder . -f tests/testdata/default_values.yaml --namespace default | grep -A1 'CODER_PPROF_ADDRESS\|CODER_PROMETHEUS_ADDRESS'
# Test pprof override (should show 127.0.0.1:6060)
helm template coder . -f tests/testdata/pprof_address_override.yaml --namespace default | grep -A1 'CODER_PPROF_ADDRESS'
# Test prometheus override (should show 127.0.0.1:2112)
helm template coder . -f tests/testdata/prometheus_address_override.yaml --namespace default | grep -A1 'CODER_PROMETHEUS_ADDRESS'
# Run Go tests
cd tests && go test . -v
```
Fixes#21713
---------
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Co-authored-by: uzair-coder07 <uzair@coder.com>
## Summary
Updates the description for the "Use" role in the workspace sharing
dropdown to explicitly mention that users with this permission can start
and stop the workspace, not just read and access it.
## Changes
- Updated the "Use" role description from "Can read and access this
workspace." to "Can read, access, start, and stop this workspace."
## Context
This clarification helps users understand the full scope of the "Use"
permission, which includes `ActionWorkspaceStart` and
`ActionWorkspaceStop` as defined in `coderd/database/db2sdk/db2sdk.go`.
---
*Created on behalf of @geokat*
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Fixes the state format for Workspace Sharing in `docs/manifest.json`.
Changes `"early_access"` to `"early access"` (with space, no underscore)
to match the format used by other early access entries and to fix builds
on coder/coder.com.
Follow-up to #21797.
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
This pull request adds a new documentation file that defines the
"code-review" skill for use in the project. The document outlines a
standard workflow, severity levels, key areas to focus on during code
reviews, and Coder-specific review guidelines. This aims to standardize
and improve the quality and consistency of code reviews across the team.
Documentation and process standardization:
* Added `.claude/skills/code-review/SKILL.md`, which describes the
code-review skill, including workflow steps, severity levels, what to
look for in reviews, and what not to comment on. It also provides
Coder-specific patterns and best practices for authorization, error
handling, and shell scripting.
This PR changes the shared workspaces documentation page from Beta to
Early Access status.
Changes `docs/manifest.json` to update the state from `["beta"]` to
`["early_access"]` for the Workspace Sharing page.
Ref: https://coder.com/docs/user-guides/shared-workspaces
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
When a workspace has multiple agents (e.g., main + devcontainer), the
build timeline was showing all events duplicated under each agent
instead of filtering by the agent they belong to.
Added agentId to the Stage type and filter timings by workspace_agent_id
so each agent section only shows its own events.
Fixes#18002
These tests use dbfake to set up database state directly and don't
need a provisioner daemon. Removing it fixes a flaky failure on
Windows where the provisioner daemon acquired a job that dbfake had
already "completed", causing the task status to be "error" instead
of "paused".
Fixescoder/internal#1322
Refs coder/internal#1323
Previously there were two issues that could cause incorrect boundary
usage telemetry data.
1. Bad handling across snapshot intervals: After telemetry snapshot deleted
the DB row, the next flush would INSERT the stale cumulative data (which
included already-reported usage). This would then be overwritten by
subsequent UPDATE flushes, causing the delta between the last snapshot
and the reset to be lost (under-reporting usage). Additionally, if there
was no new usage after the reset, the tracker would carry over all usage
from the previous period into the next period (over-reporting usage).
2. Missed usage from a race condition: Track() calls between the first
mutex unlock and second mutex lock in FlushToDB() were lost. The data
wasn't included in the current flush (already snapshotted) and was wiped
by the subsequent reset. This is likely low impact to overall usage
numbers in the real world.
Fix by tracking unique workspace/user deltas separately from cumulative
values and always tracking delta allowed/denied requests. Deltas are used
for INSERT (fresh start after reset), cumulative for UPDATE (accurate unique
counts within a period). All counters reset atomically before the DB operation
so Track() calls during the operation are preserved for the next flush.
Archiving modules attempts to save as many modules as it can before it hits the limit. Enabling the template as much as it can, rather than a hard failure.
## Description
Adds authentication support for upstream proxies in `aibridgeproxyd`.
When credentials are provided in the upstream proxy URL, the
`Proxy-Authorization` header is now included in `CONNECT` requests.
## Changes
* Extract credentials from upstream proxy URL and set
`Proxy-Authorization` header on tunneled `CONNECT` requests
* Support optional user and password
* Fail at startup if both username and password are empty
* Add tests for all auth scenarios
Follow-up: https://github.com/coder/internal/issues/1204
Apply optimizations:
* https://github.com/openai/openai-go/pull/602
* https://github.com/coder/aibridge/pull/160
These reduce CPU time and allocation count for OpenAI `chat/completions`
and `responses` APIs, making the use of OpenAI chat models through AI
Bridge more performant.
In order to test these changes, we add scaletesting support for the
responses API.
## Summary
This PR restructures the Agent Boundaries documentation to improve URL
clarity and consistency:
### Changes
- Renames `/docs/ai-coder/boundary/` to
`/docs/ai-coder/agent-boundaries/`
- Renames `agent-boundary.md` to `index.md` for cleaner URLs
- Updates all internal doc references to the new paths
- Updates `manifest.json` with new paths
- Updates prose references from "Boundary" to "Agent Boundaries"
throughout the documentation (33 changes across 4 files)
### New URL structure
| Old URL | New URL |
|---------|----------|
| `/docs/ai-coder/boundary/agent-boundary` |
`/docs/ai-coder/agent-boundaries` |
| `/docs/ai-coder/boundary/nsjail` |
`/docs/ai-coder/agent-boundaries/nsjail` |
| `/docs/ai-coder/boundary/landjail` |
`/docs/ai-coder/agent-boundaries/landjail` |
| `/docs/ai-coder/boundary/rules-engine` |
`/docs/ai-coder/agent-boundaries/rules-engine` |
| `/docs/ai-coder/boundary/version` |
`/docs/ai-coder/agent-boundaries/version` |
### Follow-up required
Redirects need to be added to `coder/coder.com` for the old URLs:
- `/docs/ai-coder/agent-boundary` → `/docs/ai-coder/agent-boundaries`
(this one is currently 404'ing from Google search results)
- `/docs/ai-coder/boundary/:path*` →
`/docs/ai-coder/agent-boundaries/:path*`
---
Created on behalf of @mattvollmer
---------
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Co-authored-by: Matt Vollmer <matthewjvollmer@outlook.com>
Liveness checks are currently causing pods to be killed during
long-running migrations.
They are generally not advisable for our workloads; if a pod becomes
unresponsive we _need_ to know about it (due to a deadlock, etc) and not
paper over the issue by killing the pod.
I've also made all probe settings configurable.
---------
Signed-off-by: Danny Kopping <danny@coder.com>
## Summary
The `lint/actions/zizmor` target flakes in CI due to network
connectivity issues when running on depot runners
(https://github.com/coder/internal/issues/1233). The zizmor tool needs
to reach GitHub's API but intermittently fails with "Connection refused"
errors.
## Changes
- Creates a new `lint-actions` CI job that only runs when `.github/**`
files are touched (using existing `ci` filter)
- Removes zizmor from the main `lint` job
- Uses a Makefile conditional to include actionlint in `make lint`
locally but skip it in CI (where `lint-actions` handles it)
This reduces unnecessary flake exposure for PRs that don't modify GitHub
Actions files.
## Testing
- `actionlint` passes on the modified ci.yaml
- Verified Makefile conditional works: actionlint included locally,
skipped when `CI=true`
Fixes https://github.com/coder/internal/issues/1233
Closes#21044
This pull-request addresses an issue we were seeing where we would
attempt to filter the `<UserCombobox />` by the users username or email
not their username (which the rendered options would show).
To highlight this I created three different users. Each with a username
that did not contain their `email` or `name` and attempted to filter.
Attempting to search for `John` wouldn't actually show the user as his
username was `x`, and infact whereas a subset of users might be returned
from the backend for having `john` in the `email` it would've been
filtered by the frontend for not being in the `name` field.
| Name | Username |
| --- | --- |
| `Jake` | `z` |
| `Jeff` | `y` |
| `John` | `x` |
| Previously | Now |
| --- | --- |
| <img width="560" height="547" alt="OLD_USER_COMBOBOX"
src="https://github.com/user-attachments/assets/a0567264-0034-42ac-aba0-95b05c4f92dd"
/> | <img width="580" height="548" alt="NEW_USER_COMBOBOX"
src="https://github.com/user-attachments/assets/1aa0c942-d340-4b1c-8dde-b97879525bfb"
/> |
## Description
When configuring a From address with a display name (e.g., `Coder System
<system@coder.com>`), the SMTP `MAIL FROM` command was incorrectly
receiving the full address string instead of just the bare email
address, causing `501 Invalid MAIL argument` errors on some SMTP
servers.
## Changes
- Updated `validateFromAddr` to return both:
- `envelopeFrom`: bare email for SMTP `MAIL FROM` command (RFC 5321)
- `headerFrom`: original address with display name for email header (RFC
5322)
Fixes#20727
## Description
Mark `--ssh-hostname-prefix` flag and `CODER_SSH_HOSTNAME_PREFIX` env
variable as deprecated, recommending users to use
`--workspace-hostname-suffix` / `CODER_WORKSPACE_HOSTNAME_SUFFIX`
instead for consistency with Coder Desktop.
The deprecated option is now hidden from help output and docs but
remains functional for backward compatibility. When used, it will show a
deprecation warning pointing to the recommended alternative.
## Changes
- Added `UseInstead` pointing to `workspace-hostname-suffix` option
(triggers deprecation warning)
- Set `Hidden: true` to hide from CLI help and documentation
- Updated description to mention deprecation
- Regenerated docs and help files via `make gen`
Closes#18156
---
_Originally requested by @matifali in
https://github.com/coder/coder/pull/18085#discussion_r2115594447_
This pull-request addresses the size of the iconography within the
`<SingleSignOnSection />` section component. As a side-effect of the
changes in #21347 we are now rendering this too large.
Furthermore, to catch these issues in future we've introduced two new
stories within `SecurityPageView.stories.tsx` which render both `oidc`
and `github` login routes.
| Old | New |
| --- | --- |
| <img width="520" height="399" alt="OLD_SSO_PROVIDER"
src="https://github.com/user-attachments/assets/f6687b9a-d6bc-4bca-859a-0b59a3f6ba03"
/> | <img width="520" height="398" alt="NEW_SSO_PROVIDER"
src="https://github.com/user-attachments/assets/5beb8149-3e07-4dbc-9e0f-06f9207ecc59"
/> |
## Summary
The bottom admin bar (DeploymentBannerView) was showing a thick
scrollbar when content overflowed horizontally. This change applies the
native thin scrollbar style instead.
## Changes
- Added `[scrollbar-width:thin]` Tailwind CSS arbitrary value to the
deployment banner container
This uses the native CSS `scrollbar-width: thin` property which is
supported in modern browsers (Firefox, Chrome, Edge, Safari) and
provides a less obtrusive scrollbar when horizontal scrolling is needed.
## Testing
- The change is purely CSS and was verified with lint and format checks
passing
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> **Low Risk**
> Purely a CSS styling tweak with no behavioral, data, or security
impact; risk is limited to minor cross-browser appearance differences.
>
> **Overview**
> Updates the dashboard `DeploymentBannerView` bottom admin bar styling
to use the native CSS `scrollbar-width: thin` via Tailwind
(`[scrollbar-width:thin]`), reducing scrollbar thickness when the banner
overflows horizontally.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
ba36e48d66. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
Co-authored-by: Cursor Agent <cursor@coder.com>
This pull-request resolves a really annoying issue with the `<TasksPage
/>` switcher control. Essentially every time I navigated to this page my
eyes were drawn to this button that felt out of place. I finally figured
out why and its that its breaking the first rules of nested rounded
corners.
We should be using the following math to calculate the roundedness.
```
outerRadius - gap = innerRadius
```
<img width="852" height="596" alt="button-rounding"
src="https://github.com/user-attachments/assets/89de5d98-0891-4c9d-a5aa-66f722796630"
/>
## Summary
Adds support for pre-filling the OAuth2 application creation form via
URL query parameters.
## Query Parameters
| Parameter | Description |
|-----------|-------------|
| `name` | Pre-fills the "Application name" field |
| `callback_url` | Pre-fills the "Callback URL" field |
| `icon` | Pre-fills the "Application icon" field |
## Example
```
/deployment/oauth2-provider/apps/add?name=MyApp&callback_url=https://example.com/callback&icon=/icon/github.svg
```
This allows external tools or documentation to link directly to the
OAuth2 app creation page with pre-populated values.
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore <dependency name> major version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's major version (unless you unignore this specific
dependency's major version or upgrade to it yourself)
- `@dependabot ignore <dependency name> minor version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's minor version (unless you unignore this specific
dependency's minor version or upgrade to it yourself)
- `@dependabot ignore <dependency name>` will close this group update PR
and stop Dependabot creating any more for the specific dependency
(unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore <dependency name>` will remove all of the ignore
conditions of the specified dependency
- `@dependabot unignore <dependency name> <ignore condition>` will
remove the ignore condition of the specified dependency and ignore
conditions
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore <dependency name> major version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's major version (unless you unignore this specific
dependency's major version or upgrade to it yourself)
- `@dependabot ignore <dependency name> minor version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's minor version (unless you unignore this specific
dependency's minor version or upgrade to it yourself)
- `@dependabot ignore <dependency name>` will close this group update PR
and stop Dependabot creating any more for the specific dependency
(unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore <dependency name>` will remove all of the ignore
conditions of the specified dependency
- `@dependabot unignore <dependency name> <ignore condition>` will
remove the ignore condition of the specified dependency and ignore
conditions
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Update provisionerdserver to handle the changes introduced to
provisionerd in https://github.com/coder/coder/pull/21602
We now create a relationship between `workspace_agent_devcontainers` and
`workspace_agents` with the newly created `subagent_id`.
## Summary
Clarifies the [AI Bridge client config authentication
section](https://coder.com/docs/ai-coder/ai-bridge/client-config#authentication)
to explicitly state that only **Coder-issued tokens** are accepted.
## Changes
- Changed "API key" to "Coder API key" throughout the Authentication
section
- Added a note clarifying that provider-specific API keys (OpenAI,
Anthropic, etc.) will not work with AI Bridge
Fixes#21790
---
Created on behalf of @dannykopping
---------
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Previously the task logs endpoint only worked when the workspace was
running, leaving users unable to view task history after pausing.
This change adds snapshot retrieval with state-based branching: active
tasks fetch live logs from AgentAPI, paused/initializing/pending tasks
return stored snapshots (providing continuity during pause/resume), and
error/unknown states return HTTP 409 Conflict.
The response includes snapshot metadata (snapshot, snapshot_at) to
indicate whether logs are live or historical.
Closescoder/internal#1254
Operators need to know which API key was used in HTTP requests.
For example, if a key is leaking and a DDOS is underway using that key, operators need a way to identify the key in use and take steps to expire the key (see https://github.com/coder/coder/issues/21782).
_Disclaimer: created using Claude Opus 4.5_
During development of #21659 I approved some `<Paywall />` code that had
an extensive props system, however, I wasn't a huge fan of this. This
approach attempts to take it further like something `shadcn` would,
where-in we define the `<Paywall />` (and its subset of components) and
we wrap around those when needed for `<PaywallAIGovernance />` and
`<PaywallPremium />`.
Theoretically there is no real CSS/Design changes here. However
screenshot for prosperity.
| Previously | Now |
| --- | --- |
| <img width="2306" height="614" alt="CleanShot 2026-01-29 at 10 56
05@2x"
src="https://github.com/user-attachments/assets/83a4aa1b-da74-459d-ae11-fae06c1a8371"
/> | <img width="2308" height="622" alt="CleanShot 2026-01-29 at 10 55
05@2x"
src="https://github.com/user-attachments/assets/4aa43b09-6705-4af3-86cc-edc0c08e53b1"
/> |
---------
Co-authored-by: Ben Potter <me@bpmct.net>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
## Description
Removes the following deprecated Prometheus metrics:
- `coderd_api_workspace_latest_build_total` → use
`coderd_api_workspace_latest_build` instead
- `coderd_oauth2_external_requests_rate_limit_total` → use
`coderd_oauth2_external_requests_rate_limit` instead
These metrics were deprecated in #12976 because gauge metrics should
avoid the `_total` suffix per [Prometheus naming
conventions](https://prometheus.io/docs/practices/naming/).
## Changes
- Removed deprecated metric `coderd_api_workspace_latest_build_total`
from `coderd/prometheusmetrics/prometheusmetrics.go`
- Removed deprecated metric
`coderd_oauth2_external_requests_rate_limit_total` from
`coderd/promoauth/oauth2.go`
- Updated tests to use the non-deprecated metric name
Fixes#12999
The test was creating two template versions without explicit names,
relying on `namesgenerator.NameDigitWith()` which can produce
collisions. When both versions got the same random name, the test failed
with a 409 Conflict error.
Fix by giving each version an explicit name (`v1`, `v2`).
Closes https://github.com/coder/internal/issues/1309
---
*Generated by [mux](https://mux.coder.com)*
Add PeriodStart and PeriodDurationMilliseconds fields to BoundaryUsageSummary
so consumers of telemetry data can understand usage within a particular time window.
## Summary
This PR updates the note on the Tasks documentation page to more clearly
explain the relationship between Premium task limits and the AI
Governance Add-On.
## Problem
The previous wording:
> "Premium Coder deployments are limited to running 1,000 tasks. Contact
us for pricing options or learn more about our AI Governance Add-On to
evaluate all of Coder's AI features."
The "or" in this sentence could be interpreted as two separate paths:
(1) contact sales for custom pricing that might not require the add-on,
OR (2) get AI Governance. This led to confusion about whether higher
task limits could be obtained without the AI Governance Add-On.
## Solution
Updated the note to be explicit about the scaling path:
> "Premium deployments include 1,000 Agent Workspace Builds for
proof-of-concept use. To scale beyond this limit, the AI Governance
Add-On provides expanded usage pools that grow with your user count.
Contact us to discuss pricing."
This makes it clear that:
1. Premium includes 1,000 builds for POC use
2. Scaling beyond that requires the AI Governance Add-On
3. Contact sales to discuss pricing for the add-on
Created on behalf of @mattvollmer
---------
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Co-authored-by: Matt Vollmer <matthewjvollmer@outlook.com>
Justification:
- Populating `members` is authorized with `group_member.read` which is
not required to be able to share a workspace
- Populating `total_member_count` is authorized with `group.read` which
is required to be able to share
- The updated helper is only used in template/workspace sharing UIs, so
other pages that might need counts of readable members are unaffected
Related to: https://github.com/coder/internal/issues/1302
## Description
Adds Prometheus metrics to the AI Bridge Proxy for observability into
proxy traffic and performance.
## Changes
* Add Metrics struct with the following metrics:
* `connect_sessions_total`: counts CONNECT sessions by type
(mitm/tunneled)
* `mitm_requests_total`: counts MITM requests by provider
* `inflight_mitm_requests`: gauge tracking in-flight requests by
provider
* `mitm_request_duration_seconds`: histogram of request latencies by
provider
* `mitm_responses_total`: counts responses by status code class
(2XX/3XX/4XX/5XX) and provider
* Register metrics with `coder_aibridgeproxyd_` prefix in CLI
* Unregister metrics on server close to prevent registry leaks
* Add `tunneledMiddleware` to track non-allowlisted CONNECT sessions
* Add tests for metric recording in both MITM and tunneled paths
Closes: https://github.com/coder/internal/issues/1185
Adds a new subcommand to print the current session token for use in
scripts and automation, similar to `gh auth token`.
## Usage
```bash
CODER_SESSION_TOKEN=$(coder login token)
```
Fixes#21515
## Description
Add exponential backoff retries to all `go install` and `go mod
download` commands across CI workflows and actions.
## Why
Fixes
[coder/internal#1276](https://github.com/coder/internal/issues/1276) -
CI fails when `sum.golang.org` returns 500 errors during Go module
verification. This is an infrastructure-level flake that can't be
controlled.
## Changes
- Created `.github/scripts/retry.sh` - reusable retry helper with
exponential backoff (2s, 4s, 8s delays, max 3 attempts), using
`scripts/lib.sh` helpers
- Wrapped all `go install` and `go mod download` commands with retry in:
- `.github/actions/setup-go/action.yaml`
- `.github/actions/setup-sqlc/action.yaml`
- `.github/actions/setup-go-tools/action.yaml`
- `.github/workflows/ci.yaml`
- `.github/workflows/release.yaml`
- `.github/workflows/security.yaml`
- Added GNU tools setup (bash 4+, GNU getopt, make 4+) for macOS in
`test-go-pg` job, since `retry.sh` uses `lib.sh` which requires these
tools
## Summary
Fixes the broken Kilo Code documentation link in the AI Bridge
client-config page.
## Changes
- Updated the Kilo Code link from the old
`/docs/features/api-configuration-profiles` (returns 404) to the current
`/docs/ai-providers/openai-compatible` page
The Kilo Code documentation was restructured and the old URL no longer
exists.
Fixes#21750
Fixes: https://github.com/coder/internal/issues/560
"Select" CLI UI component should ignore "space" when `+Add custom value`
is highlighted. Otherwise it interprets that as a potential option...
and panics.
Fixes: coder/internal#767
Adds two new Prometheus metrics for license health monitoring:
- `coderd_license_warnings` - count of active license warnings
- `coderd_license_errors` - count of active license errors
Metrics endpoint after startup of a deployment with license enabled:
```
...
# HELP coderd_license_errors The number of active license errors.
# TYPE coderd_license_errors gauge
coderd_license_errors 0
...
# HELP coderd_license_warnings The number of active license warnings.
# TYPE coderd_license_warnings gauge
coderd_license_warnings 0
...
```
fixes: https://github.com/coder/internal/issues/1304
Subscribe to heartbeats synchronously on startup of PGCoordinator. This ensures tests that send heartbeats don't race with this subscription.
Closes [#1246](https://github.com/coder/internal/issues/1246)
This PR adds a new component to display AI Governance user entitlements
in the Licenses Settings page. The implementation includes:
- New `AIGovernanceUsersConsumptionChart` component that shows the
number of entitled users for AI Governance features
- Storybook stories for various states (default, disabled, error states)
- Integration with the existing license settings page
- Collapsible "Learn more" section with links to relevant documentation
- Updated the ManagedAgentsConsumption component with clearer
terminology ("Agent Workspace Builds" instead of "Managed AI Agents")
The chart displays the number of users entitled to use AI features like
AI Bridge, Boundary, and Tasks, with a note that additional analytics
are coming soon.
### Preview
<img width="3516" height="2390" alt="CleanShot 2026-01-27 at 22 44
25@2x"
src="https://github.com/user-attachments/assets/cb97a215-f054-45cb-a3e7-3055c249ef04"
/>
<img width="3516" height="2390" alt="CleanShot 2026-01-27 at 22 45
04@2x"
src="https://github.com/user-attachments/assets/d2534189-cffb-4ad2-b2e2-67eb045572e8"
/>
---------
Co-authored-by: Jaayden Halko <jaayden.halko@gmail.com>
This pull request makes a minor update to the documentation check
workflow. It clarifies that a comment should not be posted if there are
no documentation changes needed and simplifies the comment format
instructions.
The reaper (PID 1) now returns the child's exit code instead of always
exiting 0. Signal termination uses the standard Unix convention of 128 +
signal number.
fixes#21661
My previous change to this test did not create another **workspace**
using the template containing `coder_ai_task` resources, meaning that
this test was not actually testing the right thing. This PR addresses
this oversight.
The test occasionally times out at 15s on Windows CI runners.
Investigation of CI logs shows the HTTP request to the agent's
gitsshkey endpoint never appears in server logs, suggesting it
hangs before the request completes (possibly in connection setup,
middleware, or database queries). Increase to 60s to reduce flake
rate.
Fixescoder/internal#770
## Description
Moves the provider lookup from `handleRequest` to `authMiddleware` so
that the provider is determined during the `CONNECT` handshake and
stored in the request context. This enables provider information to be
available earlier in the request lifecycle.
## Changes
* Move `aibridgeProviderFromHost` call from `handleRequest` to
`authMiddleware`
* Store `Provider` in `requestContext` during `CONNECT` handshake
* Add provider validation in `authMiddleware` (reject if no provider
mapping)
* Keep defensive provider check in `handleRequest` for safety
Follow-up from: https://github.com/coder/coder/pull/21617
Closes#20598
This pull-request implements a very basic change to also render the
`icon` of the `Preset` when we've specifically defined one within the
template. Furthermore, theres a `ⓘ` icon with a description.
### Preview
<img width="984" height="442" alt="CleanShot 2026-01-27 at 20 15 29@2x"
src="https://github.com/user-attachments/assets/d4ceebf9-a5fe-4df4-a8b2-a8355d6bb25e"
/>
This migration converts all tailnet coordination tables to UNLOGGED:
- `tailnet_coordinators`
- `tailnet_peers`
- `tailnet_tunnels`
UNLOGGED tables skip Write-Ahead Log (WAL) writes, significantly
improving performance for high-frequency updates like coordinator
heartbeats and peer state changes.
The trade-off is that UNLOGGED tables are truncated on crash recovery
and are not replicated to standby servers. This is acceptable for these
tables because the data is ephemeral:
1. Coordinators re-register on startup
2. Peers re-establish connections on reconnect
3. Tunnels are re-created based on current peer state
**Migration notes:**
- Child tables must be converted before the parent table because LOGGED
child tables cannot reference UNLOGGED parent tables (but the reverse is
allowed)
- The down migration reverses the order: parent first, then children
Fixes https://github.com/coder/coder/issues/21333
Implements telemetry for boundary usage tracking across all Coder
replicas and reports them via telemetry.
Changes:
- Implement Tracker with Track(), FlushToDB(), and StartFlushLoop() methods
- Add telemetry integration via collectBoundaryUsageSummary()
- Use telemetry lock to ensure only one replica collects per period
The tracker accumulates unique workspaces, unique users, and request
counts (allowed/denied) in memory, then flushes to the database
periodically. During telemetry collection, stats are aggregated across
all replicas and reset for the next period.
Closes#21593
Various `<PopoverContent>`'s among the application were found that when
the screen-size was too small we weren't able to actually see the full
content unless we resized the window. This pull-request ensures that the
content is never going to extend past that of the
`--radix-popper-available-height` without having an appropriate
scrollbar.
| Before | After |
| --- | --- |
| <img width="948" height="960" alt="CleanShot 2026-01-21 at 20 56
48@2x"
src="https://github.com/user-attachments/assets/5d15fbf9-1c62-427b-bbed-81239922a6bc"
/> | <img width="896" height="906" alt="CleanShot 2026-01-21 at 21 19
03@2x"
src="https://github.com/user-attachments/assets/cfa5baa5-2ec1-438c-9454-bf3073dc6534"
/> |
Only task workspaces have the checks in wsbuilder for violating the
managed agent caps in the license.
Stopped tasks that are resumed with a regular workspace start **still
count as usage**.
Clarified the definition of Agent Workspace Builds and updated the
previous term used.
<!--
If you have used AI to produce some or all of this PR, please ensure you
have read our [AI Contribution
guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING)
before submitting.
-->
feat: add boundary usage telemetry database schema and RBAC
Adds the foundation for tracking boundary usage telemetry across Coder
replicas. This includes:
- Database schema: `boundary_usage_stats` table with per-replica stats
(unique workspaces, unique users, allowed/denied request counts)
- Database queries: upsert stats, get aggregated summary, reset stats,
delete by replica ID
- RBAC: `boundary_usage` resource type with read/update/delete actions,
accessible only via system `BoundaryUsageTracker` subject (not regular
user roles)
- Tracker skeleton + docs: stub implementation in `coderd/boundaryusage/`
The tracker accumulates stats in memory and periodically flushes to the
database. Stats are aggregated across replicas for telemetry reporting,
then reset when a new reporting period begins. The tracker implementation
and plumbing will be done in a subsequent commit/PR.
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Changes:
- bump claude-code module version
- add docs for version compatibility (for customers using older version
of coder or claude-code module; before GA)
This undeprecates the `allow-workspace-renames` flag. IIUC, the 'danger'
with using this flag is that the workspace name might have been used in
the definition of some other terraform resources within template code,
so a rename could cause problems such as with persistent disks.
for https://github.com/coder/coder/issues/21628
---------
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Use transition-specific actions when authorizing workspace build
parameter inserts in the database layer so start/stop/delete do not
require workspace.update.
Related to: https://github.com/coder/internal/issues/1299
This pull request updates the `.github/workflows/doc-check.yaml`
workflow to improve how documentation reviews are triggered and handled,
particularly for pull requests that are converted from draft to ready
for review. The changes ensure that documentation checks are performed
at the appropriate times and clarify the workflow's behavior.
**Workflow trigger and logic enhancements:**
* Added support for triggering the documentation check when a pull
request is marked as "ready for review" (converted from draft), both in
the workflow triggers and in the workflow logic.
[[1]](diffhunk://#diff-46e6065a312f35e5d294476e7865089afd10e6072fed80ac77b257e090def149R9)
[[2]](diffhunk://#diff-46e6065a312f35e5d294476e7865089afd10e6072fed80ac77b257e090def149R24)
[[3]](diffhunk://#diff-46e6065a312f35e5d294476e7865089afd10e6072fed80ac77b257e090def149L39-R48)
* Updated the internal context and trigger type handling to recognize
and describe the "ready_for_review" event, providing more accurate
context for the agent.
[[1]](diffhunk://#diff-46e6065a312f35e5d294476e7865089afd10e6072fed80ac77b257e090def149R138-R140)
[[2]](diffhunk://#diff-46e6065a312f35e5d294476e7865089afd10e6072fed80ac77b257e090def149R171-R173)
**Workflow behavior adjustment:**
* Changed the `comment-on-issue` setting to `false`, so the workflow
will no longer automatically comment on the PR issue when running which
was creating unnecessary noise.
Relates to https://github.com/coder/internal/issues/1282
Updates tracking of managed agents to be predicated instead on the
presence of a related `task_id` instead of the presence of a
`coder_ai_task` resource.
## Description
When `CONNECT` requests are missing or have invalid
`Proxy-Authorization` credentials, the proxy now returns a proper `407
Proxy Authentication Required` response with a `Proxy-Authenticate`
challenge header instead of rejecting the connection without an HTTP
response.
Some clients (e.g. Copilot in VS Code) do not send the
`Proxy-Authorization` header on the initial request and rely on
receiving a `407 challenge` to prompt for credentials. Without this fix,
those clients would fail to connect.
## Changes
* Added `newProxyAuthRequiredResponse` helper function to create
consistent `407` responses with the appropriate `Proxy-Authenticate`
header.
* Updated `authMiddleware` to return a `407` challenge instead of
rejecting unauthenticated `CONNECT` requests without an HTTP response
* Refactored `handleRequest` to use the same helper for consistency
* Updated `TestProxy_Authentication` to verify the `407` response
status, `Proxy-Authenticate` header, and response body
Related to: https://github.com/coder/internal/issues/1235
Closes [#1227](https://github.com/coder/internal/issues/1227)
Added support for license addons, starting with AI Governance, to enable
dynamic feature grouping without requiring license reissuance.
### What changed?
- Introduced a new `Addon` type to represent groupings of features that
can be added to licenses
- Created the first addon `AddonAIGovernance` which includes AI Bridge
and Boundary features
- Added validation for addon dependencies to ensure required features
are present
- Added new features: `FeatureBoundary` and
`FeatureAIGovernanceUserLimit`
- Updated license entitlement logic to handle addons and their features
- Added helper methods to check if features belong to addons
- Updated tests to verify addon functionality
### Why make this change?
This change introduces a more flexible licensing model that allows
features to be grouped into addons that can be added to licenses without
requiring reissuance when new features are added to an addon. This is
particularly useful for specialized feature sets like AI Governance,
where related features can be bundled together and sold as a separate
SKU. The addon approach allows for better organization of features and
more granular control over entitlements.
## Summary
Adds documentation for the "disable automatic updates" feature in Coder
Desktop.
This adds a new "Administrator Configuration" section to the Coder
Desktop docs that documents:
- **macOS**: Setting the `disableUpdater` UserDefaults key via MDM or
`defaults` command
- **Windows**: Setting the `Updater:Enable` registry value in
`HKLM\SOFTWARE\Coder Desktop\App`
The feature already exists in both platforms but was not documented in
the user-facing docs.
## Changes
- Added new "Administrator Configuration" section before
"Troubleshooting"
- Documented macOS MDM configuration for disabling updates
- Documented Windows registry configuration for disabling updates
- Mentioned the `ForcedChannel` option for locking update channels
---
Created on behalf of @ethanndickson
---------
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
## Description
Improves logging in `aibridgeproxyd` to provide better observability for
proxy requests. Adds structured logging with request correlation IDs and
propagates request context through the proxy chain.
## Changes
* Add `requestContext` struct to propagate metadata (token, provider,
session ID) through the proxy request/response chain
* ~Add `handleTunnelRequest` to log passthrough requests for
non-allowlisted domains at debug level~ (removed due to verbosity)
* Add `handleResponse` to log responses from `aibridged`
* Log MITM requests routed to `aibridged` at info level, tunneled
requests at debug level
Related to: https://github.com/coder/internal/issues/1185
Closes https://github.com/coder/internal/issues/1167
Previously we were checking that start != end time; this was flaking on
Windows.
On Windows, `time.Now()` has limited resolution (~1ms with Go runtime's
`timeBeginPeriod`, or ~15.6ms in default system resolution). When two
`time.Now()` calls execute within the same clock tick, they return
identical timestamps, causing `StartedAt.Before(EndedAt)` to return
`false`.
**References:**
- [Go issue #8687](https://github.com/golang/go/issues/8687) - Windows
system clock resolution issue
- [Go issue #67066](https://github.com/golang/go/issues/67066) -
time.Now precision on Windows (still open)
Instead, we're changing the assertion to (the more semantically correct)
"end not before start".
A possible future enhancement could be to plumb coder/quartz through the
recording mechanism, but it's unnecessary for now.
Signed-off-by: Danny Kopping <danny@coder.com>
This change adds a POST /workspaceagents/me/tasks/{task}/log-snapshot
endpoint for agents to upload task conversation history during
workspace shutdown. This allows users to view task logs even when the
workspace is stopped.
The endpoint accepts agentapi format payloads (typically last 10
messages, max 64KB), wraps them in a format envelope, and upserts to the
task_snapshots table. Uses agent token auth and validates the task
belongs to the agent's workspace.
Closescoder/internal#1253
The workspace sharing autocomplete was using the site-wide /api/v2/users
endpoint which requires user:read permission. Regular org members don't
have this permission, so they couldn't see other members to share with.
## Sharing Scope
* This iteration of shared workspaces is slated for beta, and the
currently understood use case does not include cross-org workspace
sharing. This can be addressed later if necessary.
## What's Changed
* Changed to use /api/v2/organizations/{org}/members instead, which only
requires organization_member:read permission (already granted to org
members when workspace sharing is enabled).
* Added `OrganizationMemberWithUserData` to the `UserLike` union to
allow for more flexibility in differentiating groups from users
Bumps rust from `bf3368a` to `df6ca8f`.
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Relates to https://github.com/coder/coder/pull/21676
* Replaces all existing usages of `httpapi.Heartbeat` with `httpapi.HeartbeatClose`
* Removes `httpapi.HeartbeatClose`
When running multiple instances of the coder/coder devcontainer, the
`postStartCommand` fails with exit code 1:
```
postStartCommand from devcontainer.json failed with exit code 1
Command failed: ./.devcontainer/scripts/post_start.sh
```
`service docker start` returns exit code 1 when Docker is already
running:
| Docker State | Exit Code |
|--------------|-----------|
| Not running | 0 |
| Already running | **1** |
## Fix
Check if Docker is already running before attempting to start it:
```bash
sudo service docker status >/dev/null 2>&1 || sudo service docker start
```
---
*This PR and description were generated by
[Mux](https://mux.coder.com).*
Removes the legacy tailnet v1 API tables (`tailnet_clients`, `tailnet_agents`, `tailnet_client_subscriptions`) and their associated queries, triggers, and functions. These were superseded by the v2 tables (`tailnet_peers`, `tailnet_tunnels`) in migration 000168, and the v1 API code was removed in commit d6154c4310, but the database artifacts were never cleaned up.
**Changes:**
- New migration `000410_remove_tailnet_v1_tables` to drop the unused tables
- Removed 11 unused queries from `tailnet.sql`
- Removed associated manual wrapper methods in `dbauthz` and `dbmetrics`
- ~930 lines deleted across 11 files
## Summary
AI Bridge is moving to General Availability in v2.30 and will require
the AI Governance Add-On license in future versions. This adds a soft
warning for deployments using AI Bridge via Premium/Enterprise
FeatureSet without an explicit AI Bridge add-on license.
Relates to: https://github.com/coder/internal/issues/1226
## Changes
- Track whether AI Bridge was explicitly granted via license Features
(add-on) vs inherited from FeatureSet
- Show soft warning when AI Bridge is enabled and entitled via
FeatureSet but not via explicit add-on
- Changed AI Bridge enablement from hardcoded `true` to check
`CODER_AIBRIDGE_ENABLED` deployment config
## Behavior Change
AI Bridge is now only marked as "enabled" in entitlements when
`CODER_AIBRIDGE_ENABLED=true` is set in the deployment config.
Previously, it was always enabled for Premium/Enterprise licenses
regardless of the config setting.
This change ensures that users who do not use AI Bridge will not see the
soft warning about the upcoming license requirement.
## Warning Message
> AI Bridge is now Generally Available in v2.30. In a future Coder
version, your deployment will require the AI Governance Add-On to
continue using this feature. Please reach out to your account team or
sales@coder.com to learn more.
## Behavior
| Condition | Warning Shown |
|-----------|---------------|
| AI Bridge disabled | ❌ No |
| AI Bridge enabled + explicit add-on license | ❌ No |
| AI Bridge enabled + Premium/Enterprise FeatureSet (no add-on) | ✅ Yes
|
## Screenshots
### 1. No license
<img width="1708" height="577" alt="image"
src="https://github.com/user-attachments/assets/cbdbfd4d-55de-4d70-8abf-2665f458e96f"
/>
### 2. No license + CODER_AIBRIDGE_ENABLED=true
<img width="1716" height="513" alt="image"
src="https://github.com/user-attachments/assets/344aae76-7703-485f-b568-1f13a1efa48f"
/>
### 3. Premium license + CODER_AIBRIDGE_ENABLED=false
<img width="1687" height="389" alt="image"
src="https://github.com/user-attachments/assets/c2be12b0-1c0f-438d-a293-f9ec9fe6a736"
/>
### 4. Premium license + CODER_AIBRIDGE_ENABLED=true
<img width="1707" height="525" alt="image"
src="https://github.com/user-attachments/assets/1a4640e1-e656-4f9b-bed0-9390cb5d6a84"
/>
## Notes
- TODO comments added to mark code that should be removed when AI Bridge
enforcement is added
- Feature continues to work - this is just a transitional warning (soft
enforcement)
Relates to https://github.com/coder/coder/issues/19715
This is similar to https://github.com/coder/coder/pull/19711
This endpoint works by doing the following:
- Subscribing to the database's with pubsub
- Accepts a WebSocket upgrade
- Starts a `httpapi.Heartbeat`
- Creates a json encoder
- **Infinitely loops waiting for notification until request context
cancelled**
The critical issue here is that `httpapi.Heartbeat` silently fails when
the client has disconnected. This means we never cancel the request
context, leaving the WebSocket alive until we receive a notification
from the database and fail to write that down the pipe.
By replacing usage of `httpapi.Heartbeat` with `httpapi.HeartbeatClose`,
we cancel the context _when the heartbeat fails to write_ due to the
client disconnecting. This allows us to cleanup without waiting for a
notification to come through the pubsub channel.
## Summary
Updates the bulk action checkbox style in workspace and task lists to
use the new Shadcn Checkbox component that aligns with the Coder design
system (as specified in
[Figma](https://www.figma.com/design/WfqIgsTFN2BscBSSyXWF8/Coder-kit?node-id=489-4187&t=KRtpi391rVPHRXJI-1)).
## Changes
- **WorkspacesTable.tsx**: Replace MUI Checkbox with Shadcn Checkbox
component
- **TasksTable.tsx**: Replace MUI Checkbox with Shadcn Checkbox
component
- **WorkspacesPageView.stories.tsx**: Add `WithCheckedWorkspaces` story
to showcase the new design
## Key Improvements
The new checkbox design features:
- ✨ Consistent 20px × 20px sizing (vs. old larger MUI checkbox)
- 🎨 Clean inverted color scheme (light background unchecked, dark when
checked)
- ✅ Proper indeterminate state support for "select all" functionality
- 🎯 Smooth hover and focus states with proper ring indicators
- 📐 Better alignment with Coder design language from Figma
## API Changes
Updated from MUI Checkbox API to Radix UI Checkbox API:
- `onChange={(e) => ...}` → `onCheckedChange={(checked) => ...}`
- Removed MUI-specific `size` props (`xsmall`, `small`)
- Updated `checked` prop to support boolean | "indeterminate"
## Testing
### Storybook
The checkbox changes can be reviewed in Storybook:
1. **Checkbox Component** - `components/Checkbox` - Base component
examples
2. **Workspaces Page** - `pages/WorkspacesPage/WithCheckedWorkspaces` -
Shows workspace list with selections
3. **Tasks Page** - `pages/TasksPage/BatchActionsSomeSelected` - Shows
task list with selections
### Feature Flag Requirement
**Note**: The bulk action checkboxes require the
`workspace_batch_actions` or `task_batch_actions` feature flags to be
enabled (Premium feature). To test in a live environment, you'll need a
valid license that includes these features.
## Screenshots
Before: Old MUI checkbox with prominent blue styling and larger size
After: New Shadcn checkbox with refined design matching Coder's design
system
_(Screenshots can be viewed in Storybook at the URLs above)_
Closes#21444🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: Jake Howell <jake@hwll.me>
Co-authored-by: Jaayden Halko <jaayden@coder.com>
This pull request updates the `.github/workflows/doc-check.yaml`
workflow to improve its handling of required secrets. The workflow now
checks for the presence of necessary secrets before proceeding, and
conditionally skips all subsequent steps if the secrets are unavailable.
This prevents failures on pull requests where secrets are not accessible
(such as from forks), and provides clear messaging for maintainers about
manual triggering options.
Key improvements:
**Secret availability checks and conditional execution:**
* Added an explicit step at the start of the workflow to check if
required secrets (`DOC_CHECK_CODER_URL` and
`DOC_CHECK_CODER_SESSION_TOKEN`) are available, and set an output flag
(`skip`) accordingly.
* Updated all subsequent workflow steps to include a conditional (`if:
steps.check-secrets.outputs.skip != 'true'`), ensuring they only run if
the secrets are present. This includes setup, context extraction, task
creation, waiting, and summary steps.
[[1]](diffhunk://#diff-46e6065a312f35e5d294476e7865089afd10e6072fed80ac77b257e090def149R59-R82)
[[2]](diffhunk://#diff-46e6065a312f35e5d294476e7865089afd10e6072fed80ac77b257e090def149R140)
[[3]](diffhunk://#diff-46e6065a312f35e5d294476e7865089afd10e6072fed80ac77b257e090def149R205)
[[4]](diffhunk://#diff-46e6065a312f35e5d294476e7865089afd10e6072fed80ac77b257e090def149R215)
[[5]](diffhunk://#diff-46e6065a312f35e5d294476e7865089afd10e6072fed80ac77b257e090def149R232)
[[6]](diffhunk://#diff-46e6065a312f35e5d294476e7865089afd10e6072fed80ac77b257e090def149R250)
* Modified the "Fetch Task Logs", "Cleanup Task", and "Write Final
Summary" steps to combine their existing `always()` condition with the
new secrets check, preventing unnecessary errors when secrets are
missing.
[[1]](diffhunk://#diff-46e6065a312f35e5d294476e7865089afd10e6072fed80ac77b257e090def149L314-R340)
[[2]](diffhunk://#diff-46e6065a312f35e5d294476e7865089afd10e6072fed80ac77b257e090def149L327-R353)
[[3]](diffhunk://#diff-46e6065a312f35e5d294476e7865089afd10e6072fed80ac77b257e090def149L339-R365)
**Documentation and messaging:**
* Added comments at the top of the workflow file to explain the secret
requirements and the expected behavior for PRs without secrets,
including instructions for maintainers on manual triggering.…se on pr's
originating from forks.
<!--
If you have used AI to produce some or all of this PR, please ensure you
have read our [AI Contribution
guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING)
before submitting.
-->
Source code changes:
- Added a wrapper for the boundary subcommand that checks feature
entitlement before executing the underlying command.
- Added a helper that returns the Boundary version using the
runtime/debug package, which reads this information from the go.mod
file.
- Added FeatureBoundary to the corresponding enum.
- Move boundary command from AGPL to enterprise.
`NOTE`: From now on, the Boundary version will be specified in go.mod
instead of being defined in AI modules.
## Description
Fixes a panic that occurs when the prebuilds feature is toggled by
adding/removing a license. The `StoreReconciler` was not unregistering
the `reconciliationDuration` histogram, causing a "duplicate metrics
collector registration attempted" panic when a new reconciler was
created.
## Changes
* Unregister the `reconciliationDuration` histogram in `Stop()`
alongside the existing metrics collector
* Change log level when stopping the reconciler with a cause, since
"entitlements change" is not an error condition
* Add `TestReconcilerLifecycle` to verify the reconciler can be stopped
and recreated with the same prometheus registry
Related to internal slack thread:
https://codercom.slack.com/archives/C07GRNNRW03/p1769116582171379
## Summary
AI Bridge is moving out of Premium as a separate add-on (GA in Feb 3).
Closes https://github.com/coder/internal/issues/1226
## Changes
- Excludes `FeatureAIBridge` from `Enterprise()` and
`FeatureSetPremium.Features()`
- Adds soft warning for deployments with AI Bridge enabled but not
entitled
- Warning is displayed to Auditor/Owner roles in UI banner and CLI
headers
## Warning Message
When AI Bridge is enabled (`CODER_AIBRIDGE_ENABLED=true`) but the
license doesn't include the entitlement:
> AI Bridge has reached General Availability and your Coder deployment
is not entitled to run this feature. Contact your account team
(https://coder.com/contact) for information around getting a license
with AI Bridge.
## Behavior
- The feature remains usable in v2.30 (soft warning only)
- Future versions may include hard enforcement
Relates to
https://github.com/coder/aibridge/pull/143/changes#r2720659638
We previously had been returning the following when attempting to delete
failed due to lack of permissions.
```
500 Internal error deleting template: unauthorized: rbac: forbidden
```
This PR updates the handler to return our usual 403 forbidden response.
fixes https://github.com/coder/internal/issues/1286
We can get blank IP address from the net connection if the client has
already disconnected, as was the case in this flake. Fix is to only log
error if we get something non-empty we can't parse.
---------
Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
Relates to https://github.com/coder/internal/issues/1289
I was able to reproduce the issue locally -- it appears to sometimes
just take 25 seconds to get all of the test dependencies stood up:
```
t.go:111: 2026-01-22 16:39:15.388 [debu] pubsub: pubsub dialing postgres network=tcp address=127.0.0.1:5432 timeout_ms=0
...
t.go:111: 2026-01-22 16:39:38.789 [info] agent.net.tailnet.tcp: accepted connection src=[fd7a:115c:a1e0:44b1:8901:8f09:e605:d019]:55406 dst=[fd7a:115c:a1e0:4cfd:a892:e4e2:8cad:8534]:1
...
ssh_test.go:1208:
Error Trace: /Users/cian/src/coder/coder/testutil/chan.go:74
/Users/cian/src/coder/coder/cli/ssh_test.go:1208
Error: SoftTryReceive: context expired
Test: TestSSH/StdioExitOnParentDeath
ssh_test.go:1212:
```
Hopefully bumping the timeout should fix it.
Adds template_version_id to re-emitted boundary audit logs to allow
filtering and analysis by specific template versions iin addition to the
existing template_id field. Since boundary policies are defined in the
template, the template version is critical to figuring out which policy
was responsible for boundaries decision in a workspace.
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Also remove the "restart" button. The language is still there, and the
restart button is already on this page. That button being present had
users clicking it when waiting was the correct solution.
So the "Restart" option is not as pushed.
The removal of that permission from the role broke valid use cases (e.g.
a site owner user creating a workspace owned by a system account and
then trying to share it with another user).
The bulk of the PR is made up of the rollbacks of the previously
introduced test updates necessitated by the removal.
Related to: https://github.com/coder/internal/issues/1285
Simplifies the CTA text from "Share workspace" to "Share" for better UX.
Users don't need to understand the underlying infrastructure when
working on tasks, so the shorter, more direct text is clearer.
**Affected locations:**
- Tasks table dropdown menu
- Tasks sidebar dropdown menu
Fixes https://github.com/coder/coder/issues/21599🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Boundary policies are currently defined at the template level, so
including the template ID in re-emitted logs by the control plane
allows policy creators to filter and observe boundary activity for
specific templates. This makes it easier to verify that policies are
working as expected and to debug issues with specific template
configurations.
AI agents report status via patchWorkspaceAgentAppStatus, but this wasn't
extending workspace deadlines. This prevented proper task auto-pause behavior,
causing tasks to pause mid-execution when there were no human connections.
Now we call ActivityBumpWorkspace when agents report status, using the same
logic as SSH/IDE connections. We bump when transitioning to or from the working
state.
Closescoder/internal#1251
## Summary
Updates our existing doc-check workflow to utilize new claude-skills
that exist in the repository for better contextual behavior with less
prompting requirements in the workflow
## Changes
### New: Claude Skill (`.claude/skills/doc-check/SKILL.md`)
Defines the doc-check workflow for Claude:
- Check PR title first to skip irrelevant reviews (refactors, tests,
chores)
- Read code changes and search existing docs for related content
- Post structured comments with specific recommendations
### Updated: GitHub Workflow (`.github/workflows/doc-check.yaml`)
- **Triggers**: PR opened, updated (synchronize), `doc-check` label
added, or manual dispatch
- **Task lifecycle**: Creates task → monitors completion → fetches logs
→ cleans up
- **Context-aware prompts**: Tells Claude if this is a new PR, update,
or manual request
- Uses `coder-username` parameter to run as `doc-check-bot` service
account
## Description
Introduces a new `X-Coder-Token` header for authenticating requests from
AI Proxy to AI Bridge. Previously, the proxy overwrote the
`Authorization` header with the Coder token, which prevented the
original authentication headers from flowing through to upstream
providers.
With this change, AI Proxy sets the Coder token in a separate header,
preserving the original `Authorization` and `X-Api-Key` headers. AI
Bridge uses this header for authentication and removes it before
forwarding requests to upstream providers. For requests that don't come
through AI Proxy, AI Bridge continues to use `Authorization` and
`X-Api-Key` for authentication.
## Changes
* Add `HeaderCoderAuth` constant and update `ExtractAuthToken` to check
headers in the following order: `X-Coder-Token` > `Authorization` >
`X-Api-Key`
* Update AI Proxy to set `X-Coder-Token` instead of overwriting
`Authorization`
* Remove `X-Coder-Token` in AI Bridge before forwarding to upstream
providers
* Add tests for header handling and token extraction priority
Related to: https://github.com/coder/internal/issues/1235
Relates to https://github.com/coder/internal/issues/1217
Adds a background goroutine in `--stdio` mode to check if the parent PID
is still alive and exit if it is no longer present.
🤖 Implemented using Mux + Claude Opus 4.5, reviewed and refactored by
me.
Agents were losing authentication during workspace shutdown, causing
shutdown scripts to fail. The auth query required agents to belong to
the latest build, but during shutdown a `stop` build becomes latest while
the `start` build's agents are still running.
Modified the auth query to allow `start` build agents to authenticate
temporarily during `stop` execution. The query allows auth when:
- Agent's `start` build job succeeded
- Latest build is `stop` with `pending`/`running` job status
- Builds are adjacent (`stop` is `build_number + 1`)
- Template versions match
Auth closes once `stop` completes.
Renamed `GetWorkspaceAgentAndLatestBuildByAuthToken` to
`GetAuthenticatedWorkspaceAgentAndBuildByAuthToken` since it returns the
agent's build (not always latest) during shutdown.
Closes coder/internal#1249
Fixes#19467
## Description
This PR addresses database connection pool exhaustion during prebuilds
reconciliation by introducing two changes:
* `CanSkipReconciliation`: Filters out presets that don't need
reconciliation before spawning goroutines. This ensures we only create
goroutines for presets that will (_most likely_) perform database
operations, avoiding unnecessary connection pool usage.
* Dynamic `eg.SetLimit`: Limits concurrent goroutines based on the
configured database connection pool size (`CODER_PG_CONN_MAX_OPEN / 2`).
This replaces the previous hardcoded limit of 5, ensuring the
reconciliation loop scales appropriately with the configured pool size
while leaving capacity for other database operations.
## Changes
* Add `CanSkipReconciliation()` method to `PresetSnapshot` that returns
true for inactive presets with no running workspaces, no pending jobs,
or expired prebuilds.
* Add `maxDBConnections` parameter to `NewStoreReconciler` and compute
`reconciliationConcurrency` as half the pool size (minimum 1).
* Add `ReconciliationConcurrency()` getter method to `StoreReconciler`.
* Add `eg.SetLimit(c.reconciliationConcurrency)` to bound concurrent
reconciliation goroutines.
* Add `PresetsTotal` and `PresetsReconciled` to `ReconcileStats` for
observability.
* Add `TestCanSkipReconciliation` unit tests.
* Add `TestReconciliationConcurrency` unit tests.
* Add benchmark tests for reconciliation performance.
## Benchmarks
* `BenchmarkReconcileAll_NoOps`: Tests presets with no reconciliation
actions. All presets are filtered by `CanSkipReconciliation`, resulting
in no goroutines spawned and no database connections used.
* `BenchmarkReconcileAll_ConnectionContention`: Tests presets where all
require reconciliation actions. All presets spawn goroutines, but
concurrency is limited by `eg.SetLimit(reconciliationConcurrency)`.
* `BenchmarkReconcileAll_Mix`: Simulates a realistic scenario with a
large subset of inactive presets (filtered by `CanSkipReconciliation`)
and a smaller subset requiring reconciliation (limited by
`eg.SetLimit`).
Closes: https://github.com/coder/coder/issues/20606
Creates migration 000409 with the database foundation for pausing and
resuming task workspaces.
The task_snapshots table stores conversation history (AgentAPI messages)
so users can view task logs even when the workspace is stopped. Each task
gets one snapshot, overwritten on each pause.
Three new build_reason values (task_auto_pause, task_manual_pause,
task_resume) let us distinguish task lifecycle events in telemetry and
audit logs from regular workspace operations.
Uses a regular table rather than UNLOGGED for snapshots. While UNLOGGED
would be faster, losing snapshots on database crash creates user confusion
(logs disappear until next pause). We can switch to UNLOGGED post-GA if
write performance becomes a problem.
Closescoder/internal#1250
Closes#21440
The `TestDBPurgeAuthorization` test was overfitting by calling each
purge method individually, which reimplemented dbpurge logic in the test
and created a maintenance burden. When new purge steps are added, they
either need to be reflected in the test or there will be a testing
blindspot.
This change extracts the `doTick` closure into an exported `PurgeTick`
function that returns an error, making the core purge logic testable.
The test now calls `PurgeTick` directly to exercise the actual dbpurge
behavior rather than reimplementing it. Retention values are configured
to ensure all purge operations run, so we test RBAC permissions for all
code paths.
- Tests actual dbpurge behavior instead of reimplementing it
- Automatically covers new purge steps when they're added
- Still validates that all operations have proper RBAC permissions
The test focuses on authorization (checking for RBAC errors) rather than
verifying deletion behavior, which is already covered by other tests
like `TestDeleteExpiredAPIKeys` and `TestDeleteOldAuditLogs`.
## Description
Adds startup validation to ensure all allowlisted domains have
corresponding AI Bridge provider mappings. This prevents a
misconfiguration where a domain could be MITM'd (decrypted) but have no
route to aibridge.
Previously, if a domain was in the allowlist but had no provider
mapping, requests would be decrypted and forwarded to the original
destination, a potential privacy concern. Now the server fails to start
if this misconfiguration is detected.
## Summary
Add circuit breaker support for AI Bridge to protect against cascading
failures from upstream AI provider rate limits (HTTP 429, 503, and
Anthropic's 529 overloaded responses).
## Changes
- Add 5 new CLI options for circuit breaker configuration:
- `--aibridge-circuit-breaker-enabled` (default: false)
- `--aibridge-circuit-breaker-failure-threshold` (default: 5)
- `--aibridge-circuit-breaker-interval` (default: 10s)
- `--aibridge-circuit-breaker-timeout` (default: 30s)
- `--aibridge-circuit-breaker-max-requests` (default: 3)
- Update aibridge dependency to include circuit breaker support
- Add tests for pool creation with circuit breaker providers
## Notes
- Circuit breaker is **disabled by default** for backward compatibility
- When enabled, applies to both OpenAI and Anthropic providers
- Uses sony/gobreaker internally via the aibridge library
## Testing
```
make test RUN=TestPoolWithCircuitBreakerProviders
```
- Adds pprof collection support now that we have the listeners
automatically starting (requires Coder server 2.28.0+, includes a
version check). Collects heap, allocs, profile (30s), block, mutex,
goroutine, threadcreate, trace (30s), cmdline, symbol. Performs capture
for 30 seconds and emits a log line stating as such. Enable capture by
supplying the `--pprof` flag or `CODER_SUPPORT_BUNDLE_PPROF` env var.
Collection of pprof data from both coderd and the Coder agent occurs.
- Adds collection of Prometheus metrics, also requires 2.28.0+
- Adds the ability to include a template in the bundle independently of
supplying the details of a running workspace by supplying the
`--template` flag or `CODER_SUPPORT_BUNDLE_TEMPLATE` env var
- Captures a list of workspaces the user has access to. Defaults to a
max of 10, configurable via `--workspaces-total-cap` /
`CODER_SUPPORT_BUNDLE_WORKSPACES_TOTAL_CAP`
- Collects additional stats from the coderd deployment (aggregated
workspace/session metrics), as well as entitlements via license and
dismissed health checks.
created with help from mux
Bumps rust from `6cff8a3` to `bf3368a`.
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps ubuntu from `104ae83` to `c7eb020`.
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Relates to https://github.com/coder/internal/issues/1214
The `ExtractWorkspaceAgentParam` middleware ends up making 4 database
queries to follow the chain of `WorkspaceAgent` -> `WorkspaceResource`
-> `ProvisionerJob` -> `WorkspaceBuild` -- but then dropping all that
hard work on the floor. The `api.workspaceAgent` handler that references
this middleware then has to do all of that work again, plus one more
query to get the related `User` so we can get the username. This pattern
is also mirrored in `getDatabaseTerminal` but without the middleware.
This PR:
* Adds a new query `GetWorkspaceAgentAndWorkspaceByID` to fetch all
this information at once to avoid the multiple round-trips,
* Updates the existing usage of `GetWorkspaceAgentByID` to this new
query instead,
* Updates `ExtractWorkspaceAgentParam` to also store the workspace in
the request context
Dalibo: [0.63ms](https://explain.dalibo.com/plan/40bb597f3539gc6c)
Increases the interval of running `du` on `/home/coder` and
`/var/lib/docker` to 1h.
Also decreases the timout to 1m; having `du` run for longer is likely
not great.
This PR:
- Removes the host-related agent metadata. It's not particularly useful
and the hosts have dedicated monitoring via Netdata.
- Adds two metdata blocks to expose the sizes of `/home/coder` and
`/var/lib/docker`
## Description
Adds upstream proxy support for AI Bridge Proxy passthrough requests.
This allows aiproxy to forward non-allowlisted requests through an
upstream proxy. Currently, the only supported configuration is when
aiproxy is the first proxy in the chain (client → aiproxy → upstream
proxy).
## Changes
* Add `--aibridge-proxy-upstream` option to configure an upstream
HTTP/HTTPS proxy URL for passthrough requests
* Add `--aibridge-proxy-upstream-ca` option to trust custom CA
certificates for HTTPS upstream proxies
* Passthrough requests (non-allowlisted domains) are forwarded through
the upstream proxy
* MITM'd requests (allowlisted domains) continue to go directly to
aibridge, not through the upstream proxy
* Add tests for upstream proxy configuration and request routing
Closes: https://github.com/coder/internal/issues/1204
This makes it so we can test it directly without having to go through
Tailnet, which appears to be causing flakes in CI where the requests
time out and never make it to the agent.
Takes inspiration from the container-related API endpoints.
Would probably make sense to refactor the ls tests to also go through
the API (rather than be internal tests like they are currently) but I
left those alone for now to keep the diff minimal.
This PR improves the usability of `coder show`:
- Adds a header with workspace owner/name, latest build status and time
since, and template name / version name.
- Updates `namedWorkspace` to allow looking up by UUID
- Also improves associated `TestShow` to respect context deadlines.
Fixes shellcheck warning reported in
https://github.com/coder/coder/pull/21496#discussion_r2696470065
## Problem
The `error()` function in `lib.sh` already calls `exit 1`, so the `exit
1` on line 17 of `check_pg_schema.sh` was unreachable:
```
In ./scripts/check_pg_schema.sh line 17:
exit 1
^----^ SC2317 (info): Command appears to be unreachable.
```
## Solution
Remove the redundant `exit 1` since `error()` already handles exiting.
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Relates to https://github.com/coder/internal/issues/272
This flake has been persisting for a while, and unfortunately there's no
detail on which healthcheck in particular is holding things up.
This PR adds a concurrency-safe `healthcheck.Progress` and wires it
through `healthcheck.Run`. If the healthcheck times out, it will provide
information on which healthchecks are completed / running, and how long
they took / are still taking.
🤖 Claude Opus 4.5 completed the first round of this implementation,
which I then refactored.
## Summary
Moves the stop action from the icon-button shortcuts to the kebab menu
(WorkspaceMoreActions) in the workspaces list view.
## Problem
The stop icon was difficult to recognize without context in the
workspace list view. Users couldn't easily identify what the stop button
did based on the icon alone.
## Solution
- The stop action is not a primary action and doesn't need to be
highlighted in the icon-button view
- Moved the stop action into the kebab (⋮) menu
- The start button remains as a primary action when the workspace is
offline, since starting a workspace is a more common and expected action
## Changes
- `WorkspaceMoreActions`: Added optional `onStop` and `isStopPending`
props to conditionally render a "Stop" menu item
- `WorkspacesTable`: Removed the stop `PrimaryAction` button and instead
passes the stop callback to `WorkspaceMoreActions` when the workspace
can be stopped
## Testing
- TypeScript compiles without errors
- All existing tests pass
- Manually verified that the stop action appears in the kebab menu when
the workspace is running
Fixes#21516
---
Created on behalf of @jacobhqh1
---------
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Co-authored-by: Jake Howell <jake@hwll.me>
## Problem
Migration 000401 introduced a hardcoded `public.` schema qualifier which
broke deployments using non-public schemas (see #21493). We need to
prevent this from happening again.
## Solution
Adds a new `lint/migrations` Make target that validates database
migrations do not hardcode the `public` schema qualifier. Migrations
should rely on `search_path` instead to support deployments using
non-public schemas.
## Changes
- Added `scripts/check_migrations_schema.sh` - a linter script that
checks for `public.` references in migration files (excluding test
fixtures)
- Added `lint/migrations` target to the Makefile
- Added `lint/migrations` to the main `lint` target so it runs in CI
## Testing
- Verified the linter **fails** on current `main` (which has the
hardcoded `public.` in migration 000401)
- Verified the linter **passes** after applying the fix from #21493
```bash
# On main (fails)
$ make lint/migrations
ERROR: Migrations must not hardcode the 'public' schema. Use unqualified table names instead.
# After fix (passes)
$ make lint/migrations
Migration schema references OK
```
## Depends on
- #21493 must be merged first (or this PR will fail CI until it is)
---------
Signed-off-by: Danny Kopping <danny@coder.com>
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Co-authored-by: Danny Kopping <danny@coder.com>
Adds a new Prometheus metric `coderd_db_query_counts_total` that tracks
the total number of queries by route, method, and query name. This is
aimed at helping us track down potential optimization candidates for
HTTP handlers that may trigger a number of queries. It is expected to be
used alongside `coderd_api_requests_processed_total` for correlation.
Depends upon new middleware introduced in
https://github.com/coder/coder/pull/21498
Relates to https://github.com/coder/internal/issues/1214
Extracts part of the prometheus middleware that stores the route
information in the request context into its own middleware. Also adds
request method information to context.
Relates to https://github.com/coder/internal/issues/1214
Add comprehensive OAuth2 enum types to codersdk following RFC specifications:
- OAuth2ProviderGrantType (RFC 6749)
- OAuth2ProviderResponseType (RFC 6749)
- OAuth2TokenEndpointAuthMethod (RFC 7591)
- OAuth2PKCECodeChallengeMethod (RFC 7636)
- OAuth2TokenType (RFC 6749, RFC 9449)
- OAuth2RevocationTokenTypeHint (RFC 7009)
- OAuth2ErrorCode (RFC 6749, RFC 7009, RFC 8707)
Add OAuth2TokenRequest, OAuth2TokenResponse, OAuth2TokenRevocationRequest,
and OAuth2Error structs to the SDK. Update OAuth2ClientRegistrationRequest,
OAuth2ClientRegistrationResponse, OAuth2ClientConfiguration, and
OAuth2AuthorizationServerMetadata to use typed enums instead of raw strings.
This makes codersdk the single source of truth for OAuth2 types, eliminating
duplication between SDK and server-side structs.
Closes#21476
Adds a per-organization setting to disable workspace sharing. When enabled,
all existing workspace ACLs in the organization are cleared and the workspace
ACL mutation API endpoints return `403 Forbidden`.
This complements the existing site-wide `--disable-workspace-sharing` flag by
providing more granular control at the organization level.
Closes https://github.com/coder/internal/issues/1073 (part 2)
---------
Co-authored-by: Steven Masley <Emyrk@users.noreply.github.com>
With issues: "## 🔍 Code Review\\n\\nReviewed [5-8 words].\\n\\n**Found X issues** (Y critical, Z nitpicks).\\n\\n---\\n*AI review via [Coder Tasks](https://coder.com/docs/ai-coder/tasks)*"
No issues: "## 🔍 Code Review\\n\\nReviewed [5-8 words].\\n\\n✅ **Looks good** - no production issues found.\\n\\n---\\n*AI review via [Coder Tasks](https://coder.com/docs/ai-coder/tasks)*"
</github_api_documentation>
<critical_rules>
1. Read ENTIRE files before commenting - use read_file or grep to verify
2. Check the EXACT line you're commenting on - does the issue actually exist there?
3. Suggestion block = ONLY replacement lines (never include unchanged surrounding lines)
Use GitHub MCP tools to get PR title, body, and diff
Or use: git diff main...pr-${PR_NUMBER}
Use \`gh\` to get PR details, diff, and all comments. Look for an existing doc-check comment containing \`<!-- doc-check-sticky -->\` - if one exists, you'll update it instead of creating a new one.
3. Understand Changes
Read the diff and identify what changed
Ask: Is this user-facing? Does it change behavior? Is it a new feature?
**Do not comment if no documentation changes are needed.**
"file is %d bytes which exceeds the maximum of %d bytes. Use grep, sed, or awk to extract the content you need, or use offset and limit to read a portion.",
fileSize,limits.MaxFileSize,
))
}
// Read the entire file (up to MaxFileSize).
data,err:=io.ReadAll(f)
iferr!=nil{
returnerrResp(fmt.Sprintf("read file: %s",err))
}
// Split into lines.
content:=string(data)
// Handle empty file.
ifcontent==""{
returnReadFileLinesResponse{
Success:true,
FileSize:fileSize,
TotalLines:0,
LinesRead:0,
Content:"",
}
}
lines:=strings.Split(content,"\n")
totalLines:=len(lines)
// offset is 1-based line number.
ifoffset<1{
offset=1
}
ifoffset>int64(totalLines){
returnerrResp(fmt.Sprintf(
"offset %d is beyond the file length of %d lines",
offset,totalLines,
))
}
// Default limit.
iflimit<=0{
limit=int64(limits.MaxResponseLines)
}
startIdx:=int(offset-1)// convert to 0-based
endIdx:=startIdx+int(limit)
ifendIdx>totalLines{
endIdx=totalLines
}
varnumbered[]string
totalBytesAccumulated:=0
fori:=startIdx;i<endIdx;i++{
line:=lines[i]
// Per-line truncation.
iflen(line)>limits.MaxLineBytes{
line=line[:limits.MaxLineBytes]+"... [truncated]"
}
// Format with 1-based line number.
numberedLine:=fmt.Sprintf("%d\t%s",i+1,line)
lineBytes:=len(numberedLine)
// Check total byte budget.
newTotal:=totalBytesAccumulated+lineBytes
iflen(numbered)>0{
newTotal++// account for \n joiner
}
ifnewTotal>limits.MaxResponseBytes{
returnerrResp(fmt.Sprintf(
"output would exceed %d bytes. Read less at a time using offset and limit parameters.",
limits.MaxResponseBytes,
))
}
// Check line count.
iflen(numbered)>=limits.MaxResponseLines{
returnerrResp(fmt.Sprintf(
"output would exceed %d lines. Read less at a time using offset and limit parameters.",
Description:"Address to bind the mock LLM API server. Can include a port (e.g., 'localhost:8080' or ':8080'). Uses a random port if no port is specified.",
Description:"Size in bytes of the response payload. If 0, uses default context-aware responses.",
Value:serpent.Int64Of(&responsePayloadSize),
},
{
Flag:"pprof-enable",
Env:"CODER_SCALETEST_LLM_MOCK_PPROF_ENABLE",
Default:"false",
Description:"Serve pprof metrics on the address defined by pprof-address.",
Value:serpent.BoolOf(&pprofEnable),
},
{
Flag:"pprof-address",
Env:"CODER_SCALETEST_LLM_MOCK_PPROF_ADDRESS",
Default:"127.0.0.1:6060",
Description:"The bind address to serve pprof.",
Value:serpent.StringOf(&pprofAddress),
},
{
Flag:"trace-enable",
Env:"CODER_SCALETEST_LLM_MOCK_TRACE_ENABLE",
Default:"false",
Description:"Whether application tracing data is collected. It exports to a backend configured by environment variables. See: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/protocol/exporter.md.",
Title:fmt.Sprintf("%s/%s (%s since %s) %s:%s",workspace.OwnerName,workspace.Name,workspace.LatestBuild.Status,time.Since(workspace.LatestBuild.CreatedAt).Round(time.Second).String(),workspace.TemplateName,workspace.LatestBuild.TemplateVersionName),
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.