This pull-request addresses a few design things within the `<AgentRow
/>` element. This is a follow-on from the previous work done with
implementing tabs.
- Workspace border can no longer be red, will always be orange (this was
done in a previous PR but not stated).
- Warnings have been moved to inside the Agent Logs collapsible.
- Warning badge has been added to the Agent Logs collapsible trigger.
- Collapsible is now open by default when there is an error inside of
the agent.
- Agent disconnected is no longer prominent by default.
Closes#16332
Previously `coder provisioner jobs list` showed no indication of what a workspace
build job was doing (i.e., start, stop, or delete). This adds
`workspace_build_transition` to the provisioner job metadata, exposed in
both the REST API and CLI. Template and workspace name columns were also
added, both available via `-c`.
```
$ coder provisioner jobs list -c id,type,status,"workspace build transition"
ID TYPE STATUS WORKSPACE BUILD TRANSITION
95f35545-a59f-4900-813d-80b8c8fd7a33 template_version_import succeeded
0a903bbe-cef5-4e72-9e62-f7e7b4dfbb7a workspace_build succeeded start
```
The "By model" and "Pull requests" tables on the PR Insights page
(`/agents/settings/insights`) were side-by-side at `lg` breakpoints, and
the Pull requests table was hard-capped at 20 rows by the backend.
- Replaced `lg:grid-cols-2` with a single-column stacked layout so both
tables span the full content width.
- Removed the `LIMIT 20` from the `GetPRInsightsRecentPRs` SQL query so
all PRs in the selected time range are returned.
- Can add this back if we need it. If we do, we should add a little
subheader above this table to indicate that we're not showing all PRs
within the selected timeframe.
- Added client-side pagination to the Pull requests table using
`PaginationWidgetBase` (page size 10), matching the existing pattern in
`ChatCostSummaryView`.
- Renamed the section heading from "Recent" to "Pull requests" since it
now shows the full set for the time range.
<img width="1481" height="1817" alt="image"
src="https://github.com/user-attachments/assets/0066c42f-4d7b-4cee-b64b-6680848edc68"
/>
> 🤖 PR generated with Coder Agents
Previously, editing a past user message in Agents chat waited for the
PATCH round-trip and cache reconciliation before the conversation
visibly settled. The edited bubble and truncated tail could briefly fall
back to older fetched state, and a failed edit did not restore the full
local editing context cleanly.
Keep history editing optimistic end-to-end: update the edited user
bubble and truncate the tail immediately, preserve that visible
conversation until the authoritative replacement message and cache catch
up, and restore the draft/editor/attachment state on failure. The route
already scopes each `agentId` to a keyed `AgentChatPage` instance with
its own store/cache-writing closures, so navigating between chats does
not need an extra post-await active-chat guard to keep one chat's edit
response out of another chat.
Fixes https://github.com/coder/internal/issues/1440
- Convert `OrganizationAutocomplete` to a purely presentational, fully
controlled component
- Accept `value`, `onChange`, `options` from parent; remove internal
state, data fetching, and permission filtering
- Update `CreateTemplateForm` and `CreateUserForm` to own org fetching,
permission checks, auto-select, and invalid-value clearing inline
- Memoize `orgOptions` in callers for stable `useEffect` deps
- Rewrite Storybook stories for the new controlled API
> 🤖 Written by a Coder Agent. Reviewed by a human.
Closes CODAGT-123
Assistant messages containing only source parts (no markdown or
reasoning)
were missing the bottom spacer that normally fills the gap left by the
hidden
action bar, causing them to sit flush against the next user bubble.
The existing fallback spacer guarded on `Boolean(parsed.reasoning)`, so
it
only fired for thinking-only replies. Replace that guard with the
broader
`hasRenderableContent` flag (which covers blocks, tools, and sources)
and
extract a named `needsAssistantBottomSpacer` boolean so future content
types
inherit consistent spacing without re-reading compound conditions.
Adds a `SourcesOnlyAssistantSpacing` Storybook story mirroring the
existing
`ThinkingOnlyAssistantSpacing` pattern for regression coverage.
Closes CODAGT-124
When a streaming assistant response finishes and moves from the live
stream
tail into the conversation timeline, the message jumps 4px upward. This
happens because the outer layout wrapper and live-stream section both
used
`gap-3` (12px), while the committed-message list used `gap-2` (8px).
Unify all three containers to `gap-2` so the gap between messages stays
at 8px regardless of whether they're streaming or committed, eliminating
the layout shift.
A Storybook story with play-function assertions locks the invariant: it
renders both committed messages and an active stream, then verifies both
the outer and inner containers report `rowGap === "8px"`.
Fixes a regression introduced in #24060 that could crash the frontend.
`thunk` is created by `useEffectEvent()`, and React 19.2 enforces that
effect-event functions are not invoked during render. The previous code
called `thunk()` inside a `setState` updater function, and React
executes updater
functions during render, so this became an illegal render-phase call.
The fix computes `next` in the interval callback (`const next =
thunk()`) and then stores it via `setComputedValue(() => next)`. This
keeps the `useEffectEvent` call outside render and also preserves
correct behavior when `func` returns a function value, because React
stores `next` instead of treating it as a functional updater.
Go's html/template has a built-in security filter (urlFilter) that only
allows http, https, and mailto URL schemes. Any other scheme gets
replaced with #ZgotmplZ.
The OAuth2 app's callback URL uses custom URI scheme which the filter
considers unsafe. For example the Coder JetBrains plugin exposes a
callback URI with the scheme jetbrains:// - which was effectively
changed by the template engine into #ZgotmplZ. Of course this is not an
actual callback. When users clicked the cancel button nothing happened.
The fix was simple - we now wrap the apps registered callback URI into
htmltemplate.URL. Usually this needs some validation otherwise the
linter will complain about it. The callback URI used by the Cancel logic
is actually validated by our backend when the client app
programmatically registered via the dynamic OAuth2 registration
endpoints, so we refactored the validation around that code and re-used
some of it in the Cancel handling to make sure we don't allow URIs like
`javascript` and `data`, even though in theory these URIs were already
validated.
In addition, while testing this PR with
https://github.com/coder/coder-jetbrains-toolbox/pull/209 I discovered
that we are also not compliant with
https://www.rfc-editor.org/rfc/rfc6749#section-4.1.2.1 which requires
the server to attach the local state if it was provided by the client in
the original request. Also it is optional but generally a good practice
to include `error_description` in the error responses. In fact we follow
this pattern for the other types of error responses. So this is not a
one off.
- resolves#20323
<img width="1485" height="771" alt="Cancel_page_with_invalid_uri"
src="https://github.com/user-attachments/assets/5539d234-9ce3-4dda-b421-d023fc9aa99e"
/>
<img width="486" height="746" alt="Coder Toolbox handling the Cancel
button"
src="https://github.com/user-attachments/assets/acab71a6-d29c-4fa9-80ba-3c0095bbdc8f"
/>
<!--
If you have used AI to produce some or all of this PR, please ensure you
have read our [AI Contribution
guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING)
before submitting.
-->
\`InlineMarkdown\` and \`MemoizedInlineMarkdown\` lived in
\`Markdown.tsx\`
alongside a static \`import { Prism as SyntaxHighlighter } from
"react-syntax-highlighter"\` — the full PrismJS build with ~300 language
grammars. Because \`DashboardLayout\` eagerly imports
\`AnnouncementBannerView → InlineMarkdown\`, every authenticated page
loaded and evaluated the entire Prism/refractor bundle on startup even
though syntax highlighting is only used in secondary views.
This PR moves \`InlineMarkdown\` and \`MemoizedInlineMarkdown\` into
their
own \`InlineMarkdown.tsx\` file that depends only on \`react-markdown\`
and
updates all six consumers to import from the new module.
\`Markdown.tsx\`
keeps the PrismJS import for the full \`Markdown\` component, which is
only reached through lazy-loaded routes.
> 🤖 Generated by Coder Agents
## Summary
Exposes `credential_kind` and `credential_hint` on AI Bridge session
threads, making credential metadata visible in the session detail API.
Each thread in the `/api/v2/aibridge/sessions/{session_id}` response now
includes:
- `credential_kind`: `centralized` or `byok`
- `credential_hint`: masked credential (e.g. `sk-a...pgAA`)
Values are taken from the thread's root interception.
## Changes
- `codersdk/aibridge.go`: Added `CredentialKind` and `CredentialHint`
fields to `AIBridgeThread`
- `coderd/database/db2sdk/db2sdk.go`: Populated from root interception
in `buildAIBridgeThread`
- `SessionTimeline.stories.tsx`: Added fields to mock thread data
the current page has an "Agentic loop completed" block that doesn't
really contain any valuable info that isn't available elsewhere. replace
this with a status indicator
<img width="507" height="300" alt="Screenshot 2026-04-08 at 2 47 40 PM"
src="https://github.com/user-attachments/assets/09cf3772-a52d-485d-a15e-b2257b2d9003"
/>
couple of little design tweaks to make the UI of the Request Logs page
and Sessions pages more consistent:
- decrease size of Request Logs page chevron
- copy Request Logs page chevron animation for Sessions expandable
sections
- use TokenBadges component in RequestLogsRow
- wrap tool call counts in badges
<img width="1393" height="210" alt="Screenshot 2026-04-08 at 1 56 10 PM"
src="https://github.com/user-attachments/assets/97e7acb6-71c7-48d6-b0df-a102c7602cc0"
/>
Frontend for https://github.com/coder/coder/pull/24022.
From that PR's description:
> The agents chat interface displays thumbnails for videos recorded by
the computer use agent. Currently, to display a thumbnail, the frontend
downloads the entire video and shows the first frame.
#24022 adds a thumbnail file id to `wait_agent` tool results, and this
PR displays it instead of fetching the entire video.
## Motivation
During the April 2 dogfood incident, a pod OOM-kill triggered a
reconnection storm: hundreds of chat-stream and agent-RPC websockets all
attempted to reconnect at the same deterministic backoff intervals (1 s,
2 s, 4 s, …). Because every browser tab computed the same delay, the
surviving replicas received a synchronized wall of new connections at
each retry tick, amplifying the overload that caused the first OOM in
the first place.
The root cause of the memory blowup (chatd serialization cost) is a
separate issue. This change addresses the secondary blast-radius
problem: when N clients reconnect in lockstep, the retry storm itself
becomes a capacity threat.
## Change
The shared `createReconnectingWebSocket` utility now applies symmetric
jitter (default ±30%) to the capped exponential-backoff delay before
scheduling the reconnect timer. With 100 clients and a 1 s base delay,
reconnects spread over the 700 ms–1300 ms window instead of all landing
at exactly 1000 ms, and once retries hit `maxMs` the scheduler still
preserves downward spread instead of collapsing back to a single tick.
Two new options are accepted by callers:
- **`jitter`** (0–1 fraction, default `0.3`) — controls the jitter
window. Values are clamped to `[0, 1]`; `0` preserves exact legacy
timing.
- **`random`** (`() => number`, default `Math.random`) — injectable RNG,
primarily a deterministic test seam. Non-finite output falls back to the
midpoint (`0.5`).
The `retryingAt` timestamp surfaced to `ChatStatusCallout` is computed
from the jittered delay, so the countdown shown to users reflects the
actual retry time. The scheduler also keeps `maxMs` as a hard ceiling on
the final delay and saturates exponential overflow at that cap instead
of dropping to `0ms` retries.
No production callers need changes — the default jitter activates
automatically for all four call sites (`AgentsPage` chat-list watcher,
`AgentChatPage` workspace watcher, `useChatStore` per-chat stream,
`useGitWatcher`). The two downstream tests that asserted exact reconnect
timing now pin `Math.random()` to `0.5` so those expectations stay
deterministic.
This pull-request makes a few changes to our `<Badge />` component to
bring it inline with Figma.
* Added all variants to the stories of Figma (they can vary per
badge-type, so its better we track everything).
* Removed the `border` variant of the component, border variants should
be on all `sm` and `md`.
* Added a hover effect to the `default` variant (per-design).
* Resolved issue with sizings of `xs` and `sm` plus resolved
iconography.
* Resolved issue with icons not showing at all on `xs` variants.
closes CODAGT-122
Add a spacer div that renders only when an assistant message lacks the
action bar, matching the height the action bar would provide.
> 🤖 Generated by Coder Agents
## Problem
Workspaces showed as "Healthy" immediately after creation while the
agent was still downloading, starting, or connecting. If the agent never
connected, the workspace stayed "Healthy" for the entire connection
timeout (~120s), then abruptly flipped to "Unhealthy".
## Root cause
In `db2sdk.WorkspaceAgent`, the health switch had no case for
`WorkspaceAgentConnecting`. Agents in `connecting` status with a
non-`off` lifecycle (e.g. `created` after a fresh build) fell through to
the `default` case and were marked `Healthy = true`.
## Fix
Add an explicit case for `WorkspaceAgentConnecting` that sets `Healthy =
false` with reason `"agent has not yet connected"`. The case is placed
after the existing `!connected + off` case (which correctly catches
stopped agents as "not running") and before the `timeout`/`disconnected`
cases.
```
Status + Lifecycle → Health reason
──────────────────────────────────────────────────────
any !connected + off → "agent is not running"
connecting + created/starting → "agent has not yet connected" ← NEW
timeout + any → "agent is taking too long to connect"
disconnected + any → "agent has lost connection"
connected + start_error → "agent startup script exited with an error"
connected + shutting_down → "agent is shutting down"
connected + ready/starting → healthy
```
The frontend already handles this case — `getAgentHealthIssue()` returns
"Workspace agent is still connecting" with `severity: "info"` for
unhealthy workspaces with connecting agents.
## Test changes
- **Healthy test**: now actually connects the agent via `agenttest.New`
before asserting health (previously passed due to the bug).
- **New Connecting test**: verifies a never-connected agent is correctly
marked unhealthy.
- **Mixed health test**: connects a1 and waits for the mixed state
(`a1.Healthy && !workspace.Healthy`) to avoid a race where both agents
are initially connecting.
- **Sub-agent excluded test**: connects the parent agent and waits for
it to be healthy before creating the sub-agent.
- **TestWorkspaceAgent/Connect**: flipped assertion to `Health.Healthy
== false` for a `dbfake` agent that never connects.
<details>
<summary>Review notes</summary>
### Known follow-up
The `healthy:false` workspace search filter maps to `[disconnected,
timeout]` and does not include `connecting`. This is a pre-existing gap
that is now more consequential — a workspace unhealthy solely due to a
connecting agent won't appear in `healthy:false` results. Worth a
follow-up issue.
### Deep review findings addressed
| Finding | Severity | Status |
|---------|----------|--------|
| Mixed health test race (all 3 reviewers) | P2 | Fixed — tightened
`Eventually` condition |
| `TestWorkspaceAgent/Connect` assertion break | P1 | Fixed — flipped
assertion |
| CLI renders red for connecting agents | Obs | Acknowledged — design
trade-off, accurate but visually strong for transient state |
| Switch case ordering overlap | Obs | Documented with inline comment |
</details>
> 🤖 This PR was created with the help of Coder Agents, and needs a human
review. 🧑💻
Replace Tooltip with `HelpPopover` in the "New workspace" page header.
`HelpPopover` supports interactive content like links and provides
better layout control, making it a better fit for this use case.
Adds backend validation for user secret environment variable names and file paths.
Env name validation enforces POSIX naming rules and blocks a deliberately aggressive denylist of reserved names and prefixes. The denylist errs on the side of blocking too much since it's easier to remove entries later than to add them after users have created conflicting secrets.
File path validation requires paths to start with ~/ or /.
Dev and RC builds now show diagonal warning stripes in the navbar plus a
centered version badge, making it impossible to miss which build you're
running.
**Devel build:** amber "warning" from theme
**RC build:** sky "pending" from theme
> 🤖 Written by a Coder Agent. Will be reviewed by a human.
Adds an optional `CreatedAt` timestamp to `tool-call` and `tool-result`
`ChatMessagePart` variants so the frontend can compute tool execution
duration (`result.created_at - call.created_at`).
Timestamps are recorded at the correct moments in the chatloop:
- **Tool-call**: when the model stream emits the tool call
- **Tool-result**: when tool execution completes (or is interrupted)
These are passed through `PersistedStep.PartCreatedAt` so the
persistence layer can apply accurate timestamps to stored parts.
SSE-published parts also carry `CreatedAt` for real-time display.
Old persisted messages without `created_at` deserialize to `nil` — fully
backward compatible.
<details><summary>Implementation notes (Coder Agents
generated)</summary>
### Why not stamp in `PartFromContent`?
`PartFromContent` is called both for SSE publishing (correct timing) and
during persistence (wrong timing — both tool-call and tool-result would
get the same "persistence time" timestamp, yielding ~0 duration).
Instead, timestamps are captured in the chatloop at the right moments
and carried through `PersistedStep.PartCreatedAt` as a
`map[string]time.Time` keyed by `"call:<id>"` / `"result:<id>"`.
### Interrupted tool calls
`persistInterruptedStep` also stamps `CreatedAt` on synthetic error
results for cancelled/interrupted tool calls, so partial duration is
available.
### Files changed
| File | Change |
|------|--------|
| `codersdk/chats.go` | Add `CreatedAt *time.Time` field |
| `codersdk/chats_test.go` | JSON round-trip test |
| `coderd/database/dbtime/dbtime.go` | Add `TimePtr` helper |
| `coderd/x/chatd/chatloop/chatloop.go` | Track timestamps, pass through
`PersistedStep` |
| `coderd/x/chatd/chatd.go` | Apply timestamps during persistence |
| `coderd/x/chatd/chatprompt/chatprompt_test.go` | Verify
`PartFromContent` does NOT stamp |
| `site/src/api/typesGenerated.ts` | Auto-generated |
</details>
---------
Co-authored-by: Ethan <39577870+ethanndickson@users.noreply.github.com>
Adds client-executed dynamic tools to the chat API. Dynamic tools are
declared by the client at chat creation time, presented to the LLM
alongside built-in tools, but executed by the client rather than chatd.
This enables external systems (Slack bots, IDE extensions, Discord bots,
CI/CD integrations) to plug custom tools into the LLM chat loop without
modifying chatd's built-in tool set.
Modeled after OpenAI's Assistants API: the chat pauses with
`requires_action` status when the LLM calls a dynamic tool, the client
POSTs results back via `POST /chats/{id}/tool-results`, and the chat
resumes.
See [this example](https://github.com/coder/coder-slackbot-poc) as a
reference for how this is used. It's highly-configurable, which would
enable creating chats from webhooks, periodically polling, or running as
a Slackbot.
<details>
<summary>Design context</summary>
### Architecture
The chatloop **exits** when it encounters dynamic tools and
**re-enters** when results arrive. No blocking channels, no pubsub for
tool results, no in-memory registry. The DB is the only coordination
mechanism.
```
Phase 1 (chatloop):
LLM response → execute built-in tools only →
Persist(assistant + built-in results) →
status = requires_action → chatloop exits
Phase 2 (POST /tool-results):
Persist(dynamic tool results) →
status = pending → wakeCh → chatloop re-enters
```
### Validation (POST /tool-results)
1. Chat status must be `requires_action` (409 if not)
2. Read chat's `dynamic_tools` → set of dynamic tool names
3. Read last assistant message → extract tool-call parts matching
dynamic tool names
4. Submitted tool_call_ids must match exactly (400 for missing/extra)
5. Persist tool-result message parts, set status to `pending`, signal
wake
### Idempotency
Tool call IDs scoped per LLM step. State machine (`requires_action` →
`pending`) is the guard. First POST wins, subsequent get 409.
### Mixed tool calls
When the LLM calls both built-in and dynamic tools in one step, built-in
tools execute immediately. Their results are persisted in phase 1.
Dynamic tool results arrive via POST in phase 2. The LLM sees all
results when the chatloop resumes.
</details>
> 🤖 Generated by Coder Agents
- Remove Kyleosophy alternative completion chimes (keeps original chime
intact)
- Extract 5 sub-components from the 717-line god component:
- `PersonalInstructionsSettings` — user prompt textarea form
- `SystemInstructionsSettings` — admin system prompt + TextPreviewDialog
- `VirtualDesktopSettings` — admin desktop toggle
- `WorkspaceAutostopSettings` — admin autostop toggle + duration form
- `RetentionPeriodSettings` — admin retention toggle + number input
- Parent is now a ~160-line layout shell
- `isAnyPromptSaving` coupling preserved via prop
- Add `docs/plans/` to `.gitignore`
> 🤖 Written by a Coder Agent. Reviewed by a human.
Fixes https://github.com/coder/coder/issues/23910
Adds periodic cleanup of chats and chat files to the dbpurge background
goroutine, with a configurable retention period exposed in the Agent
settings UI.
> 🤖 Written by a Coder Agent. Reviewed by a human.