---
title: agent.ts
description: Set the agent's runtime config in agent.ts with defineAgent, including the model and compaction.
---

# agent.ts



An agent's `agent.ts` calls `defineAgent` (from `eve`) to set its runtime config.

## Set the model

A typical config selects a model:

```ts title="agent/agent.ts"
import { defineAgent } from "eve";

export default defineAgent({
  model: "anthropic/claude-opus-4.8",
});
```

The root `agent.ts` can be omitted when no runtime config is needed. In that case, eve defaults
to `anthropic/claude-sonnet-4.6`. When `agent.ts` is present, `model` is required.

`model` accepts a gateway model id string, which routes through the [Vercel AI Gateway](https://vercel.com/docs/ai-gateway). To call a provider directly and configure the model in code, pass a provider-authored `LanguageModel`.

Provider-specific AI SDK packages are regular project dependencies. A fresh `eve init` app includes the core `ai` package, but it does not install every provider package. Install the provider package you import, then set that provider's API key:

```bash
npm install @ai-sdk/anthropic
```

```ts title="agent/agent.ts"
import { anthropic } from "@ai-sdk/anthropic";
import { defineAgent } from "eve";

export default defineAgent({
  model: anthropic("claude-opus-4.8"),
});
```

Model use is subject to the terms, data-processing commitments, retention behavior, and available controls of the selected provider and routing path. Review the [AI Gateway model catalog](https://vercel.com/ai-gateway/models) for gateway-routed models, and review the provider's terms when you configure a direct `LanguageModel`.

## Compaction

Compaction summarizes older turns as you approach the context window. It's on by default, so you only tune when it kicks in. Lower `thresholdPercent` to compact sooner:

```ts title="agent/agent.ts"
export default defineAgent({
  model: "anthropic/claude-opus-4.8",
  compaction: {
    thresholdPercent: 0.75, // default 0.9
  },
});
```

See [Default harness](./concepts/default-harness#compaction) for how the loop applies it.

## Workflow world

By default, eve selects the Workflow SDK world for the host: Vercel Workflow on
Vercel, and the SDK's local world in local development or `eve start`. Advanced
self-hosted deployments can select the Workflow world package to use from the
root `agent.ts`:

```ts title="agent/agent.ts"
import { defineAgent } from "eve";

export default defineAgent({
  model: "anthropic/claude-opus-4.8",
  experimental: {
    workflow: {
      world: "@workflow/world-postgres",
    },
  },
});
```

Install that package in your app. It should export a default factory or
`createWorld()` function.

Put credentials and host-specific options in runtime environment variables read
by the world package, not in `agent.ts`. If the installed package must stay
external in hosted output, list it in `build.externalDependencies`.

## Other defineAgent fields

`defineAgent` takes a few more fields, all optional. For the exported types, see the [TypeScript API](./reference/typescript-api).

| Field          | Type                                                    | Default     | Description                                                                                                                                                                                                                                                                                                                                                                       |
| -------------- | ------------------------------------------------------- | ----------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `modelOptions` | `AgentModelOptionsDefinition`                           | none        | Provider option overrides forwarded to the model call.                                                                                                                                                                                                                                                                                                                            |
| `experimental` | `{ codeMode?: boolean; workflow?: { world?: string } }` | flags unset | Opt-in flags that can change or disappear in any release. Treat them as unstable. `codeMode` routes executable tools through a sandboxed code-execution wrapper, where the model writes JavaScript that calls the tools inside the [sandbox](./sandbox). `workflow.world` selects the Workflow world package backing session state, queues, hooks, and streams on the root agent. |
| `outputSchema` | Standard Schema or a JSON Schema object                 | none        | Structured return type for task-mode runs (a subagent, schedule, or remote job). Interactive conversation turns ignore it unless the client supplies a per-message schema.                                                                                                                                                                                                        |
| `build`        | `{ externalDependencies?: string[] }`                   | none        | Hosted-build packaging controls. `externalDependencies` keeps listed packages external while eve compiles authored modules such as tools and channels, and traces those packages into the hosted output.                                                                                                                                                                          |

`codeMode` is experimental and may change or be removed.

`externalDependencies` is a packaging control only. It keeps selected packages as runtime dependencies in the hosted output; it does not authorize, configure, or review any third-party service those packages may call.

## Where adjacent settings live

| Concern                       | Lives in                                                                         |
| ----------------------------- | -------------------------------------------------------------------------------- |
| Instructions prompt           | `agent/instructions.md`, [Instructions](./instructions)                          |
| Per-tool approval (HITL)      | `agent/tools/*.ts`, [Tools](./tools)                                             |
| Inbound auth & network policy | the channel layer, [Auth & route protection](./guides/auth-and-route-protection) |
| Sandbox / workspace           | `agent/sandbox/`, [Sandbox](./sandbox)                                           |
| Telemetry & debugging         | `agent/instrumentation.ts`, [Instrumentation](./guides/instrumentation)          |

## What to read next

* [Default harness](./concepts/default-harness) for the loop and built-in tools this config drives
* [TypeScript API](./reference/typescript-api) for every `defineAgent` field and type
* [Subagents](./subagents) for the `description` requirement and child-agent config


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Connections
description: Expose external MCP and OpenAPI servers to the model, with connection tokens the model never sees.
---

# Connections



A connection wires an agent into an external server you don't author, either an MCP server (Linear, GitHub, a warehouse) or any HTTP API with an OpenAPI document. eve handles the parts you'd otherwise hand-roll, discovering the remote tools, surfacing them to the model, and brokering auth.

Connections live under `agent/connections/`. The runtime name comes from the filename, so `agent/connections/linear.ts` registers as `"linear"`. The model never sees a connection's URL or credentials. It discovers tools through the built-in `connection__search` and calls them by their qualified name, `connection__<connection>__<tool>` (e.g. `connection__linear__list_issues`).

## MCP connections

`defineMcpClientConnection` points at an MCP server. Supply a `url` and a `description`:

```ts title="agent/connections/linear.ts"
import { defineMcpClientConnection } from "eve/connections";

export default defineMcpClientConnection({
  url: "https://mcp.linear.app/sse",
  description: "Linear workspace: issues, projects, cycles, and comments.",
  auth: {
    getToken: async () => ({ token: process.env.LINEAR_API_TOKEN! }),
  },
});
```

The `url` must speak Streamable HTTP or SSE. Write the `description` for the model, not for yourself. It shows up in `connection__search`, and the model uses it to decide which connection to query.

### Static-token auth

`getToken` returns a `TokenResult` (`{ token, expiresAt? }`), and eve sends it as `Authorization: Bearer <token>` on every request. Because it runs on each connection attempt, you can mint a fresh token from wherever you keep secrets, including an env var, a secrets manager, an internal vault, or your own OAuth exchange. If the token has a known TTL, set `expiresAt` (milliseconds since epoch) and eve refreshes ahead of time rather than waiting for a `401`.

When `getToken` is the only auth, `principalType` defaults to `"app"`: one shared credential keyed across all sessions. Switch to `principalType: "user"` when each end-user carries their own token.

eve resolves and caches connection tokens per step; they never land in conversation history or reach the model.

### No auth

Drop `auth` entirely for servers that need no token, such as a localhost server during development or a public one:

```ts
export default defineMcpClientConnection({
  url: "http://localhost:3001/mcp",
  description: "Local dev server.",
});
```

We recommend using no-auth connections only for services that are intentionally public, local-only, or otherwise protected outside eve. Do not use no-auth connections for sensitive third-party services.

### Headers

Use `headers` when the server wants a non-Bearer scheme (an API-key header) or extra configuration. Headers stack on top of `auth`:

```ts
export default defineMcpClientConnection({
  url: "https://example.com/mcp",
  description: "Example service.",
  headers: { "X-Api-Key": process.env.EXAMPLE_API_KEY! },
});
```

### Tool filters

To narrow which remote tools the model sees, set exactly one of `tools.allow` or `tools.block`. Filtered-out tools do not appear in `connection__search`:

```ts
export default defineMcpClientConnection({
  url: "https://mcp.linear.app/sse",
  description: "Linear: read-only.",
  auth: { getToken: async () => ({ token: process.env.LINEAR_API_TOKEN! }) },
  tools: { allow: ["search_issues", "get_issue"] },
});
```

### Per-connection approval

To put every tool a connection serves behind a human, use the helpers from `eve/tools/approval`:

```ts
import { once } from "eve/tools/approval";

export default defineMcpClientConnection({
  url: "https://mcp.linear.app/sse",
  description: "Linear workspace.",
  auth: { getToken: async () => ({ token: process.env.LINEAR_API_TOKEN! }) },
  approval: once(),
});
```

`never()` lets every call through, `once()` asks for approval the first time in a session, and `always()` asks every time. The pause and resume is the same human-in-the-loop flow covered in [Tools](./tools).

For connection tools that can create, modify, delete, transmit, purchase, message, or access sensitive data, use approval, tool allow-lists, or other safeguards appropriate to the action.

## OpenAPI connections

`defineOpenAPIConnection` turns any OpenAPI 3.x document into connection tools, one per operation. Pass an HTTPS URL eve fetches at runtime, or an inline parsed object:

```ts title="agent/connections/petstore.ts"
import { defineOpenAPIConnection } from "eve/connections";

export default defineOpenAPIConnection({
  spec: "https://petstore3.swagger.io/api/v3/openapi.json",
  description: "Pet store inventory and orders.",
  auth: { getToken: async () => ({ token: process.env.PETSTORE_TOKEN! }) },
});
```

Each operation becomes `connection__<connection>__<operationId>` (e.g. `connection__petstore__getInventory`). When an operation has no `operationId`, eve derives a deterministic `<method>_<sanitized-path>` name instead.

`auth`, `headers`, and `approval` work exactly as they do for MCP. There are two fields specific to OpenAPI:

| Field        | Purpose                                                                                                                 |
| ------------ | ----------------------------------------------------------------------------------------------------------------------- |
| `baseUrl`    | Base URL operation paths resolve against. Optional; defaults to the document's first usable `servers` entry.            |
| `operations` | Filter keyed on `operationId` (`allow` or `block`). Mirrors `tools` on MCP connections, but names operations not tools. |

## Interactive OAuth via Vercel Connect

When the server uses OAuth and you want each end-user to sign in through their own browser, turn on interactive authorization with [Vercel Connect](https://vercel.com/docs/connect). The `connect()` helper from `@vercel/connect/eve` handles consent, encrypted token storage, and refresh, then hooks all of that into eve's authorization flow:

```ts title="agent/connections/linear.ts"
import { connect } from "@vercel/connect/eve";
import { defineMcpClientConnection } from "eve/connections";

export default defineMcpClientConnection({
  url: "https://mcp.linear.app/sse",
  description: "Linear workspace: issues, projects, cycles, and comments.",
  auth: connect("linear"),
});
```

`"linear"` is the UID you chose when registering the Connect client. Connect-managed OAuth is user-scoped by default, so the runtime resolves the per-user token before each tool call. The full setup (Connect client provisioning, project linking, the runtime consent flow) lives in [Auth & route protection](./guides/auth-and-route-protection).

## Self-hosted interactive OAuth

To run your own OAuth, use `defineInteractiveAuthorization` from `eve/connections`, which takes a three-method form and needs no Vercel Connect. eve mints a callback URL, parks (durably suspends) the turn on a framework-owned webhook, and resumes once the token comes back. Interactive auth is always `principalType: "user"`, and the factory pins that for you.

```ts title="agent/connections/linear.ts"
import {
  ConnectionAuthorizationRequiredError,
  defineInteractiveAuthorization,
  defineMcpClientConnection,
} from "eve/connections";

export default defineMcpClientConnection({
  url: "https://mcp.linear.app/sse",
  description: "Linear workspace.",
  auth: defineInteractiveAuthorization<{ verifier: string }>({
    // Probed before every tool call. Return a token to run the tool;
    // throw `Required` to start the consent flow.
    getToken: async ({ principal }) => {
      const token = await lookupCachedToken(principal);
      if (!token) throw new ConnectionAuthorizationRequiredError("linear");
      return { token };
    },
    // Runs in a durable step. Return the user-facing `challenge` and
    // an optional `resume` value the runtime journals across the park.
    startAuthorization: async ({ callbackUrl }) => {
      const verifier = makePkceVerifier();
      return {
        challenge: { url: buildAuthorizeUrl(callbackUrl, verifier) },
        resume: { verifier },
      };
    },
    // Runs when the provider redirects to the callback URL. `resume` is
    // typed as `{ verifier: string } | undefined`; `callback.params`
    // holds the IdP's returned query/body params.
    completeAuthorization: async ({ resume, callback }) => {
      const token = await exchangeCode(resume!.verifier, callback.params.code!);
      return { token };
    },
  }),
});
```

`getToken` runs before every tool call. `startAuthorization` and `completeAuthorization` are both-or-neither: provide one without the other and you get a definition error. The `challenge` rides along verbatim on the `authorization.required` event. Its fields:

| Field          | Purpose                                                                                   |
| -------------- | ----------------------------------------------------------------------------------------- |
| `url`          | The authorize URL for redirect or device flows.                                           |
| `userCode`     | The device code, for device flows.                                                        |
| `instructions` | The call to action when there's no URL.                                                   |
| `displayName`  | Human-readable provider name channels show on the sign-in affordance (e.g. "Salesforce"). |

Drop `resume` when the provider keeps flow state server-side, so nothing has to cross the step boundary.

`displayName` is presentation-only. The connection's path-derived name still keys the authorization scope, token cache, and callback URL. You can also set `displayName` on the `auth` definition itself (e.g. `auth: { ...connect("sfdc"), displayName: "Salesforce" }`); that definition-level value wins over one the strategy stamps on the challenge, and channels fall back to title-casing the connection name when neither is set.

### Signaling authorization state

Two error classes drive the consent flow. Throw them from `getToken` or `completeAuthorization`; both are exported from `eve/connections`.

* `ConnectionAuthorizationRequiredError(connectionName)`: the user must authorize. Throw it from `getToken` to emit `authorization.required` and kick off the flow.
* `ConnectionAuthorizationFailedError(connectionName, { reason?, retryable? })`: authorization failed. `reason` is a stable machine-readable code (e.g. `"access_denied"`) that shows up on the `authorization.completed` event and the failed tool result. `retryable` defaults to `true`; set it to `false` for terminal cases like user denial so the runtime stops re-prompting.

```ts
import { ConnectionAuthorizationFailedError } from "eve/connections";

throw new ConnectionAuthorizationFailedError("linear", {
  reason: "access_denied",
  retryable: false,
});
```

To narrow a caught error, use `isConnectionAuthorizationRequiredError(err)` and `isConnectionAuthorizationFailedError(err)`. They match on `err.name`, which is why they survive the class-identity split `instanceof` can hit after bundling.

### Handling a revoked token mid-call

`getToken` only runs *before* a tool call, so a grant revoked while a tool is mid-flight first surfaces as a downstream `401` inside your `execute`. A plain throw there is only a tool error, so the model sees a failure and the cached bearer sticks around. Instead, map a provider `401` to `ctx.requireAuth()` (or rethrow `ConnectionAuthorizationRequiredError`). eve then evicts the rejected token from its per-step cache and re-runs the consent flow with a fresh one, exactly as it does for a connection whose server rejects the bearer.

```ts title="agent/tools/list_issues.ts"
import { defineTool } from "eve/tools";
import { z } from "zod";

export default defineTool({
  description: "List open Linear issues.",
  inputSchema: z.object({}),
  async execute(_input, ctx) {
    const { token } = await ctx.getToken();
    const res = await fetch("https://api.linear.app/graphql", {
      headers: { authorization: `Bearer ${token}` },
    });
    // The grant was revoked since getToken ran: re-challenge instead of
    // returning a dead-token error to the model.
    if (res.status === 401) ctx.requireAuth();
    return await res.json();
  },
});
```

### Authorization and approval together

A tool can require both sign-in (`auth`) and a human approval. The model's approval gate runs before the tool's `execute`, so the order the user sees is **approve, then sign in**. eve records the approval on session state the moment it's granted, and that record survives the sign-in park, so when the turn resumes after authorization the tool is not put through approval again. You get one approval and one sign-in, never a double prompt.

## What to read next

* [Integrations](/integrations): browse every channel and connection eve ships, in one gallery.
* [Tools](./tools): authored tools live alongside connection-provided tools; the same approval helpers apply.
* [Auth & route protection](./guides/auth-and-route-protection): the full interactive-OAuth flow with Vercel Connect.
* [Security model](./concepts/security-model): how connection credentials stay out of the model's reach.


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Getting Started
description: Install eve, scaffold your first agent, give it a tool, and run it locally.
---

# Getting Started



eve is a filesystem-first framework for durable agents. You write capabilities under `agent/`, and eve runs the model loop, persists every session, and serves the agent over HTTP and platform channels. You'll scaffold an app, add a tool, run it locally, then create, stream, and continue a session over HTTP.

<Callout>
  eve is currently in beta and subject to the [Vercel beta
  terms](https://vercel.com/docs/release-phases/public-beta-agreement); the framework, APIs,
  documentation, and behavior may change before general availability.
</Callout>

## Prerequisites

* Node 24 or newer
* npm (bundled with Node)
* A model credential (see below)

The scaffold's default model is `anthropic/claude-sonnet-4.6`, which routes through the Vercel AI Gateway. Set one of these before you run the agent:

* A gateway model id needs `AI_GATEWAY_API_KEY`, or a `VERCEL_OIDC_TOKEN` pulled with `vercel link`.
* A direct provider model uses that provider's AI SDK package and API key. For example, `anthropic("claude-...")` from `@ai-sdk/anthropic` needs `ANTHROPIC_API_KEY`.

You are responsible for selecting a model, provider, and channel appropriate for your data and use case, and for complying with each provider's terms (as listed per model) and data-processing requirements.

If you skip this, the dev TUI flags the missing credential and its `/model` command walks you through pasting a key or linking a project.

## Quick start

`npx` runs `eve init` without installing eve first:

```bash
npx eve@latest init my-agent
```

The command:

* Creates a child directory using the current workspace or launcher package manager, and uses eve's default model
* Installs dependencies and initializes Git
* Starts the development server and opens the interactive [terminal UI](./guides/dev-tui)

Type a message and watch the model loop run. Pass `--channel-web-nextjs` to add the Web Chat application. Every app ships the built-in HTTP channel (`agent/channels/eve.ts`) regardless.

`eve init` holds the terminal, so stop it with Ctrl+C to get your shell back before editing the generated agent. The command does not create a Vercel project or deploy.

To add eve to an existing project, run `eve init .` from a directory that already has a `package.json` and no `agent/` files yet. eve adds the missing `eve`, `ai`, and `zod` dependencies without touching anything else the project owns. The eve dependency and the Node engine come from the same release. eve pins `engines.node` to the lowest major that release supports (for example `24.x`). It keeps an existing range only when every version that range allows stays within that major; otherwise it replaces the range and prints a warning.

## Manual installation

To wire eve into an existing app by hand instead of using `eve init`, first declare a compatible Node runtime in `package.json`:

```json
{
  "engines": {
    "node": "24.x"
  }
}
```

Then install the dependencies and author the two files the runtime needs. The `eve init` scaffold adds `ai` and `zod` for you; by hand you install all three:

```bash
npm install eve@latest ai zod
```

### Project files

A minimal agent is two files; you add tools as you need them.

`agent/instructions.md` is the always-on system prompt:

```md
You are a concise assistant. Use tools when they are available.
```

`agent/agent.ts` holds runtime config:

```ts
import { defineAgent } from "eve";

export default defineAgent({
  model: "anthropic/claude-sonnet-4.6",
});
```

Before using real customer data, confirm the selected model provider's terms, routing path, and retention settings are appropriate for that data.

Even at this size the agent can already do real work. The default harness gives it file, shell, web, and delegation tools out of the box. See [Default harness](./concepts/default-harness) for the full set and how to override or disable any of them.

### Add your first tool

The filename becomes the tool name the model sees, and it must be snake\_case ASCII. Create `agent/tools/get_weather.ts`:

```ts
import { defineTool } from "eve/tools";
import { z } from "zod";

// The model sees this tool as `get_weather`, from the filename.
export default defineTool({
  description: "Get the current weather for a city.",
  inputSchema: z.object({ city: z.string().min(1) }),
  async execute({ city }) {
    return { city, condition: "Sunny", temperatureF: 72 };
  },
});
```

Tools run in your app runtime with full `process.env`, not inside the [sandbox](./sandbox). More in [Tools](./tools).

## Run the app

A scaffolded app has a `dev` script, so from the app root run:

```bash
npm run dev
```

The manual path authors no `dev` script. Run the binary through `npx` instead:

```bash
npx eve dev
```

Other commands the eve binary provides (prefix each with `npx`, or add a matching package.json script):

* `eve info`: show the active routes and compiled artifacts
* `eve build`: compile the agent into `.eve/` and build the host output
* `eve start`: serve the built output
* `eve dev`: start the local runtime and open the interactive [terminal UI](./guides/dev-tui)

In the dev TUI, type a message and watch it happen in order. First the `get_weather` call, then its result, then the reply.

The same CLI can point at a deployment. `npx eve dev https://your-app.vercel.app` drives a deployed app, which is handy for preview and production smoke tests. See [Deployment](./guides/deployment).

## Send a message

Every eve app exposes the same stable HTTP API. Start a durable session:

```bash
curl -X POST http://127.0.0.1:3000/eve/v1/session \
  -H 'content-type: application/json' \
  -d '{"message":"What is the weather in Brooklyn?"}'
```

The response comes back with two things you'll reuse:

* a `continuationToken` in the JSON body, to resume this conversation
* an `x-eve-session-id` header that identifies the run to stream

## Stream the session

Attach to the session stream:

```bash
curl http://127.0.0.1:3000/eve/v1/session/<sessionId>/stream
```

The stream is NDJSON, served as `application/x-ndjson; charset=utf-8`. For this run you'll see a handful of lifecycle events:

* `session.started`
* `actions.requested` (the `get_weather` call)
* `action.result`
* `message.completed` (the reply)
* `session.completed`

`reasoning.appended` and `message.appended` are optional live-streaming events. Clients that can't surface incremental output can ignore them and rely on `reasoning.completed` and `message.completed`.

Note: consider the privacy, confidentiality, and user-experience implications for displaying, storing, or transmitting reasoning events in your application.

The full set covers more lifecycle, human-in-the-loop, and authorization events, including `input.requested`, `turn.failed`, `authorization.required`, and `authorization.completed`. See [Sessions, runs and streaming](./concepts/sessions-runs-and-streaming) for every event and its data shape.

## Send a follow-up message

When the session is waiting for the next user message, post a follow-up with the token:

```bash
curl -X POST http://127.0.0.1:3000/eve/v1/session/<sessionId> \
  -H 'content-type: application/json' \
  -d '{"continuationToken":"<token>","message":"Now do Queens."}'
```

See [Sessions, runs and streaming](./concepts/sessions-runs-and-streaming) for the full contract.

## Setting up with a coding agent

If a coding agent (Claude Code, Cursor, and the like) is doing the setup, hand it this prompt:

<CopyPrompt text="Set up an eve agent for the user. eve is a filesystem-first TypeScript framework for durable agents, published as the npm package eve. Read its docs: once eve is installed they are bundled in the package at node_modules/eve/docs; before eve is installed, read the published Introduction and Getting Started pages. If the project has no eve app, scaffold one with `npx eve@latest init <name>`; add `--channel-web-nextjs` only when the user wants Web Chat. The init command installs dependencies, initializes Git, and starts the dev server, so run it in a controllable process and stop it with Ctrl+C before editing. To add eve to an existing app, run `eve init .`, or install the dependencies by hand with `npm install eve@latest ai zod` (init adds ai and zod; the by-hand path needs all three). Make sure agent/agent.ts and agent/instructions.md exist, then add a first typed tool at agent/tools/get_weather.ts using defineTool from eve/tools with a Zod inputSchema and an inline execute. Start the dev server again, then exercise the HTTP API: create a session with POST /eve/v1/session, attach to GET /eve/v1/session/:id/stream, and send a follow-up with the returned continuationToken. Verify with the project's typecheck, adapt model and provider choices to the project, and do not commit unless the user asks.">
  Set up an eve agent: read the eve docs (bundled at node\_modules/eve/docs once eve is
  installed), scaffold with `npx eve@latest init <name>` (or `npm install eve@latest ai zod` in an existing app), add
  a typed tool at agent/tools/get\_weather.ts, run it with `npm run dev`, then create a session, stream
  it, and send a follow-up.
</CopyPrompt>

Once `eve` is a dependency, the package bundles the full docs, so the agent can read them locally at `node_modules/eve/docs/` without fetching anything.

To add a platform channel after setup, run `eve channels add slack` from an interactive terminal. The init flags are covered in [Quick start](#quick-start).

## What to read next

* [Instructions](./instructions) and [Tools](./tools): the core building blocks
* [Channels](./channels/overview): reach the agent from Slack, Discord, or a web UI
* [Frontend](./guides/frontend/overview): browser chat with `useEveAgent`
* [TypeScript SDK](./guides/client/overview): call the agent from scripts or server-side code
* [Sessions, runs and streaming](./concepts/sessions-runs-and-streaming): the durable session model
* [Build an agent](./tutorial/first-agent): the full end-to-end walkthrough


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Instructions
description: Author the agent's always-on system prompt with instructions.md or instructions.ts.
---

# Instructions



Instructions are the always-on system prompt, the agent's permanent identity rather than a procedure it pulls in when the moment calls for it. Use them for anything that should hold on every turn, such as a rule, a persona, or a constraint. eve prepends the instructions to every model call in the session.

## Author instructions

At minimum, instructions are a markdown file at the agent root. Whatever you write is the prompt:

```md title="agent/instructions.md"
You are a concise assistant. Use tools when they are available.
```

Keep this file to stable behavior such as identity, tone, and standing rules.

## Markdown vs TypeScript

A static prompt belongs in markdown (`agent/instructions.md`). Switch to a TypeScript module (`agent/instructions.ts`) once you need to build the prompt from typed helpers, `lib/` code, or build-time values.

```ts title="agent/instructions.ts"
import { defineInstructions } from "eve/instructions";
import { buildInstructionsPrompt } from "./lib/prompts.js";

export default defineInstructions({
  markdown: buildInstructionsPrompt(),
});
```

`defineInstructions` takes one field, `markdown`, the resolved prompt text. A module-backed prompt runs once at build time. eve captures the resulting markdown into the compiled manifest, so the runtime serves the same prompt every session and never re-runs the module.

## Split instructions across a directory

For more than one file, add an `agent/instructions/` directory. eve reads its entries non-recursively and accepts both `.md` files and `.ts` modules (a `.ts` file can wrap `defineInstructions` or `defineDynamic`). Entries combine in alphabetical order by filename (`localeCompare`).

A flat `agent/instructions.md` (or `.ts`) at the agent root and the directory can coexist. The root file's content comes first, then the sorted directory entries. You cannot author both `instructions.md` and `instructions.ts` at the root; that pairing is a build error.

## Instructions vs skills

Instructions and [skills](./skills) both feed text into the model's context. The difference is timing:

|                           | Loaded                                       | Use for                                              |
| ------------------------- | -------------------------------------------- | ---------------------------------------------------- |
| `instructions.md` / `.ts` | Always on, every turn                        | Permanent identity and standing rules                |
| `agent/skills/*`          | On demand, when the model calls `load_skill` | Optional procedures that should not bloat every turn |

Keep instructions short and stable. Long or situational procedures belong in [skills](./skills), where they only enter context when the request calls for them.

Instructions never run code. When you need typed executable behavior, reach for a [tool](./tools).

## Dynamic instructions

To resolve the prompt at runtime from session context (auth, tenant, or channel), wrap `defineInstructions` in a `defineDynamic` resolver. See [Dynamic capabilities](./guides/dynamic-capabilities).

## Disclaimer

As the deployer, it is your responsibility to ensure your agent complies with applicable laws.

Where an eve agent communicates with people, you may be required to disclose that they are interacting with an automated AI system where law requires it. eve does not add this disclosure automatically; configure it in your instructions and/or channel responses.

## What to read next

* [Tools](./tools): typed actions, the next capability to add
* [Context control](./concepts/context-control): all the levers for what the model sees
* [Skills](./skills): on-demand procedures, the counterpart to always-on instructions


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Introduction
description: How an eve agent is laid out as files, what runs when a message arrives, and the building blocks you add as it grows.
---

# Introduction



eve is a framework for building durable agents as ordinary files in a TypeScript project.

Instead of one large configuration object, each part of your agent gets a clear home. Instructions go in one file, tools in one folder, channels in another. eve discovers that structure and turns it into an agent that runs locally, serves HTTP, connects to other platforms, and keeps working across many turns.

<Callout>
  eve is currently in beta and subject to the [Vercel beta
  terms](https://vercel.com/docs/release-phases/public-beta-agreement); the framework, APIs,
  documentation, and behavior may change before general availability.
</Callout>

## An eve project at a glance

A small eve app looks like this:

```text
my-agent/
├── package.json
└── agent/
    ├── agent.ts
    ├── instructions.md
    ├── tools/
    │   └── get_weather.ts
    ├── skills/
    │   └── plan_a_trip.md
    └── channels/
        └── slack.ts
```

You can understand most eve projects by reading that tree:

* `instructions.md` tells the agent who it is and how it should behave.
* [`agent.ts`](./agent-config) chooses the model and configures runtime options.
* [`tools/`](./tools) holds typed functions the model can call.
* [`skills/`](./skills) holds longer procedures the model loads only when they are useful.
* [`channels/`](./channels/overview) connect the agent to HTTP clients, Slack, Discord, and the other places people talk to it.

Start with only `instructions.md` and `agent.ts`. Add the other folders when the agent needs them.

## The files are the interface

eve is [filesystem-first](./reference/project-layout). A file's location says what it does, and its path usually gives it a name. For example, this file:

```text
agent/tools/get_weather.ts
```

defines a tool named `get_weather`:

```ts
import { defineTool } from "eve/tools";
import { z } from "zod";

export default defineTool({
  description: "Get the weather for a city.",
  inputSchema: z.object({ city: z.string() }),
  async execute({ city }) {
    return { city, condition: "Sunny" };
  },
});
```

There is no separate registry to keep in sync. Add the file and eve discovers it; move or rename it and its identity moves with it. See [Tools](./tools) for the complete API.

## What happens when a message arrives

The same flow runs whether a message comes from a web app, the terminal, or Slack. eve turns the platform input into a message, gives the model its instructions, skills, tools, and conversation history, runs the work (calling tools and subagents as needed), saves the session and streams events, then delivers the result back in the form the platform expects.

That keeps agent behavior portable. Your weather tool does not need to know whether the question came from a browser or from Slack.

## Durable by default

An eve session is more than one request and one response. It can:

* Stream progress while work is happening
* Call tools and subagents
* Pause for [approval or a human answer](./tools)
* Resume after that answer arrives
* Keep durable state across turns

Under the hood, eve uses the open-source [Workflow SDK](https://workflow-sdk.dev) to make sessions durable, resumable, and crash-safe. eve handles that machinery so your tools focus on the work itself.

## Grow the project by adding capabilities

As the agent grows, each concern still has a predictable home:

| Path                            | Add it when you need...                          |
| ------------------------------- | ------------------------------------------------ |
| [`connections/`](./connections) | Tools from external MCP servers                  |
| [`hooks/`](./guides/hooks)      | Code that reacts to lifecycle and stream events  |
| [`sandbox/`](./sandbox)         | A controlled workspace for files and commands    |
| [`subagents/`](./subagents)     | Specialist agents the root agent can delegate to |
| [`schedules/`](./schedules)     | Recurring or scheduled work                      |
| `lib/`                          | Shared code imported by the other agent files    |

The result stays readable before it runs. The directory tells you what the agent can do.

## What to read next

* [Getting started](./getting-started): scaffold and run your first agent
* [Tools](./tools): the typed actions your agent calls
* [Instructions](./instructions): the always-on system prompt that shapes behavior
* [Channels](./channels/overview): reach the agent from Slack, Discord, or a web UI
* [Connections](./connections): pull in tools from external MCP servers
* [Project layout](./reference/project-layout): every authored slot under `agent/`


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Responsible Use
description: Deployer responsibility and safeguards to review before using eve with sensitive, regulated, or production data.
---

# Responsible Use



As the deployer, it is your responsibility to ensure your agent complies with applicable laws.

You are responsible for configuring approval policies, tool restrictions, connection scopes, route/session authorization, sandbox controls, telemetry exports, and other safeguards appropriate for your use case.

Before using eve with non-public, sensitive, regulated, or production data, review which default tools, custom tools, MCP tools, shell/file/web tools, connected services, subagents, schedules, and external actions are available to the agent.

Require human approval or other safeguards for sensitive, irreversible, regulated, financial, healthcare, employment, housing, legal, safety-impacting, user-impacting, or external side-effecting actions.

Unless you configure stricter controls, eve agents may operate with permissive settings, including tool execution without human approval where approval is omitted and sandbox network egress that is not deny-all. Do not rely on model behavior alone to prevent sensitive or irreversible actions.


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Sandbox
description: The agent's isolated bash environment, including built-in file tools, a seeded /workspace, backends, lifecycle, and network policy.
---

# Sandbox



The sandbox is the agent's isolated bash environment: a filesystem rooted at `/workspace` where it can run shell commands, execute scripts, and read or write files without ever touching your app runtime. Every eve agent has exactly one. The built-in `bash`, `read_file`, `write_file`, `glob`, and `grep` tools already target it, and your authored code can too.

A working sandbox exists by default, with nothing to author. Override it only to add setup, seed files, pick a backend, or lock down the network.

The default sandbox is not a substitute for configuring network policy, credentials, retention, deletion, or other controls your application requires.

## Using the sandbox

The model already has shell and file access through the default tools:

| Tool                       | Does                                |
| -------------------------- | ----------------------------------- |
| `bash`                     | run a shell command in the sandbox  |
| `read_file` / `write_file` | read/write files under `/workspace` |
| `glob`                     | find files by pattern               |
| `grep`                     | search file contents                |

All of them run with `/workspace` as the working directory. Any authored runtime function (a tool, a step, a model callback) can get a live sandbox handle with `ctx.getSandbox()`.

```ts title="agent/tools/run_analysis.ts"
import { defineTool } from "eve/tools";
import { z } from "zod";

export default defineTool({
  description: "Run a Python analysis script and return its output.",
  inputSchema: z.object({ script: z.string() }),
  async execute({ script }, ctx) {
    const sandbox = await ctx.getSandbox();
    await sandbox.writeTextFile({ path: "analysis/run.py", content: script });
    const result = await sandbox.run({ command: "python analysis/run.py" });
    return { stdout: result.stdout };
  },
});
```

`ctx.getSandbox()` takes no arguments, is async, and only works inside authored runtime execution.

`/workspace` is one namespace across every backend, so `/workspace/foo` points at the same file whether the backend is local or Vercel. When you need to interpolate a path into a generated command, `sandbox.resolvePath("repo/build.py")` anchors a relative path to its absolute `/workspace/repo/build.py` form.

The handle does more than `run` and `writeTextFile`. In every method, relative paths resolve from `/workspace` and absolute paths pass through untouched:

| Method                                   | Does                                                                                            |
| ---------------------------------------- | ----------------------------------------------------------------------------------------------- |
| `run({ command })`                       | run one command, block until it exits, return `{ stdout, stderr, ... }`                         |
| `spawn(options)`                         | launch a long-running process (server, watcher) and return a `SandboxProcess` handle            |
| `readTextFile` / `writeTextFile`         | read/write a UTF-8 (or specified encoding) file; `readTextFile` supports 1-based line ranges    |
| `readBinaryFile` / `writeBinaryFile`     | read/write raw bytes (images, archives, anything non-text)                                      |
| `readFile` / `writeFile`                 | stream a file in/out as bytes                                                                   |
| `removePath({ path, force, recursive })` | delete one file or directory; `force` ignores missing paths, `recursive` removes non-empty dirs |
| `resolvePath(path)`                      | anchor a relative path to its absolute `/workspace/...` form                                    |
| `setNetworkPolicy(policy)`               | change egress policy mid-turn (backend-dependent; see [Network policy](#network-policy))        |

Since `run` blocks until the command exits, use `spawn` when the process should keep running while the agent does other work:

```ts
const sandbox = await ctx.getSandbox();
const server = await sandbox.spawn({ command: "python -m http.server 8000" });
// ...do other work against the server...
await server.kill();
```

A `SandboxProcess` exposes `stdout`/`stderr` byte streams, `wait()` (resolves with the exit code), and `kill()` (idempotent).

`sandbox.id` is a stable per-session identifier that persists across reconnects to the same logical session. Use it as the cache key for per-session state that must outlive individual step executions.

The option types (`SandboxSpawnOptions`, `SandboxReadBinaryFileOptions`, `SandboxWriteBinaryFileOptions`, and so on) are named exports from `eve/sandbox`, alongside `SandboxProcess`.

## Seeding `/workspace`

Mount authored files into the sandbox at session start by placing them under `agent/sandbox/workspace/`. This requires the folder layout (`agent/sandbox/sandbox.ts`), not the top-level shorthand:

```text
agent/sandbox/
  sandbox.ts                ← optional override (see below)
  workspace/
    schema.sql              ← lands at /workspace/schema.sql
    scripts/run.sh          ← lands at /workspace/scripts/run.sh
```

Every file under `workspace/` mirrors into the sandbox cwd with its structure intact, and eve lists the top-level entries to the model in the prompt automatically. One subtree is off limits. Skill discovery already seeds skill files under `/workspace/skills/`, so authoring `agent/sandbox/workspace/skills/...` is rejected; put those under `agent/skills/` instead.

## Overriding the sandbox

To add setup, seed files, or pick a backend, author `defineSandbox`. There are two layouts:

* `agent/sandbox.ts`: shorthand. Use it when you need only a definition, no seeded files.
* `agent/sandbox/sandbox.ts`: folder layout. Use it when you also seed `agent/sandbox/workspace/**`. If both exist, the folder layout wins.

```ts title="agent/sandbox/sandbox.ts"
import { defineSandbox } from "eve/sandbox";
import { vercel } from "eve/sandbox/vercel";

export default defineSandbox({
  backend: vercel({ runtime: "node24", resources: { vcpus: 2 } }),
  revalidationKey: () => "repo-bootstrap-v1",
  async bootstrap({ use }) {
    const sandbox = await use();
    await sandbox.run({ command: "apt-get install -y jq" });
  },
  async onSession({ use }) {
    await use({ networkPolicy: "deny-all" });
  },
});
```

`defineSandbox` and `defaultBackend` live on `eve/sandbox`. Omit `backend` and the runtime falls back to `defaultBackend()` (see [Backends](#backends)).

## Backends

The backend decides where the sandbox runs. eve ships four pinned factories from nested `eve/sandbox/*` imports plus an availability-aware default from `eve/sandbox`:

| Backend            | Runs the sandbox                                                                               |
| ------------------ | ---------------------------------------------------------------------------------------------- |
| `vercel()`         | on [Vercel Sandbox](https://vercel.com/docs/sandbox).                                          |
| `docker()`         | locally in a Docker container, driven through the `docker` CLI.                                |
| `microsandbox()`   | locally in a lightweight [microsandbox](https://www.npmjs.com/package/microsandbox) VM.        |
| `justbash()`       | locally in the pure-JS `just-bash` interpreter (no daemon or VM, but no real binaries either). |
| `defaultBackend()` | picks the best available: Vercel Sandbox on hosted Vercel → Docker → microsandbox → just-bash. |

Configuring a pinned factory uses that backend unconditionally. `docker()` always requires a reachable Docker daemon, and `vercel()` always creates hosted sandboxes (including from local dev, with Vercel credentials).

With `backend` omitted, eve uses `defaultBackend()`, which resolves on first use in priority order:

1. **Vercel Sandbox** when deploying on Vercel (`process.env.VERCEL` is set), since local container/VM runtimes can't run there.
2. **Docker** when a daemon is reachable through a Docker-compatible `docker` CLI (Docker Desktop, OrbStack, Colima, Podman via its docker-compatible CLI; override the binary with `EVE_DOCKER_PATH`).
3. **microsandbox** when the host supports it: macOS on Apple Silicon, or glibc Linux with KVM enabled.
4. **just-bash** as the dependency-free fallback.

`defaultBackend()` also accepts a keyed bag so each inner backend gets its own typed create options:

```ts
import { defaultBackend, defineSandbox } from "eve/sandbox";

export default defineSandbox({
  backend: defaultBackend({
    vercel: { networkPolicy: "deny-all", resources: { vcpus: 4 } },
    docker: { image: "ghcr.io/vercel/eve:latest" },
    microsandbox: { memoryMiB: 2048 },
  }),
});
```

### Docker

`docker()` drives the Docker CLI directly. The default base image is `ghcr.io/vercel/eve:latest`, eve's published sandbox runtime image. eve creates `/workspace` and verifies Bash during framework setup, before authored bootstrap code runs. Configure it through `docker({ image, env, pullPolicy, networkPolicy })`, and install authored runtime tools in sandbox bootstrap or provide them through a custom image. Templates are committed as local Docker images and reused across sessions when the sandbox source, seed files, `revalidationKey`, and Docker backend options still match. Sessions run as long-lived containers whose filesystems persist `/workspace` changes across turns for the same durable session. `eve dev` prunes stale template images in the background.

### microsandbox

`microsandbox()` runs each sandbox in a lightweight local VM with snapshot-backed templates, a `vercel-sandbox` user, and a firewall capable of domain-level network policies and credential brokering. It is the closest local match to hosted Vercel Sandbox. The default base image is `ghcr.io/vercel/eve:latest`, eve's published sandbox runtime image. During framework setup, before authored bootstrap code runs, eve verifies Bash and creates `/workspace` and the sandbox user. Install authored runtime tools in sandbox bootstrap or provide them through a custom image. Supported hosts are macOS on Apple Silicon, or Linux (glibc) with KVM. The `microsandbox` npm package and its VM runtime are not bundled with eve, so `eve dev` installs both automatically when missing (disable with `setup: { autoInstall: false }`); production processes fail with actionable install errors instead.

### just-bash

`justbash()` needs no daemon or VM, but commands run in a simulated bash with a virtual filesystem under `.eve/sandbox-cache/`, with no real binaries (`git`, `node`, package managers) and no network isolation. The `just-bash` package is an optional peer dependency, so `eve dev` installs it into your application automatically when missing (disable with `autoInstall: false`); production processes fail with an actionable install error instead.

You can also write your own backend. A `SandboxBackend` is an adapter object with a `name`, a `create`, and an optional `prewarm`. It can point at your own container runner, VM pool, internal sandbox service, or another isolation layer, as long as it returns the `SandboxSession` operations eve needs. See the `SandboxBackend*` types on `eve/sandbox`.

## Lifecycle

There are two hooks, scoped differently:

* **`bootstrap({ use })`** is template-scoped and runs once when the template is built. Put reusable setup here that every later session inherits, such as cloning a baseline repo, installing dependencies, or seeding files. Call `use()` to get a `SandboxSession`. Only template filesystem state and supported backend metadata carry into later sessions; config like network policy does not. If external inputs affect what bootstrap produces, set `revalidationKey: () => string` so eve knows when to rebuild the template (authored sandbox source and seed contents are already tracked for you).
* **`onSession({ use, ctx })`** is durable-session-scoped and runs once per session. Put per-session setup here, including network policy, resources, timeout, per-user credentials, and one-time markers. Because it runs inside the active runtime context, it can read `ctx.session` and derive the current principal without baking credentials into the template. Call `use(opts?)` to get a `SandboxSession`; `opts` flow to the backend's update path after create.

If you require a network policy or other configuration for every session, configure it on the backend factory or in `onSession`; do not rely on bootstrap-only configuration.

```ts
import { defineSandbox } from "eve/sandbox";
import { vercel } from "eve/sandbox/vercel";

export default defineSandbox({
  backend: vercel(),
  async onSession({ use, ctx }) {
    const sandbox = await use({ networkPolicy: "deny-all" });
    const user = ctx.session.auth.current;
    if (user === null) return;
    await sandbox.writeTextFile({ path: "SESSION_USER.txt", content: `${user.principalId}\n` });
  },
});
```

Sessions are persistent, and how the underlying runtime idles out depends on the backend. On the Vercel backend, the VM times out after a period of inactivity (default 30 minutes); eve preserves the filesystem and resumes the sandbox on the next message as if nothing happened, even days later. The Docker backend keeps a long-lived container per durable session and persists `/workspace` across turns without that timeout, and the just-bash backend stores its virtual filesystem under `.eve/sandbox-cache/`. In every case, `/workspace` survives between turns for the same session.

## Network policy

Egress rules go on the backend factory or in `onSession`'s `use()`. There are three forms:

```ts
networkPolicy: "allow-all"; // default
networkPolicy: "deny-all";  // block all egress, including DNS

networkPolicy: {
  allow: ["ai-gateway.vercel.sh", "*.github.com"],
  subnets: { deny: ["10.0.0.0/8"] },
};
```

Default egress is `allow-all`. For non-public, sensitive, regulated, or production workloads, configure `deny-all` or an explicit allow-list before running untrusted tools or handling sensitive data.

Set it on the factory (`vercel({ networkPolicy: "deny-all" })`) and it applies before authored `bootstrap` code runs; framework-owned base setup may briefly keep egress open to install required packages. Set it in `onSession`'s `use()` to override per-session. The common pattern combines both: leave the factory open so `bootstrap` can `git clone`, then lock down in `onSession`. To change the policy mid-turn, call `sandbox.setNetworkPolicy(...)` on the live handle.

Domain-level allow-lists and credential brokering are supported by `vercel()` and `microsandbox()`. The Docker backend honors only `"allow-all"` and `"deny-all"` (at creation and via `setNetworkPolicy`); the just-bash backend rejects `setNetworkPolicy` entirely.

## Credential brokering

Secrets never enter the sandbox. Instead, the network policy's per-domain `transform` injects credentials at the firewall, so a header can authenticate egress to a host while the secret stays out of the sandbox process entirely:

```ts
async onSession({ use }) {
  await use({
    networkPolicy: {
      allow: {
        "github.com": [{ transform: [{ headers: { authorization: "Basic your_base64_credentials_here" } }] }],
        "*": [],
      },
    },
  });
}
```

The `"*": []` catch-all keeps general egress open while the `transform` applies only to `github.com`. For mid-turn brokering, call `setNetworkPolicy` with the same shape. The [Vercel Sandbox docs](https://vercel.com/docs/sandbox) cover the brokering mechanism itself.

## What to read next

* [Subagents](./subagents): each subagent gets its own sandbox, independent of its parent.
* [Tools](./tools): authored tools run in the app runtime (full `process.env`); only sandbox tools run in the sandbox.
* [Security model](./concepts/security-model): the app-runtime/sandbox trust boundary in full.
* [Vercel Sandbox](https://vercel.com/docs/sandbox): platform docs, including credential brokering and persistence limits.


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Schedules
description: Run an agent on a cron cadence, either a fire-and-forget prompt or a handler that hands work off to a channel.
---

# Schedules



A schedule starts the agent on its own clock instead of waiting for an inbound message. Use one for daily digests, data syncs, cleanup sweeps, heartbeats, or anything that should fire on a cadence. Each one is a single file under `agent/schedules/` carrying a cron expression. Schedules are root-only, so declared subagents cannot have a `schedules/` directory.

The name comes from the path under `schedules/` (`agent/schedules/billing/sweep.ts` → `"billing/sweep"`), and nested directories are fine.

## `defineSchedule`

Every schedule provides a `cron` and exactly one of `markdown` or `run`:

```ts
interface ScheduleDefinition {
  cron: string;
  markdown?: string; // fire-and-forget prompt (task mode)
  run?: (args: ScheduleHandlerArgs) => Promise<void> | void; // handler
}

interface ScheduleHandlerArgs {
  receive: CrossChannelReceiveFn; // hand the work off to a channel
  waitUntil: (task: Promise<unknown>) => void; // keep the cron task alive past return
  appAuth: SessionAuthContext; // pre-built app principal
}
```

`defineSchedule` is a type-level pass-through. The compiler is what enforces the one-of rule.

`cron` is a standard 5-field string (`minute hour day-of-month month day-of-week`) with minute granularity. On Vercel, each schedule becomes a Vercel Cron Job, and Vercel evaluates the expression in UTC, so `"0 9 * * 1-5"` fires at 09:00 UTC on weekdays. `eve dev` never fires schedules on their cron cadence. A built app served with `eve start` does run production scheduled tasks. To trigger one while iterating in dev, use the dispatch route below.

## Markdown form (fire-and-forget)

This is the minimal schedule. eve runs the agent on the prompt and throws away the output, though the agent can still call tools, write to backends, and log along the way. We call this task mode. A task-mode session runs to completion or fails, and cannot park to wait for a person or an OAuth sign-in.

```ts title="agent/schedules/heartbeat.ts"
import { defineSchedule } from "eve/schedules";

export default defineSchedule({
  cron: "*/5 * * * *",
  markdown: "Pull open Linear issues and POST a summary to the metrics endpoint.",
});
```

You can write the same thing as a plain `.md` file: its frontmatter takes `cron` and nothing else, and the body is the prompt.

`agent/schedules/cleanup.md`:

```md
---
cron: "0 0 * * 0"
---

Sweep stale workflow state.
```

## Handler form (`run`)

Use a handler when the schedule needs to deliver to a channel, branch on conditions, or compute its arguments at fire time. The handler is in full control. It has no channel of its own, so it passes the work to one with `receive`.

```ts title="agent/schedules/daily-digest.ts"
import { defineSchedule } from "eve/schedules";

import slack from "../channels/slack.js";

export default defineSchedule({
  cron: "0 9 * * 1-5",
  async run({ receive, waitUntil, appAuth }) {
    waitUntil(
      receive(slack, {
        message: "Summarize yesterday's activity and post the digest.",
        target: { channelId: "C0123ABC" },
        auth: appAuth,
      }),
    );
  },
});
```

* `receive(channel, { message, target, auth })`: starts a session on another channel. Same contract as a route handler's `args.receive`.
* `waitUntil(promise)`: extends the cron task's lifetime so the parked session and any in-flight fetches settle before the task ends. Wrap the `receive` call in it.
* `appAuth`: the app principal (`{ authenticator: "app", principalId: "eve:app", principalType: "runtime" }`). Pass it as `receive(..., { auth: appAuth })` for work the agent does on its own behalf.

A handler-form session runs on the same durable runtime engine as any other session, so it can park (durably suspend), for instance when the channel handoff is waiting for a Slack reply. Only markdown task mode is barred from waiting.

## Trigger a schedule while iterating

The dev server mounts a one-shot dispatch route that fires a schedule by name, out of band, exactly once. Since `eve dev` never runs schedules on their cron cadence, this is how you trigger one without waiting for the next production tick.

```sh
curl -X POST http://localhost:3000/eve/v1/dev/schedules/heartbeat
# -> { "scheduleId": "heartbeat", "sessionIds": ["..."] }
```

`:scheduleId` is the path-derived schedule name (`agent/schedules/heartbeat.ts` → `heartbeat`; URL-encode the `/` in nested names). It runs the exact dispatch path the production cron handler uses and returns the started session ids as JSON, so you can subscribe to each one's [stream](./concepts/sessions-runs-and-streaming) at `GET /eve/v1/session/:sessionId/stream`. An unknown id comes back `404` with `availableScheduleIds`, listing the schedules the app actually defines.

The route is dev-only. Production builds never mount it, and it needs no auth since the dev server is local-only.

## On Vercel

Hosted Vercel builds turn every `defineSchedule(...)` into a Vercel Cron Job, with each `cron` written as an entry in `.vercel/output/config.json`. Vercel evaluates these expressions in UTC. Confirm discovery under **Settings → Cron Jobs** and watch execution history under **Observability → Cron Jobs**. Per-run logs land under **Observability → Logs**.

## Self-deployed hosts

Production builds register schedules as Nitro scheduled tasks. On Vercel, Nitro's Vercel preset wires those task registrations into Vercel Cron for you. Outside Vercel, the standard `eve build && eve start` path serves Nitro's Node output and starts Nitro's schedule runner, so the tasks fire on their cron cadence while that process is running.

The gotcha is custom hosting. If you adapt the generated output to a process manager, container platform, or Nitro preset that only serves HTTP and does not start Nitro's scheduled task runner, the schedule definitions still compile, but they will not fire automatically. In that case, run eve through `eve start`, use a host that supports Nitro scheduled tasks, or trigger the same work from your own scheduler through an authenticated route, channel handoff, or application-specific job runner. The dev dispatch route above is only for `eve dev`; production builds do not mount it.

## What to read next

* [Channels](./channels/overview): deliver schedule output to users.
* [Sessions, runs & streaming](./concepts/sessions-runs-and-streaming): inspect a schedule run.


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Skills
description: Author load-on-demand procedures the model pulls into context with load_skill.
---

# Skills



A skill is a model-loadable procedure that follows the `SKILL.md` convention. It is a markdown document, optionally a packaged directory with supporting files, that the model pulls into context on demand rather than carrying on every turn. eve advertises each skill's description, and the model loads the full body only when a turn calls for it. This is progressive disclosure, the same model the broader Agent Skills standard uses, so a skill authored against that standard ports over as-is.

## How loading works

eve scans the files under `agent/skills/` and exposes each one's description to the model alongside a framework-owned `load_skill` tool. When a request matches a skill's description (or you name the skill outright), the model calls `load_skill`, and eve appends that skill's markdown to the active turn's context.

The description is a routing hint, not a label. Write it as the task that should trigger activation:

```md
Use when the user needs a release checklist or changelog workflow.
```

Loading a skill adds instructions, never a new execution surface. Tools stay visible whether a skill is loaded or not. If you need typed runtime behavior, reach for a [tool](./tools) instead.

## Markdown vs `defineSkill`

The smallest skill is a flat markdown file. The content is the procedure, and the name comes from the path.

```md title="agent/skills/forecast.md"
Use the weather tool before answering forecast or temperature questions.
```

A flat markdown skill can skip the `description` frontmatter. When it does, eve advertises the first non-empty, non-code-fence line of the body with any leading `#`, `>`, `*`, or `-` marker stripped. If the body has no such line, eve falls back to the literal `Instructions for the <name> skill.`, which is a weak routing hint, so add a `description` when you want the model to route on intent.

A packaged skill is a directory with a `SKILL.md` plus sibling files like `references/`, `assets/`, and `scripts/`. The packaged `SKILL.md` must carry `description` frontmatter; it has no filename slug to fall back on.

```md title="agent/skills/research/SKILL.md"
---
description: Research unfamiliar topics before answering with confidence.
---

When the task is novel or ambiguous, gather evidence first, then answer with the
key facts and the remaining uncertainty.
```

When markdown can't express what you need (typed values, generated content, or inline sibling files), author the skill in TypeScript with `defineSkill` from `eve/skills`:

```ts title="agent/skills/research.ts"
import { defineSkill } from "eve/skills";

export default defineSkill({
  description: "Research unfamiliar topics before answering with confidence.",
  markdown:
    "When the task is novel or ambiguous, gather evidence first, then answer with the key facts and the remaining uncertainty.",
  files: {
    "references/checklist.md": "# Checklist\n\n- Find primary sources.\n",
  },
});
```

eve generates `SKILL.md` from `markdown`, and each `files` entry becomes a package-relative sibling. Start with plain markdown and move to `defineSkill` only when you hit its limits.

## Skills are scoped per agent

Skills are scoped to the agent that declares them. A [subagent](./subagents)'s `skills/` are invisible to the root agent, and the reverse holds too. There's no shared-skill mechanism, so put shared executable helpers in `lib/`.

## Read skill files at runtime

Loading a skill adds its `SKILL.md` to context. To reach a packaged skill's sibling files (references, assets, scripts) from inside a tool or hook, use `ctx.getSkill(id)`:

```ts
const research = ctx.getSkill("research");
const checklist = await research.file("references/checklist.md").text();
```

The handle exposes the skill's `name` and `file(relativePath)`; file content is read lazily from the active sandbox.

## Dynamic skills

To serve a different skill per principal, tenant, or channel (the caller's own team playbook, say), wrap `defineSkill` in a `defineDynamic` resolver keyed on `ctx.session.auth`. See [Dynamic capabilities](./guides/dynamic-capabilities).

## What to read next

* [Connections](./connections): add tools from external MCP and OpenAPI servers
* [Dynamic capabilities](./guides/dynamic-capabilities): resolve skills per caller with `defineDynamic`
* [Context control](./concepts/context-control): how skills fit the full context model


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Subagents
description: Delegate work to child agents, either a copy of the agent itself or declared specialists with their own sandbox and skills.
---

# Subagents



A subagent is a child agent that one agent delegates a focused subtask to. Split work into one to run it in parallel, to give the child a narrower set of tools, or to give a specialist its own identity. There are two kinds, the built-in `agent` tool (a copy of the agent itself) and declared subagents (specialists with their own directory).

## The built-in `agent` tool

Every agent gets an `agent` tool by default. The model calls it to delegate a subtask to a copy of itself:

```ts
{
  message: string;       // everything the child needs; it does not see the parent's history
  outputSchema?: object; // when set, the child runs in task mode and returns structured output
}
```

The copy shares the parent's sandbox and tools, and a child's file writes are immediately visible to the parent. That is what makes parallel calls natural: fan out a few copies to fix different files at once. The copy inherits auth and connections, but starts with fresh conversation history and fresh state. If a declared subagent calls `agent`, the child is a copy of *that* subagent, not the root.

The parent transfers data to the child through the `message` input it gives the subagent. Do not include sensitive data in a subagent request unless that child and its inherited tools, connections, sandbox, and telemetry path are appropriate for that data.

An authored tool at `agent/tools/agent.ts` takes priority over the built-in.

## Declared subagents

A declared subagent lives under `agent/subagents/<id>/` and uses the same `defineAgent` helper as the root. Its location under `subagents/` is the only thing that marks it as a subagent. Declare one when the child needs a clearly different prompt, role, or tool surface.

```ts title="agent/subagents/researcher/agent.ts"
import { defineAgent } from "eve";

export default defineAgent({
  description: "Investigate ambiguous questions before the parent agent responds.",
  model: "anthropic/claude-opus-4.8",
});
```

`description` is required. The parent reads it to decide whether to delegate, so the compiler rejects any subagent whose `agent.ts` leaves it out.

Minimum files:

```text
agent/subagents/researcher/
├── agent.ts            # required (must export a description)
├── instructions.md     # or instructions.ts, optional
├── tools/              # optional, its own tools
├── skills/             # optional, its own skills
├── sandbox/            # optional, its own sandbox + workspace seed
└── subagents/          # optional, nested subagents
```

`schedules/` is not supported inside a declared subagent. Schedules are root-only.

## The isolation boundary

A declared subagent inherits nothing from the root's authored slots. Discovery treats its directory as its own agent root, so it has only the instructions, tools, connections, skills, sandbox, hooks, and nested subagents authored under `agent/subagents/<id>/`. An absent slot falls back to the framework default, not to the root's version.

| Slot         | Built-in `agent` tool         | Declared subagent                      |
| ------------ | ----------------------------- | -------------------------------------- |
| Instructions | Inherited (copy of the agent) | Own `instructions.{md,ts}`, optional   |
| Tools        | Inherited                     | Own `tools/`                           |
| Connections  | Inherited                     | Own `connections/`                     |
| Skills       | Inherited                     | Own `skills/`                          |
| Sandbox      | Shared with parent            | Own `sandbox/`, else framework default |
| Hooks        | Inherited                     | Own `hooks/`                           |
| State        | Fresh                         | Fresh                                  |
| Channels     | Root-only                     | Root-only                              |
| Schedules    | Root-only                     | Root-only                              |

For a declared subagent this means duplicating anything the child needs. When two subagents need the same procedure, copy the markdown under each `skills/` directory, or share typed helpers via `lib/`. The sandbox does not inherit from the parent; it falls back to the framework default unless the subagent authors `subagents/<id>/sandbox.ts` or seeds files via `subagents/<id>/sandbox/workspace/`.

The built-in `agent` tool is the exception. Its children share the parent's sandbox and tools because they are copies of the same agent working on the same files.

`defineState` is never shared, for either kind. Each child starts with fresh durable state.

## What the parent sees

eve lowers every subagent (built-in copy, declared, or [remote](./guides/remote-agents)) into a model-visible tool with the same `{ message, outputSchema? }` shape. The parent packs `message` with everything the child needs, since the child never sees the parent's history. Set `outputSchema` to run the child in task mode, returning structured output as the tool result.

A declared subagent's tool name is the bare path-derived name, with no prefix. `agent/subagents/researcher/` registers as the tool `researcher`. Unlike connection tools (`connection__<connection>__<tool>`), it carries no namespace, so the model, approvals, logs, and evals all reference it by that name. Its input schema is:

```ts
{
  message: string;       // all context the child needs; it never sees the parent's history
  outputSchema?: object; // when set, the child runs in task mode and returns structured output
}
```

Because the name lives in the same runtime tool namespace as authored tools, a subagent named `researcher` collides with a tool named `researcher`. eve rejects the build rather than picking a winner, so keep subagent directory names distinct from tool names.

Do not rely on subagent delegation by itself as an approval boundary. Put sensitive tools behind `needsApproval`, connection approval, route/session authorization, or other controls wherever those tools can be called.

Each delegated subagent spins up its own child session and stream. The parent stream carries only the control-plane events `subagent.called` and `subagent.completed`. To follow the child's full progress, read `subagent.called.data.childSessionId` and subscribe at `GET /eve/v1/session/:childSessionId/stream`.

## When to split

Split out a subagent when the task needs a different prompt or specialist role, a narrower tool surface, or its own runtime context. Don't reach for one when a [skill](./skills) would do. If the agent can keep its identity and needs only an optional procedure, a skill is the lighter choice.

## What to read next

* [Remote agents](./guides/remote-agents): call another eve deployment as a subagent.
* [Dynamic workflows](./guides/dynamic-workflows): have the model orchestrate its subagents programmatically (fan-out, map-reduce).


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Custom Channels
description: Author custom HTTP and WebSocket channels with routes, events, metadata, continuation tokens, and file uploads.
---

# Custom Channels



When eve doesn't ship a channel for your surface, you build one. Custom channels expose HTTP or WebSocket endpoints, parse incoming requests, start or resume sessions, observe runtime events, and own delivery back to your platform.

## File location and identity

Custom channels live in `agent/channels/` at the root agent. Local subagents do not declare channels today.

The channel file stem becomes the channel id, so `agent/channels/internal-webhook.ts` is addressed as `internal-webhook`. Export the channel definition as the module's default export.

## Define a channel

```ts
import { defineChannel, GET, POST } from "eve/channels";

export default defineChannel({
  routes: [
    POST("/message", async (req, { send }) => {
      const body = await req.json();
      const session = await send(body.message, {
        auth: null,
        continuationToken: body.token,
      });

      return Response.json({ sessionId: session.id });
    }),
    GET("/sessions/:sessionId/stream", async (_req, { getSession, params }) => {
      const session = getSession(params.sessionId);
      const stream = await session.getEventStream();

      return new Response(stream, {
        headers: { "content-type": "application/x-ndjson; charset=utf-8" },
      });
    }),
  ],
  events: {
    "message.completed"(event, channel, ctx) {
      // deliver completed messages back to the surface that owns this channel
    },
  },
});
```

Declare routes with the `POST()` and `GET()` helpers. Each route handler receives the raw `Request` and a helpers object:

* `send(message, { auth, continuationToken, state? })` starts or resumes a session. Returns a `Session`.
* `getSession(sessionId)` looks up an existing session. The returned `Session` exposes `getEventStream({ startIndex? })` for streaming.
* `receive(channel, ...)` hands inbound work to a different channel for cross-channel hand-off.
* `params` holds route parameters extracted from the path pattern.
* `waitUntil(promise)` extends the request lifetime for background work.
* `requestIp` is the client IP, or `null` when the host cannot provide it.

Event handlers like `"message.completed"` are declared under the `events` key. They receive `(eventData, channel, ctx)`, where `eventData` is the event payload, `channel` carries platform handles and session continuation operations, and `ctx` is the eve `SessionContext`. Every channel kind shares this signature. The one exception is `session.failed`, which receives only `(eventData, channel)` with no `ctx`.

## WebSocket routes

Use `WS()` when a custom channel needs a WebSocket endpoint. The route handler runs once per upgrade request and returns lifecycle hooks for that connection:

```ts
import { defineChannel, WS } from "eve/channels";

export default defineChannel({
  routes: [
    WS("/voice/ws", async (_req, { send }) => ({
      async message(_peer, message) {
        await send(message.text(), {
          auth: null,
          continuationToken: "voice-demo",
        });
      },
    })),
  ],
});
```

`WS()` handlers receive the same helpers as HTTP route handlers: `send`, `getSession`, `receive`, `params`, `waitUntil`, and `requestIp`. The returned hooks are eve-owned structural types compatible with Nitro/H3 websocket routing, including `upgrade`, `open`, `message`, `close`, and `error`.

### Node upgrade server escape hatch

Prefer the `WS()` lifecycle hooks above when you own the websocket behavior. eve also exposes `createWebSocketUpgradeServer()` for the narrower case where a third-party SDK or framework expects to bind directly to a Node `http.Server` with `server.on("upgrade", ...)`.

```ts
import { defineChannel, WS, createWebSocketUpgradeServer } from "eve/channels";

const bridge = createWebSocketUpgradeServer();

thirdPartySdk.attach(bridge.server);

export default defineChannel({
  routes: [WS("/vendor/ws", bridge.route)],
});
```

The bridge server does not listen on its own port. It receives only upgrade events that matched the eve route, and only on hosts where Nitro exposes the raw Node upgrade request, socket, and head. Treat it as a compatibility adapter for libraries with server-binding APIs, not the primary way to build websocket channels in eve.

## Cross-channel hand-off

Route handlers can start a session on a different channel via `args.receive(channel, ...)`. Use this when an inbound request on one channel should pivot the conversation onto another, such as an incident webhook that opens an investigation thread in Slack.

```ts
import { defineChannel, POST } from "eve/channels";
import slack from "./slack.js";

export default defineChannel({
  routes: [
    POST("/incident", async (req, args) => {
      const incident = await req.json();

      args.waitUntil(
        args.receive(slack, {
          message: `Investigate ${incident.reference}: ${incident.title}`,
          target: { channelId: "C0123ABC" },
          auth: {
            authenticator: "incidentio",
            principalType: "service",
            principalId: incident.actor.id,
            attributes: { reference: incident.reference, severity: incident.severity },
          },
        }),
      );

      return new Response("ok");
    }),
  ],
});
```

Semantics:

* The target channel's authored `receive(input, { send })` hook owns the continuation-token format and initial state. Callers supply only `{ message, target, auth }`.
* `auth` flows through to `session.auth.initiator` so the target's event handlers and the agent's tools can read who started the session.
* Calling `args.receive(...)` does not also start a session on the current channel. The inbound channel's response is whatever the route handler returns explicitly.
* The first argument is the target channel module's default export. Import it directly from `agent/channels/<name>.ts`. Identity is matched by reference.

## Channel metadata

A channel can project a subset of its adapter state as metadata, available to instrumentation resolvers, dynamic tool resolvers, and dynamic skill or instruction resolvers. Define a `metadata(state)` function on the channel config:

```ts
import { defineChannel, POST } from "eve/channels";

export default defineChannel({
  state: {
    topic: null as string | null,
    contextMessages: [] as string[],
    internalCounter: 0,
  },

  metadata(state) {
    return {
      topic: state.topic,
      contextMessages: state.contextMessages,
    };
  },

  routes: [
    POST("/start", async (req, { send }) => {
      const body = await req.json();
      await send(body.message, {
        auth: null,
        continuationToken: body.token,
        state: { topic: body.topic, contextMessages: body.context, internalCounter: 0 },
      });

      return new Response("ok");
    }),
  ],
  events: {
    "turn.started"(eventData, channel) {
      channel.state.internalCounter += 1;
    },
  },
});
```

The projection is re-evaluated whenever adapter state changes after channel event handlers run. Dynamic tool resolvers read it via `ctx.channel.metadata` and narrow it with `isChannel`. See [Dynamic capabilities](../guides/dynamic-capabilities) for the full consumption pattern.

When a parent agent dispatches a subagent, the framework forwards the parent's channel metadata projection to the child. The same `metadata(state)` projector also serves instrumentation metadata resolvers.

## Continuation tokens

Each call to `send(message, { auth, continuationToken, state? })` from a channel route addresses a session by its channel-local raw token. The framework prepends the channel name, derived from the file stem under `agent/channels/`, before handing the token to the runtime.

```ts
import { slackContinuationToken } from "eve/channels/slack";
import { twilioContinuationToken } from "eve/channels/twilio";

slackContinuationToken("C0123ABC", "1800000000.001234"); // "C0123ABC:1800000000.001234"
twilioContinuationToken("+15551234567", "+15557654321"); // "+15551234567:+15557654321"
```

Custom channels write their own function that joins the identity fields. The framework derives nothing for you; the channel owns its token format.

When the identity that should address a session is not known until later, the channel can re-key the parked session by calling `session.setContinuationToken(...)`. Pass the channel-local raw token; the runtime preserves the current channel namespace.

The `context(state, session)` config option builds the per-step `channel` argument handed to every event handler. It receives the channel's live adapter `state` and a `SessionHandle`, and returns the channel-owned context (thread handles, API clients, late-bound callbacks). The framework injects [`ChannelSessionOps`](#define-a-channel) and passes the result as the second positional argument to each handler. Closing over `session` lets the factory register callbacks that re-key the session later. State mutations made through the returned context are written back to adapter state.

```ts
import { defineChannel } from "eve/channels";

import { mintRef } from "./refs.js";

defineChannel<{ ref: string | null }>({
  state: { ref: null },
  context(state, session) {
    return {
      state,
      registerAnchor(ref: string) {
        state.ref = ref;
        session.setContinuationToken(ref);
      },
    };
  },
  events: {
    "message.completed"(eventData, channel) {
      if (!channel.state.ref) channel.registerAnchor(mintRef());
    },
  },
  routes: [
    /* ... */
  ],
});
```

The workflow runtime disposes the current park hook at the next step boundary and registers a new one at the new token. Inbound deliveries already addressed to the old token are dropped, so coordinate with your senders to use the new token.

## File uploads

`send()` accepts `string | UserContent`. To include file attachments, pass a `UserContent` array mixing text and file parts:

```ts
await send(
  [
    { type: "text", text: body.message },
    { type: "file", data: imageBytes, mediaType: "image/png" },
  ],
  { auth, continuationToken },
);
```

For platforms like Slack where files sit behind authenticated URLs, put a `URL` object in `FilePart.data` and declare `fetchFile` on the channel config:

```ts
defineChannel({
  fetchFile(url) {
    if (!url.startsWith("https://files.slack.com/")) return null;
    return fetch(url, { headers: { authorization: `Bearer ${token}` } })
      .then((r) => r.arrayBuffer())
      .then((b) => ({ bytes: Buffer.from(b) }));
  },

  routes: [
    POST("/webhook", async (req, { send }) => {
      await send(
        [
          { type: "text", text: message.text },
          ...message.attachments.map((a) => ({
            type: "file" as const,
            data: new URL(a.url),
            mediaType: a.mediaType,
          })),
        ],
        { auth, continuationToken, state },
      );
    }),
  ],
});
```

The `URL` object survives the queue boundary as a string and is reconstituted inside the workflow step. The staging pipeline calls `fetchFile` with the URL serialized as a string (the URL's `href`), which is why the example matches on `url.startsWith(...)`. Return bytes to stage the file to the sandbox, or `null` to let the URL pass through to the model provider.

The framework handles staging bytes to the sandbox, enforcing upload policy, hydrating files for the model call, and reconstituting `URL` objects after queue serialization.

## What to read next

* [Channels overview](./overview)
* [Dynamic capabilities](../guides/dynamic-capabilities)
* [Auth & route protection](../guides/auth-and-route-protection)


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Discord
description: Reach your agent from Discord HTTP Interactions, including slash commands, components, and modals.
type: integration
---

# Discord



The Discord channel wires your agent into Discord's HTTP Interactions, including slash and application commands, message components, and modal submissions. Discord enforces a three-second ACK deadline, so the channel verifies the Ed25519 signature headers, acknowledges the command right away, and runs the eve work in the background. See [Channels](./overview) for the contract this builds on.

## Add the channel

```ts title="agent/channels/discord.ts"
import { discordChannel } from "eve/channels/discord";

export default discordChannel();
```

```bash
DISCORD_PUBLIC_KEY=...      # verifies X-Signature-Ed25519 + X-Signature-Timestamp
DISCORD_APPLICATION_ID=...  # edits the deferred response and sends followups
DISCORD_BOT_TOKEN=...       # proactive messages + fallback + typing indicators
```

To skip env vars, pass the same values through `credentials: { applicationId, botToken, publicKey }`. The route is `POST /eve/v1/discord` by default. Paste that public URL into your Discord application's Interactions Endpoint URL.

## Register a command

Registering commands is on you, not the channel. Use Discord's API or the Developer Portal. A string option named `message` lines up with eve's default prompt extraction:

```bash
curl -X PUT "https://discord.com/api/v10/applications/$DISCORD_APPLICATION_ID/commands" \
  -H "Authorization: Bot $DISCORD_BOT_TOKEN" -H "Content-Type: application/json" \
  -d '[{"name":"ask","description":"Ask the eve agent","type":1,
    "options":[{"name":"message","description":"What should the agent do?","type":3,"required":true}]}]'
```

Use guild commands during development for faster propagation.

## How the channel handles messages

### Dispatch

`onCommand(ctx, interaction)` decides whether to dispatch and under what `auth`. Return `{ auth }` to proceed or `null` to drop the interaction. By default, auth comes from the invoking user. Event handlers receive `(eventData, channel, ctx)`, with Discord platform handles on `channel.discord`:

```ts
import { discordChannel } from "eve/channels/discord";

export default discordChannel({
  onCommand: (ctx, interaction) => ({
    auth: {
      principalId: interaction.user.id,
      principalType: "user",
      authenticator: "discord",
      attributes: { channel_id: interaction.channelId, guild_id: interaction.guildId ?? "" },
    },
  }),
  events: {
    "message.completed"(eventData, channel, ctx) {
      if (eventData.finishReason === "tool-calls") return;
      if (eventData.message) channel.discord.post(eventData.message);
    },
  },
});
```

### Delivery

The default `message.completed` handler edits the deferred response for the first reply and sends followups after that. If the interaction token is rejected, it falls back to a bot-authenticated channel message. Long text is split to Discord's 2000-char limit, and generated messages default to `allowed_mentions: { parse: [] }`.

Typing fires on `turn.started` and `actions.requested`, but only when a bot token is present. In custom hooks, call `channel.discord.startTyping()` yourself.

### Human-in-the-loop (HITL)

HITL renders as Discord components. Confirmations and options become buttons, `display: "select"` becomes a string select, and freeform input becomes a button that opens a modal. When the user responds, the parked session (paused awaiting input) resumes.

### Proactive sessions

Start a session without an inbound interaction through `receive(discord, { message, target, auth })` from a schedule `run` handler, or `args.receive(discord, ...)` from another channel. The proactive target shape is `{ channelId, conversationId?, initialMessage? }`. Either path needs `DISCORD_BOT_TOKEN`.

### Attachments

Inbound file attachments are not supported on this channel today.

## What to read next

* [Channels overview](./overview): the channel contract and every built-in channel
* [Auth & route protection](../guides/auth-and-route-protection): authenticating inbound traffic


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: eve
description: The default HTTP API for an agent, covering session routes, auth, and customization.
---

# eve



The eve channel is the framework's default HTTP API. It's what the terminal UI, [`useEveAgent`](../guides/frontend/overview), `curl`, and any SDK client talk to when they start sessions, send messages, and stream events. `eveChannel()` mounts the canonical session routes under `/eve/v1/session*`, and they are enabled by default even when `agent/channels/eve.ts` does not exist.

Reach for it when something needs HTTP access to your agent, including local tooling, a browser frontend, the terminal UI, or another API client. Most apps never write this file. Add `agent/channels/eve.ts` only to override the defaults, usually the route auth policy.

```ts title="agent/channels/eve.ts"
import { eveChannel } from "eve/channels/eve";
import { localDev, vercelOidc } from "eve/channels/auth";

export default eveChannel({
  auth: [localDev(), vercelOidc()],
});
```

## Routes

The channel exposes routes that create sessions, send follow-ups, and stream events:

* `GET /eve/v1/health`
* `POST /eve/v1/session` (start a session)
* `POST /eve/v1/session/:sessionId` (send a follow-up)
* `GET /eve/v1/session/:sessionId/stream` (stream events, NDJSON)

Start a session with a minimal body. The response returns `sessionId` and the `continuationToken` you reuse for follow-ups:

```bash
curl -X POST https://<deployment>/eve/v1/session \
  -H "Content-Type: application/json" \
  -d '{"message":"What is the weather in Paris?"}'
# {"continuationToken":"eve:7f3c...","ok":true,"sessionId":"ses_01h..."}
```

Stream that session's events as newline-delimited JSON (`application/x-ndjson; charset=utf-8`), one event object per line:

```bash
curl -N https://<deployment>/eve/v1/session/ses_01h.../stream
# {"type":"turn.started",...}
# {"type":"text.delta","delta":"It is "}
# {"type":"message.completed",...}
```

See [Sessions, runs & streaming](../concepts/sessions-runs-and-streaming) for the full request and stream flow, including the complete event set.

## Authentication

The `auth` option decides who can call these routes. The built-in helpers cover development and trusted infrastructure:

* `localDev()` accepts requests during local development.
* `vercelOidc()` lets the local CLI reach a deployed agent, and lets other internal deployments from your team call it.

Neither admits browser users or external clients in production. For a public app, wire the channel to your own auth (Clerk, Auth.js, your own OIDC/JWT verification, an API-key verifier, or any custom `AuthFn`). Vercel OIDC is optional; use it only when Vercel-issued deployment tokens are part of your trust model.

`eve init` scaffolds an `agent/channels/eve.ts` with a production placeholder so you replace it before going live. The generated channel allows Vercel OIDC and localhost, and includes `placeholderAuth()`, which returns a setup-focused 401 in production until you swap it for real auth. Delete the file and eve falls back to `[localDev(), vercelOidc()]`, which still does not admit browser users in production.

For the full auth model and helper list, see [Auth & route protection](../guides/auth-and-route-protection).

## Customization

Use `onMessage` to add request-specific context before the agent sees the user message, and `events` to observe stream events from sessions this channel created:

```ts title="agent/channels/eve.ts"
import { eveChannel, defaultEveAuth } from "eve/channels/eve";
import { localDev, vercelOidc } from "eve/channels/auth";

export default eveChannel({
  auth: [localDev(), vercelOidc()],
  onMessage(ctx, message) {
    const callerId = ctx.eve.caller?.principalId ?? "anonymous";
    return {
      auth: defaultEveAuth(ctx),
      context: [`HTTP caller ${callerId} sent: ${message}`],
    };
  },
  events: {
    "message.completed"(eventData, channel, ctx) {
      console.log("eve response completed", {
        continuationToken: channel.continuationToken,
        sessionId: ctx.session.id,
      });
    },
  },
});
```

## Clients

The browser side of this API lives in the [Frontend](../guides/frontend/overview) docs, where `useEveAgent` drives the eve channel from React UI.

For scripts, server-to-server calls, evals, tests, and custom clients, use the [TypeScript SDK](../guides/client/overview). It wraps the session routes, continuation token, stream cursor, and reconnect loop.

## What to read next

* [Frontend](../guides/frontend/overview): drive the eve channel from browser UI with `useEveAgent`
* [TypeScript SDK](../guides/client/overview): call the eve channel from TypeScript
* [Auth & route protection](../guides/auth-and-route-protection): the route auth policy
* [Sessions, runs & streaming](../concepts/sessions-runs-and-streaming): the routes this channel exposes


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: GitHub
description: Reach your agent from GitHub App webhooks, with @mention dispatch, PR diff context, and sandbox checkout.
type: integration
---

# GitHub



The GitHub channel lets the agent work directly on a repository. Someone `@mentions` it in an issue, PR, or review comment, and the agent answers right there in the thread, with the PR diff already in context and the repo checked out into the sandbox. It takes GitHub App webhooks at `/eve/v1/github`, checks the signature, derives auth from whoever triggered the event, and replies on the native surface. See [Channels](./overview) for the contract this builds on.

## Add the channel

```ts title="agent/channels/github.ts"
import { githubChannel } from "eve/channels/github";

export default githubChannel({
  botName: "my-agent",
  credentials: {
    appId: process.env.GITHUB_APP_ID,
    privateKey: process.env.GITHUB_APP_PRIVATE_KEY,
    webhookSecret: process.env.GITHUB_WEBHOOK_SECRET,
  },
});
```

Every field falls back to an env var, so you can drop the `credentials` block entirely once these are set:

```bash
GITHUB_APP_ID=...            # GitHub App id
GITHUB_APP_PRIVATE_KEY=...   # GitHub App private key (PEM)
GITHUB_WEBHOOK_SECRET=...    # verifies the webhook signature
GITHUB_APP_SLUG=...          # supplies botName when it is not set in config
```

`appId`/`privateKey`/`webhookSecret` also take a lazy resolver function if you'd rather fetch them on demand.

Point the GitHub App webhook URL at `https://<deployment>/eve/v1/github`. For mention-driven turns, subscribe to `issue_comment` and `pull_request_review_comment`; add `issues` / `pull_request` if you wire up their opt-in hooks. A comment that `@mention`s `botName` starts a turn.

## How the channel handles messages

### Dispatch

Inbound hooks return `{ auth }` to dispatch, or `null` to ignore. Use `defaultGitHubAuth(ctx)` to derive auth from the actor.

```ts
import { defaultGitHubAuth, githubChannel } from "eve/channels/github";

export default githubChannel({
  botName: "my-agent",
  // Replaces the @mention gate. ctx.conversation.kind is "issue", "pull_request", or "review_thread".
  onComment: (ctx, comment) => ({ auth: defaultGitHubAuth(ctx) }),
  // Opt in; no default dispatch on these events.
  onIssue: (ctx, issue) => (issue.action === "opened" ? { auth: defaultGitHubAuth(ctx) } : null),
  onPullRequest: (ctx, pr) => (pr.action === "opened" ? { auth: defaultGitHubAuth(ctx) } : null),
});
```

### Delivery

When a turn starts, the channel adds an `eyes` reaction to the triggering comment (turn this off with `progress: { reactions: false }`). The reply comes back as a comment, on the timeline or in the review thread, and splits across multiple comments when it runs long. If the turn fails, you get a short error comment carrying an error id.

### Human-in-the-loop (HITL)

GitHub comments have no interactive button or card affordance. A human-in-the-loop (HITL) `input.requested` event is posted as a comment prompt, and the user's reply comment maps back to the pending input request. Declare an `events["input.requested"]` handler to customize the prompt.

### Proactive sessions

Start a session without an inbound mention through `receive(github, { message, target, auth })` from a schedule `run` handler, or `args.receive(github, ...)` from another channel. The target requires `owner`, `repo`, and exactly one of `issueNumber` or `pullRequestNumber`.

### Attachments

Inbound file attachments are not supported on this channel today. Repository contents reach the agent through the sandbox checkout below, not as message attachments.

### PR context

Summon the agent on a PR and it always sees the diff. PR metadata and the changed-file patch land in `context`. Large generated files still appear in the list, but their patch body is dropped; add more paths to the skip list with `pullRequestContext.excludedFiles`.

### Sandbox checkout

Before the first model call, every triggered turn checks out the relevant ref into the sandbox, so `read_file`/`glob`/`grep`/`bash` all run against the real tree. The installation token never enters the sandbox. `git` fetches a token-free URL, and the platform injects auth on egress at the firewall. That requires a firewall-capable backend (Vercel); the local backend skips checkout. Within a session, checkout is incremental across turns.

### Arbitrary API calls

For anything the channel doesn't wrap, call `ctx.github.request({ method, path, body })`. It carries installation-token auth.

## What to read next

* [Channels overview](./overview): the channel contract and every built-in channel
* [Auth & route protection](../guides/auth-and-route-protection): authenticating inbound traffic


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Linear
description: Reach your agent through Linear Agent Sessions, with native Agent Activities for progress, questions, and responses.
type: integration
---

# Linear



The Linear channel uses Linear's Agent Session surface rather than ordinary comments. Users delegate work to the agent from Linear, eve receives `AgentSessionEvent` webhooks at `/eve/v1/linear`, and the channel replies with native Agent Activities, including `thought`, `action`, `elicitation`, `response`, and `error`. See [Channels](./overview) for the contract this builds on.

## Add the channel

```ts title="agent/channels/linear.ts"
import { linearChannel } from "eve/channels/linear";

export default linearChannel({
  credentials: {
    accessToken: process.env.LINEAR_AGENT_ACCESS_TOKEN,
    webhookSecret: process.env.LINEAR_WEBHOOK_SECRET,
  },
});
```

```bash
LINEAR_AGENT_ACCESS_TOKEN=lin_api_... # posts Agent Activities and creates proactive sessions
LINEAR_WEBHOOK_SECRET=...             # verifies Linear-Signature
```

The sample passes credentials explicitly. To rely on env vars instead, drop the `credentials` block: the access token falls back to `LINEAR_AGENT_ACCESS_TOKEN`, `LINEAR_ACCESS_TOKEN`, `LINEAR_API_KEY`, or `LINEAR_API_TOKEN`, and the webhook secret falls back to `LINEAR_WEBHOOK_SECRET`. Both fields also accept lazy resolver functions.

## Configure Linear

Create a Linear OAuth app, enable Agent Session events, and point the webhook URL at:

```text
https://<deployment>/eve/v1/linear
```

For Linear's agent surface, configure the OAuth authorize URL with `actor=app` and grant the app scopes that let it appear as an agent in Linear, including `app:assignable` and `app:mentionable`. Subscribe to the `AgentSessionEvent` webhook category so Linear sends `created` events when the agent is delegated or mentioned and `prompted` events when the user continues the session.

Linear sends webhook signatures in `Linear-Signature`; eve verifies the HMAC over the raw body and rejects stale `webhookTimestamp` values. If a trusted gateway verifies Linear before the request reaches eve, pass `credentials.webhookVerifier` instead of a webhook secret.

## How the channel handles messages

### Dispatch

The default hook dispatches `created` and `prompted` Agent Session events. eve adds a Linear context block with the agent session, issue, comment, and organization identifiers, then continues the same session with `agent-session:<id>`.

### Delivery

Turn start posts an ephemeral `thought`, tool calls post ephemeral `action` activities, final assistant text posts a durable `response`, and failures post `error` activities. When the model emits text before a tool call, eve buffers the first non-empty line and uses it as the next ephemeral Linear `thought`, mirroring Slack's typing-status behavior.

### Human-in-the-loop (HITL)

Human-in-the-loop (HITL) input requests render as Linear `elicitation` activities. When the user replies to the Agent Session, the channel resolves that prompt back to the pending eve input request and resumes with `inputResponses`.

### Proactive sessions

Start a session without an inbound webhook with `receive(linear, { target })`. See [Proactive sessions](#proactive-sessions) below for the target shape and examples.

### Attachments

Inbound file attachments are not supported on this channel today.

### API handle

Event handlers receive `channel.linear`, which exposes `createActivity`, `listActivities`, and `updateSession` for custom Agent Activity delivery and Agent Session metadata.

## Custom hooks

Return `{ auth }` to dispatch, or `null` to acknowledge without waking the agent.

```ts
import { defaultLinearAuth, linearChannel } from "eve/channels/linear";

export default linearChannel({
  onAgentSession: (_ctx, event) => {
    if (event.action !== "created" && event.action !== "prompted") return null;
    return { auth: defaultLinearAuth(event) };
  },
});
```

Restrict dispatch to a subset of Linear teams or projects by inspecting `event.agentSession.issue` in `onAgentSession`. Add extra context by returning `context` alongside `auth`.

```ts
import { defaultLinearAuth, linearChannel } from "eve/channels/linear";

export default linearChannel({
  onAgentSession: (_ctx, event) => {
    if (event.agentSession.issue?.identifier?.startsWith("OPS-") !== true) return null;
    return {
      auth: defaultLinearAuth(event),
      context: ["Only make reversible changes unless the issue says otherwise."],
    };
  },
});
```

Override event delivery when you want more specific Agent Activities.

```ts
import { linearChannel } from "eve/channels/linear";

export default linearChannel({
  events: {
    async "message.completed"(eventData, channel) {
      if (eventData.finishReason === "tool-calls" || !eventData.message) return;
      await channel.linear.createActivity({
        body: `Done.\n\n${eventData.message}`,
        type: "response",
      });
    },
    async "input.requested"(eventData, channel) {
      await channel.linear.createActivity({
        body: eventData.requests.map((request) => request.prompt).join("\n\n"),
        type: "elicitation",
      });
    },
  },
});
```

Add session-level links when your agent creates an external artifact.

```ts
await channel.linear.updateSession({
  addedExternalUrls: [{ label: "Run log", url: "https://example.com/runs/123" }],
});
```

## Proactive sessions

Use the channel's `receive` target to continue an existing Agent Session or create one from a Linear issue or root comment. The target accepts an existing `agentSessionId`, or an `issueId` or root `commentId` to create a new session before sending the message. The example below runs from a schedule; a route handler uses the same target shape through its own `receive` helper.

```ts
import { defineSchedule } from "eve/schedules";

import linear from "../channels/linear.js";

export default defineSchedule({
  cron: "0 14 * * 1",
  async run({ receive, waitUntil, appAuth }) {
    waitUntil(
      receive(linear, {
        auth: appAuth,
        message: "Post a concise status update with blockers and next actions.",
        target: {
          issueId: "EVE-123",
          initialActivity: "Preparing the status update.",
        },
      }),
    );
  },
});
```

For issue or comment targets, the channel calls Linear's proactive Agent Session mutations before starting the eve turn. For an existing `agentSessionId`, it skips session creation and only seeds the continuation token.

## What to read next

* [Channels overview](./overview): the channel contract and every built-in channel
* [Connections](../connections): use the Linear MCP connection when the agent needs to inspect or edit Linear data from another channel


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Overview
description: How users reach your agent: the channel contract, the base eve HTTP channel, and authoring custom channels.
---

# Overview



A channel is the edge adapter between a platform and your agent. It does three things:

* Normalizes platform input into a user message.
* Owns the `continuationToken`, the resume handle for a conversation on that surface.
* Decides delivery, meaning how, where, and whether a response goes back.

eve ships a base HTTP channel plus first-class platform channels, and you can author your own. Browse the full set in the [Integrations](/integrations) gallery.

Each channel has its own provider terms, data flow, auth model, and user-consent expectations. Before sending non-public, sensitive, regulated, or production data through a channel, confirm that the channel provider and your configured scopes, signature checks, route auth, and delivery behavior are appropriate for your use case.

## Where channels live

Channel files live under `agent/channels/` in the root agent. The file stem is the channel id: `agent/channels/intake.ts` is addressed as `intake`. Export the channel as the module's default export. Local subagents do not declare channels.

```text
agent/
  agent.ts
  channels/
    eve.ts
    slack.ts
    intake.ts
```

Scaffold a channel file with `eve channels add` (interactive), or pass a kind: `eve channels add slack` or `eve channels add web`. You can also author the file by hand.

## The eve HTTP channel (default)

The eve channel is the framework's default HTTP session API, the routes the terminal UI, [`useEveAgent`](../guides/frontend/overview), and `curl` all talk to. It is enabled by default, even with no `agent/channels/eve.ts` file. Add that file only to override the defaults, most often the route auth policy. See [HTTP channel](./eve) for routes, auth, and customization.

## Custom channels

When eve doesn't ship a channel for your surface, build one with `defineChannel` from `eve/channels`. A custom channel declares route handlers (`GET`, `POST`, `PUT`, `PATCH`, `DELETE`, `WS`), an `events` map, and a `send` call inside a handler to start or resume a session. See [Custom channels](./custom) for the full walkthrough, including WebSocket routes, cross-channel hand-off, channel metadata, continuation tokens, and file uploads.

## Relationship to the Chat SDK

eve uses the Chat SDK's **card-builder components** (Cards, Buttons, Actions, etc.) for composing rich Slack messages. When you build a card with the [Slack channel](./slack), the underlying primitives come from the Chat SDK and get converted to Slack Block Kit at post time.

eve does **not** use the Chat SDK's runtime. The `Chat`, `Adapter`, and `Thread` primitives are never imported or reachable through eve's public API. eve implements its own channel layer (webhook handling, signature verification, event parsing, and thread management). Building Slack messages works like Chat SDK cards, but wiring a channel means authoring against eve's `defineChannel(...)` API, not a Chat SDK adapter.

## Which channel?

| You want…                                   | Use                                                        |
| ------------------------------------------- | ---------------------------------------------------------- |
| A web app / browser chat UI                 | eve channel + [`useEveAgent`](../guides/frontend/overview) |
| Local tooling, SDK clients, `curl`          | eve channel (default)                                      |
| Slack mentions, DMs, buttons                | [Slack](./slack)                                           |
| Discord slash commands, components          | [Discord](./discord)                                       |
| Microsoft Teams messages + Adaptive Cards   | [Teams](./teams)                                           |
| Telegram bot messages                       | [Telegram](./telegram)                                     |
| SMS or speech-transcribed phone calls       | [Twilio](./twilio)                                         |
| GitHub @mentions, PR review with checkout   | [GitHub](./github)                                         |
| Linear issue delegation and Agent Sessions  | [Linear](./linear)                                         |
| Anything else (internal webhook, WebSocket) | Custom channel (`defineChannel`, above)                    |

## Disclaimer

As the deployer, it is your responsibility to ensure your agent complies with applicable laws.

Where an eve agent communicates with people, you may be required to disclose that they are interacting with an automated AI system where law requires it. eve does not add this disclosure automatically; configure it in your instructions and/or channel responses.

## What to read next

* [Slack](./slack): the most common platform channel, end to end
* [Custom channels](./custom): build a channel for any surface with `defineChannel`
* [Frontend](../guides/frontend/overview): browser chat on the eve channel with `useEveAgent`
* [Integrations](/integrations): browse every built-in channel and connection in one gallery


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Slack
description: Reach your agent from Slack app mentions and DMs, with thread anchoring, buttons, and Vercel Connect credentials.
type: integration
---

# Slack



The Slack channel puts your agent inside a workspace. It answers `@mentions` and DMs, replies in threads, shows typing indicators, and turns human-in-the-loop (HITL) prompts into buttons. Use it when the conversation should happen where your team already works. Credentials run through [Vercel Connect](../guides/auth-and-route-protection), which handles both the outbound bot token and inbound webhook verification, so there's no `SLACK_BOT_TOKEN` or `SLACK_SIGNING_SECRET` for you to manage. See [Channels](./overview) for the contract this builds on.

## Set up Connect

Create a Slack Connect client and copy its UID (e.g. `slack/my-agent`), then attach this project as the trigger destination at eve's Slack route:

```bash
npm install -g vercel@latest && export FF_CONNECT_ENABLED=1
vercel connect create slack --triggers
vercel connect detach <uid> --yes
vercel connect attach <uid> --triggers --trigger-path /eve/v1/slack --yes
```

`FF_CONNECT_ENABLED=1` turns on the Connect commands, which are feature-flagged in the Vercel CLI today. The `create` step provisions a destination at the default Connect path. `detach` then `attach --trigger-path /eve/v1/slack` re-points the trigger at the eve Slack route, since eve does not serve the default Connect path. `--triggers` turns on Slack Event Subscriptions; without it, Slack never delivers `app_mention` or `message.im`. You can also create the client from the [Connect dashboard](https://vercel.com/d?to=/%5Bteam%5D/~/connect\&title=Go+to+Connect).

## Add the channel

Scaffold the channel and its dependency with `eve channels add slack`, or set it up by hand:

```bash
npm install @vercel/connect
```

```ts title="agent/channels/slack.ts"
import { connectSlackCredentials } from "@vercel/connect/eve";
import { slackChannel } from "eve/channels/slack";

export default slackChannel({
  credentials: connectSlackCredentials("slack/my-agent"),
});
```

`connectSlackCredentials` returns `{ botToken, webhookVerifier }`, keeping token rotation, multi-workspace tenancy, and request verification inside Connect rather than your code. Deploy once the trigger destination and channel file are ready:

```bash
VERCEL_USE_EXPERIMENTAL_FRAMEWORKS=1 vercel deploy --prod
```

`VERCEL_USE_EXPERIMENTAL_FRAMEWORKS=1` lets the Vercel CLI recognize eve as a framework during the build. eve's own setup commands set the same flag.

## How the channel handles messages

### Dispatch

Inbound hooks decide whether to dispatch a turn and with what `auth`. Return `{ auth }` to dispatch, `null` to drop, or `{ auth, context }` to inject background into history.

* `onAppMention(ctx, message)` handles `app_mention` events. The default derives workspace-scoped auth and posts a `Thinking…` indicator.
* `onDirectMessage(ctx, message)` handles `message.im` events (needs `im:history` scope). Bot-authored messages and edits are filtered out first.
* `onInteraction(action, ctx)` handles `block_actions` callbacks not consumed by HITL.

You get the triggering mention by default, but not the earlier replies in the thread. Pull them in with `loadThreadContextMessages` and return them as `context`, which eve appends to history as user messages the model sees on every later turn. Use `since: "last-agent-reply"` so repeated mentions in one thread inject only what is new:

```ts
import { defaultSlackAuth, loadThreadContextMessages, slackChannel } from "eve/channels/slack";
import { connectSlackCredentials } from "@vercel/connect/eve";

export default slackChannel({
  credentials: connectSlackCredentials("slack/my-agent"),
  async onAppMention(ctx, message) {
    const auth = defaultSlackAuth(message, ctx);
    const prior = await loadThreadContextMessages(ctx.thread, message, {
      since: "last-agent-reply",
    });
    if (prior.length === 0) return { auth };
    const transcript = prior
      .map((m) => `${m.isMe ? "you" : (m.user ?? "user")}: ${m.markdown}`)
      .join("\n");
    return { auth, context: [`Recent thread messages since your last reply:\n\n${transcript}`] };
  },
});
```

### Delivery

The default handlers reply in-thread and show progress. Typing indicators post automatically: `Thinking…` on inbound, `Working…` on `turn.started`, a truncated reasoning snippet on `reasoning.appended`, and tool status on `actions.requested`. Reasoning snippets build progressively: extensions of at least four characters appear immediately, while smaller streamed deltas use the five-second refresh interval to avoid one Slack request per token. Override `events["reasoning.appended"]` if you prefer generic wording. Override `onAppMention` or the `events` handlers to customize.

When a session starts without a `threadTs` (say, from a schedule or `receive(slack, ...)`), it anchors on the first agent post, and later posts and mentions resume that same session. Pass `initialMessage` with a `Card` to land a structured anchor first instead. `threadTs` and `initialMessage` are mutually exclusive.

The example below overrides `onAppMention` to gate on an authored message and posts the completed reply to the thread. Event handlers receive `(eventData, channel, ctx)`, with Slack platform handles on `channel.thread` and `channel.slack`:

```ts
import { defaultSlackAuth, slackChannel } from "eve/channels/slack";
import { connectSlackCredentials } from "@vercel/connect/eve";

export default slackChannel({
  credentials: connectSlackCredentials("slack/my-agent"),
  onAppMention: (ctx, message) =>
    message.author ? { auth: defaultSlackAuth(message, ctx) } : null,
  events: {
    "message.completed"(eventData, channel, ctx) {
      if (eventData.finishReason === "tool-calls") return;
      if (eventData.message) channel.thread.post(eventData.message);
    },
  },
});
```

### Human-in-the-loop (HITL)

HITL renders as Slack buttons and selects. When the user responds, the parked session (paused awaiting input) resumes.

Authorization prompts are private. A sign-in challenge (OAuth URL, device code) is a credential. Anyone who completes it binds their identity to the session's connection. The default `authorization.required` handler delivers the challenge ephemerally to the triggering user, device code included, and posts a public link-free status only when it has no user to target. The handler receives a private-delivery context with `postEphemeral`, `postDirectMessage` (needs the `im:write` scope), and `state`. There is, intentionally, no public `post` and no raw API access.

```ts
events: {
  "authorization.required"(eventData, channel) {
    const userId = channel.state.triggeringUserId;
    if (!userId || !eventData.authorization?.url) return;
    return channel.postDirectMessage(userId, `Sign in to continue: ${eventData.authorization.url}`);
  },
},
```

### Proactive sessions

Start a session without an inbound message through `receive(slack, { message, target, auth })` from a schedule `run` handler, or `args.receive(slack, ...)` from another channel. The proactive target shape is `{ channelId }`.

### Attachments

Inbound files behind authenticated Slack URLs are staged with `fetchFile`. See [File uploads](./custom#file-uploads) for the `fetchFile` contract.

## What to read next

* [Channels overview](./overview): the channel contract and every built-in channel
* [Auth & route protection](../guides/auth-and-route-protection): authenticating inbound traffic


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Microsoft Teams
description: Reach your agent from Microsoft Teams via the Bot Framework Activity protocol, with Adaptive Card human-in-the-loop prompts.
type: integration
---

# Microsoft Teams



The Teams channel runs your agent inside Microsoft Teams as a bot. It takes Bot Framework Activity POSTs, checks the Bot Connector bearer JWT on each one, and routes message activities to your agent. Human-in-the-loop (HITL) prompts come back as Adaptive Cards, and replies go out over the Bot Framework Connector REST API. See [Channels](./overview) for the contract this builds on.

## Add the channel

```ts title="agent/channels/teams.ts"
import { teamsChannel } from "eve/channels/teams";

export default teamsChannel();
```

```bash
MICROSOFT_APP_ID=...
MICROSOFT_APP_PASSWORD=...
MICROSOFT_TENANT_ID=...   # optional, single-tenant bots
```

By default the channel mounts at `POST /eve/v1/teams`. Point your Azure Bot or Teams app messaging endpoint at that public URL. To mount somewhere else, pass `route: "/api/teams/activity"`.

## How the channel handles messages

### Dispatch

The default `onMessage` handles two cases: personal-chat messages, and channel or group-chat messages that mention the bot directly. Ambient resource-specific-consent messages are dropped unless you override it. Before dispatch, eve strips the mention, adds a `<teams_context>` block, and scopes channel and group threads by root activity id (`replyToId ?? id`).

```ts
import { defaultTeamsAuth, teamsChannel } from "eve/channels/teams";

export default teamsChannel({
  onMessage(ctx, message) {
    if (message.scope !== "personal" && !message.isBotMentioned) return null;
    return { auth: defaultTeamsAuth(message) };
  },
});
```

### Delivery

Replies post as Markdown (`textFormat: "markdown"`), with oversized text split across messages and a typing indicator sent on turn start and action requests.

### Human-in-the-loop (HITL)

A human-in-the-loop (HITL) `input.requested` event renders as an Adaptive Card. Buttons and options map to `Action.Submit`, selects to `Input.ChoiceSet`, and freeform to `Input.Text`. When the user submits, the activity converts to eve `inputResponses` for you. For invokes that aren't HITL, handle them in `onInvoke(ctx, activity)`.

### Proactive sessions

Proactive sessions need an existing conversation reference, because the Bot Framework v1 surface cannot create new chats by Azure Active Directory (AAD) user id. Pass `serviceUrl`, `conversationId`, and the other reference fields to `receive(teams, { target })`.

### Attachments

Inbound files are off by default. Opt in to allow personal-scope downloads and public media URLs:

```ts
export default teamsChannel({
  files: { enabled: true, allowedHosts: ["contoso.sharepoint.com"] },
});
```

## What to read next

* [Channels overview](./overview): the channel contract and every built-in channel
* [Auth & route protection](../guides/auth-and-route-protection): authenticating inbound traffic


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Telegram
description: Reach your agent from Telegram bot webhooks, with inline-keyboard human-in-the-loop prompts and attachments.
type: integration
---

# Telegram



The Telegram channel puts your agent behind a Telegram bot. It takes Bot API webhooks, checks the `X-Telegram-Bot-Api-Secret-Token` header before trusting anything, and routes the messages it cares about (private chats plus group messages that address the bot) to a reply over `sendMessage`. See [Channels](./overview) for the contract this builds on.

## Add the channel

```ts title="agent/channels/telegram.ts"
import { telegramChannel } from "eve/channels/telegram";

export default telegramChannel({
  botUsername: "my_bot",
});
```

```bash
TELEGRAM_BOT_TOKEN=123456:...        # replies, typing, callbacks, proactive sends
TELEGRAM_WEBHOOK_SECRET_TOKEN=...    # must match the secret_token you register
```

You can pass the same values via `credentials: { botToken, webhookSecretToken }`. The channel mounts `POST /eve/v1/telegram`. Register the deployed URL yourself; eve does not call `setWebhook`:

```bash
curl -X POST "https://api.telegram.org/bot$TELEGRAM_BOT_TOKEN/setWebhook" \
  -H "Content-Type: application/json" \
  -d '{"url":"https://your-app.example.com/eve/v1/telegram",
       "secret_token":"'"$TELEGRAM_WEBHOOK_SECRET_TOKEN"'",
       "allowed_updates":["message","callback_query"]}'
```

## How the channel handles messages

### Dispatch

In a private chat, text, captions, photos, and documents all go through. Groups are stricter. Only three things wake the bot: a command (`/ask`, `/ask@my_bot`), an `@my_bot` mention (when `botUsername` is set), or a reply to one of the bot's own messages. Everything else is ignored.

Forum topics carry `message_thread_id` in the continuation token, so each topic stays on its own thread.

To customize auth or filtering, override `onMessage`. Group privacy mode itself lives in BotFather, not here.

### Delivery

The default `message.completed` handler sends plain text via `sendMessage`. It passes no `parse_mode`, so any Markdown shows up literally. Replies longer than Telegram's 4096-char limit are split across messages. Custom handlers use `channel.telegram`.

### Human-in-the-loop (HITL)

Human-in-the-loop (HITL) turns option requests into inline-keyboard buttons and freeform requests into `ForceReply`. Telegram caps `callback_data` at 64 bytes, so eve keeps compact callback ids in channel state instead. It acknowledges its own callbacks with `answerCallbackQuery`; anything it doesn't recognize goes to `onCallbackQuery`.

### Proactive sessions

Start a session without an inbound message through `receive(telegram, { message, target, auth })` from a schedule `run` handler, or `args.receive(telegram, ...)` from another channel. `target.chatId` is required. Add `messageThreadId` to land in a specific forum topic.

### Attachments

Inbound photos and documents are supported. eve fetches them on demand via `getFile`, only when an upload policy allows the type:

```ts
export default telegramChannel({
  botUsername: "my_bot",
  uploadPolicy: { allowedMediaTypes: ["image/*", "application/pdf"], maxBytes: 10 * 1024 * 1024 },
});
```

## What to read next

* [Channels overview](./overview): the channel contract and every built-in channel
* [Auth & route protection](../guides/auth-and-route-protection): authenticating inbound traffic


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Twilio
description: Reach your agent over SMS and speech-transcribed phone calls with Twilio.
type: integration
---

# Twilio



The Twilio channel puts your agent on a phone number, so people can text it or call it. Inbound SMS arrives as a webhook. Inbound calls are answered with TwiML `<Gather input="speech">`, and the resulting transcript feeds the same eve session that SMS uses, so a caller and a texter look identical downstream. Every request is checked against `X-Twilio-Signature` before anything else runs. The raw continuation token is `From:To`. See [Channels](./overview) for the contract this builds on.

## Add the channel

```ts title="agent/channels/twilio.ts"
import { twilioChannel } from "eve/channels/twilio";

export default twilioChannel({
  allowFrom: "+15551234567",
  messaging: { from: "+15557654321" },
});
```

```bash
TWILIO_ACCOUNT_SID=AC...   # required for default outbound SMS
TWILIO_AUTH_TOKEN=...      # required for inbound signature verification
```

To skip env vars, pass the same values via `credentials: { accountSid, authToken }`. The channel mounts three routes:

* `POST /eve/v1/twilio/messages`: Messaging webhook
* `POST /eve/v1/twilio/voice`: inbound call webhook
* `POST /eve/v1/twilio/voice/transcription`: speech transcript callback

Point your Twilio number's Messaging webhook at `/messages` and Voice webhook at `/voice`, using the exact public URL Twilio will call.

## How the channel handles messages

### Dispatch

`allowFrom` is required. It gates who can reach the inbound hooks. Pass a single number, a list, an async resolver, or `"*"`. The wildcard is dangerous; only use it with an explicit check inside `onText`/`onVoice`.

```ts
export default twilioChannel({ allowFrom: ["+15551234567", "+15557654321"] });
```

`onText` and `onVoiceTranscription` decide dispatch and `auth`. Return `{ auth }` to proceed, or `null` to drop the message. `onVoice` fires the moment a call comes in. Return `null` to reject it, or return an object to override the spoken prompt, language, `<Say voice>`, and speech-recognition options.

```ts
export default twilioChannel({
  allowFrom: ["+15551234567"],
  onText: (ctx, message) => ({
    auth: {
      principalId: message.from,
      principalType: "user",
      authenticator: "twilio",
      attributes: { to: message.to ?? "" },
    },
  }),
});
```

### Delivery

The default `message.completed` handler sends the reply as SMS through Twilio's Messages API. A reply to an inbound message can reuse the webhook's `To` as the sender, but a proactive send has nothing to reuse, so it needs `messaging.from` or `messaging.messagingServiceSid`. Behind a proxy, set `webhookUrl` so signature verification matches the exact configured URL, and `publicBaseUrl` so voice TwiML can build absolute callback URLs.

### Human-in-the-loop (HITL)

SMS and voice have no native button or card affordance, so HITL prompts do not render as interactive controls. The agent's `input.requested` event reaches your `events["input.requested"]` handler if you declare one. Handle it by sending the prompt as text and mapping the caller's reply back to the input request yourself.

### Proactive sessions

Start a session without an inbound message through `receive(twilio, { message, target, auth })` from a schedule `run` handler, or `args.receive(twilio, ...)` from another channel. `target.phoneNumber` is required, and the channel needs `messaging.from` or `messaging.messagingServiceSid` for the outbound sender.

### Attachments

Inbound media attachments are not supported on this channel today.

## Disclaimer

As the deployer, it is your responsibility to ensure your agent complies with applicable laws.

For example, you may be required to inform callers and texters that calls are recorded/transcribed and processed by an automated AI system, and obtain consent where required (including two-party-consent jurisdictions). For outbound SMS or calls you initiate, you may be required to get prior express consent, honor STOP/opt-out and quiet-hour rules, and complete required carrier registration.

## What to read next

* [Channels overview](./overview): the channel contract and every built-in channel
* [Auth & route protection](../guides/auth-and-route-protection): authenticating inbound traffic


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Context Control
description: Control what an eve agent's model sees and when, across instructions, skills, the workspace, and subagents.
---

# Context Control



eve gives you a few levers for controlling what the model sees and when. `instructions.md` (or `instructions.ts`) is always on, `skills/` are available but loaded on demand, and the workspace and sandbox are visible through tools rather than pasted into the prompt.

## Base identity with `instructions.md`

Use `instructions.md` for the core contract of the agent.

```md
You are a careful support assistant. Be concise, verify facts before replying, and explain when you
used a tool.
```

Keep this file focused on stable behavior that should apply on every turn.

## Compose instructions in TypeScript with `instructions.ts`

To build the instructions prompt from typed helpers, lib code, or environment-derived values, author it as a module instead of markdown.

```ts title="agent/instructions.ts"
import { defineInstructions } from "eve/instructions";
import { buildInstructionsPrompt } from "./lib/prompts.js";

export default defineInstructions({
  markdown: buildInstructionsPrompt(),
});
```

Module-backed instructions run once at build time. eve captures the resulting markdown into the compiled manifest, so the runtime serves the same prompt every session without re-running the module.

## Load procedures on demand with `skills/`

Skills stay out of the always-on prompt by default, which keeps rich procedures available without bloating every turn. eve advertises the available skills and adds a framework-owned `load_skill` tool. When the request clearly matches a skill description, or the user names a skill explicitly, the model activates that skill, and eve appends the skill's markdown to the active instructions for later turn work.

### Flat skill

```md title="agent/skills/get-weather.md"
Use the weather tool before answering forecast or temperature questions.
```

### Packaged skill

```md title="agent/skills/research/SKILL.md"
---
description: Research unfamiliar topics before answering with confidence.
---

When the task is novel or ambiguous, gather evidence first, then answer with the key facts and the
remaining uncertainty.
```

Packaged skills are useful when you also want sibling files such as `references/`, `assets/`, or `scripts/` under the same skill directory. Those paths show up under the runtime workspace root, so the model can inspect them with the normal file or shell tools instead of pasting their content into the prompt.

See [Skills](../skills) for the full authoring model and install notes.

## Put runtime files in the workspace, not the prompt

eve does not inline the entire authored surface into the prompt. Instead, it gives the model a shallow workspace hint and runtime tools to inspect deeper when needed. Skill files are available under the active workspace root, and the model inspects them with the shared `bash` tool, which keeps prompts smaller and makes file and command work explicit.

See [Sandbox](../sandbox) for the workspace and sandbox model.

## Delegate to a specialist with a subagent

If a task deserves its own prompt and tool surface, use a local subagent instead of overloading the root agent. Subagents are a context-control lever too. They get their own `instructions.md`, tools, and sandbox, and they run inside their own delegated context instead of extending the root agent inline.

See [Subagents](../subagents).

## Dynamic context with `defineDynamic`

The levers above are static, authored once and the same on every session. When the right context depends on who is calling (their team, tenant, plan, or feature flags), resolve it at runtime instead. `defineDynamic` in `agent/instructions/` returns the per-session system prompt, and `defineDynamic` in `agent/skills/` returns the set of skills a caller can load. Both read `ctx.session.auth` or channel metadata, so a caller on the billing team gets the billing instructions and playbook while no one else sees them. See [Dynamic capabilities](../guides/dynamic-capabilities) for the resolver API and when each event fires.

## Recommended context layout

Pick the lever by what the context is for:

* `instructions.md` for the agent's permanent identity. Keep it short and stable.
* `instructions.ts` when you need to compose the prompt from typed helpers at build time.
* `skills/` for optional procedures that should load only when needed. Move long procedures here instead of into the always-on prompt.
* `tools/` to expose typed integrations.
* a subagent when the task needs a different specialist surface; use one only for real specialization boundaries.
* the workspace or sandbox when the model should inspect files or run commands instead of relying on pasted instructions.

## What to read next

* [Tools](../tools): expose typed integrations the model can call.
* [Skills](../skills): the full authoring model for on-demand procedures.
* [Subagents](../subagents): delegate to a specialist with its own prompt and tools.
* [Dynamic capabilities](../guides/dynamic-capabilities): resolve per-session instructions and skills with `defineDynamic`.
* [Hooks](../guides/hooks): run code on session events to update the channel state that dynamic resolvers read.


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: The Harness
description: The out-of-the-box eve agent loop and the built-in tools every agent ships with, plus how to override or disable them.
---

# The Harness



The default harness is what every eve agent ships with. It includes the framework-owned agent loop plus a set of built-in tools the model can call without you writing a line. You extend it with capabilities specific to your agent. The loop itself, how a turn runs and checkpoints and resumes, lives in [Execution model and durability](./execution-model-and-durability).

## Compaction

The harness keeps a long session from overflowing the model's context window. Once the conversation crosses a fraction of the window (`thresholdPercent`, `0.9` by default), it summarizes the older turns into a compact form and keeps going. The summary uses the active turn model unless you override it. Tune when and how it kicks in under [`compaction`](../agent-config#compaction) in `agent.ts`:

```ts title="agent/agent.ts"
export default defineAgent({
  model: "anthropic/claude-opus-4.8",
  compaction: {
    thresholdPercent: 0.75,
  },
});
```

Compaction also preserves the framework's own tool state automatically. It resets read-before-write tracking (so a write afterward re-reads the file whose read evidence was summarized away) and re-injects the active todo list, so the model keeps its task list across the summary. There is no per-tool hook to configure.

## Built-in tools

These ship with every agent, no imports. The harness shows the model the tool descriptors first, then executes only what the model actually calls; discovery never runs them. The shell and file tools (`bash`, `read_file`, `write_file`, `glob`, `grep`) live in the app runtime and proxy their work into the agent's single [sandbox](../sandbox); the rest run in the app runtime. The "Where it runs" column below names where each tool's effect lands.

| Tool                | Does                                                                                                                                                                                                                | Where it runs |
| ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------- |
| `bash`              | Run a shell command.                                                                                                                                                                                                | Sandbox       |
| `read_file`         | Read a text file with line-numbered output (enables read-before-write).                                                                                                                                             | Sandbox FS    |
| `write_file`        | Write a complete file; enforces read-before-write and stale-read detection.                                                                                                                                         | Sandbox FS    |
| `glob`              | Find files by glob pattern.                                                                                                                                                                                         | Sandbox FS    |
| `grep`              | Search file contents by regex.                                                                                                                                                                                      | Sandbox FS    |
| `web_fetch`         | Fetch a URL.                                                                                                                                                                                                        | App runtime   |
| `web_search`        | Search the web (provider-managed; resolved from the model provider).                                                                                                                                                | Provider      |
| `todo`              | Maintain a durable per-session todo list.                                                                                                                                                                           | App runtime   |
| `ask_question`      | Ask the user a clarifying question or a choice mid-turn and park until they answer. No `execute`; the model calls it with `{ prompt, options?, allowFreeform? }`. See [Human-in-the-loop](/docs/human-in-the-loop). | App runtime   |
| `agent`             | Delegate a subtask to a copy of itself (shares the parent sandbox + tools, fresh history/state).                                                                                                                    | App runtime   |
| `load_skill`        | Pull an on-demand [skill](../skills)'s instructions into the current turn. Present only when the agent declares skills.                                                                                             | App runtime   |
| `connection_search` | Discover tools across declared [connections](../connections); matched tools become directly callable. Present only when the agent declares connections.                                                             | App runtime   |

Notes:

* **`agent`** runs a copy of the current agent on a focused task. It inherits the same tools, connections, and instructions, but starts with fresh conversation history and fresh [state](../guides/state). The child shares the parent's sandbox filesystem, so anything it writes is visible to the parent. See [Subagents](../subagents).
* **`load_skill`** only pulls instructions into context. It adds no new execution surface, because behavior still comes from the tools the agent already has.
* **`connection_search`** is the model-facing `connection__search` tool. A search surfaces a connection's tools by their qualified name (e.g. `connection__linear__list_issues`), and the model can then call them directly. It's registered only when the agent has connections.
* **`web_search`** has no local executor; the provider runs it. To supply your own implementation, override it with `defineTool()`.

Review these built-in tools before production use. Disable, wrap, restrict, or require approval for any tool that can access the filesystem, network, shell, or sensitive data.

## Override a default

Author a tool at the same slug and it takes over the built-in of that name. The file `agent/tools/write_file.ts` replaces the built-in `write_file` by existing:

```ts title="agent/tools/write_file.ts"
import { defineTool } from "eve/tools";
import { writeFile } from "eve/tools/defaults";

export default defineTool({
  ...writeFile, // keep the default description, schema, and executor
  async execute(input, ctx) {
    console.log("[write_file]", input.path);
    return writeFile.execute(input, ctx);
  },
});
```

The framework defaults are importable from `eve/tools/defaults` (`bash`, `readFile`, `writeFile`, `glob`, `grep`, `webFetch`, `webSearch`, `todo`, `loadSkill`), so you can spread, wrap, or patch them. Skip the spread and your replacement owns its own context. A fresh `defineTool` for `todo` won't inherit the framework's durable state key.

## Disable a default

Export a `disableTool()` sentinel from a file named after the tool's slug. The filename is what picks the default to remove:

```ts title="agent/tools/bash.ts"
import { disableTool } from "eve/tools";

export default disableTool();
```

If the filename matches no known framework tool, resolution fails instead of silently doing nothing, so a typo surfaces at build time rather than removing the wrong tool.

## When to override, disable, or author a new tool

Three moves shape the harness. The right one depends on whether the model should keep the built-in capability.

* **Override** when you want the same capability with different behavior. Spread the default from `eve/tools/defaults` and wrap it (logging, an extra guard, a different backend), and the model still sees a tool by that name. Spreading keeps the default's description, schema, and any framework state, such as the `todo` tool's durable state key. Drop the spread and your replacement owns its own context, losing that wiring.
* **Disable** when the model should not have the capability at all. A `disableTool()` sentinel removes the built-in, and the model never sees it. Reach for this to lock down `bash` or `web_fetch` in an agent that should not run shell commands or fetch arbitrary URLs.
* **Author a new tool** when you want a capability the harness does not ship. Give it a fresh slug under `agent/tools/` and it joins the built-ins instead of replacing one. See [Tools](../tools) for the authoring model.

## The opt-in `Workflow` tool

An experimental `Workflow` tool ships but stays off by default. To turn it on, re-export the opt-in marker from `agent/tools/workflow.ts`:

```ts
export { ExperimentalWorkflow as default } from "eve/tools";
```

With it on, the model can orchestrate the agent's own subagents from model-authored JavaScript, all as one durable step. See [Dynamic workflows](../guides/dynamic-workflows).

## What to read next

* [Tools](../tools): define your own tools, gate them on approval, and shape their output with `toModelOutput`
* [Dynamic capabilities](../guides/dynamic-capabilities): generate the tool set per session with `defineDynamic`
* [Sandbox](../sandbox): the sandbox the shell and file tools run in


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Execution Model and Durability
description: How an eve session runs. Durable conversations, turns that checkpoint at steps, and parked work that resumes later.
---

# Execution Model and Durability



An eve session is a durable conversation. It can run for days and survives process restarts and redeploys without any work on your part. You write the capabilities (tools, instructions, channels) and eve runs the loop.

## Sessions, turns, and steps

Work nests in three levels:

* **session**: the whole durable conversation or task. It's long-lived and can span many requests over days or weeks without losing context.
* **turn**: one user message and all the work it triggers (model calls, tool calls, reasoning) until the agent produces its response.
* **step**: a durable checkpoint inside a turn (one model call and the tool calls it makes).

Every turn runs as a durable workflow, built on the open-source [Workflow SDK](https://workflow-sdk.dev/) (Vercel Workflow when you deploy on Vercel). eve checkpoints progress and serializes durable state at each step boundary. Your code runs inside a managed step, so tools, the sandbox, and subagents feel synchronous even though the session underneath them is durable.

The Workflow SDK is not inherently tied to Vercel. In local development and in a self-deployed `eve start` process, eve uses the SDK's local world by default; that world persists workflow runs on disk, normally under `.workflow-data`, and dispatches through the same Nitro-hosted workflow routes. On Vercel, the same workflow code runs against Vercel Workflow instead, which adds platform features such as latest production deployment routing and dashboard run metadata.

Nitro hosts the HTTP routes and workflow entrypoints. It does not supply the workflow state store or the sandbox runtime. Those are separate adapters: Workflow uses the active world implementation, and Sandbox uses the backend from `agent/sandbox` or `defaultBackend()`.

For advanced self-hosted deployments, the root `agent.ts` can select the installed Workflow world package to use with `experimental.workflow.world`:

```ts title="agent/agent.ts"
import { defineAgent } from "eve";

export default defineAgent({
  model: "anthropic/claude-opus-4.8",
  experimental: {
    workflow: {
      world: "@workflow/world-postgres",
    },
  },
});
```

The world package backs workflow state, queues, hooks, and streams. Keep secrets and deployment-specific options in runtime environment variables read by that package, not in `agent.ts`. See [agent.ts](../agent-config#workflow-world) and [Workflow Worlds](https://workflow-sdk.dev/worlds).

## Resuming after a crash

Crash the process, hit a timeout, or redeploy mid-turn, and the run picks up from the last completed step rather than replaying the whole turn. Completed steps never re-run; eve replays the recorded result. A step interrupted mid-execution re-runs, so make non-idempotent side effects like charges or emails idempotent, or gate them with approval.

There's nothing to configure. eve owns the workflow lifecycle, and sessions are durable by default.

You don't write workflow code directly. Workflow primitives (`start()`, `resumeHook()`, etc.) are an implementation detail of eve's runtime layer; channels, tools, and hooks never touch them. Two surfaces give your own code session data: tools read the current session's metadata (id, turn, auth, parent lineage) via `ctx.session`, and [`defineState`](../guides/state) reads or writes session-scoped durable state. See [State](../guides/state) for the read/write model.

## Parked work

Some work has to wait, including a human approving a [tool](../tools), an interactive OAuth sign-in for a [connection](../connections), or a long-running [subagent](../subagents). At those points the turn parks durably. The workflow suspends and holds no compute until the input it's waiting on arrives (a click, a callback, a child completing), even if that's much later. When it does, the conversation picks up exactly where it left off.

## Message delivery and queueing

eve does not maintain a durable FIFO queue of user messages for a session. The `continuationToken` is a resume handle for the session's current workflow hook, not a general message-queue address.

When a session is waiting, a delivery to the current continuation token wakes the session and starts the next turn. When a turn is already active, the hook may accept additional deliveries, but the runtime only drains them at specific workflow boundaries. If more than one delivery is ready when the driver checks, eve may fold them into the next turn; that drain is best-effort and depends on workflow and transport timing.

So don't rely on concurrent sends to the same session behaving like a typical ordered chat queue. For deterministic behavior, send one user turn at a time and wait for `session.waiting` before sending the next message to the same session. If your channel can receive bursts while the agent is working, keep your own per-session queue in the channel or app layer, then deliver the next message after the session parks again. Separate sessions still run independently.

## Subagents

A turn can hand work off to a [subagent](../subagents). Each subagent gets its own context and its own durable session; a declared subagent also gets its own sandbox, skills, and state. Nothing crosses the boundary implicitly.

## How eve orders session history

Conversation history within a session is append-only. Turns land in order, and the tool calls inside a turn (plus their results) keep their order too. Read a session back and you see events in the order they happened.

## What to read next

* [Sessions and streaming](./sessions-runs-and-streaming): the handles you hold and the event stream you watch.
* [Security model](./security-model): the trust boundaries the runtime enforces.
* [State](../guides/state): durable per-session memory that persists across step boundaries.


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Security Model
description: eve's trust boundaries, where secrets live, how credentials reach hosts, and what fails closed by default.
---

# Security Model



Your eve agent runs across two contexts, with a trust boundary between them and every secret kept on the trusted side. Use this mental model when deciding what an agent (and the model driving it) is allowed to reach.

## Trust boundaries

|                         | App runtime  | Sandbox               |
| ----------------------- | ------------ | --------------------- |
| `process.env` / secrets | Yes          | No                    |
| Your Node.js code       | Yes          | No                    |
| Network                 | Unrestricted | Controlled by policy  |
| Filesystem              | App's own    | Isolated `/workspace` |

The app runtime is the trusted side. Your tool implementations, model calls, connections, state, and durable execution all run here, with `process.env` and full Node.js available. (On Vercel, this is a Vercel Function.)

The sandbox is the isolated side. The model runs shell commands there through the built-in `bash`, `read_file`, `write_file`, `glob`, and `grep` tools. It gets its own `/workspace` filesystem, but no `process.env`, no secrets, and no path back into the app runtime. (On Vercel, each sandbox is a [Vercel Sandbox](https://vercel.com/docs/sandbox) microVM with hardware-level isolation.) Only shell commands execute in the sandbox. Even the built-in `bash`/`read_file`/`write_file` tools live in the app runtime and *proxy* into the sandbox. The model sees tool definitions and results, never your secrets.

A concrete trace makes the boundary clear. When the model calls a custom `charge_card` tool, its `execute` runs in the app runtime, reads `process.env.STRIPE_KEY`, calls Stripe, and returns `{ ok: true }`. The model sees only `{ ok: true }`: the key never leaves the app runtime, and nothing about the call touches the sandbox. The built-in `write_file` is the mirror image, running in the app runtime and proxying the write into the sandbox `/workspace`. Either way the model drives the work through tool calls and their results, never by holding a credential or reaching the runtime directly.

## Data flow at a glance

<Mermaid
  chart="flowchart LR
  User[&#x22;User or channel provider&#x22;] --> Channel[&#x22;Channel route and route auth&#x22;]
  Channel --> Runtime[&#x22;eve app runtime and durable session&#x22;]
  Runtime --> Model[&#x22;Configured model provider or Vercel AI Gateway&#x22;]
  Runtime --> Tools[&#x22;Authored tools and connections&#x22;]
  Tools --> Services[&#x22;Customer-selected external services&#x22;]
  Runtime --> Sandbox[&#x22;Per-session sandbox&#x22;]
  Sandbox --> Egress[&#x22;Allowed sandbox network egress&#x22;]
  Runtime --> Telemetry[&#x22;Configured telemetry or eval provider&#x22;]"
/>

eve sends data where your agent configuration and runtime choices send it:

* Inbound channel data flows through the channel provider you configure, then into the eve app runtime.
* Model inputs and outputs flow to the model or routing path selected in `agent.ts`, such as a Vercel AI Gateway model id or a provider-authored `LanguageModel`.
* Tool and connection calls flow to the external services, MCP servers, OpenAPI endpoints, and channels you configure.
* Sandbox commands can reach network destinations allowed by the sandbox network policy.
* Telemetry and eval data flows to the exporters and providers you configure in `instrumentation.ts` or eval settings.

eve stores durable session and workflow state needed to resume conversations, stream events, replay completed steps, and show run observability. You are responsible for deciding whether the selected channels, model providers, connected services, sandbox egress destinations, telemetry exporters, retention settings, and deletion controls are appropriate for your data and use case.

## Credential brokering

Credential brokering gives the model *authenticated* network access from inside the sandbox, like a `git clone` of a private repo or an authenticated `curl`, when there's no [tool](../tools) or [connection](../connections) to route it through. On the Vercel Sandbox backend, auth headers get injected at the sandbox's network firewall for matching domains. The secret stays in the app runtime; the sandbox process only ever sees the response. See [Vercel Sandbox Credential Brokering](https://vercel.com/docs/sandbox/concepts/firewall#credentials-brokering) for the platform mechanism, and [Sandbox](../sandbox) for the eve policy API.

## Connection credentials

[Connection](../connections) tokens (MCP and OpenAPI) come from either `getToken()` or an interactive OAuth flow, and eve injects the resolved token into every outbound request. The token is cached per step and never serialized to durable state.

## Channel verification

A [channel](../channels/overview) is your agent's front door, so authenticating inbound traffic is its job. The built-in platform channels follow two rules, and so must any channel you write yourself:

* **Verify signatures in constant time.** Platform channels (Slack, GitHub,
  Telegram, Twilio) verify the platform's HMAC signature over the raw request body
  with a constant-time comparison, so timing the response can't reveal a forged
  signature. Use a constant-time compare for any secret you check, never `===` on
  a signature.
* **Don't trust body-supplied identity.** Derive the caller from a *verified*
  signature or token, never from a `principalId` (or similar) the request body
  claims. A body field is attacker-controlled; treating it as identity is
  cross-user impersonation.

A custom channel that accepts dashboard-style webhooks should follow the same shape: authenticate the raw body with an HMAC, compare signatures in constant time, and trust any body-supplied principal only after the signature verifies.

## Authored markdown is data

[Skill](../skills) and [schedule](../schedules) files are markdown with YAML frontmatter, and eve treats that frontmatter strictly as data. The code-capable engines (`---js` / `---javascript`, which would `eval()` the frontmatter body the moment the file is parsed) are disabled, so such a fence throws rather than running. Frontmatter has to parse to a plain YAML object.

## Auth fails closed

Routes reject unauthenticated traffic by default. If no `AuthFn` in the walk accepts the request, it gets a `401`, and admitting anonymous callers takes an explicit `none()`. The scaffold's `placeholderAuth()` keeps a half-configured app closed in production until you replace it. See [Auth & route protection](../guides/auth-and-route-protection) for the full walk and verifiers.

## Pre-production checklist

Before exposing an agent to real traffic:

* [ ] Replace `placeholderAuth()` in `agent/channels/eve.ts` with a real
  `AuthFn` (`vercelOidc()`, `httpBasic()`, `oidc()`, or your own). Verify an
  unauthenticated production request gets `401`.
* [ ] Verify channel signatures. Each platform channel needs its signing
  secret set; custom channels must verify signatures in constant time and never
  trust body-supplied identity.
* [ ] Keep secrets in `process.env`, never in compiled artifacts, never
  passed into the sandbox. Route privileged calls through tools or connections.
* [ ] Scope connection tokens to the least privilege the agent needs; they
  reach hosts but never the model.
* [ ] Set a sandbox network policy tighter than `allow-all` if the model
  shouldn't have open egress; use credential brokering for authenticated egress.
* [ ] Don't surface untrusted text as markup. Model- or user-controlled
  strings rendered into a channel UI should be escaped for that surface.

## What to read next

* [Auth & route protection](../guides/auth-and-route-protection): the full auth walk and verifier helpers
* [Sandbox](../sandbox): backends, network policy, and brokering config
* [Execution model and durability](./execution-model-and-durability): how durable sessions run
* [Connections](../connections): static-token and OAuth connections


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Sessions, Runs & Streaming
description: The session and run contract you touch: continuation tokens, stream handles, the NDJSON event stream, and reconnecting.
---

# Sessions, Runs & Streaming



Every eve app speaks the same stable HTTP API to a [durable session](./execution-model-and-durability). This page is the contract you hold: the handles you get back, the events you stream, and how to reconnect.

## The two handles

Two handles do two jobs, and mixing them up is the most common mistake. One handle creates and resumes a session; a different one streams and inspects it.

* **`continuationToken`**: the resume handle. Use it to send a follow-up message to the same conversation. Owned by the channel.
* **`sessionId` / `runId`**: the stream-and-inspect handle. Use it to attach to the event stream and watch a run. Owned by the runtime.

A session has one active continuation at a time: each follow-up uses the current `continuationToken`, and a stale one is rejected.

React, Vue, and Svelte apps reach for [`useEveAgent()`](../guides/frontend/overview) instead of calling these routes by hand. Next.js and Nuxt apps can proxy them to the eve runtime from the same origin.

## Start a session

```bash
curl -X POST http://127.0.0.1:3000/eve/v1/session \
  -H 'content-type: application/json' \
  -d '{"message":"Summarize the latest forecast."}'
```

eve responds right away. The JSON body carries a `sessionId` and a `continuationToken`, and the `x-eve-session-id` header names the durable session to stream.

## Stream a session

```bash
curl http://127.0.0.1:3000/eve/v1/session/<sessionId>/stream
```

The stream is newline-delimited JSON (NDJSON), one event per line:

| Event                     | Meaning                                                                                                          |
| ------------------------- | ---------------------------------------------------------------------------------------------------------------- |
| `session.started`         | A durable session was created.                                                                                   |
| `turn.started`            | A new turn began.                                                                                                |
| `message.received`        | An inbound user message was accepted.                                                                            |
| `step.started`            | A model step began.                                                                                              |
| `actions.requested`       | The model requested tool calls.                                                                                  |
| `action.result`           | A tool call returned.                                                                                            |
| `input.requested`         | The run paused for human input ([HITL](/docs/human-in-the-loop) approval or `ask_question`); carries `requests`. |
| `subagent.called`         | A subagent was delegated; carries `childSessionId` to attach to.                                                 |
| `subagent.completed`      | A delegated subagent finished.                                                                                   |
| `reasoning.appended`      | A reasoning delta (incremental, with cumulative text so far).                                                    |
| `reasoning.completed`     | The finalized reasoning block.                                                                                   |
| `message.appended`        | An assistant text delta (incremental, with cumulative text so far).                                              |
| `message.completed`       | A finalized assistant text block.                                                                                |
| `result.completed`        | The finalized structured result for a turn that requested an output schema; carries `result`.                    |
| `compaction.requested`    | Context-window compaction began; carries `modelId`, `sessionId`, `turnId`, `usageInputTokens`.                   |
| `compaction.completed`    | A compaction checkpoint was written to durable history.                                                          |
| `authorization.required`  | A connection needs OAuth; carries `name`, `description`, and an `authorization` challenge.                       |
| `authorization.completed` | A connection's authorization resolved; carries `outcome`.                                                        |
| `step.completed`          | A model step finished; carries `finishReason` and usage.                                                         |
| `step.failed`             | A model step failed; carries `{ code, message, details? }`.                                                      |
| `turn.completed`          | The turn finished.                                                                                               |
| `turn.failed`             | The turn failed; carries `{ code, message, details? }`.                                                          |
| `session.waiting`         | The session parked, waiting for the next input (a message, an answer).                                           |
| `session.failed`          | The session failed.                                                                                              |
| `session.completed`       | The session reached a terminal end.                                                                              |

`reasoning.appended` and `message.appended` stream deltas as they arrive, and each one carries both the new delta and the cumulative text for the current block. The finalized block shows up on `message.completed` and `reasoning.completed`, which is the compatibility path for clients that don't render incremental streaming.

Note: consider the privacy, confidentiality, and user-experience implications for displaying, storing, or transmitting reasoning events in your application.

`message.completed` can fire more than once in a turn: the agent often emits interim assistant text before a tool call. To tell tool-call narration from a terminal reply, check `message.completed.data.finishReason`. `step.completed.data.finishReason` mirrors the step outcome, and usage lives on `step.completed`.

A delegated subagent publishes progress on its own child-session stream. The parent only emits `subagent.called` with a `childSessionId`, which a client uses to attach.

`step.failed` and `turn.failed` carry `{ code, message, details? }` for the failed fragment or turn, and `session.failed` is the terminal session-level variant. When a turn requested an output schema, the finalized payload lands on `result.completed` as `data.result` before the turn boundary. `authorization.required` carries the sign-in challenge (`data.authorization` may include `url`, `userCode`, `expiresAt`, `instructions`), and `authorization.completed` carries `data.outcome` (`"authorized" | "declined" | "failed" | "timed-out"`).

## Send a follow-up message

Once the session is waiting (you'll see `session.waiting`), POST your follow-up to the session endpoint with the stored continuation token:

```bash
curl -X POST http://127.0.0.1:3000/eve/v1/session/<sessionId> \
  -H 'content-type: application/json' \
  -d '{"continuationToken":"<token>","message":"Now send the short version."}'
```

The follow-up reuses the same durable session: same history, same state.

For deterministic ordering, send one follow-up at a time and wait for the next `session.waiting` event before sending another message to the same session. See [message delivery and queueing](./execution-model-and-durability#message-delivery-and-queueing) for the current runtime contract.

## Reconnect and rewind

The stream is durable. Every event is recorded before a step completes, so the whole stream is replayable. Pass `startIndex` to reconnect by event count and pick up where you dropped off, or rewind to the start:

```bash
curl "http://127.0.0.1:3000/eve/v1/session/<sessionId>/stream?startIndex=<count>"
```

## Use the client from TypeScript

For scripts, server-to-server calls, tests, evals, and custom UIs, `eve/client` wraps these routes in a typed client so you don't hand-roll the POST and NDJSON stream loop.

Start with the [TypeScript SDK](../guides/client/overview) guide. It covers basic usage, sending messages, continuations, streaming, and per-turn `outputSchema` results.

## Inspect the agent over HTTP

`GET /eve/v1/info` returns a JSON inspection snapshot for the running agent: model, instructions, authored and framework tools, skills, channels, schedules, subagents, sandbox, connections, hooks, workflow, and workspace metadata. Local development accepts loopback requests; deployed Vercel targets require the route's OIDC auth.

```bash
curl http://127.0.0.1:3000/eve/v1/info
```

The route uses the same default auth chain as the eve channel (`[localDev(), vercelOidc()]`). Locally it answers anonymously; a deployed Vercel target requires a valid OIDC bearer, with a same-project bypass for in-deployment callers. See [auth & route protection](../guides/auth-and-route-protection).

## Dispatch order

Every stream event runs four steps, in this order:

1. **Channel handler**: the channel's event handler runs and can mutate adapter state.
2. **Metadata projection**: the framework re-evaluates the channel's `metadata(state)` and stores the result.
3. **Hooks**: authored [hooks](../guides/hooks) subscribed to the event fire.
4. **Dynamic resolvers**: [dynamic](../guides/dynamic-capabilities) tool, skill, and instruction resolvers fire, and `ctx.channel.metadata` already holds the freshly projected metadata from step 2.

The order is structural, not incidental. By the time a resolver or hook reads channel metadata, the channel has already updated its state and the projection is current.

## What to read next

* [Execution model & durability](./execution-model-and-durability): what makes a session durable and how parked work resumes.
* [Channels](../channels/overview): what owns the continuation token and delivery.
* [TypeScript SDK](../guides/client/overview): call these routes from scripts and server-side code.
* [Frontend](../guides/frontend/overview): `useEveAgent` instead of raw routes.


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Assertions
description: Run-level methods, t.check value assertions, the matcher mini-language, and gate vs soft severity.
---

# Assertions



Assertions are how an eval grades what its `test(t)` function produced. Each one **records** a result onto `t` and returns a chainable handle. The runner reads the recorded results to compute the verdict, so a single run reports every failing assertion rather than dying on the first. There are two deterministic surfaces: run-level methods on `t`, and `t.check` for grading a specific value. For model-graded assertions, see [Judge](./judge).

## Run-level assertions

Run-level assertions read the whole run, so they take no value. They are methods on `t` and gate by default. Several key off whether a run **parked**: paused on an unanswered human-in-the-loop (HITL) input request, waiting for an approval or answer before it can continue.

| Assertion                                           | Asserts                                                                                 |
| --------------------------------------------------- | --------------------------------------------------------------------------------------- |
| `t.completed()`                                     | The run did not fail and did not park on unanswered HITL input                          |
| `t.didNotFail()`                                    | No terminal failure and no `turn.failed`/`step.failed` events (parked runs pass)        |
| `t.waiting()`                                       | The run parked on HITL input (for approval-shaped evals)                                |
| `t.messageIncludes(token)`                          | Joined assistant text contains `token` (string or RegExp)                               |
| `t.outputEquals(value)` / `t.outputMatches(schema)` | Deep equality or Standard Schema (e.g. Zod) validation of the agent's structured output |
| `t.calledTool(name, opts?)`                         | A matching tool call happened (`input`, `output`, `isError`, `times` constraints)       |
| `t.loadedSkill(skill, opts?)`                       | Sugar for `t.calledTool("load_skill", { input: { skill }, ...opts })`                   |
| `t.notCalledTool(name)`                             | No call to `name`                                                                       |
| `t.toolOrder([...names])`                           | Tool names appear in order (other calls may interleave)                                 |
| `t.usedNoTools()`                                   | No tool calls at all                                                                    |
| `t.maxToolCalls(n)`                                 | At most `n` tool calls                                                                  |
| `t.noFailedActions()`                               | No tool, subagent, or skill action reported a failure                                   |
| `t.calledSubagent(name, opts?)`                     | A subagent delegation happened (`remoteUrl`, `output` constraints)                      |
| `t.event(predicate, label)`                         | Escape hatch: any predicate over the typed event stream                                 |

`t.completed()` subsumes `t.didNotFail()`, so reach for `completed` unless you specifically want to allow a parked run. The structured output that `t.outputEquals` and `t.outputMatches` read is the agent's structured output (see the [output schema guide](../guides/client/output-schema)).

```ts
await t.send("What is the weather in Brooklyn?");
t.completed();
t.calledTool("get_weather");
```

`t.calledTool` and `t.usedNoTools` are mutually exclusive; assert one or the other, never both in the same run.

## Value assertions with `t.check`

`t.check(value, assertion)` grades an explicit value against a builder from `eve/evals/expect`. The value can be `t.reply`, a turn's `.message`, parsed JSON, or any local you computed:

```ts
import { includes, equals, matches, similarity } from "eve/evals/expect";

t.check(t.reply, includes("sunny")); // substring (gate)
t.check(parsed, equals({ city: "Brooklyn" })); // deep structural equality (gate)
t.check(parsed, matches(WeatherSchema)); // Standard Schema, e.g. Zod (gate)
t.check(t.reply, similarity("Sunny, 72F")); // fuzzy 0–1 Levenshtein (soft)
```

| Builder                | Scores                                           | Default |
| ---------------------- | ------------------------------------------------ | ------- |
| `includes(substring)`  | value (coerced to string) contains `substring`   | gate    |
| `equals(value)`        | deep structural equality                         | gate    |
| `matches(schema)`      | validates against a Standard Schema              | gate    |
| `similarity(expected)` | normalized Levenshtein similarity, 1 = identical | soft    |

Pick the cheapest builder that captures what "correct" means. When exact match is too strict but a judge model is overkill, `similarity` is the middle ground. For nuanced grading, reach for the [judge](./judge).

## The matcher mini-language

`t.calledTool` and `t.calledSubagent` take a matcher object: `{ input, output, isError, times }` for tools, `{ remoteUrl, output }` for subagents. Each field accepts a literal (objects partial-deep-match), a RegExp, or a function. A matcher function receives the value and returns either a boolean (acts as a predicate) or an expected value to compare against (handy for runner-assigned values like environment-provided URLs):

```ts
t.calledTool("bash", { input: { command: /^pwd/ }, isError: false, times: 1 });

t.calledTool("echo", { output: (value) => String(value).includes(marker) });

t.calledSubagent("weather", {
  remoteUrl: () => process.env.WEATHER_AGENT_URL!,
  output: /72F/,
});
```

## Run state and derived facts

Beyond the raw `t.events` stream, the runner derives typed facts the assertions read: tool calls (name, input, output, error state), subagent calls, and HITL input requests. A turn that leaves the session open for a next message is the normal end state of a successful turn; parking on unanswered HITL input is tracked separately, and that is what `t.completed()` and `t.waiting()` key off.

The built-in assertions cover almost everything. When you need to read the stream directly, `t.event(predicate, label)` is the escape hatch:

```ts
t.event(
  (events) =>
    events.some((e) => e.type === "message.completed" && e.data.message?.includes(marker)),
  "assistant reply includes the marker",
);
```

## Severity

Every assertion returns a chainable handle. Severity rides on the assertion, so there is no separate thresholds map to keep in sync.

* `.gate(threshold?)` is hard. A miss marks the eval `failed` and `eve eval` exits non-zero.
* `.soft(threshold?)` is tracked data. A below-threshold miss marks the eval `scored`, fatal only under `--strict`. With no threshold, it is tracked-only and never fails.
* `.atLeast(threshold)` is soft with a bar (equivalent to `.soft(threshold)`).

The defaults are chosen so you rarely set severity. Run-level methods and `includes`/`equals`/`matches` are gates; `similarity` and every `t.judge.*` assertion are soft. Annotate only when you deviate:

```ts
t.calledTool("get_weather").soft(); // record the tool call as a metric, don't gate
t.check(t.reply, similarity("Sunny")).atLeast(0.8); // gate the fuzzy match under --strict
t.check(t.reply, includes("error")).soft(); // track without failing the build
```

## What to read next

* [Judge](./judge): LLM-graded assertions with thresholds
* [Cases](./cases): where assertions attach
* [Running evals](./running): how verdicts map to exit codes


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Cases
description: Author single-turn and multi-turn evals with test(t), and fan one file out over a dataset.
---

# Cases



Each eval file is one graded case by default, and a single file can fan out over a dataset by default-exporting an array (covered below). The runner executes each `test(t)` function against the target, captures every event, and computes a verdict from the [assertions](./assertions) you recorded. Every eval shares one shape, whether single-turn, multi-turn, human-in-the-loop (HITL), or dataset-driven: one `async test(t)` function that drives the agent and asserts inline.

## Single-turn evals

The common case sends one turn and asserts on the reply. `t.send(input)` resolves once the turn settles, and `t.reply` is the last assistant message:

```ts title="evals/weather/brooklyn-forecast.eval.ts"
import { defineEval } from "eve/evals";
import { includes } from "eve/evals/expect";

export default defineEval({
  async test(t) {
    await t.send("What is the weather in Brooklyn?");
    t.completed();
    t.check(t.reply, includes("Sunny"));
  },
});
```

Some evals only care about behavior, not text. Assert on the run and skip the content check entirely:

```ts title="evals/weather/no-tools-for-greetings.eval.ts"
import { defineEval } from "eve/evals";

export default defineEval({
  async test(t) {
    await t.send("Hello!");
    t.completed();
    t.notCalledTool("get_weather");
  },
});
```

## Organizing with directories

Identity is the file path, so directories are the grouping mechanism. `evals/weather/brooklyn-forecast.eval.ts` gets the id `weather/brooklyn-forecast`, and `eve eval weather` runs everything under `evals/weather/`. Shared constants and helpers live in sibling non-eval files (any name that doesn't end in `.eval.ts`):

```text
evals/
├── weather/
│   ├── shared.ts                    # helpers, not an eval
│   ├── brooklyn-forecast.eval.ts
│   └── no-tools-for-greetings.eval.ts
└── smoke.eval.ts
```

## Multi-turn evals

Drive several turns in sequence for branching, HITL approvals, structured output, attachments, or multiple sessions. Because assertions live in the function, an intermediate value is a local variable. Judge a draft before the next turn overwrites it, then keep going.

```ts title="evals/draft-then-send.eval.ts"
import { defineEval } from "eve/evals";
import { includes } from "eve/evals/expect";

export default defineEval({
  async test(t) {
    const draft = await t.send("Draft the follow-up email.");
    t.check(draft.message, includes("Best regards"));
    t.judge.autoevals.closedQA("professional tone", { on: draft.message }).atLeast(0.6);

    await t.send("Now send it.");
    t.calledTool("send_email");
  },
});
```

For a precondition no built-in assertion expresses, `throw`. A thrown error marks the eval `failed` with the message in the result:

```ts title="evals/session-continuity.eval.ts"
import { defineEval } from "eve/evals";
import { includes } from "eve/evals/expect";

export default defineEval({
  async test(t) {
    await t.send("My favorite word is marigold.");
    const firstSessionId = t.sessionId;

    const second = await t.send("Thanks for remembering.");
    second.expectOk();
    if (t.sessionId !== firstSessionId) {
      throw new Error(`Expected one session; got ${firstSessionId} then ${t.sessionId}.`);
    }

    t.completed();
    t.check(second.message, includes("Thanks for remembering."));
  },
});
```

## The drive API

`t` drives the primary session; `t.newSession()` returns an independent `EveEvalSession` against the same target, whose events feed the same run-level assertions.

* `t.send(input)` sends a turn and waits for it to settle. It accepts the same input as `ClientSession.send()` (a string or a structured message) and resolves to a turn carrying `.message` and `.expectOk()`.
* `t.sendFile(text, path, mediaType?)` attaches a local file as a data URL.
* `t.expectInputRequests(filter?)` asserts the previous turn parked on HITL input and returns the pending requests.
* `t.respond(...responses)` answers specific pending input requests and sends them as the next turn.
* `t.respondAll(optionId)` answers every pending input request with the same option and sends the responses as the next turn.
* `t.reply` is the last assistant message (or `null`); `t.sessionId` is the current session id; `t.events` is the full typed event stream captured so far.

Each `send` (and `respond`/`respondAll`) resolves to a turn whose `expectOk()` throws only when the turn ended failed. A session left open for a next message is the normal end state of a successful turn.

Events from every session are captured in the result and artifacts. `t.log(message)` records debug lines into the eval artifact; `--verbose` also streams them to stdout as evals run. `t.signal` is an `AbortSignal` that fires on timeout.

For driving sessions created outside the eval, by a channel webhook or a schedule, see [Targets](./targets).

## Datasets: exporting an array

To fan one file out over a dataset, default-export an array of `defineEval(...)` values. Eval modules are ESM, so top-level `await` can load anything. Ids derive from the file name plus a zero-padded index in array order (`sql/0000`, `sql/0001`, and so on). The loaders (`loadJson`, `loadYaml` from `eve/evals/loaders`) parse fixture files relative to the app root:

```ts title="evals/sql.eval.ts"
import { defineEval } from "eve/evals";
import { loadYaml } from "eve/evals/loaders";
import { equals } from "eve/evals/expect";

const doc = await loadYaml("evals/data/cases.yaml");
const rows = doc.evals as readonly { task: string; prompt: string; sql: string }[];

export default rows.map((row) =>
  defineEval({
    description: row.task,
    async test(t) {
      await t.send(row.prompt);
      t.completed();
      t.check(t.reply, equals(row.sql));
    },
  }),
);
```

The loaders are meant for fixtures, not runtime agent code.

## What to read next

* [Assertions](./assertions): assert on what the eval did
* [Judge](./judge): grade quality with an LLM judge
* [TypeScript client](../guides/client/messages): the send/turn protocol eval sessions build on


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Judge
description: Grade evals with an LLM judge via t.judge.autoevals, set thresholds on the assertion, and configure the judge model.
---

# Judge



When no deterministic [assertion](./assertions) captures what "good" means (factual correctness, summary quality, free-form criteria), grade the run with an LLM judge. The `t.judge.*` assertions are the only model-backed ones, and they use a judge model that is resolved separately from the agent under test. eve only uses it for scoring, never to swap out the agent.

```ts
import { defineEval } from "eve/evals";

export default defineEval({
  async test(t) {
    await t.send("Explain quantum tunneling to a 10-year-old.");
    t.completed();
    t.judge.autoevals.closedQA("uses no math beyond arithmetic").atLeast(0.8);
  },
});
```

## The graders

The judges live under `t.judge.autoevals`. The namespace names the [Braintrust autoevals](https://github.com/braintrustdata/autoevals) grader family, so the factuality and closedQA semantics are autoevals', not eve-invented. Each grader scores `t.reply` by default and is soft by default (tracked, no gate):

| Grader                                   | Grades                                                                                 |
| ---------------------------------------- | -------------------------------------------------------------------------------------- |
| `t.judge.autoevals.factuality(expected)` | Factual consistency of the reply against an expected answer (A–E buckets)              |
| `t.judge.autoevals.summarizes(expected)` | How well the reply summarizes the expected text                                        |
| `t.judge.autoevals.closedQA(criteria)`   | Whether the reply satisfies a free-form yes/no criterion (no expected answer to match) |
| `t.judge.autoevals.sql(expected)`        | Semantic equivalence of two SQL statements                                             |

The reference or criteria is the positional argument. An options object follows:

* `on` is the value to grade, defaulting to `t.reply`. Pass an intermediate draft or parsed value to grade it instead.
* `model` and `modelOptions` are a per-call judge override (see below).

```ts
const draft = await t.send("Draft the welcome email.");
t.judge.autoevals.closedQA("professional tone", { on: draft.message }).atLeast(0.6);
```

## Soft scoring and thresholds

Judge assertions are soft, so the threshold rides on the assertion handle. There is no separate thresholds map:

* **No threshold** is tracked-only. The score lands in reports and artifacts and never fails the eval. Use it to watch a metric without gating on it.
* `.atLeast(threshold)` is a soft bar. A below-threshold score marks the eval `scored`, fatal only under `eve eval --strict`.
* `.gate(threshold)` promotes a judge to a hard gate that fails the eval outright.

```ts
t.judge.autoevals.closedQA("cites a source"); // tracked, never fails
t.judge.autoevals.closedQA("cites a source").atLeast(0.6); // soft, fails under --strict below 0.6
t.judge.autoevals.factuality(reference).gate(0.8); // hard gate at 0.8
```

A judge runs once per assertion and burns tokens, so reach for one only when nothing deterministic will do. Several slow judge calls in one eval can fan out with `await Promise.all([...])`.

## Configuring the judge model

The judge model is resolved once when the runner builds `t`. It is **never** the model under test. Three levels resolve innermost-wins:

1. **Per-call**: `t.judge.autoevals.closedQA("…", { model, modelOptions })`.
2. **Per-eval**: `defineEval({ judge: { model, modelOptions }, test })`.
3. **Project default**: `defineEvalConfig({ judge: { model, modelOptions } })` in `evals.config.ts`.

```ts title="evals/evals.config.ts"
import { defineEvalConfig } from "eve/evals";

export default defineEvalConfig({
  judge: { model: "openai/gpt-5.4-mini" }, // the default judge for every eval in this tree
});
```

```ts title="evals/quantum.eval.ts"
import { defineEval } from "eve/evals";

export default defineEval({
  judge: { model: "anthropic/claude-opus-4.8" }, // a stronger judge for this eval
  async test(t) {
    await t.send("Explain quantum tunneling to a 10-year-old.");
    t.judge.autoevals.factuality(reference).atLeast(0.7);
    t.judge.autoevals.closedQA("is concise", { model: "anthropic/claude-haiku-4.5" }); // cheaper, per-call
  },
});
```

`judge` in `evals.config.ts` is optional, and a tree of fully deterministic evals can omit it. Calling `t.judge.*` with no judge model resolved records a failed gate: the runner scores the assertion after the `test` function runs, the missing model throws, and the eval fails with that message.

A **string model id** (e.g. `"anthropic/claude-opus-4.8"`) routes through the Vercel AI Gateway and needs `AI_GATEWAY_API_KEY` or `VERCEL_OIDC_TOKEN` in the environment. An **AI SDK `LanguageModel` instance** is used directly. With a model configured but no credentials, a judge-backed eval **skips visibly** rather than failing, so the run reports the skip instead of a spurious error. For provider-specific judge settings, use `modelOptions.providerOptions`.

## What to read next

* [Assertions](./assertions): deterministic run-level and value assertions
* [Reporters](./reporters): ship judged scores to Braintrust experiments
* [Targets](./targets): local vs remote targets for judge-backed evals


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Overview
description: Define repeatable scored checks for an eve agent with defineEval and run them with eve eval.
---

# Overview



An eval is a scored check that runs your agent against real sessions and grades the result, catching regressions when you change a prompt or a tool. Drive the agent through one or more turns, assert on what it did (the run completed, the right tool ran, the reply contains the right text), and optionally ship the results to Braintrust.

Evals exercise the same HTTP surface your users hit. The runner boots (or targets) a real agent server, drives sessions through the [TypeScript client](../guides/client/overview) protocol, and grades what comes back, so a passing eval means the agent booted, accepted a request, and produced the result you asserted.

## `defineEval`

eve discovers evals under the app-root `evals/` directory, in `.eval.ts` files. Each file is one eval by default. A file can also default-export an array to fan out over a dataset (see [Cases](./cases)). The file path is the eval's identity, so you don't author an `id` or `name`. Directories group related evals (`evals/weather/brooklyn-forecast.eval.ts` becomes id `weather/brooklyn-forecast`).

```text
my-agent/
├── agent/
├── evals/
│   ├── evals.config.ts
│   ├── smoke.eval.ts
│   └── weather/
│       ├── brooklyn-forecast.eval.ts
│       └── no-tools-for-greetings.eval.ts
└── package.json
```

An eval is a single `async test(t)` function. You drive the agent with `t` and assert on the run with the same `t`:

```ts title="evals/weather/brooklyn-forecast.eval.ts"
import { defineEval } from "eve/evals";
import { includes } from "eve/evals/expect";

export default defineEval({
  description: "Basic message and tool-usage coverage for the weather agent.",
  async test(t) {
    await t.send("What is the weather in Brooklyn?");
    t.completed();
    t.calledTool("get_weather");
    t.check(t.reply, includes("Sunny"));
  },
});
```

`test` is the only required field. The rest are optional: `description`, `judge`, `tags`, `metadata`, `timeoutMs`, and `reporters`. The init template adds `evals/**/*.ts` to `tsconfig.json`, so your eval code type-checks alongside the app.

## `evals.config.ts`

Every `evals/` directory needs exactly one `evals.config.ts` at its root. It declares the defaults every eval shares:

```ts title="evals/evals.config.ts"
import { defineEvalConfig } from "eve/evals";
import { Braintrust } from "eve/evals/reporters";

export default defineEvalConfig({
  judge: { model: "openai/gpt-5.4-mini" },
  reporters: [Braintrust({ projectName: "my-agent" })],
});
```

Everything is optional. `judge` sets the default model for [LLM-as-judge](./judge) assertions (`t.judge.*`); a tree of fully deterministic evals can omit it. `reporters`, `maxConcurrency`, and `timeoutMs` round out the defaults. Config `reporters` observe every eval in the run, so set one `Braintrust()` here instead of adding it to each eval. CLI flags (`--max-concurrency`, `--timeout`) and per-eval values take precedence over the config defaults.

## The `t` context

`t` is both the driver and the assertion surface. There are no separate `input`, `run`, `checks`, or `scores` fields. You write ordinary control flow, sending turns and asserting inline.

* **Drive** the agent: `t.send(...)`, `t.respond(...)`, `t.respondAll(...)`, `t.sendFile(...)`, `t.expectInputRequests(...)`, `t.newSession()`. Read what came back with `t.reply` (the last assistant message), `t.sessionId`, and `t.events`. See [Cases](./cases).
* **Assert** with three surfaces, covered next.

## Three assertion surfaces

Each surface matches a genuinely different kind of judgment:

* **Run-level methods** read the whole run, like `t.completed()`, `t.calledTool("get_weather")`, `t.usedNoTools()`, and `t.toolOrder([...])`. They take no value because they observe the run itself. See [Assertions](./assertions).
* **`t.check(value, assertion)`** grades an explicit value with a deterministic builder from `eve/evals/expect`, such as `t.check(t.reply, includes("sunny"))`. Grade `t.reply`, an intermediate draft, parsed JSON, or anything else. See [Assertions](./assertions).
* **`t.judge.autoevals.*`** is the LLM-as-judge surface, like `t.judge.autoevals.closedQA("cites a source")`. It grades `t.reply` by default and uses the configured judge model, never the agent under test. See [Judge](./judge).

## Gate vs soft

Every assertion returns a chainable handle, so severity rides on the assertion itself. There is no separate thresholds map.

* **Gates** are hard. A failed gate marks the eval `failed` and `eve eval` exits non-zero. Run-level methods, `includes`, `equals`, and `matches` are gates by default.
* **Soft** assertions are tracked data. They land in reports and artifacts, and a below-threshold soft assertion marks the eval `scored` (visible but not fatal, unless you pass `--strict`). `similarity` and every `t.judge.*` assertion are soft by default. A soft assertion with no threshold is tracked-only and never fails.

Override per assertion: `.gate(threshold?)` promotes to a hard gate, `.soft(threshold?)` demotes to tracked, and `.atLeast(threshold)` is a soft assertion with a bar.

```ts
t.completed(); // gate
t.calledTool("get_weather").soft(); // record as a metric, don't gate
t.judge.autoevals.closedQA("cites a source"); // soft, tracked (no threshold)
t.judge.autoevals.factuality(reference).atLeast(0.7); // soft, gated under --strict at 0.7
```

## Run evals with eve eval

```bash
eve eval                       # run all discovered evals against a local dev server
eve eval weather               # run one eval, or every eval under evals/weather/
eve eval --url https://<app>   # target an existing server or deployment
```

Exit code `0` means every eval passed its gates. See [Running evals](./running) for the full flag list, exit codes, and CI guidance.

## A good baseline

Most apps do fine with a few small smoke evals. Assert behavior with `t.completed()` plus one or two content checks, keep dataset fixtures in `evals/data/`, and reach for a judge or Braintrust only when you need fuzzy grading or shared result review. In CI, run `eve eval --strict` so soft threshold misses fail the build too.

## What to read next

The rest of this section covers each piece:

* [Cases](./cases): single-turn evals, scripted multi-turn evals, and dataset fan-out
* [Assertions](./assertions): run-level methods and `t.check` value assertions, with matchers and severity
* [Judge](./judge): LLM-as-judge grading and the judge model
* [Targets](./targets): local vs remote targets for the same eval files
* [Reporters](./reporters): Braintrust experiments and JUnit XML
* [Running evals](./running): the `eve eval` CLI, exit codes, and artifacts
* [Tools](../tools): the surface most evals assert on


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Reporters
description: Ship eval results to Braintrust experiments or JUnit XML. eve runs and scores everything itself.
---

# Reporters



eve runs and grades everything itself; reporters ship the results out. The CLI prints a console summary by default (one line per eval, with failed assertions and their messages), and reporters from `eve/evals/reporters` add destinations on top.

You are responsible for ensuring any observability or eval provider is approved for the data exported to it.

Reporters attach in two places. Declare them in `evals.config.ts` to observe **every** eval in the run, the usual choice for a shared destination like one Braintrust experiment, so you don't repeat the reporter in each file. Or list them on an individual eval's `reporters` to scope a destination to that eval (or to a group of evals that share one instance).

## Braintrust

`Braintrust(...)` uploads eval results to Braintrust experiments. Put one instance in the config so it covers the whole run:

```ts title="evals/evals.config.ts"
import { defineEvalConfig } from "eve/evals";
import { Braintrust } from "eve/evals/reporters";

export default defineEvalConfig({
  judge: { model: "openai/gpt-5.4-mini" },
  reporters: [Braintrust({ projectName: "weather-agent" })],
});
```

Need a destination for only some evals? Attach it per eval instead:

```ts title="evals/brooklyn-forecast.eval.ts"
import { defineEval } from "eve/evals";
import { Braintrust } from "eve/evals/reporters";

export default defineEval({
  reporters: [Braintrust({ projectName: "weather-agent" })],
  async test(t) {
    await t.send("What is the weather in Brooklyn?");
    t.completed();
  },
});
```

The reporter config takes an optional `projectName` and `experimentName`, plus a base experiment (by name or id) to diff against. Gate assertions log as binary scores under a `gate:` prefix so experiments diff gate regressions the same way they diff soft-score regressions. Eval `metadata` rides along to reporters.

A reporter instance observes the evals that reference it. Share one instance across several evals (the config, a `shared.ts` export, or every entry of a dataset array) and their results land in a single experiment. Listing the same config reporter on an eval too does not double-report it.

Braintrust needs its SDK installed in the app and credentials in the environment: install the `braintrust` package (`npm install braintrust`) and set `BRAINTRUST_API_KEY`. Pass `--skip-report` to run the eval without shipping results, which also suppresses config reporters and is useful locally when iterating.

## JUnit

`JUnit({ filePath })` writes JUnit XML for CI annotations. The `--junit <path>` CLI flag does the same thing without touching the eval file, usually the better fit because CI owns the output path, not the eval:

```bash
eve eval --strict --junit .eve/junit.xml
```

Each eval becomes one `<testcase>` named by its path-derived id; failed gates and execution errors land as failure messages on the matching test case, so CI surfaces them inline.

## Custom reporters

A reporter implements the `EvalReporter` interface from `eve/evals/reporters` and receives the same structured results the built-ins do. The runner calls three lifecycle methods, each of which may return a promise for async work like a remote upload:

```ts
interface EvalReporter {
  onRunStart(evaluations: readonly EveEval[], target: EveEvalTarget): void | Promise<void>;
  onEvalComplete(result: EveEvalResult): void | Promise<void>;
  onRunComplete(summary: EveEvalRunSummary): void | Promise<void>;
}
```

`onRunStart` fires once before any eval runs, `onEvalComplete` fires after each observed eval with its checks, scores, and verdict, and `onRunComplete` fires once with the aggregated summary. Reach for a custom reporter only when a destination isn't covered. The per-run artifacts under `.eve/evals/` already capture everything for ad-hoc inspection.

## What to read next

* [Running evals](./running): console output, `--json`, and artifacts
* [Judge](./judge): what the reported numbers mean


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Running Evals
description: The eve eval CLI: flags, filters, exit codes, artifacts, and how to wire evals into CI.
---

# Running Evals



`eve eval` discovers every `.eval.ts` file under `evals/`, boots a local dev server (or targets a remote one), runs the evals concurrently, and prints a per-eval summary.

```bash
eve eval                       # run all discovered evals locally
eve eval weather smoke         # run selected evals (an id, or a directory prefix)
eve eval --url https://<app>   # target a remote app instead of a local host
eve eval --tag fast            # only evals carrying a tag
eve eval --strict              # soft below-threshold assertions also fail the exit code
eve eval --timeout 60000       # per-eval timeout in milliseconds
eve eval --max-concurrency 4   # cap concurrent eval executions (default 8)
eve eval --junit .eve/junit.xml  # write JUnit XML
eve eval --list                # print discovered evals without running
eve eval --verbose             # stream per-eval t.log lines to stdout
eve eval --json                # machine-readable output
eve eval --skip-report         # skip config and eval-defined reporters (e.g. Braintrust)
```

Positional ids match exactly or by directory prefix: `eve eval weather` runs `evals/weather.eval.ts`, every eval under `evals/weather/`, and every entry of an array-exported `weather.eval.ts`.

## Exit codes

| Code | Means                                                                           |
| ---- | ------------------------------------------------------------------------------- |
| `0`  | Every eval passed its gates (and soft thresholds, under `--strict`)             |
| `1`  | Any eval failed (a failed gate, an execution error, or a strict threshold miss) |
| `2`  | Configuration error                                                             |

## Artifacts

Each run drops artifacts under `.eve/evals/<timestamp>/`: a run `summary.json`, a `results.jsonl` index, and per-eval assertion results, verdicts, captured event streams, and `t.log` lines under `evals/`. The console output stays tight on purpose; when an eval fails, the artifact has the full story.

## CI

A solid CI invocation is strict and machine-reportable:

```bash
eve eval --strict --junit .eve/junit.xml
```

* `--strict` turns soft threshold misses into failures, so score regressions block the merge.
* `--junit` gives the CI provider per-eval annotations; upload the `.eve/evals/` directory as a failure artifact for the full event streams.

Evals run against a live model, so the CI environment must provide the model-provider credentials. Against a deployed app, add `--url`:

```bash
eve eval --strict --url "$DEPLOY_URL" --junit .eve/junit.xml
```

## What to read next

* [Targets](./targets): what `--url` interacts with
* [Reporters](./reporters): Braintrust and JUnit output
* [CLI reference](../reference/cli): the rest of the `eve` CLI


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Targets
description: Point evals at a local dev server or a deployment with the same eval files.
---

# Targets



An eval target is always an HTTP URL. `eve eval` starts a local dev server, while `eve eval --url <url>` runs against an existing server or deployment. The same eval files work for both, which is what makes evals usable as end-to-end tests in CI.

The runner polls `/eve/v1/health`, verifies `/eve/v1/info`, and exposes the live target as `t.target` inside the `test` function.

## Target helpers

```ts title="evals/heartbeat.eval.ts"
import { defineEval } from "eve/evals";

export default defineEval({
  async test(t) {
    const { sessionIds } = await t.target.dispatchSchedule("heartbeat");
    await t.target.attachSession(sessionIds[0]!);
    t.completed();
    t.calledTool("send_report");
  },
});
```

* `t.target.fetch(path, init)` performs an authenticated fetch against the target, useful for channel and webhook ingress. See [Authentication](#authentication) for how the runner authenticates.
* `t.target.dispatchSchedule(id)` triggers a [schedule](../schedules) through the dev-only schedule route and returns the session ids it created. It works only against a target with dev routes enabled (the local `eve eval` dev server, or a deployment running in development mode), and throws otherwise.
* `t.target.attachSession(sessionId, { startIndex? })` consumes one turn from a session created outside the eval, by a channel or a schedule, so its events feed the run-level assertions. `startIndex` skips events before that position, so a session already partway through its stream resumes from where you left off rather than replaying from the start.

Sessions attached this way are full `EveEvalSession`s: you can keep driving them with `send` and read their event streams. The run-level assertions on `t` (`t.completed()`, `t.calledTool(...)`) read the whole run, including attached sessions.

## Authentication

Local targets send no auth: `eve eval` owns the dev server it boots. A remote `--url` target connects with the same credentials as every other development client, resolved in this order:

* A Vercel OIDC trusted-IDP token, sent as a per-request header. It bypasses Deployment Protection without a per-project secret, so a CI job with a pulled OIDC token reaches a protected preview deployment without extra setup.
* An `x-vercel-protection-bypass` header, added when `VERCEL_AUTOMATION_BYPASS_SECRET` is set.
* A bearer token resolved from the same OIDC cascade.
* `EVE_EVAL_AUTH_TOKEN`, which overrides the bearer with a static token for targets whose auth is not OIDC-based.

`t.target.fetch(path, init)` carries these same credentials, so channel and webhook ingress you exercise through it authenticates the same way the session protocol does.

## What to read next

* [Running evals](./running): `--url` and the rest of the CLI in practice
* [Schedules](../schedules): the surface `dispatchSchedule` drives
* [Channels](../channels/overview): ingress you can exercise with `target.fetch`


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Auth & Route Protection
description: Secure your agent's HTTP routes with an ordered auth walk, verifier helpers, and connection OAuth via Vercel Connect.
---

# Auth & Route Protection



eve has two independent auth systems:

* **Route auth** (inbound) decides who can reach your agent's HTTP routes. It runs at the channel layer, gating the request before any model work runs.
* **Tool and connection auth** (outbound) is how your agent signs in to an external service it calls, like an OAuth MCP server. It happens later, when a tool or connection actually reaches out.

Start with route auth.

## Route auth

The route-auth policy lives on the HTTP channel factory (`agent/channels/eve.ts`) and guards three routes:

* `POST /eve/v1/session`
* `POST /eve/v1/session/:sessionId`
* `GET /eve/v1/session/:sessionId/stream`

These routes are protected by the channel's auth policy. eve fails closed by default: production browser traffic is rejected unless you configure an authenticator that accepts it, and anonymous access requires an explicit `none()`.

`GET /eve/v1/health` is always public and skips the walk entirely, so load balancers and uptime monitors can probe it without credentials.

```ts title="agent/channels/eve.ts"
import { eveChannel } from "eve/channels/eve";
import { localDev, vercelOidc } from "eve/channels/auth";

export default eveChannel({
  auth: [localDev(), vercelOidc()],
});
```

`vercelOidc()` is a convenience for Vercel-hosted agents and Vercel-to-Vercel callers, not a requirement. If your app already has users, sessions, API keys, or an identity provider, put that authenticator in the `auth` walk instead. Custom `AuthFn` entries are first-class and can fully replace Vercel OIDC.

## The ordered auth walk

`auth` takes a single `AuthFn` or an array that eve walks in order. Each entry has three possible outcomes:

* returns a `SessionAuthContext`: accept the request and stop the walk
* returns `null` / `undefined`: skip to the next entry
* **throws**: reject with a specific status

If every entry skips, the request gets a `401`. An empty array `auth: []` rejects everything.

```ts
import { type AuthFn, localDev, vercelOidc } from "eve/channels/auth";
import { eveChannel } from "eve/channels/eve";
import { getSession } from "@/lib/auth";

function appSession(): AuthFn<Request> {
  return async (request) => {
    const session = await getSession(request);
    if (!session) return null; // skip; fall through to the next entry
    return {
      attributes: { providerId: session.providerId },
      authenticator: "app",
      principalId: session.userId,
      principalType: "user",
    };
  };
}

export default eveChannel({
  auth: [appSession(), localDev(), vercelOidc()],
});
```

Put your own providers ahead of the catch-all helpers. Any entry that doesn't recognize the caller returns `null`, and the walk moves on. On non-Vercel hosts, omit `vercelOidc()` unless you specifically want to accept Vercel-issued tokens.

To reject with a precise status instead of skipping, throw:

```ts
import { ForbiddenError, UnauthenticatedError } from "eve/channels/auth";

throw new UnauthenticatedError({
  code: "authentication_required",
  message: "Sign in to continue.",
}); // 401
throw new ForbiddenError({ message: "Not allowed on this workspace." }); // 403
```

Any other thrown error follows the normal channel failure path. When building a custom channel on `defineChannel`, call `routeAuth(request, auth)` from `eve/channels/auth` to reuse the same walk semantics.

## Verifier helpers

`eve/channels/auth` ships these channel-auth helpers:

| Helper           | Use when                                                                  |
| ---------------- | ------------------------------------------------------------------------- |
| `localDev()`     | Local development. Accepts requests addressed to a loopback hostname.     |
| `vercelOidc()`   | The common Vercel deployment path. Verifies a Vercel OIDC bearer JWT.     |
| `none()`         | You want to accept anonymous traffic explicitly (use as the final entry). |
| `httpBasic(...)` | Operator or service access via a shared username/password.                |
| `jwtHmac(...)`   | You control a shared-secret JWT signer.                                   |
| `jwtEcdsa(...)`  | You verify asymmetric JWTs minted by another system.                      |
| `oidc(...)`      | You want eve to verify OIDC-issued tokens from an arbitrary issuer.       |

Exercise caution for agents that process non-public, sensitive, regulated, or production data unless you have implemented other access controls.

### `localDev()`

Authenticates a synthetic `local-dev` principal, but only when the inbound request is addressed to a loopback hostname (`localhost`, `*.localhost`, `127.0.0.0/8`, or `::1`). The check keys off the request URL's hostname rather than the bare `process.env.VERCEL` flag, and that's deliberate: a deployment outside Vercel leaves `VERCEL` unset, so sniffing that flag alone would wave through all public traffic. There's one process-level exception. `vercel dev`, detected by `VERCEL=1` and `VERCEL_ENV=development` together, opens the local dev server even when it serves over a non-loopback host. Every other non-loopback request returns `null` and falls through.

`localDev()` trusts the advertised hostname, so an attacker who can inject a `Host` header (no normalizing proxy in front of your origin) can spoof it. Always layer a real authenticator on top; never run on `localDev()` alone.

### `vercelOidc()`

Verifies a bearer JWT against the [Vercel OIDC issuer](https://vercel.com/docs/oidc). Tokens minted for the current `VERCEL_PROJECT_ID` are always accepted, which is why internal subagent and runtime callers authenticate with zero configuration. Tokens carrying an `external_sub` authenticate as user callers, but only when their `project_id` matches `VERCEL_PROJECT_ID` and their environment matches `VERCEL_TARGET_ENV` / `VERCEL_ENV`. In that case `external_sub` becomes the session subject, and the profile claims (`name`, `picture`, `email`) show up in `ctx.session.auth.current.attributes`. To admit tokens minted by other Vercel projects, pass `subjects: [...]` (AWS IAM-style `*` wildcards).

Auth fails closed: routes reject unauthenticated traffic by default, and the OIDC user branch verifies `external_sub` against `VERCEL_PROJECT_ID` and the deployment environment, returning `false` when either is unset. An external-subject token cannot authenticate on a deployment that hasn't pinned its project.

#### `subjects` patterns and `vercelSubject(...)`

Each `subjects` entry is matched against the token's `sub` claim, which Vercel shapes as `owner:<team>:project:<name>:environment:<env>`. Hand-writing that string is a footgun: a typo silently rejects every caller, and an over-broad `*` wildcard silently lets unrelated ones in. Build the pattern with `vercelSubject(...)` instead. It rejects malformed input at construction time, and defaults `environment` to `"production"` when you omit it, so an unspecified environment cannot silently accept preview or development tokens:

```ts
import { vercelOidc, vercelSubject } from "eve/channels/auth";

vercelOidc({
  subjects: [
    vercelSubject({ teamSlug: "partner", projectName: "data" }), // environment defaults to "production"
    vercelSubject({ teamSlug: "acme", projectName: "agent", environment: "*" }),
  ],
});
```

`teamSlug` and `projectName` are the human-readable slugs Vercel embeds in `sub` (not the stable `team_…` / `prj_…` IDs), so they can't contain `:` or `*`. `environment` is `"production" | "preview" | "development" | "*"`. Only hand-write the subject string yourself when you actually mean to match across teams with a wildcard.

### Custom verifiers

When none of the shipped helpers fit, write your own `AuthFn` (the array example above) or call the low-level verifiers directly. Each verifier is the pure function sitting behind the matching strategy helper, and returns `{ ok: true, sessionAuth }` or `{ ok: false }`:

| Verifier                               | Behind         | Input                            |
| -------------------------------------- | -------------- | -------------------------------- |
| `verifyHttpBasic(header, credentials)` | `httpBasic()`  | raw `Authorization` header value |
| `verifyJwtHmac(token, config)`         | `jwtHmac()`    | bearer token (HMAC-signed JWT)   |
| `verifyJwtEcdsa(token, config)`        | `jwtEcdsa()`   | bearer token (ECDSA-signed JWT)  |
| `verifyOidc(token, config)`            | `oidc()`       | bearer token (OIDC, any issuer)  |
| `verifyVercelOidc(token, opts)`        | `vercelOidc()` | bearer token (Vercel OIDC)       |

Pull the token with `extractBearerToken(request.headers.get("authorization"))` before you hand it to the JWT/OIDC verifiers. The configs (`VerifyJwtHmacConfig`, `VerifyJwtEcdsaConfig`, `VerifyOidcConfig`) take `issuer`, `audiences`, the signing material (`secret` / `publicKey` / `discoveryUrl`), and optional `subjects` / `claims` matchers.

```ts
import { extractBearerToken, verifyJwtHmac, type AuthFn } from "eve/channels/auth";

function hmacAuth(): AuthFn<Request> {
  return async (request) => {
    const token = extractBearerToken(request.headers.get("authorization"));
    const result = await verifyJwtHmac(token, {
      algorithm: "HS256",
      issuer: "https://auth.example.com",
      audiences: ["agent"],
      secret: process.env.JWT_SECRET!,
    });
    return result.ok ? result.sessionAuth : null;
  };
}
```

### Failure responses in custom `defineChannel` routes

If a `defineChannel` route handler runs its own checks instead of `routeAuth`, it can still emit a framework-shaped failure with `createUnauthorizedResponse(...)`. You get back a `Response` with `cache-control: no-store`, a `{ ok: false, code, error }` JSON body, and one `www-authenticate` header per challenge:

```ts title="agent/channels/intake.ts"
import { defineChannel, POST } from "eve/channels";
import { createUnauthorizedResponse } from "eve/channels/auth";

export default defineChannel({
  routes: [
    POST("/message", async (req, { send }) => {
      if (!isAllowed(req)) {
        return createUnauthorizedResponse({
          status: 403, // defaults to 401; code defaults to "forbidden" / "unauthorized"
          message: "Not allowed on this workspace.",
          challenges: [{ scheme: "Bearer" }],
        });
      }
      // authenticated: handle the request
    }),
  ],
});
```

`UnauthenticatedError` and `ForbiddenError` wrap this builder (status `401` / `403`). Throw those from an `AuthFn` that `routeAuth` walks. Call `createUnauthorizedResponse` directly only when you're returning a `Response` from a hand-rolled route.

## Network policy

`eve/channels/auth` exports `createIpAllowList(...)` and `isIpAllowed(...)` for cutting off requests before any model work starts. A request that fails the network policy is dropped ahead of both auth and runtime execution.

## Replace `placeholderAuth` before production

`eve init` scaffolds `agent/channels/eve.ts` with a `placeholderAuth()` guardrail:

```ts
import { eveChannel } from "eve/channels/eve";
import { localDev, placeholderAuth, vercelOidc } from "eve/channels/auth";

export default eveChannel({
  auth: [localDev(), vercelOidc(), placeholderAuth()],
});
```

In production, `placeholderAuth()` returns a structured `401` so a generated web chat app can say "auth isn't configured yet" instead of throwing an internal error. Replace it before a browser caller submits a production request: swap in your app's `AuthFn` or one of the shipped helpers. Delete the authored channel file entirely and eve falls back to the framework default `[localDev(), vercelOidc()]`, which also rejects production browser traffic.

You do not have to keep `vercelOidc()` in the final policy. For a self-hosted app, an app-embedded frontend, or any deployment that uses a non-Vercel identity system, use `httpBasic()`, `jwtHmac()`, `jwtEcdsa()`, generic `oidc()`, or a custom `AuthFn` that maps your verified user/session/API key into a `SessionAuthContext`.

Keep secret values (`ROUTE_AUTH_BASIC_PASSWORD`, signing keys) in environment variables. Route-auth secrets never land in compiled artifacts. The runtime re-materializes them from the authored channel definition at boot.

## What reaches `ctx.session.auth`

Inside runtime code, `ctx.session.auth` carries the result of the channel's route auth (the walk above) forward as the caller snapshot:

* `auth.current`: the caller on the active inbound turn.
* `auth.initiator`: the caller that started the durable session.
* A follow-up message updates `auth.current` but leaves `auth.initiator` alone. When a different caller follows up on the same session, `auth.current` tracks the new caller for that turn while `auth.initiator` stays pinned to whoever started it.
* Both are `null` only on internal runtime paths (subagents, for instance) that never went through an authored route. HTTP traffic always populates `auth.current`, since the walk either accepts with a `SessionAuthContext` or returns `401`.

Use the principal on `auth.current` (or `auth.initiator`) to scope tools, resolve [dynamic capabilities](./dynamic-capabilities) per principal, or enforce tenant boundaries. There's no second per-session ownership ACL stacked on top of route auth. Access is decided at the HTTP boundary, and the durable session carries the caller snapshot forward into your runtime code.

Route auth does not enforce session ownership. If multiple users or tenants can reach the same route, you must implement the per-user, per-tenant, or per-session authorization your application requires.

## Tool and connection auth

Tool and connection auth is how your agent reaches an external service that wants an interactive sign-in, like an OAuth MCP server. Both a connection and an individual tool can declare an `auth` strategy; eve drives the sign-in, caches the token per step, and re-runs the call once the caller authorizes.

### On a connection

Attach `connect()` from `@vercel/connect/eve` to the connection:

```ts title="agent/connections/linear.ts"
import { connect } from "@vercel/connect/eve";
import { defineMcpClientConnection } from "eve/connections";
import { once } from "eve/tools/approval";

export default defineMcpClientConnection({
  url: "https://mcp.linear.app/mcp",
  description: "Linear: project management, issue tracking, and team workflows.",
  auth: connect("oauth/linear"),
  approval: once(),
});
```

The first call that needs the connection kicks off an OAuth sign-in, surfaced as an authorization challenge (a URL the caller visits). [Vercel Connect](https://vercel.com/docs/connect) brokers the flow and holds the credentials, which are resolved and cached per workflow step, never serialized into history, and never shown to the model. For non-interactive connections, pass a static token in place of `connect()`. [Connections](../connections) covers both shapes.

### On a single tool

When one tool calls a service behind OAuth, it can declare its own `auth` and skip the separate connection. `auth` takes the same shapes: `connect("...")` for Vercel Connect-backed OAuth, a custom interactive definition, or a plain `{ getToken }` for static credentials.

```ts title="agent/tools/list_okta_groups.ts"
import { defineTool } from "eve/tools";
import { connect } from "@vercel/connect/eve";
import { z } from "zod";

export default defineTool({
  description: "List the caller's Okta groups.",
  inputSchema: z.object({}),
  auth: connect("okta"),
  async execute(_input, ctx) {
    const { token } = await ctx.getToken();
    const res = await fetch("https://api.okta-proxy.internal/groups", {
      headers: { authorization: `Bearer ${token}` },
    });
    return res.json();
  },
});
```

Declaring `auth` adds two accessors to the tool's `ctx`:

* `ctx.getToken()` resolves the bearer for the declared strategy, checking the per-step token cache first. With an interactive strategy, a cache miss suspends the turn on a framework-owned callback URL, shows a "Sign in" affordance, and re-runs the tool once the OAuth callback completes.
* `ctx.requireAuth()` throws `ConnectionAuthorizationRequiredError` to gate the tool on authorization before any token resolves. The runtime turns that into the same consent prompt.

Throw `ConnectionAuthorizationRequiredError` anywhere in `execute` (directly, via `requireAuth()`, or implicitly from `getToken()`) and you trigger the consent flow, keyed by the tool's name. Calling either accessor on a tool that does not declare `auth` throws.

By default the sign-in affordance title-cases the tool's path-derived name, so a tool file named `sfdc_lookup.ts` renders "Sign in with Sfdc\_lookup". Set `displayName` on the `auth` definition to control what users see instead, for example `auth: { ...connect("sfdc"), displayName: "Salesforce" }`. It is presentation-only. The tool's name still keys the authorization scope, token cache, and callback URL, and a definition-level `displayName` wins over one the strategy stamps on the challenge.

## What to read next

* [Security model](../concepts/security-model): trust boundaries and the pre-production checklist
* [Connections](../connections): connection auth shapes (`connect()` vs static token)
* [Deployment](./deployment): where route-auth secrets live in production


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Deployment
description: A production checklist for shipping an eve agent on Vercel or your own host, covering build output, env and secrets, sandbox backend, auth, deploy, and verify.
---

# Deployment



eve runs the same way locally, on Vercel, and on a long-running Node host, so taking an agent from `eve dev` to production is mostly mechanical. Work through this checklist in order.

## 1. Build

`eve build` compiles the agent and writes the host output:

```bash
eve build
```

When `VERCEL` is set (every hosted Vercel build sets it), `eve build` writes the [Vercel Build Output](https://vercel.com/docs/build-output-api) bundle under `.vercel/output`. A plain local `eve build` skips that bundle. Either way you get eve's compiled framework artifacts under `.eve/`, including the discovery manifest, compiled manifest, diagnostics, and module map. Open those to see which authored surface a deployment will load. For the artifact guide and what to do when `eve build` fails, see [Observability](./instrumentation).

### How portability works

Nitro is the HTTP host layer. It gives eve a build artifact that can serve the health, session, stream, channel, callback, and schedule routes outside the dev server. Workflow execution and sandbox execution are separate runtime adapters; they are not hidden Vercel dependencies inside Nitro.

On Vercel, eve emits Vercel Build Output, the Workflow SDK runs on Vercel Workflow, and `defaultBackend()` selects Vercel Sandbox. Outside Vercel, `eve start` serves the standard Nitro Node output, the Workflow SDK uses its local world by default, and `defaultBackend()` selects a local sandbox backend in availability order. That local workflow world persists run state on disk and has no direct coupling to Vercel; Vercel-only behavior such as latest-deployment routing and dashboard run attributes is additive.

Advanced self-hosted deployments can select a different installed Workflow world package in the root `agent.ts`:

```ts title="agent/agent.ts"
import { defineAgent } from "eve";

export default defineAgent({
  model: "anthropic/claude-opus-4.8",
  experimental: {
    workflow: {
      world: "@acme/eve-workflow-world",
    },
  },
});
```

The world package should read credentials and host-specific options from runtime environment variables. It should export a default factory or `createWorld()` function. See [Workflow Worlds](https://workflow-sdk.dev/worlds) for the underlying SDK abstraction.

## 2. Environment variables and secrets

Set these in your deployment environment or secret manager, never in source or compiled artifacts:

* **A model credential.** The lowest-setup Vercel option is the Vercel AI Gateway. Link a Vercel project, and gateway model ids like `anthropic/claude-opus-4.8` authenticate through Vercel OIDC, with no provider keys to manage. Outside Vercel, either set `AI_GATEWAY_API_KEY` for gateway-routed models or configure a direct provider model with an [AI SDK provider package](https://ai-sdk.dev/docs/foundations/providers-and-models) and set that provider's key, for example `OPENAI_API_KEY` or `ANTHROPIC_API_KEY`.
* **Route-auth secrets**, for example `ROUTE_AUTH_BASIC_PASSWORD` and any JWT/OIDC signing keys referenced by your channel's `auth` (see [Auth and route protection](./auth-and-route-protection)).

Route-auth secrets are never serialized into the compiled discovery or module-map artifacts. The runtime re-materializes them from the authored channel definition instead. If your deployment sits behind Vercel preview protection and you want to drive it with `eve dev`, set `VERCEL_AUTOMATION_BYPASS_SECRET` locally before launching.

## 3. Model routing

The shape of `model` in `agent/agent.ts` decides whether eve calls the Vercel AI Gateway or a provider endpoint directly.

A string model id is gateway-routed:

```ts title="agent/agent.ts"
import { defineAgent } from "eve";

export default defineAgent({
  model: "anthropic/claude-opus-4.8",
});
```

That works on Vercel with project OIDC or anywhere else with `AI_GATEWAY_API_KEY`. Passing a provider key through `modelOptions.providerOptions.gateway.byok` also still sends the request through the Gateway; it only changes which upstream key the Gateway uses.

To avoid the Gateway entirely, install the [AI SDK package](https://ai-sdk.dev/docs/foundations/providers-and-models) for the provider you want to call, pass that provider's model object, and set that provider's normal environment variable:

```bash
npm install @ai-sdk/anthropic
```

```ts title="agent/agent.ts"
import { anthropic } from "@ai-sdk/anthropic";
import { defineAgent } from "eve";

export default defineAgent({
  model: anthropic("claude-opus-4.8"),
});
```

With that shape, the model call goes directly to Anthropic and the runtime reads `ANTHROPIC_API_KEY`. The same pattern works for OpenAI after installing `@ai-sdk/openai`, using `openai("...")`, and setting `OPENAI_API_KEY`. This is the usual choice when self-deploying without any Vercel-managed services.

## 4. Sandbox backend

On Vercel, the [sandbox](../sandbox) runs on hosted [Vercel Sandbox](https://vercel.com/docs/sandbox) infrastructure. Attach the backend on the sandbox definition:

```ts title="agent/sandbox/sandbox.ts"
import { defineSandbox } from "eve/sandbox";
import { vercel } from "eve/sandbox/vercel";

export default defineSandbox({
  backend: vercel(),
});
```

Leave `backend` off and eve falls back to `defaultBackend()`, which picks the Vercel backend on hosted builds and the local backend everywhere else. One definition, both environments.

For a self-deployed process, leave `defaultBackend()` in place or choose an explicit non-Vercel backend such as Docker or microsandbox. If those do not match your infrastructure, write a custom `SandboxBackend` adapter that creates sessions in your own container, VM, or isolation service. Do not pin `vercel()` unless that process should create hosted Vercel sandboxes.

## 5. Build-time sandbox prewarm

During hosted builds, eve prewarms reusable Vercel sandbox templates so the first session doesn't pay the cold-start cost:

* Prewarm runs only when both `VERCEL` and `VERCEL_DEPLOYMENT_ID` are present.
* A sandbox with no `bootstrap()` and no workspace seed files gets skipped.
* Seed-only templates are keyed by skills and workspace file contents, so unchanged seeds reuse a template across deploys.
* Templates with a `bootstrap()` are keyed by the optional resolved `revalidationKey()` plus the authored sandbox source and seed contents, so matching inputs reuse a template across deploys.
* Each template shows up in the build log as either `reused cached` or `built`.
* Prewarming only covers template construction. `onSession()` still runs at runtime, once per session.
* **If build-time prewarm fails, the build fails.**

If `VERCEL` is set but `VERCEL_DEPLOYMENT_ID` is missing, eve warns that it skipped prewarming. Do not deploy that build with `vercel deploy --prebuilt`; its output may reference sandbox templates that were never provisioned. Run `vercel deploy` instead so Vercel builds the source in its hosted build environment.

## 6. Auth

Swap any scaffolded `placeholderAuth()` for your real policy before the first production browser request hits the app. Both the framework default and the placeholder reject production browser traffic, so an unconfigured app fails closed rather than serving open routes. The production policy can be a shipped helper (`httpBasic()`, `jwtHmac()`, `jwtEcdsa()`, `oidc()`, `vercelOidc()`) or a custom `AuthFn` that validates your own sessions, API keys, or identity provider. See [Auth and route protection](./auth-and-route-protection) for the ordered auth walk and the fail-closed guarantee.

If you self-deploy outside Vercel, do not rely on `vercelOidc()` as the only production authenticator. Use your own route policy, such as Basic auth, JWT/OIDC verification for your identity provider, or a custom verifier.

## 7. Deploy on Vercel

Deploy with the [Vercel CLI](https://vercel.com/docs/cli) or by pushing to a Git-connected project:

```bash
vercel deploy
```

The deployed app serves the same stable health, session, and stream routes you've been hitting locally.

## 8. Deploy without Vercel

eve can also run as a normal Node service behind your own process manager, container platform, or reverse proxy:

```bash
eve build
PORT=3000 eve start --host 0.0.0.0
```

Eve writes the standard Nitro output under `.output/` instead of Vercel Build Output. `eve start` serves that built app and respects `PORT`, or the `--port` flag. Put TLS, routing, autoscaling, and log collection around that process the same way you would for any other Node HTTP service.

Self-deployed agents should make the Vercel-specific choices explicit:

* Let the Workflow SDK use its default local world, which stores workflow state under `.workflow-data`, configure your host so that directory is on persistent storage, or select another world with `experimental.workflow.world` in the root `agent.ts`.
* Install the AI SDK package for your provider, then use a direct provider model object and `OPENAI_API_KEY` / `ANTHROPIC_API_KEY` when you want no Gateway dependency.
* Use `AI_GATEWAY_API_KEY` if you still want Gateway routing from a non-Vercel host.
* Replace `vercelOidc()` with auth that your host can verify.
* Use `defaultBackend()`, a pinned non-Vercel sandbox backend such as Docker or microsandbox, or your own `SandboxBackend` adapter.
* If the agent defines schedules, the default `eve build && eve start` path starts Nitro's schedule runner, and Vercel wires schedules to Vercel Cron automatically. If you adapt the output to a custom HTTP-only host or preset, make sure it also runs Nitro scheduled tasks, or trigger the same work from your own scheduler.
* Treat Vercel Cron, Vercel Sandbox prewarm, Vercel Deployment Protection bypass, and the Agent Runs dashboard as Vercel-only conveniences.

The HTTP contract is unchanged: health, session creation, streaming, channels, tools, and subagents use the same routes. Any client that can reach and authenticate to those routes can talk to the agent.

## 9. Verify the deployment

Smoke-test the live routes. Health first:

```bash
curl https://<your-app>/eve/v1/health
```

Then a real turn:

```bash
curl -X POST https://<your-app>/eve/v1/session \
  -H 'content-type: application/json' \
  -d '{"message":"Hello from production"}'
```

The POST returns a JSON body whose `sessionId` identifies the new session. Attach to that session's stream with it:

```bash
curl https://<your-app>/eve/v1/session/<sessionId>/stream
```

Or drive the deployment interactively with the dev TUI, which is handy for preview and production smoke tests:

```bash
eve dev https://<your-app>
```

(Set `VERCEL_AUTOMATION_BYPASS_SECRET` locally first if the deployment uses preview protection.)

## View runs in the dashboard

Once the agent is deployed, the platform auto-detects `eve` as the framework and surfaces an **Agent Runs** tab under your project's **Observability** view in the Vercel dashboard. From there you can browse sessions and drill into each conversation's trace.

> The Agent Runs tab is currently gated. Your Vercel team needs the feature enabled before it appears. If you don't see it, reach out to your Vercel contact to get your team enabled.

Agent Runs is separate from the OpenTelemetry exporters configured in [Observability](./instrumentation). Those still work and are the recommended path if you want spans in Braintrust, Datadog, or another third-party backend.

## How eve sits behind a host framework

You can deploy an eve app on its own, or mount it inside a host web framework that owns the rest of the site (marketing pages, a dashboard, other API routes). The host keeps its own routing and serves eve's routes through the framework integration. Either way, the agent surface and HTTP contract are identical. For mounting eve in Next.js (`withEve`) and the other supported frameworks, see [Frontend](./frontend/nextjs).

## Checklist

* [ ] `eve build` succeeds, and writes `.vercel/output` when `VERCEL` is set.
* [ ] Provider keys and route-auth secrets are set in the deployment environment.
* [ ] The sandbox backend matches the environment (`vercel()` or `defaultBackend()`).
* [ ] On Vercel, build-time prewarm reused or built templates without failing.
* [ ] `placeholderAuth()` is replaced with your real policy.
* [ ] `vercel deploy` succeeds, or your self-hosted process starts with `eve start`.
* [ ] The health, session, and stream routes respond on the deployment URL.

## What to read next

* [Auth and route protection](./auth-and-route-protection): secure the routes you deployed
* [Observability](./instrumentation): tracing, run tags, and common failures
* [Sandbox](../sandbox): backends, lifecycle, and credential brokering


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Dev TUI
description: Drive an eve agent locally in an interactive terminal UI. Chat, stream, approve tools, answer questions, tune the display, and point it at a deployment.
---

# Dev TUI



`eve dev` boots the local runtime and drops you into an interactive terminal UI. You chat with the agent, watch it stream, approve its tool calls, and answer the questions it asks back.

```bash
eve dev
```

On startup the TUI prints a brand line with your agent's name, plus a rotating tip (local sessions only).

```text
 eve weather-agent
 Use /channels to add more ways to reach your agent.
```

If agent discovery reported problems, an error and warning count renders between the two lines. Instructions, tools, skills, and subagents are one `eve info` away, and `/help` lists every command. The TUI also runs a startup check. A missing model-provider setup surfaces as an attention line (`⚠ 1 setup issue: model provider not linked · /model`) so the fix is visible before the first message fails, with each command's outcome hanging under it on a `⎿` connector.

## Reading the transcript

The conversation streams straight into your terminal's normal scrollback, so you keep native scrolling, copy and paste, and a transcript that persists after you exit. The scrollback holds your prompts, the agent's replies, reasoning, tool calls, nested subagents, connection-authorization prompts, and any captured `stdout`, `stderr`, or sandbox lifecycle lines.

Each turn renders without boxes. A colored gutter glyph marks who is speaking, tool calls collapse to a one-line summary (`✓ get_weather  city="SF" → 73°F`), and a subagent's work is indented beneath its `◆` header. When input is ready, the prompt stays bare until you type. While a turn or setup action owns the terminal, only its live status shows.

A persistent line beneath the prompt or status shows the model, the session's token flow (`↑ 394.4K ↓ 4.3K`), the linked Vercel project and team (`▲ my-agent (acme)`), and a yellow `/deploy pending` marker once a channel added this session still needs `/deploy`. The Vercel segment stays hidden until the directory is linked.

Errors render compactly with docs links highlighted. A code bug escaping your agent's own code shows its stack trace dim beneath the error headline. Dev-server rebuilds condense into one status row that updates in place (`tui/setup-panel.ts changed · rebuilding…`, then `· rebuilt`); only the latest rebuild shows, and paths shrink to their last two components.

## Slash commands

Each command echoes as an invocation line, asks through a bordered panel that takes the input area's place (one question at a time, separate from the chat transcript), and finishes with a one-line `⎿` result. Loading states stay on the ephemeral status line instead of piling into the transcript.

| Command     | Does                                                                                                                              |
| ----------- | --------------------------------------------------------------------------------------------------------------------------------- |
| `/model`    | Opens a configure menu that loops until Done (or Esc). See [Configure the model and provider](#configure-the-model-and-provider). |
| `/channels` | Shows the agent's channel list and adds the one you pick. See [Add a channel](#add-a-channel).                                    |
| `/deploy`   | Ships the agent to Vercel production, linking the directory first when it is unlinked.                                            |
| `/loglevel` | Switches which logs the transcript shows. See [Control what logs show](#control-what-logs-show).                                  |
| `/new`      | Starts a fresh session.                                                                                                           |
| `/exit`     | Quits the TUI.                                                                                                                    |
| `/help`     | Lists every command.                                                                                                              |

`/model`, `/channels`, and `/deploy` manage the project and are available only when `eve dev` runs the server locally, not when connected to a remote server with `--url`.

### Configure the model and provider

Bare `/model` opens the configure menu. "Change model" runs the same searchable model picker setup uses (the Vercel AI Gateway catalog, pre-selected on the model the runtime is serving). A model change is written into your agent's authored source, and the command reports success only after eve confirms the new id. `/model <provider/model-id>` applies one directly, skipping the menu.

The provider row opens the provider questions: which model provider to use, and how to connect. Picking something other than Vercel AI Gateway shows wiring instructions for your own provider and stops there, leaving any existing setup untouched. For Vercel AI Gateway, you either paste your own `AI_GATEWAY_API_KEY` (saved straight to `.env.local`) or connect via a project. Connecting via a project asks for a Vercel team, opens that team's existing-project list (picking again re-links), then pulls the project's environment so an AI Gateway credential lands in `.env.local`. The dev server reloads env files automatically, with no restart needed.

The provider row demands attention (a bold yellow "Configure provider" with "Required to enable the agent") until a link or gateway credential is detected, then names the connection afterward (for example "AI Gateway (Linked to my-project in my-team)"). Each action's latest outcome stays visible beneath the menu (for example "✓ Model changed to openai/gpt-5.5"). When a turn fails because AI Gateway authentication is missing or stale, the error points you at `/model` directly.

### Add a channel

`/channels` shows the agent's channel list. Already-registered channels render as checked, focusable rows with an "Already installed" hint. Picking one adds it (including the Slack Connect provisioning), then installs the dependencies the scaffold added so the dev server can load the new channels right away. After each addition the list repaints with the channel checked, until Done (or Esc) leaves the flow.

## Keyboard shortcuts

Chat and freeform `ask_question` inputs behave like a shell line editor.

| Key                                            | Action                                                                                                            |
| ---------------------------------------------- | ----------------------------------------------------------------------------------------------------------------- |
| `Enter`                                        | Submit the message or question response.                                                                          |
| `Shift+Enter`                                  | Insert a newline without sending (needs a terminal that reports modified keys).                                   |
| `Ctrl+C`                                       | Interrupt a running turn. At the chat or freeform-question prompt, clear non-empty input; when empty, quit.       |
| `↑` / `↓`                                      | Move between input lines; at a chat-buffer edge, navigate messages you have sent this session.                    |
| `←` / `→`, `Home` / `End`, `Ctrl+A` / `Ctrl+E` | Move the caret; Home/End stay within the current line.                                                            |
| `Ctrl+U` / `Ctrl+K` / `Ctrl+W`                 | Delete to the start of the line, to its end, or through the previous word.                                        |
| `Ctrl+L`                                       | Cycle the log display mode (`none → all → stderr → sandbox → none`) and briefly show the mode in the status line. |
| `Ctrl+R`                                       | Redraw the screen.                                                                                                |

In terminals that support bracketed paste, pasting multi-line text into chat or a freeform question inserts it intact and renders one row per line rather than submitting at the first line. `Shift+Enter` adds a line by hand. The input grows down to the available terminal height, then scrolls to keep the caret visible; `Enter` submits the whole response.

If a turn fails terminally (the server session dies or the connection drops), the TUI starts a fresh session and notes it inline so you can keep going. Server-side context resets with the old session.

## Answer the agent inline

When the agent needs something from you, the TUI asks inline.

* Tool approvals are a `y` or `n`.
* Option questions let you pick with `↑` / `↓` and `Enter`, or you can compose a multi-line freeform answer.
* If a tool needs an authorized [connection](../connections), the URL shows up right in the transcript, and the turn picks back up once you finish the flow.

## Control what logs show

By default, `eve dev` shows `stderr` and keeps stdout and sandbox lines buffered but hidden. Captured server `stdout` and `stderr` render as dim, indented log runs behind a `│` rule (consecutive lines from the same source share one label), while sandbox lifecycle lines use their own label.

* `/loglevel <all|stderr|sandbox|none>` switches what the transcript shows, retroactively. Bare `/loglevel` reports the current mode.
* `--logs <all|stderr|sandbox|none>` sets the starting mode at launch (default `stderr`).
* `Ctrl+L` at the idle prompt cycles `none → all → stderr → sandbox → none`.

## Display flags

Density flags control how much of each section renders. They accept `full`, `collapsed`, `auto-collapsed`, or `hidden`.

```bash
eve dev --tools full --assistant-response-stats tokens --context-size 200000
```

| Flag                                | Values                                             | Effect                                                  |
| ----------------------------------- | -------------------------------------------------- | ------------------------------------------------------- |
| `--tools <mode>`                    | `full` / `collapsed` / `auto-collapsed` / `hidden` | How tool calls render (default `auto-collapsed`).       |
| `--reasoning <mode>`                | `full` / `collapsed` / `auto-collapsed` / `hidden` | How reasoning renders (default `full`).                 |
| `--subagents <mode>`                | `full` / `collapsed` / `auto-collapsed` / `hidden` | How subagent sections render.                           |
| `--connection-auth <mode>`          | `full` / `collapsed` / `auto-collapsed` / `hidden` | How connection authorization renders.                   |
| `--assistant-response-stats <mode>` | `tokens` / `tokensPerSecond`                       | Which statistic the assistant header shows.             |
| `--context-size <tokens>`           | a token count                                      | Model context window size, shown as a usage percentage. |
| `--logs <mode>`                     | `all` / `stderr` / `sandbox` / `none`              | Which server and agent logs to show (default `stderr`). |

Connection flags: `--host` and `--port` bind the local server, and `--no-ui` runs headless (also the automatic fallback when stdout is not a TTY). See the [CLI](../reference/cli) for the full flag list.

## Remote: `eve dev <url>`

Pass a URL and the TUI talks to a running deployment instead of starting a local server, which is handy for a Vercel preview or your production app.

```bash
eve dev https://<your-app>
```

The bare URL is shorthand for `--url`. `--host`, `--port`, and `--no-ui` are ignored against a remote target. If the deployment sits behind Vercel preview protection, set `VERCEL_AUTOMATION_BYPASS_SECRET` locally first. See [Deployment](./deployment) for the smoke-test flow.

## What to read next

* [Observability](./instrumentation): OpenTelemetry, run tags, and common failures.
* [CLI](../reference/cli): every command and flag.


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Dynamic Capabilities
description: Resolve tools, skills, and instructions at runtime with defineDynamic: the resolver events, execution order, and how dynamic tools survive step boundaries.
---

# Dynamic Capabilities



`defineDynamic` resolves tools, skills, and instructions at runtime from a session event instead of declaring them up front. Reach for it when the right capabilities aren't known until the session starts, because they hinge on who the caller is, what tenant they belong to, feature flags, or external data. The [tools](../tools), [skills](../skills), and [instructions](../instructions) guides each point here for their dynamic form.

## Dynamic tools

Pass `defineDynamic` an `events` object whose handlers return either a single `defineTool(...)`, a `Record<string, defineTool(...)>`, or `null` for no tools. Wrap every entry in `defineTool()`. The wrapper stamps them so their `execute` functions survive workflow step boundaries.

The example below builds one tool per warehouse table. A map return names tools `slug__key`, so the model sees `query__orders`, `query__users`, and so on.

```ts title="agent/tools/query.ts"
import { defineDynamic, defineTool } from "eve/tools";
import { z } from "zod";
import { listTables, runReadOnly } from "../lib/warehouse.js";

export default defineDynamic({
  events: {
    "session.started": async (_event, ctx) =>
      Object.fromEntries(
        (await listTables()).map((t) => [
          t.name,
          defineTool({
            description: `Query ${t.name}. Columns: ${t.columns.join(", ")}`,
            inputSchema: z.object({ sql: z.string() }),
            execute: ({ sql }) => runReadOnly(t.name, sql),
          }),
        ]),
      ),
  },
});
```

### `execute` must be an inline function

Write `execute` as an inline function expression, arrow, or method shorthand placed directly as the property value. The bundler transform does not detect `execute: myFn` or `execute: makeFn()`, so those tools work on the first step but do not survive replay (re-running a step after a crash or resume; see [Execution model & durability](../concepts/execution-model-and-durability)). On later steps the transform reconstructs each `execute` from its stored closure variables instead of re-running the resolver, which is why it has to be inline.

### Naming

| Return shape              | File                       | Tool name(s)                      |
| ------------------------- | -------------------------- | --------------------------------- |
| single `defineTool`       | `agent/tools/analytics.ts` | `analytics`                       |
| map `{ export, query }`   | `agent/tools/tenant.ts`    | `tenant__export`, `tenant__query` |
| map `{ run }` (one entry) | `agent/tools/search.ts`    | `search__run`                     |

A single return produces one tool named after the file slug, identical to a static tool. A map always uses `slug__key`, even when it holds a single entry, so adding a second entry later never renames the first.

### Events

| Event             | Resolver runs          | Tools available for             |
| ----------------- | ---------------------- | ------------------------------- |
| `session.started` | Once per session       | Every model call in the session |
| `turn.started`    | Once per turn          | Every model call in the turn    |
| `step.started`    | Before each model call | That model call                 |

### Execution order

When a stream event fires, three things happen in order.

1. The channel adapter handler runs and the event is written to the durable stream.
2. Stream-event [hooks](./hooks) fire.
3. Dynamic tool resolvers subscribed to that event run and update the tool set.

The tool loop reads the current set right before each model call, so a mid-turn update is visible on the next call.

A single file can declare handlers for several events, and the most recently fired one owns that file's tool set. Re-resolve on `turn.started` to replace what `session.started` returned:

```ts title="agent/tools/catalog.ts"
import { defineDynamic, defineTool } from "eve/tools";
import { z } from "zod";
import { runReadOnly, searchCatalog } from "../lib/catalog.js";

export default defineDynamic({
  events: {
    "session.started": async (_event, ctx) => ({
      query: defineTool({
        description: "Run a read-only query.",
        inputSchema: z.object({ sql: z.string() }),
        execute: ({ sql }) => runReadOnly(sql),
      }),
    }),
    // On each turn, re-resolve. Replaces this file's session.started tools for later calls.
    "turn.started": async (_event, ctx) => ({
      search: defineTool({
        description: "Search the catalog.",
        inputSchema: z.object({ term: z.string() }),
        execute: ({ term }) => searchCatalog(term),
      }),
    }),
  },
});
```

Resolvers across files run concurrently.

## Dynamic skills

A dynamic skills file resolves which [skill](../skills) a caller can load, keyed on the principal. It resolves on `session.started` and `turn.started` only (`step.started` is reserved for dynamic tools). Read `ctx.session.auth` or channel metadata and return a `defineSkill(...)` (named after the file slug) or `null`:

```ts title="agent/skills/team_playbook.ts"
import { defineDynamic, defineSkill } from "eve/skills";
import { PLAYBOOKS } from "../lib/playbooks.js";

export default defineDynamic({
  events: {
    "session.started": (_event, ctx) => {
      const team = ctx.session.auth.current?.attributes.team;
      const markdown = team ? PLAYBOOKS[team] : undefined;
      return markdown ? defineSkill({ markdown }) : null;
    },
  },
});
```

The caller's team gets its own playbook advertised as a loadable skill; everyone else gets nothing.

Skills follow the same naming rule as tools: a single `defineSkill(...)` is named after the file slug, while a map return names each entry `slug__key` — even when the map holds one entry, so adding a second skill later never renames the first.

## Dynamic instructions

A dynamic instructions file resolves the per-session system prompt the same way, returning `defineInstructions(...)` built from the principal, tenant, or external data:

```ts title="agent/instructions/persona.ts"
import { defineDynamic, defineInstructions } from "eve/instructions";

export default defineDynamic({
  events: {
    "session.started": (_event, ctx) => {
      const plan = ctx.session.auth.current?.attributes.plan ?? "free";
      return defineInstructions({
        markdown: `The caller is on the ${plan} plan. Match the depth of your answers to it.`,
      });
    },
  },
});
```

Both resolve before the prompt is assembled, so the model sees the right instructions and skill set for whoever is calling, without that context reaching anyone else.

## What to read next

* The static tool basics this builds on → [Tools](../tools)
* The built-in tools and how to override them → [Default harness](../concepts/default-harness)
* Authenticate a tool or connection to an external service → [Auth & route protection](./auth-and-route-protection)
* Durable per-session memory for resolvers to read → [State](./state)


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Dynamic Workflows
description: The experimental Workflow tool: let the model orchestrate its own subagents from model-authored JavaScript as one durable step.
---

# Dynamic Workflows



The experimental `Workflow` tool lets the model write JavaScript that coordinates the agent's own subagents as a single durable step. The program can run them in sequence, feed one result into the next, fan out over a list, and combine the results. You enable the capability and the model decides and runs the orchestration. It is the agents-only slice of [code mode](../agent-config#other-defineagent-fields) (the broader `codeMode` flag that routes all of an agent's tools through model-authored JavaScript).

A single turn can already call several subagents, and parallel tool calls dispatch concurrently. What a workflow adds is *programmatic* coordination. The program decides how many subagents to run based on an earlier result, which output feeds which call, and how to combine everything. That is logic the model cannot express as a few one-off calls.

## Enable the Workflow tool

Re-export the opt-in marker as the default export of `agent/tools/workflow.ts`. The marker name carries the "experimental" warning, but the tool the model actually sees is named `Workflow`.

```ts title="agent/tools/workflow.ts"
export { ExperimentalWorkflow as default } from "eve/tools";
```

Without that file, the `Workflow` tool stays off. It earns its keep only when the agent has subagents (or the built-in `agent`) worth coordinating:

```ts title="agent/subagents/analyst/agent.ts"
import { defineAgent } from "eve";

export default defineAgent({
  description: "Analyzes one metric: queries, computes, writes a short finding.",
  model: "anthropic/claude-opus-4.8",
});
```

When asked for a weekly business review, the model picks the metrics, runs one `analyst` per metric in parallel, and combines the findings. The program below is the kind of JavaScript the model authors. It fans `analyst` out over a runtime-decided list of metrics and merges the results:

```js
const metrics = ["revenue", "signups", "churn"];
const findings = await Promise.all(
  metrics.map((metric) => tools.analyst({ message: `Summarize last week's ${metric}.` })),
);
return findings.join("\n\n");
```

Each `tools.analyst(...)` call dispatches a child subagent, so the parent stream records one `subagent.called` per metric and one `subagent.completed` as each finishes:

```json
{ "type": "subagent.called", "data": { "name": "analyst", "toolName": "analyst", "callId": "call_1", "childSessionId": "ses_a1", "sequence": 0 } }
{ "type": "subagent.called", "data": { "name": "analyst", "toolName": "analyst", "callId": "call_2", "childSessionId": "ses_a2", "sequence": 1 } }
{ "type": "subagent.called", "data": { "name": "analyst", "toolName": "analyst", "callId": "call_3", "childSessionId": "ses_a3", "sequence": 2 } }
{ "type": "subagent.completed", "data": { "subagentName": "analyst", "callId": "call_1", "output": "..." } }
{ "type": "subagent.completed", "data": { "subagentName": "analyst", "callId": "call_2", "output": "..." } }
{ "type": "subagent.completed", "data": { "subagentName": "analyst", "callId": "call_3", "output": "..." } }
```

## What a workflow can orchestrate

A workflow reaches only this agent's own agents: the built-in `agent` (a copy of itself), declared [subagents](../subagents), and [remote agents](./remote-agents). That is the whole list. No files, network, shell, skills, or connections. A workflow is a coordination layer over subagents, not a place to do other work. Each call can still request structured output via `outputSchema`, exactly like a direct subagent delegation.

## Where the JavaScript runs

The orchestration code never touches the agent's process. The runtime hands the program text to a small isolated JavaScript engine (a QuickJS sandbox) and runs it there. Nothing from the host realm crosses in, so there is no `process`, no `globalThis` from the agent, and no `import`/`require`. The program can reach exactly two things, the agent functions bridged in as `tools.<name>` and the ordinary language built-ins.

That is an allowlist, not a denylist. The sandbox cannot read files, open a socket, or see an environment variable because those are not present, not because each one is blocked in turn. When the program calls an agent function, that call bridges back out to the runtime, which dispatches it exactly like a direct delegation. The orchestration glue stays inside the sandbox.

## Durability, approvals, and observability

* **Durable.** The whole orchestration counts as one step. Subagents dispatched together run concurrently, and if a run parks (suspends durably without holding compute; see [Execution model & durability](../concepts/execution-model-and-durability)) on a long-running or human-gated child, it resumes where it left off after a restart.
* **Approval-safe.** A subagent that needs human approval (HITL, human-in-the-loop) mid-run surfaces its request to the user, and the workflow picks back up once that is answered, same as direct delegation.
* **Observable.** Every orchestrated subagent emits the usual `subagent.called` / `subagent.completed` events on the parent stream and gets its own child session and stream. The telemetry matches direct delegation, so existing dashboards and cost attribution keep working.

## Relationship to code mode

[Code mode](../agent-config#other-defineagent-fields) is the broader version, where the model drives *all* of an agent's tools (files, shell, web, and agents) from JavaScript. A workflow covers only the subagents. The two do not interfere. Enabling the `Workflow` tool leaves code mode untouched, and an agent can run both at once.

`codeMode` is experimental and may change or be removed.

## What to read next

* Declare the subagents a workflow orchestrates → [Subagents](../subagents)
* Call another deployment as one of those agents → [Remote agents](./remote-agents)
* The `agent/tools/` opt-in mechanism → [Default harness](../concepts/default-harness)


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Hooks
description: Subscribe to runtime stream events from agent/hooks/.
---

# Hooks



Hooks are eve's authored extension points for the runtime event stream. A hook subscribes to stream events and runs side effects after each event is durably recorded, such as audit logging, metrics and alerting, or persisting every session and message to your own database for analytics. Reach for one to observe what the agent does without writing a tool, a context provider (a value made available across a step), or a channel adapter handler (a handler defined on a channel's adapter; see [Channels](../channels)).

## Define a hook

```ts title="agent/hooks/audit.ts"
import { defineHook } from "eve/hooks";

export default defineHook({
  events: {
    async "session.started"(_event, ctx) {
      console.info("session started", { sessionId: ctx.session.id });
    },
    async "message.completed"(event) {
      console.info("model finished", { length: event.data.message?.length ?? 0 });
    },
  },
});
```

The slug is the path-relative basename. `agent/hooks/audit.ts` becomes `"audit"`, and `agent/hooks/auth/load-profile.ts` becomes `"auth/load-profile"`.

`defineHook`, `HookDefinition`, and `HookContext` live on `eve/hooks`.

A hook file declares stream-event subscribers under the `events` map, keyed by event type, with `*` matching every event. Subscribe to any event in the runtime stream vocabulary documented in [Sessions, runs and streaming](../concepts/sessions-runs-and-streaming), including the lifecycle events `session.started`, `turn.completed`, `message.completed`, and `action.result`. Handlers are observe-only. They cannot inject model context. To contribute runtime model messages, use `defineDynamic` and `defineInstructions` in `agent/instructions/`.

## Hook structure and context

Every handler receives the same `HookContext`:

```ts
interface HookContext {
  readonly agent: { readonly name: string; readonly nodeId?: string };
  readonly channel: { readonly kind?: string; readonly continuationToken?: string };
  readonly session: { readonly id: string };
}
```

### Narrowing tool results

`toolResultFrom` narrows an `action.result` event to a specific authored tool or MCP connection and returns typed output. Import it from `eve/tools`:

```ts
import { defineHook } from "eve/hooks";
import { toolResultFrom } from "eve/tools";
import getWeather from "../tools/get-weather";
import linear from "../connections/linear";

export default defineHook({
  events: {
    "action.result"(event) {
      // Authored tool: output is typed as the tool's return type
      const weather = toolResultFrom(event.data.result, getWeather);
      if (weather) {
        console.log(weather.output.temperature);
      }

      // MCP connection: output is unknown, toolName is qualified
      const linearResult = toolResultFrom(event.data.result, linear);
      if (linearResult) {
        console.log(linearResult.connectionToolName, linearResult.output);
      }
    },
  },
});
```

Returns `undefined` when the result doesn't match, or when `isError` is `true`. For authored tools the return includes `{ output, toolName, callId }` with `output` typed as the tool's `TOutput`. For connections it includes `{ output, toolName, connectionToolName, callId }` with `output` as `unknown`.

## Execution order

When a stream event fires, three things happen in order:

1. Emit. The channel adapter handler runs, then the event is written to the durable stream.
2. Hooks. Stream-event hooks fire (typed handlers first, then the `*` wildcard). Return values are ignored.
3. Dynamic tool resolvers. Resolvers subscribed to the event type run and update the tool set.

Hooks always run after the event is durably recorded, so if a hook throws, the stream stays consistent.

## What happens when a hook throws

A thrown handler propagates through the emit composer and surfaces as `turn.failed`. If a hook subscribed to a failure-cascade event also throws, it escalates to `session.failed`. For belt-and-suspenders semantics inside a hook, wrap the body in `try`/`catch`. eve treats a thrown hook as a real failure.

## Subagent isolation

Subagents may carry their own `agent/hooks/` directory. Subagent hooks fire only inside the subagent scope. Parent-agent hooks do not fire for subagent turns, and subagent hooks see only the subagent's own context.

## Hook vs tool vs provider

| Need                                              | Use                                            |
| ------------------------------------------------- | ---------------------------------------------- |
| Observe runtime events (audit, metrics, alerting) | `events.<type>` (or a channel adapter handler) |
| Provide structured input to the model on demand   | a tool                                         |
| Make a value available across the entire step     | a context provider                             |
| Subscribe to platform-specific events             | a channel adapter handler                      |

Stream-event hooks and channel adapter event handlers are structurally identical. Choose the channel adapter handler when you are authoring adapter-specific behavior, and choose `events.*` when you are authoring agent-level behavior that should fire across every channel. Both fire when both are registered.

## What to read next

* [Tools](../tools)
* [Context control](../concepts/context-control)
* [Session context](../reference/typescript-api)
* [Sessions, runs and streaming](../concepts/sessions-runs-and-streaming)


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: instrumentation.ts
description: Trace an agent with OpenTelemetry in instrumentation.ts, read the workflow run tags eve emits, and debug discovery with eve info and the common-failures table.
---

# instrumentation.ts



`instrumentation.ts` is where you configure how an eve agent is observed. The framework auto-discovers `agent/instrumentation.ts` and runs it at server startup before any agent code. Its presence implicitly enables telemetry, so there is no separate `isEnabled` toggle.

If you intend to export telemetry, review the exporter destination, data categories, and required legal approvals before enabling telemetry.

## Three observability surfaces

eve observes an agent through three distinct surfaces. They do not all live in this file, and they write to different places:

| Surface                          | Configured in `instrumentation.ts`?                         | What it is                                                                                                                                                    |
| -------------------------------- | ----------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Workflow run tags** (`$eve.*`) | No (automatic)                                              | Framework-owned attributes on each Vercel Workflow run. Let dashboards stitch session, turn, and subagent runs into a tree and surface model and token usage. |
| **OpenTelemetry export**         | Yes: `setup`, `recordInputs`, `recordOutputs`, `functionId` | Where AI SDK spans are exported and what they record.                                                                                                         |
| **Runtime context events**       | Yes: `events["step.started"]`                               | Per-model-call values written into the AI SDK's runtime context, which the AI SDK carries onto its spans.                                                     |

The two configurable surfaces send AI SDK spans to your OpenTelemetry backend. Workflow run tags are a separate system, queryable in the Workflow dashboard rather than on your OTel spans. The sections below cover what you configure here; [Workflow run tags](#workflow-run-tags) documents what eve emits on its own.

## Define instrumentation

```ts title="agent/instrumentation.ts"
import { BraintrustExporter } from "@braintrust/otel";
import { defineInstrumentation } from "eve/instrumentation";
import { registerOTel } from "@vercel/otel";

export default defineInstrumentation({
  setup: ({ agentName }) =>
    registerOTel({
      serviceName: agentName,
      traceExporter: new BraintrustExporter({
        parent: `project_name:${agentName}`,
        filterAISpans: true,
      }),
    }),
});
```

Export the result of `defineInstrumentation` as the default export.

## OpenTelemetry

Use the `setup` callback to register your OTel provider (for example `registerOTel` from `@vercel/otel`). The framework invokes it at server startup with the resolved agent name. `context.agentName` is resolved at compile time from your project (the package's `name`, falling back to the app directory name), so you never hard-code a service name.

Any OTel-compatible backend works (Braintrust, Honeycomb, Datadog, Jaeger). Install the exporter package you need and configure it in the callback.

Three more fields control what the AI SDK records inside those spans (see the AI SDK's [telemetry reference](https://ai-sdk.dev/docs/ai-sdk-core/telemetry)):

* `recordInputs` records full message history on each step span (defaults to `true`). Set it to `false` if inputs contain sensitive content or you want to reduce span payload size.
* `recordOutputs` records model outputs on spans (defaults to `true`). Set it to `false` to disable output recording.
* `functionId` overrides the function name on spans (defaults to the agent name).

For sensitive, regulated, or production data, set `recordInputs` and `recordOutputs` to `false` unless you have reviewed the exporter and its data-retention path.

You are responsible for ensuring any observability or eval provider is approved for the data exported to it.

The third configurable surface, [runtime context events](#runtime-context), attaches per-model-call values to these spans.

## Runtime context

*Runtime context* is an [AI SDK concept](https://ai-sdk.dev/docs/reference/ai-sdk-core/stream-text): a user-defined object that flows through a generation lifecycle. eve exposes it through `events["step.started"]`, a callback that runs once eve has assembled the model input for an attempt and returns `{ runtimeContext }`. Because eve registers the AI SDK's OpenTelemetry integration with runtime context enabled, those returned values ride onto the model-call span and its children. The field is named `runtimeContext`, not `metadata`, because AI SDK v7 carries per-call attributes on runtime context rather than a dedicated metadata field.

Use it when the values depend on the current session, turn, step, channel, or model input:

```ts
import { defineInstrumentation, isChannel } from "eve/instrumentation";
import supportChannel from "./channels/support.js";

export default defineInstrumentation({
  events: {
    "step.started"(input) {
      if (!isChannel(input.channel, supportChannel)) {
        return undefined;
      }

      return {
        runtimeContext: {
          "support.channel_id": input.channel.metadata.channelId ?? "",
          "support.user_id": input.channel.metadata.triggeringUserId ?? "",
        },
      };
    },
  },
});
```

The callback receives:

* `session`: the session id, current and initiator auth, and parent session lineage when this is a child run
* `turn`: the stream turn id and sequence, for example `turn_0`
* `step`: the zero-based step index inside the turn
* `channel`: the channel's `kind` and the metadata projected by the active channel
* `modelInput`: the final instructions and messages passed to the model call

A channel exposes its identity through `kind`, the discriminant you narrow on. For authored channels it is `channel:<name>`, where `<name>` is the channel's filename under `agent/channels/`, so `agent/channels/support.ts` is `channel:support`. Framework channels use `http`, `schedule`, or `subagent`, and an unrecognized or absent kind normalizes to `unknown`. The kind is also emitted as the `eve.channel.kind` span attribute. eve emits compiler-owned typings keyed by the channel filename, so you can narrow either by checking `input.channel.kind === "channel:support"` or by using `isChannel(input.channel, supportChannel)`.

Channel metadata is channel-owned. Built-in channels expose only the fields they choose to make observable; Slack, for example, projects `channelId`, `teamId`, `threadTs`, and `triggeringUserId` from its durable channel state. User-authored channels expose their own projection by returning `metadata(state)` from `defineChannel`. Runtime instrumentation never falls back to raw channel state.

## Trace hierarchy

When telemetry is enabled, each turn produces a trace like:

```text
ai.eve.turn  {eve.session.id}
  +-- ai.streamText                           step 1
  |     +-- ai.streamText.doStream            model call
  |     +-- ai.toolCall  {toolName: search}   tool exec
  +-- ai.streamText                           step 2
  |     +-- ai.streamText.doStream
  |     +-- ai.toolCall  {toolName: read}
  +-- ai.streamText                           step 3 (final text)
```

eve creates the `ai.eve.turn` parent span per turn and passes enriched telemetry to the AI SDK so model calls and tool executions are traced automatically. Session, turn, step, and channel context is injected as the framework half of the runtime context (`eve.version`, `eve.session.id`, `eve.environment`, `eve.turn.id`, `eve.turn.sequence`, `eve.step.index`, `eve.channel.kind`) and rides onto the spans alongside any values your `events["step.started"]` callback returns under `runtimeContext`.

## Workflow run tags

Separately from OpenTelemetry, eve tags every workflow run with reserved `$eve.*` attributes. These live on the Vercel Workflow run, queryable in the Workflow dashboard, not on OTel spans, and you do not configure them: they are framework-owned and emitted automatically on every session, turn, and subagent run, whether or not an `instrumentation.ts` file is present. Authored code cannot set or override the `$eve.` namespace.

They let a dashboard reconstruct the tree of runs behind a single agent invocation and surface model and token usage without reading run bodies.

Structural tags describe each run's place in the tree:

* `$eve.type`: `"session"`, `"turn"`, or `"subagent"`
* `$eve.parent`: session id of the immediate parent
* `$eve.root`: session id of the root session in the chain (group a whole tree with `$eve.root=<id>`)
* `$eve.subagent`: compiled graph node id (subagent runs only)
* `$eve.trigger`: the channel kind that started the run
* `$eve.title`: truncated title derived from the first user message

Per-turn usage tags are written on each step of a turn, accumulating cumulative totals (last write wins):

* `$eve.model`: model id for the turn
* `$eve.input_tokens`, `$eve.output_tokens`, `$eve.cache_read_tokens`: running token counts
* `$eve.tool_count`: number of tools available to the turn

Tag writes are best-effort: a failure is logged once per process and then swallowed, so a broken tag emit never breaks the agent.

These tags power the **Agent Runs** tab in the Vercel dashboard. When you deploy on Vercel, the platform auto-detects `eve` as the framework and surfaces an Agent Runs view under your project's **Observability** tab, where you can browse sessions and drill into each conversation's trace, with no `instrumentation.ts` required. The tab is currently gated per team. See [Deployment](./deployment#view-runs-in-the-dashboard) for enablement. Agent Runs is separate from the OpenTelemetry export above. Use OTel when you want spans in Braintrust, Datadog, or another third-party backend.

Note: By default, telemetry records full message history and model outputs You may need to disclose these data flows in your privacy materials if utilized.

## Debugging

`eve info` is the fastest way to see what eve actually picked up: the active tools, skills, subagents, schedules, routes, and discovery diagnostics. eve also writes inspectable artifacts under `.eve/`, kept even when discovery hits errors:

| Artifact                        | Tells you                                   |
| ------------------------------- | ------------------------------------------- |
| `agent-discovery-manifest.json` | what eve found on disk                      |
| `diagnostics.json`              | authored-shape errors and warnings          |
| `compiled-agent-manifest.json`  | the serialized surface eve loads at runtime |
| `module-map.mjs`                | compiled module entrypoints eve imports     |

When `eve build` fails on discovery errors, the CLI prints the full diagnostics report (severity, message, source path) and the path to the diagnostics artifact.

### Common failures

| Symptom                                       | Likely cause and fix                                                                                                                                                                                                                                                        |
| --------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Tool not discovered (the model never sees it) | Run `eve info`. Confirm the file is in the right slot (`agent/tools/<name>.ts`) and default-exports `defineTool(...)`, and check `.eve/diagnostics.json` for shape errors. `schedules/` are root-only.                                                                      |
| Model won't call a tool it should             | Tighten the tool `description` and `inputSchema`; put procedural guidance in a [skill](../skills), not the description. Confirm it's in the active set with `eve info`.                                                                                                     |
| Stuck on `session.waiting`                    | The turn is parked on an approval, a question, or a connection sign-in. Answer it, or POST a follow-up with the `continuationToken` (a stale token is rejected).                                                                                                            |
| 401 on production routes                      | Expected: auth fails closed. Replace `placeholderAuth()` with your route policy. Use `vercelOidc()` only for Vercel-issued tokens; otherwise configure `httpBasic()`, JWT/OIDC helpers, or a custom `AuthFn`. See [Auth and route protection](./auth-and-route-protection). |
| Build fails with discovery errors             | Read the printed diagnostics and `.eve/diagnostics.json`; confirm the root-vs-subagent boundary is valid and secrets come from env vars.                                                                                                                                    |

## What to read next

* [`agent.ts`](../agent-config)
* [Hooks](./hooks): observe the runtime event stream
* [Local Development](./dev-tui): drive the agent locally
* [Evals](../evals/overview): repeatable scored checks


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Remote Agents
description: Call another eve deployment as a subagent with defineRemoteAgent: same lowered tool shape, outbound auth, durable callback dispatch.
---

# Remote Agents



`defineRemoteAgent` calls a separately deployed eve agent as if it were a local subagent. Reach for it when the specialist you delegate to is a separately owned agent behind its own URL rather than a directory in your repo.

The file lives under `agent/subagents/`, so its tool name is derived from the path. There's no `name` field.

```ts title="agent/subagents/weather.ts"
import { defineRemoteAgent } from "eve";
import { vercelOidc } from "eve/agents/auth";

export default defineRemoteAgent({
  url: "https://weather-agent.example.com",
  description: "Answers weather, temperature, forecast, wind, rain, and snow questions.",
  auth: vercelOidc(),
});
```

`defineRemoteAgent` accepts:

| Parameter      | Type                            | Required | Default           | Description                                                                                                                                     |
| -------------- | ------------------------------- | -------- | ----------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
| `url`          | `string`                        | Yes      | n/a               | Base URL of the remote eve deployment to call.                                                                                                  |
| `description`  | `string`                        | Yes      | n/a               | Model-visible delegation description.                                                                                                           |
| `auth`         | `OutboundAuthFn`                | No       | none              | Outbound auth hook from `eve/agents/auth`.                                                                                                      |
| `headers`      | `HeadersValue`                  | No       | none              | Static or lazily resolved request headers.                                                                                                      |
| `path`         | `string`                        | No       | `/eve/v1/session` | Route appended to `url` for the create-session request.                                                                                         |
| `outputSchema` | `StandardSchema \| JSON Schema` | No       | none              | Structured return type the caller requires. Lowered to JSON Schema at compile time and enforced by the remote like any task-mode output schema. |

## The lowered tool

A remote agent lowers to the same `{ message, outputSchema? }` tool shape as a local subagent. The parent packs everything the remote needs into `message`. The remote never sees the parent's history. Set `outputSchema` (here or per call) and the remote runs in task mode (a single-shot delegation that returns one structured result instead of an open conversation; see [Subagents](../subagents)), returning structured output as the tool result.

## Outbound auth

`auth` is an `OutboundAuthFn` from `eve/agents/auth` that attaches request headers to the outbound dispatch:

| Helper                          | Header                                                                       |
| ------------------------------- | ---------------------------------------------------------------------------- |
| `vercelOidc(opts?)`             | `Authorization: Bearer <Vercel OIDC token>` (deployment-to-deployment trust) |
| `bearer(token)`                 | `Authorization: Bearer <token>` (static or lazily resolved)                  |
| `basic({ username, password })` | `Authorization: Basic …`                                                     |

If you are calling another Vercel-deployed eve agent, reach for `vercelOidc()`. The remote verifies the OIDC token to authorize the caller. See [Auth & route protection](./auth-and-route-protection) for the receiving side.

## How remote dispatch and callbacks work

A local subagent runs inline. A remote one runs in its own deployment, so dispatch is asynchronous:

1. The parent starts a task-mode session on the remote's `POST /eve/v1/session`, passing a framework callback URL.
2. The parent turn parks (suspends durably without holding compute; see [Execution model & durability](../concepts/execution-model-and-durability)) until the remote posts a terminal callback.
3. When the callback arrives, the parent resumes and surfaces the result.

The parent stream carries the same `subagent.called`, `action.result`, and `subagent.completed` events as local delegation. For a remote call, `subagent.called.data.remote.url` records the target.

Both failure paths surface to the parent as a failed tool result, so the caller can explain or recover within the same session. A failed *start* returns the error inline. A remote that starts and then fails posts a terminal failure callback, which the parent receives as an errored subagent result carrying the remote's error (or `REMOTE_AGENT_FAILED` when none is supplied). Terminal callback delivery runs as a durable step on the underlying workflow engine (see [Execution model & durability](../concepts/execution-model-and-durability)). A failed callback POST is rethrown rather than marking the task complete, so the engine retries it.

## What to read next

* Local delegation and the isolation boundary → [Subagents](../subagents)
* Have the model orchestrate remote agents programmatically → [Dynamic workflows](./dynamic-workflows)
* Securing the receiving deployment → [Auth & route protection](./auth-and-route-protection)


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Session Context
description: Runtime helpers: ctx.session, ctx.getSandbox, ctx.getSkill, and defineState.
---

# Session Context



eve exposes runtime state through the `ctx` parameter passed to tool `execute`, hook handlers, and channel event handlers:

* `ctx.session`: session metadata, turn, auth, and parent lineage
* `ctx.getSandbox()`: live sandbox handle for the current agent
* `ctx.getSkill(identifier)`: handle for a named skill visible to the current agent
* `defineState(name, initial)`: typed durable state with `get()` and `update()` (imported from `eve/context`)

These APIs work only inside active authored runtime execution, including tools, channel event handlers, and authored hooks. They throw when called outside a managed context.

## `ctx.session`

`ctx.session` exposes durable runtime metadata about the current execution.

```ts title="agent/tools/who_called_me.ts"
import { defineTool } from "eve/tools";
import { z } from "zod";

export default defineTool({
  description: "Return the active session metadata.",
  inputSchema: z.object({}),
  async execute(_input, ctx) {
    return {
      sessionId: ctx.session.id,
      turnId: ctx.session.turn.id,
      turnSequence: ctx.session.turn.sequence,
      currentCaller: ctx.session.auth.current?.principalId,
      initiator: ctx.session.auth.initiator?.principalId,
      parentSessionId: ctx.session.parent?.sessionId,
      parentCallId: ctx.session.parent?.callId,
    };
  },
});
```

Public session fields:

* `auth.current`
* `auth.initiator`
* `id`
* `turn.id`
* `turn.sequence`
* optional `parent`

Behavior:

* `auth.current` is the caller for the active inbound turn.
* `auth.initiator` is the caller that started the durable session.
* Unprotected agents expose both as `null`.
* Top-level schedule sessions expose the framework app principal (`principalId: "eve:app"`, `principalType: "runtime"`).
* `parent` is present for child subagent sessions and includes the parent `callId`, `sessionId`, `rootSessionId`, and `turn`.

## `ctx.getSandbox()`

`ctx.getSandbox()` returns a live handle for the current agent's sandbox.

```ts
const sandbox = await ctx.getSandbox();
const result = await sandbox.run({ command: "npm test" });
```

Behavior:

* It takes no arguments. Each agent has exactly one sandbox.
* It is async because eve binds or restores sandbox state lazily.
* It only works when sandbox access is attached to the active runtime path.
* Visibility is node-local. A subagent sees its own sandbox, not the parent's.

`SandboxSession` also exposes `resolvePath(path)`, which returns the live backend-native path for a logical `/workspace/...` location. Use it when authored code needs that path before passing it to shell code or a child process.

See [Sandbox](../sandbox) for lifecycle details.

## `ctx.getSkill(identifier)`

`ctx.getSkill(identifier)` returns a handle for a named skill visible to the current agent.

```ts
const skill = ctx.getSkill("research");
const notes = await skill.file("references/checklist.md").text();
```

Behavior:

* It is synchronous. File content is read lazily from the active sandbox.
* It only works when sandbox access is attached to the active runtime path.
* `identifier` is the path-derived skill id.
* Visibility follows the current agent's sandbox.
* A missing skill surfaces when a file accessor reads a missing sandbox path.
* The returned handle exposes `name` and `file(relativePath)`.

See [Skills](../skills) for the full authoring model.

## Custom state with `defineState`

Use `defineState` when your agent needs durable typed state that tools, hooks, and channel handlers can share. State survives workflow step boundaries. Declare the handle at module scope so every importer shares it:

```ts title="agent/lib/budget.ts"
import { defineState } from "eve/context";

interface BudgetState {
  readonly count: number;
  readonly cap: number;
}

export const budget = defineState<BudgetState>("myapp.budget", () => ({
  count: 0,
  cap: 25,
}));
```

`get()` reads the current value (returning `initial()` on first access), and `update(fn)` applies a function to it. Both throw outside a managed scope. See [State](./state) for the full read/write model and examples from tools and hooks.

## Where these APIs work

Safe places:

* inside `defineTool(...).execute(input, ctx)`
* inside authored callbacks eve runs inside the runtime
* after asynchronous boundaries inside the same authored execution chain

Unsafe places:

* top-level module evaluation
* build scripts
* discovery-time code paths

If you call them outside an active eve runtime context, they throw immediately with a message explaining the required scope.

## How it works

The framework sets up a context container before invoking authored code:

1. The runtime populates durable seed values (auth, session id, compiled bundle).
2. Before each step, the framework derives step-local values (session metadata, sandbox access, skill access) from the durable state.
3. Authored code runs inside the managed scope, so `ctx` and `defineState` accessors resolve automatically.
4. After the step, the framework commits mutable state (for example sandbox changes) back to the durable session.

The framework manages this lifecycle. Authored code only uses `ctx` and the public accessors.

## What to read next

* [State](./state)
* [Sessions, runs & streaming](../concepts/sessions-runs-and-streaming)
* [Subagents](../subagents)
* [Skills](../skills)


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: State
description: Durable per-session memory with defineState: get() and update(), persisted across step boundaries.
---

# State



`defineState` is a typed, named slot of durable per-session memory for an agent. Use it when the agent has to remember something between conversation turns (a running budget, a glossary, a checklist) and you don't want to stand up an external store for it. The values survive workflow step boundaries, so they outlast crashes, redeploys, and days-long sessions.

```ts
import { defineState } from "eve/context";

const budget = defineState("my-agent.budget", () => ({ count: 0, cap: 25 }));
```

Pass `defineState(name, initial)` a stable string `name` (namespace it to your agent) and an `initial` function that produces the starting value the first time the slot is read. You get back a `StateHandle<T>`:

* `get()`: read the current value. Returns `initial()` on first access within a context.
* `update(fn)`: replace the value with `fn(current)`.

Declare the handle once at module scope and import it wherever you read or write the slot. Use it from inside a tool, hook, or other framework-managed runtime code:

```ts title="agent/lib/budget.ts"
import { defineState } from "eve/context";

export const budget = defineState("my-agent.budget", () => ({ count: 0, cap: 25 }));
```

```ts title="agent/tools/spend.ts"
import { defineTool } from "eve/tools";
import { z } from "zod";
import { budget } from "../lib/budget.js";
import { runQuery } from "../lib/warehouse.js";

export default defineTool({
  description: "Run a query, counting it against the session budget.",
  inputSchema: z.object({ sql: z.string() }),
  async execute({ sql }) {
    const { count, cap } = budget.get();
    if (count >= cap) throw new Error("Query budget exhausted for this session.");
    budget.update((s) => ({ ...s, count: s.count + 1 }));
    return runQuery(sql);
  },
});
```

`get()` and `update()` require an active eve context. Calling them outside tools, hooks, or framework-managed code throws.

## Reset state between turns

State is durable by default and does not reset between turns. If you want a clean slate every turn, overwrite it from a lifecycle [hook](./hooks) on `turn.started`:

```ts title="agent/hooks/reset-budget.ts"
import { defineHook } from "eve/hooks";
import { budget } from "../lib/budget.js";

export default defineHook({
  events: {
    async "turn.started"() {
      budget.update(() => ({ count: 0, cap: 25 }));
    },
  },
});
```

The hook imports the same module-scope `budget` handle as the tool, so both read and write the same slot.

## State is never shared with subagents

Every [subagent](../subagents) starts with its own fresh state, whether it's a built-in `agent` copy or a declared specialist. `defineState` values never cross the parent/child boundary, even when the child is a copy of the same agent.

## State vs. connection-side storage

`defineState` holds conversation-scoped working memory that lives and dies with the session, including counters, the current plan, and what the user has told you this conversation. It is the agent's short-term memory, persisted durably for the life of the session. Anything that has to outlive the session, be shared across sessions or users, or be queried independently of a turn belongs in an external store, either a [connection](../connections) or your own database.

## What to read next

* Read state inside dynamic resolvers → [Dynamic capabilities](./dynamic-capabilities)
* How step durability works → [Execution model & durability](../concepts/execution-model-and-durability)
* The `ctx` accessors available alongside state → [TypeScript API](../reference/typescript-api)


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: CLI
description: Reference for every eve CLI command: init, info, build, start, dev, link, deploy, eval, and channels.
---

# CLI



The `eve` binary (`bin: eve`) runs from your app root, and every command first loads `.env`/`.env.local` from that root. Running `eve` with no command runs `eve dev`.

## Commands

| Command                   | Description                                                                                                                                           |
| ------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- |
| `eve init [target]`       | Scaffold a new agent, or add one to an existing project directory                                                                                     |
| `eve info`                | Print the resolved application, including discovered tools, skills, subagents, schedules, channels, routes, artifact paths, and discovery diagnostics |
| `eve build`               | Compile `.eve/` artifacts and build the host output; prints the output directory                                                                      |
| `eve start`               | Serve the built `.output/` app; prints the listening URL                                                                                              |
| `eve dev`                 | Start the local dev server and open the terminal UI                                                                                                   |
| `eve dev <url>`           | Connect the UI to an existing server URL (e.g. a remote deployment) instead of booting a local server                                                 |
| `eve link`                | Link the directory to a Vercel project and pull AI Gateway credentials                                                                                |
| `eve deploy`              | Deploy the agent to Vercel production (links first if needed)                                                                                         |
| `eve eval`                | Run evals against the local app or a remote target                                                                                                    |
| `eve channels add [kind]` | Scaffold a channel interactively, or by kind (`slack` \| `web`)                                                                                       |
| `eve channels list`       | List user-authored channels                                                                                                                           |

When `eve build` fails on discovery errors, it prints the full diagnostics report (severity, message, source path) and the diagnostics artifact path.

## `eve init`

```bash
eve init [target] [--channel-web-nextjs]
```

The optional `target` decides the mode:

* A name (`eve init my-agent`) scaffolds a fresh project in a new `my-agent/` directory.
* An existing directory, including `.` for the current one (`eve init .`), adds an agent to that project. The project needs a `package.json`, the `agent/` files must not exist yet, and the missing `eve`, `ai`, and `zod` dependencies are added without touching anything else.
* Omitting the target scaffolds or updates the current directory, the same as `eve init .`.

Either mode installs dependencies, initializes Git, and runs `eve dev` through the detected project package manager. Fresh projects inherit a parent workspace manager when one is present; otherwise they use the manager that launched `eve init`.

| Flag                   | Type | Default | Description                                                                                                                            |
| ---------------------- | ---- | ------- | -------------------------------------------------------------------------------------------------------------------------------------- |
| `--channel-web-nextjs` | flag | off     | Add the Web Chat application (a Next.js app). Rejected when adding to an existing project; run `eve channels add web` there afterward. |

## `eve info`

```bash
eve info [--json]
```

| Flag     | Type | Default | Description  |
| -------- | ---- | ------- | ------------ |
| `--json` | flag | off     | Emit as JSON |

Run this first when something behaves unexpectedly. It confirms a file was discovered, lists the active surface, and surfaces discovery diagnostics, all faster than booting the dev server.

## `eve build`

```bash
eve build
```

No flags. Compiles to `.eve/` and builds the host output, then prints the built output path.

Useful artifacts written under `.eve/` (preserved even on partial failure):

| Artifact                                       | Description                                          |
| ---------------------------------------------- | ---------------------------------------------------- |
| `.eve/discovery/agent-discovery-manifest.json` | What eve found on disk                               |
| `.eve/discovery/diagnostics.json`              | Authored-shape errors and warnings                   |
| `.eve/compile/compiled-agent-manifest.json`    | The serialized authored surface eve loads at runtime |
| `.eve/compile/compile-metadata.json`           | Build-time metadata and paths                        |
| `.eve/compile/module-map.mjs`                  | Compiled module entrypoints eve imports at runtime   |

## `eve start`

```bash
eve start [--host <host>] [--port <port>]
```

| Flag            | Type   | Default            | Description            |
| --------------- | ------ | ------------------ | ---------------------- |
| `--host <host>` | string | all interfaces     | Host interface to bind |
| `--port <port>` | number | `$PORT`, then 3000 | Port to listen on      |

Serves the previously built output. Prints the listening URL.

## `eve dev`

```bash
eve dev [options]
eve dev https://your-app.vercel.app
```

Pass a bare URL as the only argument and the UI connects to that server instead of booting a local one (same as `--url`), which lets you smoke-test a preview or production deployment. The interactive UI turns off in a non-TTY terminal.

| Flag                                | Type   | Default            | Description                                                                               |
| ----------------------------------- | ------ | ------------------ | ----------------------------------------------------------------------------------------- |
| `--host <host>`                     | string | all interfaces     | Host interface to bind                                                                    |
| `--port <port>`                     | number | `$PORT`, then 3000 | Port to listen on                                                                         |
| `-u, --url <url>`                   | string | none               | Connect to an existing server URL instead of starting one                                 |
| `--no-ui`                           | flag   | UI on              | Start the server without an interactive UI                                                |
| `--name <name>`                     | string | app folder name    | Title shown in the terminal UI                                                            |
| `--input <text>`                    | string | none               | Pre-fill the prompt input after launching the UI (editable, not auto-submitted)           |
| `--tools <mode>`                    | enum   | `auto-collapsed`   | Tool-call rendering: `full` \| `collapsed` \| `auto-collapsed` \| `hidden`                |
| `--reasoning <mode>`                | enum   | `full`             | Reasoning rendering: `full` \| `collapsed` \| `auto-collapsed` \| `hidden`                |
| `--subagents <mode>`                | enum   | `auto-collapsed`   | Subagent-section rendering: `full` \| `collapsed` \| `auto-collapsed` \| `hidden`         |
| `--connection-auth <mode>`          | enum   | `full`             | Connection-authorization rendering: `full` \| `collapsed` \| `auto-collapsed` \| `hidden` |
| `--assistant-response-stats <mode>` | enum   | `tokensPerSecond`  | Assistant header statistic: `tokens` \| `tokensPerSecond`                                 |
| `--context-size <tokens>`           | number | none               | Model context window size, shown as a usage percentage                                    |
| `--logs <mode>`                     | enum   | `stderr`           | Server/agent logs to show: `all` \| `stderr` \| `sandbox` \| `none`                       |

Local dev writes the active server process ID to `.eve/dev-process.pid`. If another `eve dev` starts for the same agent while that process is still running, eve exits with a message that includes the command to stop the existing server.

Local dev keeps immutable runtime source snapshots under `.eve/dev-runtime/snapshots/` so in-flight sessions hold a consistent code revision while new prompts pick up rebuilds. On startup, `eve dev` prunes stale runtime snapshots and old local sandbox templates in the background. For manual cleanup, stop `eve dev` and delete `.eve/dev-runtime/snapshots/` or `.eve/sandbox-cache/local/templates/`.

## `eve link`

```bash
eve link
```

Links the current directory to an existing Vercel project. You select a team and then a project, and eve pulls the project's environment so an AI Gateway credential (`VERCEL_OIDC_TOKEN` or `AI_GATEWAY_API_KEY`) lands in `.env.local`, then verifies one actually did. Running it again re-links: the pickers always run, and the new choice wins. The command is interactive only; in CI, use `vercel link --project <name> --yes` instead. A running `eve dev` reloads env files automatically, so you don't need to restart after the pull.

## `eve deploy`

```bash
eve deploy
```

Deploys the agent to Vercel production (`vercel deploy --prod`), installing dependencies first and pulling environment variables after. An already-linked project deploys with or without a TTY (non-interactive runs pass the non-interactive `vercel` flags). An unlinked directory walks the `eve link` pickers when a terminal is present, and exits with guidance otherwise.

## `eve eval`

```bash
eve eval [evalId...] [--url <url>] [options]
```

Runs all discovered evals when no eval ids are given; ids match exactly or by directory prefix (`eve eval weather` runs everything under `evals/weather/`). Exits `0` when every eval passed its checks, `1` when any eval failed (a failed check, an execution error, or a `--strict` threshold miss), `2` on configuration errors.

| Flag                    | Type   | Default | Description                                    |
| ----------------------- | ------ | ------- | ---------------------------------------------- |
| `--url <url>`           | string | none    | Remote agent URL (skip local host startup)     |
| `--tag <tag...>`        | string | none    | Run only evals carrying a tag                  |
| `--strict`              | flag   | off     | Below-threshold scores also fail the exit code |
| `--list`                | flag   | off     | Print discovered evals without running them    |
| `--timeout <ms>`        | number | none    | Per-eval timeout in milliseconds               |
| `--max-concurrency <n>` | number | 8       | Max concurrent eval executions                 |
| `--json`                | flag   | off     | Output results as JSON                         |
| `--junit <path>`        | string | none    | Write JUnit XML results to a file              |
| `--skip-report`         | flag   | off     | Skip eval-defined reporters (e.g. Braintrust)  |
| `--verbose`             | flag   | off     | Stream per-eval `t.log` lines to stdout        |

See [Evals](../evals/overview) for authoring evals.

## `eve channels add`

```bash
eve channels add [kind] [-f] [-y]
```

Scaffolds a channel into `agent/channels/`. With no `kind` it prompts interactively; pass a `kind` (`slack` | `web`) to scaffold one directly.

| Flag          | Type | Default | Description                                               |
| ------------- | ---- | ------- | --------------------------------------------------------- |
| `-f, --force` | flag | off     | Overwrite existing channel files                          |
| `-y, --yes`   | flag | off     | Assume yes for confirmations; requires an explicit `kind` |

## `eve channels list`

```bash
eve channels list [--json]
```

Lists the user-authored channels in the current project.

| Flag     | Type | Default | Description    |
| -------- | ---- | ------- | -------------- |
| `--json` | flag | off     | Output as JSON |

## Recommended loop

1. Edit files under `agent/`.
2. `eve info` to confirm discovery or read diagnostics.
3. `eve dev` while iterating locally.
4. `eve build` before shipping.
5. `eve start` to smoke-test the built output locally.

Related: [Project layout](./project-layout) · [instrumentation.ts](../guides/instrumentation).

## What to read next

* [Project layout](./project-layout): what `eve info` discovers
* [instrumentation.ts](../guides/instrumentation): tracing and the error catalog
* [Deployment](../guides/deployment): `eve build` and `eve start` in production


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Project Layout
description: Authored slots under agent/ and the path-derived naming rule.
---

# Project Layout



eve builds an agent by walking the filesystem under `agent/`. Each directory is an authored slot, and the slot a file lands in determines how eve loads it.

## Naming rule

Identity comes from the path. You never write a `name` or `id` field on a `define*` call.

| Path                                  | Resolves to           |
| ------------------------------------- | --------------------- |
| `agent/tools/get_weather.ts`          | tool `get_weather`    |
| `agent/connections/linear.ts`         | connection `linear`   |
| `agent/skills/summarize.md`           | skill `summarize`     |
| `agent/subagents/researcher/agent.ts` | subagent `researcher` |

The root agent takes its name from the enclosing `package.json` `name`, falling back to the app-root directory name when `package.json` has no `name`. A subagent takes its name from its directory.

## Recommended layout

```text
my-agent/
├── package.json
├── tsconfig.json
├── agent/
│   ├── agent.ts
│   ├── instructions.md
│   ├── instrumentation.ts
│   ├── channels/
│   ├── connections/
│   ├── hooks/
│   ├── skills/
│   ├── lib/
│   ├── sandbox/
│   ├── tools/
│   ├── schedules/
│   └── subagents/
└── evals/
```

Evals live in `evals/` at the app root, a sibling of `agent/`, not inside it. See [Evals](../evals/overview).

## Slot table

The Subagents column states whether a local subagent (`subagents/<id>/`) can author the slot. A declared subagent inherits nothing from the root; it discovers its own slots. See [Subagents](../subagents).

| Path                                                    | Description                                 | Subagents | Notes                                                                                                                                                                                                                 |
| ------------------------------------------------------- | ------------------------------------------- | --------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `agent.ts`                                              | Runtime config                              | Yes       | Model, modelOptions, compaction, build, experimental. See [Agent config](../agent-config).                                                                                                                            |
| `instructions.md` / `instructions.ts` / `instructions/` | Base system prompt                          | Optional  | A flat file, or a directory of `.md` and `.ts` files. Static sources compose at build time. Dynamic sources (`defineDynamic` + `defineInstructions`) resolve at runtime. Required on the root, optional on subagents. |
| `instrumentation.ts`                                    | Telemetry config                            | No        | OTel exporter and AI SDK span settings, auto-discovered and run before agent code. Root-only.                                                                                                                         |
| `channels/`                                             | HTTP / messaging entrypoints                | No        | Root-only.                                                                                                                                                                                                            |
| `connections/`                                          | External service connections (MCP, OpenAPI) | Yes       | One connection per file; name derived from filename.                                                                                                                                                                  |
| `hooks/`                                                | Lifecycle and stream-event subscribers      | Yes       | Module-backed only. Recursive directories supported.                                                                                                                                                                  |
| `skills/`                                               | On-demand procedures and capability packs   | Yes       | Flat markdown, module-backed skills, or packaged skills. Seeded into `/workspace/skills/...`.                                                                                                                         |
| `lib/`                                                  | Shared authored helper code                 | Yes       | Import-only; not mounted into the workspace.                                                                                                                                                                          |
| `sandbox.ts` or `sandbox/sandbox.ts`                    | The agent's single sandbox                  | Yes       | Use top-level `sandbox.ts` for a definition-only override; use `sandbox/sandbox.ts` + `sandbox/workspace/**` to also seed files. Framework default applies when neither is authored.                                  |
| `sandbox/workspace/**`                                  | Files seeded into the sandbox               | Yes       | Mirrored into `/workspace/...` at session bootstrap.                                                                                                                                                                  |
| `tools/`                                                | Typed executable integrations               | Yes       | Module-backed only.                                                                                                                                                                                                   |
| `schedules/`                                            | Recurring jobs                              | No        | Each schedule is `<name>.ts` (default-exported `defineSchedule`) or `<name>.md` (frontmatter `cron:` + prompt body). Recursive nesting supported. Root-only.                                                          |
| `subagents/`                                            | Specialist child agents                     | Yes       | Each child is its own local package under `subagents/<id>/`. Nested subagents are supported.                                                                                                                          |

## What reaches the runtime workspace

eve does not mount the whole tree. Only two sources land in the sandbox workspace:

* `skills/` files → `/workspace/skills/...`
* `agent/sandbox/workspace/**` → `/workspace/...` at session bootstrap

Everything in `lib/` stays import-only source code and never reaches the workspace.

## Local subagent layout

A local subagent lives under `subagents/<id>/` and uses the same `agent.ts` shape as the root.

```text
agent/subagents/researcher/
├── agent.ts
├── instructions.md
├── connections/
├── hooks/
├── skills/
├── lib/
├── sandbox/
├── tools/
└── subagents/
```

Rules:

* `agent.ts` is required, and must declare a `description`. The parent reads it on the lowered subagent tool to decide when to delegate.
* `instructions.md` / `instructions.ts` is optional (unlike the root agent, where it is required).
* `connections/`, `hooks/`, `skills/`, `lib/`, `sandbox/`, and `tools/` are all supported, discovered from the subagent's own directory.
* `channels/` and `schedules/` are not supported inside local subagents.
* Nested subagents are supported.

## Flat layout

Supported when the app root is also the agent root:

```text
my-agent/
├── package.json
├── agent.ts
├── instructions.md
├── tools/
└── skills/
```

Prefer the nested layout. It keeps the app root separate from the authored surface.

## Why didn't eve discover my file?

Run `eve info`. It lists the discovered surface and prints discovery diagnostics. From there, check that the file sits in the right authored slot (per the slot table above) and that the root-vs-subagent boundary is valid. eve also writes inspectable artifacts under `.eve/`. See the debugging artifacts in [instrumentation.ts](../guides/instrumentation) and the [CLI](./cli) reference.

## What to read next

* [`agent.ts`](../agent-config): the runtime config at the root
* [Tools](../tools): the most common authored slot
* [TypeScript API](./typescript-api): the define\* helpers and where they import from


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: TypeScript API
description: The define* helpers, the runtime ctx, and where each one is imported from.
---

# TypeScript API



This is the public surface of the `eve` package: the `define*` helpers you author with, the `ctx` they receive at runtime, and the import path for each. The full contract lives in `packages/eve/src/public/index.ts`; anything not exported there is a framework internal.

Identity comes from the filesystem, not a field you set. A tool at `agent/tools/get_weather.ts` is `get_weather`, and a connection at `agent/connections/linear.ts` is `linear`, so no definition carries a `name` or `id`.

Most files look the same: import a helper, default-export the result.

```ts title="agent/agent.ts"
import { defineAgent } from "eve";

export default defineAgent({ model: "anthropic/claude-opus-4.8" });
```

```ts title="agent/tools/get_weather.ts"
import { defineTool } from "eve/tools";
import { z } from "zod";

export default defineTool({
  description: "Get the weather for a city.",
  inputSchema: z.object({ city: z.string() }),
  async execute({ city }, ctx) {
    return { city, condition: "Sunny" };
  },
});
```

## The define\* helpers

| Helper                                                 | Import from                                   | Authored at                          | Guide                                                  |
| ------------------------------------------------------ | --------------------------------------------- | ------------------------------------ | ------------------------------------------------------ |
| `defineAgent`                                          | `eve`                                         | `agent/agent.ts`                     | [agent.ts](../agent-config)                            |
| `defineTool`                                           | `eve/tools`                                   | `agent/tools/<name>.ts`              | [Tools](../tools)                                      |
| `defineDynamic`                                        | `eve/tools`, `eve/skills`, `eve/instructions` | `agent/{tools,skills,instructions}/` | [Dynamic capabilities](../guides/dynamic-capabilities) |
| `defineMcpClientConnection`, `defineOpenAPIConnection` | `eve/connections`                             | `agent/connections/<name>.ts`        | [Connections](../connections)                          |
| `defineChannel`                                        | `eve/channels`                                | `agent/channels/<name>.ts`           | [Custom channels](../channels/custom)                  |
| `eveChannel`, `slackChannel`, and the other platforms  | `eve/channels/<platform>`                     | `agent/channels/<platform>.ts`       | [Channels](../channels/overview)                       |
| `defineSkill`                                          | `eve/skills`                                  | `agent/skills/<name>.ts`             | [Skills](../skills)                                    |
| `defineInstructions`                                   | `eve/instructions`                            | `agent/instructions.ts`              | [Instructions](../instructions)                        |
| `defineHook`                                           | `eve/hooks`                                   | `agent/hooks/<slug>.ts`              | [Hooks](../guides/hooks)                               |
| `defineSchedule`                                       | `eve/schedules`                               | `agent/schedules/<name>.ts`          | [Schedules](../schedules)                              |
| `defineState`                                          | `eve/context`                                 | tools, hooks, lifecycle              | [Session context](../guides/session-context)           |
| `defineSandbox`                                        | `eve/sandbox`                                 | `agent/sandbox.ts`                   | [Sandbox](../sandbox)                                  |
| `defineInstrumentation`                                | `eve/instrumentation`                         | `agent/instrumentation.ts`           | [instrumentation.ts](../guides/instrumentation)        |
| `defineRemoteAgent`                                    | `eve`                                         | `agent/subagents/<id>/agent.ts`      | [Remote agents](../guides/remote-agents)               |
| `defineEval`                                           | `eve/evals`                                   | `evals/*.eval.ts`                    | [Evals](../evals/overview)                             |
| `defineEvalConfig`                                     | `eve/evals`                                   | `evals/evals.config.ts`              | [Evals](../evals/overview)                             |
| `useEveAgent`                                          | `eve/react`, `eve/vue`, `eve/svelte`          | frontend                             | [Frontend](../guides/frontend/overview)                |

A few non-`define*` helpers round out the set: `disableTool` and `ExperimentalWorkflow` from `eve/tools` (see [Default harness](../concepts/default-harness)), the route verbs `GET`/`POST`/`PUT`/`PATCH`/`DELETE`/`WS` from `eve/channels`, the approval predicates `always`/`once`/`never` from `eve/tools/approval`, and the channel auth helpers `localDev`/`vercelOidc`/`placeholderAuth` from `eve/channels/auth`. To wrap a built-in tool, import its default value from `eve/tools/defaults` (`bash`, `readFile`, `writeFile`, `glob`, `grep`, `webFetch`, `webSearch`, `todo`, `loadSkill`). `AgentWorkflowDefinition` and `AgentWorkflowWorldDefinition` are exported from `eve` for the `defineAgent({ experimental: { workflow } })` config shape.

## Runtime context (`ctx`)

`ctx` is passed to your tool `execute`, hook handlers, and channel event handlers. It is live only while authored code is running, so reaching for it at module top level throws. See [Session context](../guides/session-context) for the full model.

| Member                     | Use                                                                           |
| -------------------------- | ----------------------------------------------------------------------------- |
| `ctx.session`              | Current session, turn, auth, and optional parent lineage (read-only)          |
| `ctx.getSandbox()`         | Live sandbox handle for the current agent                                     |
| `ctx.getSkill(identifier)` | Handle for a named skill visible to the current agent                         |
| `ctx.getToken()`           | Resolve the bearer token for a tool's declared `auth` (throws without `auth`) |
| `ctx.requireAuth()`        | Force the tool's authorization flow before proceeding                         |

## Imports at a glance

| Import                                                      | Holds                                                                |
| ----------------------------------------------------------- | -------------------------------------------------------------------- |
| `eve`                                                       | `defineAgent`, `defineRemoteAgent`, agent config types               |
| `eve/tools`                                                 | `defineTool`, `defineDynamic`, `disableTool`, `ExperimentalWorkflow` |
| `eve/tools/defaults`                                        | the built-in tools as plain values                                   |
| `eve/tools/approval`                                        | `always`, `once`, `never`                                            |
| `eve/connections`                                           | `defineMcpClientConnection`, `defineOpenAPIConnection`               |
| `eve/channels`                                              | `defineChannel`, route verbs                                         |
| `eve/channels/eve`                                          | `eveChannel`                                                         |
| `eve/channels/auth`                                         | `localDev`, `vercelOidc`, `placeholderAuth`                          |
| `eve/channels/{slack,discord,teams,telegram,twilio,github}` | platform channel factories                                           |
| `eve/hooks`                                                 | `defineHook`                                                         |
| `eve/schedules`                                             | `defineSchedule`                                                     |
| `eve/skills`                                                | `defineSkill`, `defineDynamic`                                       |
| `eve/instructions`                                          | `defineInstructions`, `defineDynamic`                                |
| `eve/context`                                               | `defineState`, session and state types                               |
| `eve/sandbox`                                               | `defineSandbox`, backends                                            |
| `eve/instrumentation`                                       | `defineInstrumentation`, `isChannel`                                 |
| `eve/evals`                                                 | `defineEval`, `defineEvalConfig`, eval types                         |
| `eve/evals/expect`                                          | `includes`, `equals`, `matches`, `similarity`                        |
| `eve/evals/reporters`                                       | `Braintrust`, `JUnit`, `EvalReporter`                                |
| `eve/evals/loaders`                                         | `loadJson`, `loadYaml`                                               |
| `eve/react`, `eve/vue`, `eve/svelte`                        | `useEveAgent`                                                        |
| `eve/next`, `eve/nuxt`, `eve/sveltekit`                     | framework bundler plugins                                            |
| [`eve/client`](../guides/client/overview)                   | `Client`, `ClientSession`                                            |

Exported types ship from the same entrypoint as the helper they describe (for example `ToolDefinition` and `ToolContext` from `eve/tools`). For the exhaustive list, read `packages/eve/src/public/index.ts`.

## What to read next

* [`agent.ts`](../agent-config): the agent config these helpers configure
* [Tools](../tools): `defineTool`, the most-used helper
* [Project layout](./project-layout): where each define\* lives on disk


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Human-in-the-loop
description: Pause a run for a person — gate a tool on approval or have the agent ask a question — and resume durably when they answer.
---

# Human-in-the-loop



Human-in-the-loop (HITL) is any point where the agent durably pauses and waits for a person. Two things trigger it, and both ride the same pause-and-resume protocol:

* **Approvals** — a tool requires a person to sign off before (or instead of) running. The agent decides to call the tool; a human decides whether it does.
* **Questions** — the agent itself asks the user a clarifying question or a choice mid-turn, and parks until they answer.

Either way the run parks at `session.waiting`, durably, for as long as it takes — seconds or days — and picks back up exactly where it left off once the answer arrives. Channels render the request for you.

## Approvals

Approval is a property of a [tool](/docs/tools) that pauses for a person before it runs. Gate a tool with `needsApproval` and the helpers from `eve/tools/approval`:

```ts title="agent/tools/refund_charge.ts"
import { defineTool } from "eve/tools";
import { always } from "eve/tools/approval";
import { z } from "zod";

export default defineTool({
  description: "Refund a charge.",
  inputSchema: z.object({ chargeId: z.string(), amount: z.number() }),
  needsApproval: always(), // or once() / never() / a predicate
  async execute(input) {
    return refund(input);
  },
});
```

| Helper     | Behavior                                                                           |
| ---------- | ---------------------------------------------------------------------------------- |
| `never()`  | Never require approval (the default when omitted).                                 |
| `once()`   | Require approval only the first time the tool runs in a session; auto-allow after. |
| `always()` | Require approval before every call.                                                |

By default, omitted `needsApproval` behaves like `never()`, so tool calls may execute without human approval. Require human approval or other safeguards for sensitive, irreversible, regulated, financial, healthcare, employment, housing, legal, safety-impacting, user-impacting, or external side-effecting actions.

When the decision depends on the input, pass your own predicate instead of a helper. It receives `{ toolName, toolInput, approvedTools }` and returns a boolean. `toolInput` can be undefined, so guard the access. To require approval only when an amount crosses a threshold:

```ts
needsApproval: ({ toolInput }) => (toolInput?.amount ?? 0) > 1000,
```

Gating a side effect on approval is also how you make non-idempotent work safe across replays: a charge or email that sits behind `always()` can't fire from a re-run step without a fresh human decision.

## Questions

The built-in `ask_question` tool lets the model pause and ask the user, rather than guessing. It has no `execute` — the model calls it with `{ prompt, options?, allowFreeform? }`:

* `prompt`: the question to put to the user.
* `options`: an optional list of choices to offer. Channels render these as buttons or a select menu.
* `allowFreeform`: whether the user may answer with free text instead of picking an option.

`ask_question` is part of the [default harness](/docs/concepts/default-harness), so it is available without you defining anything. It produces the same `input.requested` pause as an approval, and resumes the same way.

## How pause and resume works

Approvals and questions share one protocol:

1. The model requests input (an approval, or an `ask_question`).
2. eve emits an `input.requested` stream event carrying the pending requests.
3. The turn parks at `session.waiting`, durably, for as long as it takes.
4. The client answers with `inputResponses` (structured, keyed by `requestId`) or a normal follow-up `message`. A follow-up whose text matches an option label (case-insensitive) resolves automatically.

The run picks back up exactly where it parked. Because the pause is durable, nothing is held in memory while it waits — the process can restart and the parked turn survives.

See [Sessions, runs & streaming](/docs/concepts/sessions-runs-and-streaming) for the full event and resume contract that this builds on.

## Answering from a client or channel

Channels turn requests into native UI: the Slack adapter renders approvals as buttons and questions as select menus, and writes the user's choice back as the answer. You get this for free on every [channel](/docs/channels).

From your own frontend, read the pending request off the latest message and answer through the same session — see [Building a frontend](/docs/guides/frontend/overview#human-in-the-loop-prompts) for the client-side reducer and `inputResponses` shape.

## What to read next

* [Tools](/docs/tools): define the typed actions an approval gates
* [Default harness](/docs/concepts/default-harness): the built-in tools, including `ask_question`
* [Sessions, runs & streaming](/docs/concepts/sessions-runs-and-streaming): the event and resume contract behind the pause
* [Building a frontend](/docs/guides/frontend/overview): render and answer requests from your own UI


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Tools
description: Define typed actions the agent can call, and gate sensitive ones on human approval.
---

# Tools



A tool is a typed action the agent can call, such as hitting an API, running a query, or writing a file. The action stays in code you control. Tools run in your app runtime with full access to `process.env`, not in the [sandbox](/docs/sandbox).

## Define a tool

The filename is the tool name the model sees. A file at `agent/tools/get_weather.ts` is exposed as `get_weather`.

```ts title="agent/tools/get_weather.ts"
import { defineTool } from "eve/tools";
import { z } from "zod";

export default defineTool({
  description: "Get the current weather for a city.",
  inputSchema: z.object({ city: z.string().min(1) }),
  async execute({ city }, ctx) {
    return { city, condition: "Sunny", temperatureF: 72 };
  },
});
```

A tool definition needs:

* a filename slug under `agent/tools/`, the model-facing name.
* a `description`: what the tool does, written for the model.
* an `inputSchema`: a Zod schema (or any Standard Schema, or a plain JSON Schema object). Required. For no input, pass `z.object({})`. Zod and Standard Schema infer the `input` type in `execute`. Plain JSON Schema types it as `Record<string, unknown>`.
* an `execute(input, ctx)`: the implementation. May be sync or async.

When a tool returns structured data, add an optional `outputSchema`. With Zod or Standard Schema it also types the `execute` return.

### The `ctx` parameter

`execute` gets a `ctx` carrying the runtime accessors:

* `ctx.session`: session metadata, turn, auth, parent lineage.
* `ctx.getSandbox()`: the live [sandbox](/docs/sandbox) handle.
* `ctx.getSkill(id)`: read a packaged [skill](/docs/skills)'s metadata and files.

Running in the app runtime is what lets a tool import shared code from `lib/`, read `process.env`, and take part in eve’s durable pause/resume model.

eve never runs authored tools during discovery. The model sees descriptors first, and only what it actually calls gets executed. Completed steps never re-run; eve replays the recorded result. A step interrupted mid-execution re-runs, so make non-idempotent side effects like charges or emails idempotent, or gate them with approval.

## Gate a tool on human approval

A tool can require a person to sign off before it runs. Set `needsApproval` with the helpers from `eve/tools/approval`:

```ts title="agent/tools/refund_charge.ts"
import { defineTool } from "eve/tools";
import { always } from "eve/tools/approval";
import { z } from "zod";

export default defineTool({
  description: "Refund a charge.",
  inputSchema: z.object({ chargeId: z.string(), amount: z.number() }),
  needsApproval: always(), // or once() / never() / a predicate
  async execute(input) {
    return refund(input);
  },
});
```

Approval is one half of eve's [human-in-the-loop](./human-in-the-loop) model — the page covers the `always/once/never` helpers, input-dependent predicates, and how a gated call pauses and resumes durably.

## Shape what the model sees with `toModelOutput`

By default the model sees the full `execute` return. When a tool returns rich data a channel needs for rendering but the model only needs the gist, project it down with `toModelOutput`:

```ts
toModelOutput(output) {
  return { type: "text", value: `Report for ${output.domain}: score ${output.score}.` };
},
```

`toModelOutput` receives the full, typed `execute` return and only affects the model. Channel event handlers and hooks still get the full output on `action.result`, so a channel can render rich platform output (Slack Block Kit, say) the model never sees. Return `{ type: "text", value }` for a summary, or `{ type: "json", value }` for a smaller object.

Do not return secrets, credentials, unnecessary personal data, or unbounded sensitive content from tools. Filter, minimize, and redact tool outputs before returning them.

## What to read next

* [Human-in-the-loop](./human-in-the-loop): gate a tool on approval, or have the agent ask a question
* [Skills](/docs/skills): on-demand procedures the model loads when relevant
* [Default harness](/docs/concepts/default-harness): the built-in tools and how to override or disable them
* [Dynamic capabilities](/docs/guides/dynamic-capabilities): tools whose set is resolved per session with `defineDynamic`
* [Auth & route protection](/docs/guides/auth-and-route-protection): authenticate a tool to an external service


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Connect a Warehouse
description: Part 4 of the Build an Agent tutorial. Let each user connect their own warehouse over an OAuth MCP via Vercel Connect.
---

# Connect a Warehouse



The sample dataset got the analytics assistant running, but it's a stand-in. Now point the agent at a real warehouse and let each user connect their own by signing in through their browser. That's what a connection is for. It's an MCP server the model reaches through tools, with auth that eve drives for you.

This step depends on Vercel Connect, which is in private beta. No Connect access? Keep the Step 3 sample dataset and read this step for the connection model. Steps 5 through 9 work against the sample dataset, so you can complete the tutorial without a warehouse.

The filename sets the runtime name. Put the file at `agent/connections/warehouse.ts` and it registers as `"warehouse"`, with its tools surfaced as `connection__warehouse__<tool>`.

## Declare the connection

The warehouse exposes a generic SQL MCP behind OAuth. Pass `connect()` from `@vercel/connect/eve` as the auth, and Vercel Connect handles the OAuth flow, stores the tokens, and refreshes them for you:

```ts title="agent/connections/warehouse.ts"
import { connect } from "@vercel/connect/eve";
import { defineMcpClientConnection } from "eve/connections";

export default defineMcpClientConnection({
  url: "https://mcp.your-warehouse.example/sse",
  description: "The team's data warehouse: run read-only SQL and list tables and columns.",
  auth: connect("warehouse"),
});
```

`"warehouse"` is the UID you chose when registering the Connect client. By default this OAuth is user-scoped. Each end-user authorizes in their own browser, and eve resolves that user's token before every tool call.

Once Connect is enabled on your account, wire it up:

1. Install the package: `npm install @vercel/connect`.
2. Create the Connect client: `vercel connect create <type> --name warehouse`.
3. Link the client to your project.
4. Run `vercel link` and `vercel env pull` so `VERCEL_OIDC_TOKEN` is available locally.

For the full reference, see [Connections](../connections).

## What the user sees

Ask a question that needs the warehouse:

```text
How many enterprise customers signed up last month?
```

The first time, the model picks a warehouse tool but there's no token yet, so the turn parks and the channel shows a "Sign in" affordance. You authorize in the browser, and once the OAuth callback completes, the turn resumes from exactly that step (the durable parking from [Step 2](./how-it-runs)) and the query runs. Later calls in the session reuse the cached per-user token, so there's no prompt.

## The token never reaches the model

Right before each request to the MCP server, eve resolves the bearer and sends it as `Authorization: Bearer <token>`. The model only ever sees tool names, descriptions, and results. The credential stays out of its reach.

If you want more control, gate the connection behind approval (`approval: once()`) or narrow which tools the model sees (`tools.allow`). See [Connections](../connections).

→ Next: [Run analysis](./run-analysis)

Learn more: [Connections](../connections) · [Auth and route protection](../guides/auth-and-route-protection)


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Your First Agent
description: Part 1 of the Build an Agent tutorial. Scaffold the analytics assistant, give it an analyst persona, run it, and ask a question.
---

# Your First Agent



The Build an Agent tutorial constructs one app end to end, a data analytics assistant. You ask in natural language, and over the next nine steps it learns to query a warehouse, run analysis in a sandbox, remember your team's metric definitions, and refuse to exceed your query budget without asking.

Step 1 gets it talking. The scaffold bundles a small sample dataset, so your first question works with zero setup.

## Prerequisites

* Node 24 or newer and npm.
* A model credential. The scaffold's default model goes through the [Vercel AI Gateway](../getting-started), so you need `AI_GATEWAY_API_KEY` (or `VERCEL_OIDC_TOKEN` pulled via `vercel link`). A direct provider model like `anthropic("claude-opus-4.8")` instead needs that provider's AI SDK package and key, here `@ai-sdk/anthropic` and `ANTHROPIC_API_KEY`.

If you have not run eve before, complete [Getting Started](../getting-started) first. Without a credential, "Run the agent" below fails when the runtime tries to reach the model; the dev TUI's `/model` flow walks you through pasting a key or linking a project.

## Scaffold the agent

```bash
npx eve@latest init analytics-assistant
cd analytics-assistant
```

The command writes the starter agent with eve's default model and built-in HTTP API
channel (`agent/channels/eve.ts`), installs dependencies, initializes Git, and
starts the development server. Stop the server before continuing with the edits
below. It does not create a Vercel project or deploy. `init` creates the
`analytics-assistant/` directory, so `cd` into it before running further
commands.

## Set the model

`agent/agent.ts` holds the model and config. Use a capable model for analysis work:

```ts
import { defineAgent } from "eve";

export default defineAgent({
  model: "anthropic/claude-opus-4.8",
});
```

## Give it an analyst persona

`agent/instructions.md` is the always-on system prompt. Replace the starter text with a standing identity for a data analyst:

```md
You are a senior data analyst. You answer questions about the team's data.

- Prefer exact numbers to hand-waving. If you can compute it, compute it.
- State the assumptions behind any number you report (date range, filters, grain).
- Use the tools available to you rather than guessing. If you cannot answer from
  the data, say so plainly.
```

Instructions are identity and standing rules. On-demand procedures belong in skills (Step 7), and actions belong in tools (Step 3). See [Instructions](../instructions).

## Run the agent

```bash
npm run dev
```

The `init` scaffold writes a `dev` script that runs the `eve dev` binary from the project's `node_modules`. The local runtime boots and the dev TUI opens. Ask it something it can answer from general knowledge first:

```text
What's a good way to measure week-over-week retention?
```

You get a reply that follows the analyst persona. It can't see your data yet (that comes in Step 3). First, a look at what happened under the hood.

→ Next: [How it runs](./how-it-runs)

Learn more: [Getting Started](../getting-started) · [Instructions](../instructions)


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Guard the Spend
description: Part 8 of the Build an Agent tutorial. Gate expensive queries with cost-based approval. The agent pauses, asks, and resumes.
---

# Guard the Spend



A single warehouse query can scan terabytes and run up the bill. So before the analytics assistant fires off an expensive scan, make it stop and check with you. The agent pauses, asks you, and resumes with your answer. That's human-in-the-loop, and you wire it up with one field on the tool.

`needsApproval` runs before `execute`. Return `true` and the turn parks on an approval request; you answer, and the run picks up from that exact step. The function gets the tool input, so you can make the decision cost-based.

## Estimate, then gate

This step keeps `run_sql` on the Step 3 sample dataset so you can demo the gate locally. With a real warehouse you'd gate the warehouse connection tool from Step 4 the same way, on a dry-run byte estimate instead of the toy heuristic below.

Add a cheap estimator and gate `run_sql` on it:

```ts title="agent/lib/cost.ts"
// Illustrative: a real warehouse exposes a dry-run byte estimate.
export function estimateScanGb(sql: string): number {
  return /\bwhere\b/i.test(sql) ? 1 : 200; // unfiltered scans are the expensive ones
}
```

```ts title="agent/tools/run_sql.ts"
import { defineTool } from "eve/tools";
import { z } from "zod";
import { runReadOnlySql } from "../lib/sample-db.js";
import { estimateScanGb } from "../lib/cost.js";

const THRESHOLD_GB = 50;

export default defineTool({
  description: "Run a read-only SQL query against the analytics tables.",
  inputSchema: z.object({ sql: z.string() }),
  // Cost-based gate: only the expensive queries need a human yes.
  needsApproval: ({ toolInput }) => estimateScanGb(toolInput?.sql ?? "") > THRESHOLD_GB,
  async execute({ sql }) {
    const { columns, rows } = await runReadOnlySql(sql);
    return { columns, rows: rows.slice(0, 500), truncated: rows.length > 500 };
  },
});
```

Cheap queries run straight through. A query estimated above the threshold trips the gate.

## Pause, ask, resume

Ask for something that forces a large unfiltered scan:

```text
Total revenue across all customers, all time, broken out by day.
```

The model proposes the query, `needsApproval` returns `true`, and the turn parks. The stream emits `input.requested`, then `session.waiting`. How the prompt looks depends on the channel, whether buttons in the TUI, Block Kit in Slack, or a UI control on the web. Approve it and the run resumes from exactly that step, then the query runs. Deny it and the tool is skipped, with the model told why.

Each session has exactly one active continuation. Answer an approval against a stale handle and it's rejected, so there's no way to double-resume the same parked turn.

The same machinery backs the built-in `ask_question` tool, where the model asks you mid-turn, and per-connection approval via `approval: once()`. See [Tools and human-in-the-loop](../tools).

→ Next: [Ship it](./ship-it)

Learn more: [Tools and human-in-the-loop](../tools)


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: How It Runs
description: Part 2 of the Build an Agent tutorial. Session, turn, and durable steps, and why a turn survives a crash.
---

# How It Runs



The analytics assistant sent one message and got one answer. Three terms describe the model behind that.

| Term        | Meaning                                           |
| ----------- | ------------------------------------------------- |
| **session** | Your whole conversation (durable, can span days). |
| **turn**    | One message you send and the work it triggers.    |
| **step**    | A durable checkpoint within the turn.             |

Each turn runs as a durable workflow, and eve saves progress at every step. Completed steps never re-run; eve replays the recorded result. A step interrupted mid-execution re-runs, so make non-idempotent side effects like charges or emails idempotent, or gate them with approval. A turn that's waiting on you (an approval, a question) resumes whenever you answer, even if that's much later.

That's why the features in the rest of this tutorial work the way they do:

* The warehouse sign-in in Step 4 parks the turn until you authorize in the browser. A few minutes is fine.
* The metric glossary in Step 6 survives across turns. State is checkpointed at step boundaries, so it sticks.
* The spend approval in Step 8 pauses the turn on your yes/no, then picks up exactly where it left off.

You author capabilities, including tools, instructions, channels, and skills. eve drives the model-to-tool loop and decides when a turn continues, waits, or ends. You never write that loop yourself.

→ Next: [Step 3: Query sample data](./query-sample-data)

Depth: [Execution model & durability](../concepts/execution-model-and-durability)


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Query Sample Data
description: Part 3 of the Build an Agent tutorial. Add a run_sql tool over the bundled sample dataset and watch the tool loop.
---

# Query Sample Data



The analytics assistant can hold a conversation, but it can't see a single row of data. Give it a tool. A tool is the action primitive. Typed input goes in, your code runs, structured output comes back. The name the model sees is the filename, so `agent/tools/run_sql.ts` becomes the tool `run_sql`.

## A tiny sample dataset

To make the first query work without setup, bundle a small in-memory dataset under `agent/lib/`. Keep it tiny. This is throwaway scaffolding, not the real warehouse (that comes in Step 4).

```ts title="agent/lib/sample-db.ts"
// A toy SQLite-in-memory stand-in. Swap for your real warehouse in Step 4.
import initSqlJs from "sql.js";

const SEED = `
  CREATE TABLE orders (id INTEGER, customer_id INTEGER, amount_cents INTEGER, created_at TEXT);
  INSERT INTO orders VALUES
    (1, 10, 4200, '2026-05-01'), (2, 10, 1500, '2026-05-03'),
    (3, 11, 9900, '2026-05-04'), (4, 12,  800, '2026-05-06');
  CREATE TABLE customers (id INTEGER, name TEXT, plan TEXT);
  INSERT INTO customers VALUES
    (10, 'Acme', 'pro'), (11, 'Globex', 'enterprise'), (12, 'Initech', 'free');
`;

let dbPromise: Promise<import("sql.js").Database> | null = null;

async function db() {
  dbPromise ??= initSqlJs().then((SQL) => {
    const database = new SQL.Database();
    database.run(SEED);
    return database;
  });
  return dbPromise;
}

export async function runReadOnlySql(sql: string) {
  const database = await db();
  const [result] = database.exec(sql);
  if (!result) return { columns: [], rows: [] as unknown[][] };
  return { columns: result.columns, rows: result.values };
}
```

## Define the run\_sql tool

```ts title="agent/tools/run_sql.ts"
import { defineTool } from "eve/tools";
import { z } from "zod";
import { runReadOnlySql } from "../lib/sample-db.js";

export default defineTool({
  description:
    "Run a read-only SQL query against the analytics tables (orders, customers) " +
    "and return the columns and rows.",
  inputSchema: z.object({
    sql: z.string().describe("A single read-only SELECT statement."),
  }),
  async execute({ sql }) {
    const { columns, rows } = await runReadOnlySql(sql);
    // Bound the output so a wide query can't flood the model's context.
    return { columns, rows: rows.slice(0, 500), truncated: rows.length > 500 };
  },
});
```

Tools run in your app runtime with full `process.env`, not in the sandbox. The `inputSchema` both validates the call and types the `input` you get inside `execute`. For output bounding, `toModelOutput`, and authorization, see [Tools](../tools).

## Watch the tool loop

Restart the dev server with `npm run dev` and ask:

```text
Which customer has spent the most, and how much?
```

Watch the loop play out in the TUI. The model emits a `run_sql` call, eve runs your `execute`, and the rows come back as a tool result. The model reads them and answers with a real number. eve drove the whole loop. All you supplied was the tool.

→ Next: [Connect a warehouse](./connect-a-warehouse)

Learn more: [Tools](../tools)


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Remember Definitions
description: Part 6 of the Build an Agent tutorial. Use defineState to remember the team's metric glossary across turns.
---

# Remember Definitions



Every team has house definitions for the analytics assistant. "Active" means a purchase in the last 30 days, revenue is net of refunds, a "week" starts Monday. Re-explaining all of that on every turn is a waste. State gives the agent a place to keep them.

`defineState(name, initial)` creates a typed, named slot that survives across step and turn boundaries within a session. You read it with `get()` and change it with `update()`.

## Define the glossary slot

```ts title="agent/lib/glossary.ts"
import { defineState } from "eve/context";

export interface Glossary {
  readonly terms: Readonly<Record<string, string>>;
}

export const glossary = defineState<Glossary>("analytics.glossary", () => ({
  terms: {},
}));
```

## Tools to read and write it

```ts title="agent/tools/define_metric.ts"
import { defineTool } from "eve/tools";
import { z } from "zod";
import { glossary } from "../lib/glossary.js";

export default defineTool({
  description: "Record the team's definition of a metric so it persists across turns.",
  inputSchema: z.object({ term: z.string(), meaning: z.string() }),
  async execute({ term, meaning }) {
    glossary.update((g) => ({ terms: { ...g.terms, [term]: meaning } }));
    return glossary.get();
  },
});
```

```ts title="agent/tools/recall_metrics.ts"
import { defineTool } from "eve/tools";
import { z } from "zod";
import { glossary } from "../lib/glossary.js";

export default defineTool({
  description: "Read the team's recorded metric definitions.",
  inputSchema: z.object({}),
  async execute() {
    return glossary.get();
  },
});
```

## See it persist

```text
> For us, an active customer is one with a purchase in the last 30 days.
  Remember that.
  → calls define_metric("active customer", "purchase in the last 30 days")

> How many active customers do we have?
  → recalls the definition, writes the matching SQL, answers
```

The second turn is a separate turn in the same session, yet the definition is still there. State checkpoints at step boundaries, so it's the same durability from [Step 2](./how-it-runs), now applied to your own data.

State is scoped to a session and isolated per agent, so a subagent starts with fresh state and never sees the parent's. Need to reset something each turn? Call `update(() => fresh)` in a lifecycle hook. More in [State](../guides/state).

→ Next: [Team playbooks](./team-playbooks)

Learn more: [State](../guides/state)


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Run Analysis
description: Part 5 of the Build an Agent tutorial. Seed the warehouse schema into the sandbox workspace, then compute and chart beyond SQL.
---

# Run Analysis



SQL tells the analytics assistant the numbers, but a cohort curve, a forecast, or a chart needs real computation. That's what the sandbox is for. It's an isolated bash environment with a `/workspace` filesystem, and every agent gets exactly one.

This takes two pieces. First seed reference files the model can read, then compute against them.

## Seed the schema into the workspace

Mount the warehouse schema into the sandbox so the model isn't guessing at table shapes. Seeding uses the folder sandbox layout, where anything under `agent/sandbox/workspace/` lands in the live `/workspace` cwd at session bootstrap.

```text
agent/sandbox/
  workspace/
    schema.sql        ← lands at /workspace/schema.sql
    notes/grain.md    ← lands at /workspace/notes/grain.md
```

```sql
-- agent/sandbox/workspace/schema.sql
-- Reference only: table shapes the analyst can read before writing queries.
CREATE TABLE orders     (id INT, customer_id INT, amount_cents INT, created_at DATE);
CREATE TABLE customers  (id INT, name TEXT, plan TEXT, signed_up_at DATE);
```

Top-level workspace entries get advertised to the model automatically, so it knows `schema.sql` is there to read. No `agent/sandbox/sandbox.ts` required. A `workspace/` folder keeps the default sandbox and seeds your files into it.

## Compute and chart in the sandbox

The built-in `bash`, `read_file`, and `write_file` tools already target the sandbox. When you write your own analysis steps, grab a live handle with `ctx.getSandbox()`:

```ts title="agent/tools/chart_series.ts"
import { defineTool } from "eve/tools";
import { z } from "zod";

export default defineTool({
  description:
    "Plot a time series to a PNG in the workspace. Pass {date, value} points; " +
    "returns the chart path.",
  inputSchema: z.object({
    title: z.string(),
    points: z.array(z.object({ date: z.string(), value: z.number() })),
  }),
  async execute({ title, points }, ctx) {
    const sandbox = await ctx.getSandbox();
    await sandbox.writeTextFile({
      path: "analysis/series.json",
      content: JSON.stringify({ title, points }),
    });
    await sandbox.writeTextFile({
      path: "analysis/plot.py",
      content: [
        "import json, matplotlib",
        "matplotlib.use('Agg')",
        "import matplotlib.pyplot as plt",
        "d = json.load(open('series.json'))",
        "plt.plot([p['date'] for p in d['points']], [p['value'] for p in d['points']])",
        "plt.title(d['title']); plt.savefig('chart.png')",
      ].join("\n"),
    });
    const root = sandbox.resolvePath("analysis");
    await sandbox.run({ command: `cd ${JSON.stringify(root)} && python plot.py` });
    return { chart: `${root}/chart.png` };
  },
});
```

This tool shells out to `python` with matplotlib, which the sandbox base image does not preinstall. Install the runtime in sandbox bootstrap (or bake it into a custom image) so `python plot.py` resolves. See [Sandbox](../sandbox) for where bootstrap runs.

Now ask for something past plain SQL. If you skipped Step 4, this still works against the Step 3 sample dataset:

```text
Plot total order revenue per customer.
```

The model queries for the numbers (the warehouse from Step 4, or the sample dataset if you skipped it), checks `schema.sql` to get the grain right, then calls `chart_series` to render the PNG in `/workspace`.

## Secrets stay out of the sandbox

The sandbox has no `process.env` and no access to your app's secrets. Your warehouse token lives in the app runtime, and firewall brokering is the only path it takes to the warehouse host. It never enters the sandbox process.

The local backend runs the sandbox on your laptop during `eve dev`; on Vercel it runs on Vercel Sandbox. Lifecycle, backends, and network policy are in [Sandbox](../sandbox).

→ Next: [Remember definitions](./remember-definitions)

Learn more: [Sandbox](../sandbox)


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Ship It
description: Part 9 of the Build an Agent tutorial. Put a web dashboard on the agent with useEveAgent, replace placeholderAuth, and deploy to Vercel.
---

# Ship It



The analytics assistant runs fine in the TUI. Now ship it for real, as a web dashboard your team logs into, behind actual auth, deployed on Vercel. There are three pieces to wire up. A React UI, the channel's auth, and the deploy itself.

## Add the Web Chat app

Step 1 scaffolded the agent without a web frontend. Add one now with `eve channels add`, run from the `analytics-assistant/` directory:

```bash
npx eve channels add web
```

This adds a Next.js app (`next.config.ts`, `app/page.tsx`, `app/_components/`) wired to the existing eve channel, plus the chat UI components and their dependencies. Run `npm install` afterward to install the added packages. The generated `next.config.ts` wraps your config with `withEve`, which wires the eve routes automatically:

```ts title="next.config.ts"
import type { NextConfig } from "next";
import { withEve } from "eve/next";

const nextConfig: NextConfig = {};

export default withEve(nextConfig);
```

## A dashboard with `useEveAgent`

The dashboard talks to the built-in eve HTTP channel (`agent/channels/eve.ts`). On the browser side, `useEveAgent` handles session creation, streaming, and HITL. The scaffold renders its chat from `app/_components/agent-chat.tsx`, mounted by `app/page.tsx`. That component is fuller than you need to start, so replace its contents with this minimal version:

```tsx title="app/_components/agent-chat.tsx"
"use client";

import { useEveAgent } from "eve/react";

export function AgentChat() {
  const agent = useEveAgent();
  const isBusy = agent.status === "submitted" || agent.status === "streaming";

  return (
    <form
      onSubmit={(event) => {
        event.preventDefault();
        const data = new FormData(event.currentTarget);
        const message = String(data.get("q") ?? "").trim();
        if (message) void agent.send({ message });
      }}
    >
      {agent.data.messages.map((message) => (
        <article key={message.id}>
          <header>{message.role}</header>
          {message.parts.map((part, index) =>
            part.type === "text" ? <p key={index}>{part.text}</p> : null,
          )}
        </article>
      ))}
      <input name="q" disabled={isBusy} placeholder="Ask about the data…" />
      <button type="submit" disabled={isBusy}>
        Ask
      </button>
    </form>
  );
}
```

The generated `app/page.tsx` already imports and renders this `AgentChat` export, so no other wiring is needed:

```tsx title="app/page.tsx"
import { AgentChat } from "@/app/_components/agent-chat";

export default function Page() {
  return <AgentChat />;
}
```

`agent.data.messages` and `agent.status` cover most chat UIs. The hook also surfaces HITL prompts (the spend approval from [Step 8](./guard-the-spend)), so the dashboard can render approve/deny controls. For the full API, see [Frontend](../guides/frontend/overview).

## Replace `placeholderAuth`

The scaffold's channel ships with `placeholderAuth()`, which fails closed. It rejects production traffic so an unauthenticated app can't go live by accident. Swap it for your app's real auth before you deploy.

Your auth lives in one module that turns a request into a user. Create `agent/lib/auth.ts` and wire your real provider (a cookie session, Auth.js, Clerk) in here. The stub below returns a fixed user so the page compiles and runs end to end:

```ts title="agent/lib/auth.ts"
export interface AppUser {
  id: string;
  team: string;
}

// Replace with your real session/provider lookup.
export async function authenticate(_request: Request): Promise<AppUser | null> {
  return { id: "demo-user", team: "growth" };
}
```

Now point the channel at it. Replace the contents of `agent/channels/eve.ts`, which Step 7 left with a dev-only `devTeam` entry and `placeholderAuth()`. List your app auth first, ahead of the catch-all helpers, so any entry that doesn't recognize the caller falls through to the next one:

```ts title="agent/channels/eve.ts"
import { eveChannel } from "eve/channels/eve";
import { localDev, vercelOidc, type AuthFn } from "eve/channels/auth";
import { authenticate } from "../lib/auth.js";

const appAuth: AuthFn<Request> = async (request) => {
  const user = await authenticate(request); // your cookie/session/provider
  if (!user) return null;
  return {
    attributes: { team: user.team }, // the claim Step 7's playbook reads
    principalType: "user",
    principalId: user.id,
    authenticator: "app",
    issuer: "analytics-dashboard",
  };
};

export default eveChannel({
  auth: [appAuth, localDev(), vercelOidc()],
});
```

That `team` attribute is exactly what the dynamic playbook in [Step 7](./team-playbooks) reads from `ctx.session.auth`. Identity is set in this one place and flows out to every capability from there.

## Deploy to Vercel

```bash
vercel deploy
```

On Vercel, the web app stays public and the eve runtime sits behind it on the same origin, with the sandbox running on Vercel Sandbox. You can smoke-test the deployment without leaving the CLI:

```bash
npx eve dev https://your-analytics-app.vercel.app
```

That's the full assistant, deployed and authed. It queries the warehouse, runs analysis in a sandbox, charts the results, remembers your team's definitions, loads the right playbook per team, and asks before it spends.

## What you learned

Across the nine steps you built and shipped one agent, and along the way you used:

* **Tools** to give the model typed actions (`run_sql`, `chart_series`, `define_metric`).
* **Connections** to reach a warehouse over an OAuth MCP, with per-user tokens eve resolves for you.
* **The sandbox** to compute and chart beyond SQL in an isolated `/workspace`.
* **State** (`defineState`) to remember the team's glossary across turns.
* **Dynamic skills** (`defineDynamic`) to load the right team playbook per caller.
* **Human-in-the-loop** approval (`needsApproval`) to gate expensive queries.
* **Channel auth** to turn a request into an authenticated principal.
* **Deployment** to Vercel, with the runtime behind your web app.

## Next steps

* [Connections](../connections) for tool allowlists and per-connection approval.
* [Sandbox](../sandbox) for backends, lifecycle, and network policy.
* [Dynamic capabilities](../guides/dynamic-capabilities) for schema-derived dynamic tools, a read-only analyst subagent, and model-authored report workflows on this same example.
* [Auth and route protection](../guides/auth-and-route-protection) for production auth patterns.

Learn more: [Frontend](../guides/frontend/overview) · [Auth and route protection](../guides/auth-and-route-protection) · [Deployment](../guides/deployment)


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Team Playbooks
description: Part 7 of the Build an Agent tutorial. Load the caller's team playbook with a dynamic skill keyed on the principal.
---

# Team Playbooks



The glossary from [Step 6](./remember-definitions) is per-session. But your teams have standing analysis conventions for the analytics assistant (Growth runs cohort retention a particular way, Finance has its own revenue-recognition rules), and those shouldn't bleed across tenants. Load the right team's playbook for whoever is asking.

A skill is an on-demand procedure. The model pulls it in with `load_skill` only when a turn needs it. Make it dynamic and the skill gets decided at runtime instead of baked in. A `defineDynamic` resolver reads the session and returns a `defineSkill` (or nothing). Here you key that decision on the caller's identity in `ctx.session.auth`.

## A playbook per principal

`ctx.session.auth.current` holds the most recent caller, or `null` if there isn't one. Its `attributes` are the claims your auth layer stamped on, including the team. Read the team, look up that team's playbook, and emit a skill for it:

```ts title="agent/skills/team-playbook.ts"
import { defineDynamic, defineSkill } from "eve/skills";

const PLAYBOOKS: Record<string, { title: string; markdown: string }> = {
  growth: {
    title: "Growth analysis playbook",
    markdown:
      "When analyzing retention, use weekly cohorts anchored on signup week, " +
      "report curves not point estimates, and exclude trial accounts.",
  },
  finance: {
    title: "Finance analysis playbook",
    markdown:
      "Report revenue net of refunds and recognized over the subscription term. " +
      "Always reconcile against the close-of-month snapshot.",
  },
};

export default defineDynamic({
  events: {
    "session.started": async (_event, ctx) => {
      const team = ctx.session.auth.current?.attributes.team;
      const key = Array.isArray(team) ? team[0] : team;
      const playbook = key ? PLAYBOOKS[key] : undefined;
      if (!playbook) return null;

      return defineSkill({
        description:
          `Use when answering analysis questions for the ${key} team. ` +
          `Contains that team's standing conventions.`,
        markdown: `# ${playbook.title}\n\n${playbook.markdown}`,
      });
    },
  },
});
```

`session.started` fires once per session. The resolver reads the team once, and the resulting skill stays available for every turn that follows. Returning `null` produces no skill, so a caller with no team gets no playbook.

## See it route

The team comes from authenticated claims, which the auth layer stamps on in [Step 9](./ship-it). Until then `ctx.session.auth.current` has no `team`, so the resolver returns `null` and no playbook loads. To verify routing now, stamp a team in local dev. Add a dev-only entry to `agent/channels/eve.ts` ahead of `localDev()`, and remove it before Step 9 wires real auth:

```ts title="agent/channels/eve.ts"
import { eveChannel } from "eve/channels/eve";
import { localDev, placeholderAuth, vercelOidc, type AuthFn } from "eve/channels/auth";

// Dev-only: stamp a team so Step 7's playbook resolver has something to read.
// Remove before Step 9.
const devTeam: AuthFn<Request> = () =>
  process.env.NODE_ENV === "production"
    ? null
    : {
        attributes: { team: "growth" },
        authenticator: "dev-team",
        principalId: "dev",
        principalType: "user",
      };

export default eveChannel({
  auth: [devTeam, localDev(), vercelOidc(), placeholderAuth()],
});
```

Restart with `npm run dev` and ask "what's our 8-week retention?" The model sees the Growth playbook fits, calls `load_skill`, and applies the Growth conventions to that turn (weekly cohorts, no trial accounts). Switch `team` to `"finance"`, restart, and the same question routes to Finance's playbook instead.

Because the team comes from authenticated claims, not from the message, one tenant can't borrow another's playbook through the message content.

The same `defineDynamic` resolver drives dynamic tools and instructions too. For the full mechanism, see [Dynamic capabilities](../guides/dynamic-capabilities).

→ Next: [Guard the spend](./guard-the-spend)

Learn more: [Skills](../skills) · [Dynamic capabilities](../guides/dynamic-capabilities)


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Continuations
description: Persist and resume eve client sessions with continuation tokens, session IDs, and stream cursors.
---

# Continuations



Every eve client turn returns two handles, and mixing them up is a common mistake. The TypeScript client tracks both for you:

* `continuationToken`: the resume handle. Use it to send the next user turn.
* `sessionId`: the stream-and-inspect handle. Use it to attach to event history.

`ClientSession` also tracks `streamIndex`, the count of events already consumed. Together these three fields make a `SessionState`.

## Read and persist state

After a streamed turn finishes, read `session.state`:

```ts
const session = client.session();

const response = await session.send("Create a launch checklist.");
await response.result();

await saveSessionState(session.state);
```

Store the full state object:

```ts
interface SessionState {
  continuationToken?: string;
  sessionId?: string;
  streamIndex: number;
}
```

The continuation token resumes the conversation. The session ID and stream index let the client reconnect to the right stream position without replaying events it already consumed.

## Resume a saved session

Pass the saved state back into `client.session()`:

```ts
import type { SessionState } from "eve/client";

const saved = (await loadSessionState()) as SessionState;
const session = client.session(saved);

const response = await session.send("Now shorten it.");
const result = await response.result();
console.log(result.message);
```

If all you have is a continuation token, pass it as shorthand:

```ts
const session = client.session(continuationToken);
const response = await session.send("Continue where we left off.");
await response.result();
```

The shorthand can send a follow-up, but it doesn't know the previous stream cursor. Prefer full `SessionState` when you control persistence.

## Waiting, completed, and failed sessions

When a turn ends with `session.waiting`, the client preserves the state so the next send continues the conversation.

When a turn ends with `session.completed` or `session.failed`, the client resets its local state. The next send starts a fresh durable session:

```ts
const response = await session.send("Do this one-shot task.");
const result = await response.result();

if (result.status === "completed") {
  // session.state is now a fresh cursor: { streamIndex: 0 }
}
```

This matches the runtime contract, where only waiting sessions can accept the next user input.

## Multiple sessions

Create a separate `ClientSession` per conversation:

```ts
const research = client.session();
const support = client.session();

const researchResponse = await research.send("Research competitors.");
await researchResponse.result();

const supportResponse = await support.send("Draft a support reply.");
await supportResponse.result();

await save("research", research.state);
await save("support", support.state);
```

The shared `Client` only owns host, auth, headers, and reconnect settings. Conversation state lives on each `ClientSession`.

## Reconnect an existing stream

When a session already has a `sessionId`, `session.stream()` reattaches to its stream from the saved cursor. Resuming a saved `SessionState` after a restart is the common reason to do this:

```ts
const session = client.session(savedState);

for await (const event of session.stream()) {
  console.log(event.type);
}
```

`stream()` attaches to an existing run; to send new user input, use `send()`. For overriding the cursor with `startIndex` and the full reconnection model, see [Streaming](./streaming#open-a-stream-manually).

## What to read next

* [Streaming](./streaming): stream events and reconnect by index
* [Sessions, runs & streaming](../../concepts/sessions-runs-and-streaming): the raw HTTP contract
* [eve channel](../../channels/eve): where continuation tokens come from


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Messages
description: Send text, full turn payloads, client context, attachments, and HITL responses with eve/client.
---

# Messages



`ClientSession` sends one turn at a time. A fresh session starts on the first send; later sends continue the same conversation as long as the previous turn left the session waiting.

## Send text

Pass a string to `send()` for plain text:

```ts
import { Client } from "eve/client";

const client = new Client({ host: "http://127.0.0.1:3000" });
const session = client.session();

const response = await session.send("What is the weather in Brooklyn?");

// Metadata is available as soon as the POST succeeds.
console.log(response.sessionId, response.continuationToken);

const result = await response.result();
console.log(result.status, result.message);
```

`response.result()` consumes the event stream and returns a `MessageResult`:

| Field       | Meaning                                                                        |
| ----------- | ------------------------------------------------------------------------------ |
| `message`   | Final assistant text for the turn, when one completed.                         |
| `status`    | `"waiting"`, `"completed"`, or `"failed"`.                                     |
| `events`    | All stream events observed during the turn.                                    |
| `sessionId` | Session ID for streaming and inspection.                                       |
| `data`      | Structured output when the turn requested an [output schema](./output-schema). |

When the stream includes `session.failed`, the turn returns `status: "failed"` rather than throwing. Transport and route errors throw `ClientError`.

## Send a full turn payload

Use `send()` when you need more than plain text:

```ts
const response = await session.send({
  message: "What should I do on this screen?",
  clientContext: {
    route: "/billing",
    plan: "pro",
    seatsUsed: 4,
  },
});

await response.result();
```

`clientContext` is one-turn context for the next model call. Strings become user-role context messages, arrays of strings become multiple context messages, and objects are JSON-serialized into one context message. It isn't persisted to durable session history and doesn't dispatch a turn by itself.

## Send attachments

`send()` accepts AI SDK `UserContent`, so a message can mix text and file parts:

```ts
const response = await session.send({
  message: [
    { type: "text", text: "Summarize this report." },
    {
      type: "file",
      data: reportDataUrl,
      mediaType: "application/pdf",
      filename: "report.pdf",
    },
  ],
});

await response.result();
```

For local files, read the file and send a base64 `data:` URL:

```ts
import { readFile } from "node:fs/promises";

const bytes = await readFile("report.pdf");
const reportDataUrl = `data:application/pdf;base64,${bytes.toString("base64")}`;

const response = await session.send({
  message: [
    { type: "text", text: "Summarize this report." },
    {
      type: "file",
      data: reportDataUrl,
      mediaType: "application/pdf",
      filename: "report.pdf",
    },
  ],
});

await response.result();
```

## Answer human input requests

Tools can pause for approval or ask the user a question. The stream emits `input.requested` with one or more requests. Reply through the same session with `inputResponses`:

```ts
import type { InputRequest } from "eve/client";

let pendingRequests: readonly InputRequest[] = [];

const response = await session.send("Run the deployment checks.");

for await (const event of response) {
  if (event.type === "input.requested") {
    pendingRequests = event.data.requests;
  }
}

const resumed = await session.send({
  inputResponses: pendingRequests.map((request) => ({
    requestId: request.requestId,
    optionId: "approve",
  })),
});

await resumed.result();
```

You can send `message`, `inputResponses`, and `clientContext` together when the resumed turn needs both a human answer and follow-up text.

## Single-use responses

`MessageResponse` is single-use. Either aggregate it:

```ts
const result = await response.result();
```

Or stream it:

```ts
for await (const event of response) {
  console.log(event.type);
}
```

Don't do both on the same response. Once the stream is consumed, the `ClientSession` advances its cursor for the next turn.

## What to read next

* [Continuations](./continuations): how the session cursor advances
* [Streaming](./streaming): handle events live instead of using `result()`
* [Tools](../../tools): configure approvals and question prompts


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Output Schema
description: Request structured results from eve client turns and read typed data from MessageResult.
---

# Output Schema



Pass `outputSchema` on a client turn when the caller needs structured data instead of only assistant text. The runtime makes the model satisfy the schema before the turn settles, then emits the final payload as `result.completed`.

## JSON Schema

Raw JSON Schema objects work directly:

```ts
import { Client } from "eve/client";

interface Summary {
  title: string;
  count: number;
}

const outputSchema = {
  type: "object",
  properties: {
    title: { type: "string" },
    count: { type: "integer" },
  },
  required: ["title", "count"],
} as const;

const client = new Client({ host: "http://127.0.0.1:3000" });
const session = client.session();

const response = await session.send<Summary>({
  message: "Summarize this turn.",
  outputSchema,
});

const result = await response.result();

console.log(result.data?.title);
console.log(result.data?.count);
```

`result.data` is `undefined` when the turn did not produce a structured result.

## Standard Schema

The client also accepts Standard Schema implementations such as Zod, Valibot, and ArkType. The schema is lowered to JSON Schema before the request is sent:

```ts
import { z } from "zod";

const summarySchema = z.object({
  title: z.string(),
  count: z.number().int(),
});

type Summary = z.infer<typeof summarySchema>;

const response = await session.send<Summary>({
  message: "Summarize this turn.",
  outputSchema: summarySchema,
});

const { data } = await response.result();
```

The server is authoritative for validation. The client types `MessageResult.data` from your generic and schema, but it doesn't revalidate the streamed payload client-side.

## Stream the result event

If you consume events manually, read `result.completed`:

```ts
const response = await session.send<Summary>({
  message: "Summarize this turn.",
  outputSchema,
});

for await (const event of response) {
  if (event.type === "result.completed") {
    const summary = event.data.result as Summary;
    console.log(summary.title);
  }
}
```

If more than one `result.completed` appears in the consumed event list, `result()` returns the most recent one as `data`.

## Send payloads with output schema

`outputSchema` works with string shorthand and object-form sends. Use object form when you need schema, headers, signal, context, attachments, or HITL responses:

```ts
const response = await session.send<Summary>({
  message: "Summarize this PDF.",
  clientContext: { reportId: "rpt_123" },
  outputSchema,
});

const result = await response.result();
```

It also works on follow-up turns and HITL response turns:

```ts
const response = await session.send({
  inputResponses: [{ requestId, optionId: "approve" }],
  message: "Return the approved action as structured output.",
  outputSchema,
});

const result = await response.result();
```

## Per-turn scope

Client `outputSchema` is scoped to the turn that sends it. It doesn't become a permanent setting for the conversation:

```ts
const response = await session.send({ message: "Return a structured summary.", outputSchema });
await response.result();

const followUpResponse = await session.send("Now answer normally.");
const followUp = await followUpResponse.result();

console.log(followUp.data); // undefined unless this turn also requested a schema
```

For task-mode output that belongs to the agent or subagent definition itself, see [`agent.ts`](../../agent-config#outputschema) and [Subagents](../../subagents).

## What to read next

* [Messages](./messages): send turns with `send()`
* [Streaming](./streaming): handle `result.completed` live
* [`agent.ts`](../../agent-config#outputschema): configured task-mode output


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: TypeScript SDK Overview
description: Call an eve agent from TypeScript with Client, sessions, auth, and health checks.
---

# TypeScript SDK Overview



The `eve/client` entrypoint is the typed client for eve's default HTTP API. Use it from scripts, server-to-server integrations, tests, evals, backend jobs, or custom UIs that want the session protocol without hand-writing the POST and NDJSON (newline-delimited JSON) stream loop.

For browser chat UIs, start with [`useEveAgent`](../frontend/overview). For wire-level details, read [Sessions, runs & streaming](../../concepts/sessions-runs-and-streaming). The client sits between those two: lower level than the frontend hooks, higher level than raw HTTP.

## Create a client

A `Client` binds one host, auth policy, header policy, and stream reconnection budget:

```ts
import { Client } from "eve/client";

const client = new Client({
  host: "http://127.0.0.1:3000",
});
```

`host` is the origin where the eve routes are mounted. In a same-origin browser integration this is often `""`; scripts and backend services usually name the full URL.

## Check health

Use `health()` when a script needs to fail early before creating a session:

```ts
const health = await client.health();
console.log(health.status, health.workflowId);
```

Non-2xx responses throw `ClientError`, which carries the HTTP `status` and response `body`.

## Authentication

Pass `auth` when the [eve channel](../../channels/eve) route requires credentials:

```ts
const client = new Client({
  host: "https://agent.example.com",
  auth: {
    bearer: async () => await getAccessToken(),
  },
});
```

Bearer values and Basic auth passwords can be strings or functions. Functions run before every HTTP call, including stream reconnects:

```ts
const client = new Client({
  host: "https://agent.example.com",
  auth: {
    basic: {
      username: "agent-client",
      password: async () => await getRotatingSecret(),
    },
  },
});
```

Use `headers` for route-specific credentials such as bypass tokens or tenant hints. Like `auth`, it can be static or dynamic:

```ts
const client = new Client({
  host: "https://agent.example.com",
  headers: async () => ({
    "x-vercel-protection-bypass": await getBypassToken(),
  }),
});
```

Per-request headers can be attached to an individual turn:

```ts
const response = await session.send({
  message: "Run the check.",
  headers: { "x-request-id": requestId },
});

await response.result();
```

## Sessions

Create a `ClientSession` for each conversation:

```ts
const session = client.session();
```

A client can own many sessions at once. Each session tracks its own `sessionId`, `continuationToken`, and stream cursor:

```ts
const alice = client.session();
const bob = client.session();

const aliceResponse = await alice.send("Summarize account A.");
await aliceResponse.result();

const bobResponse = await bob.send("Summarize account B.");
await bobResponse.result();
```

The next pages cover the session lifecycle:

* [Messages](./messages): send turns and collect results
* [Continuations](./continuations): persist and resume sessions
* [Streaming](./streaming): render events as they arrive
* [Output schema](./output-schema): request structured results

## What to read next

* [eve channel](../../channels/eve): the HTTP API this client calls
* [Sessions, runs & streaming](../../concepts/sessions-runs-and-streaming): the raw HTTP contract
* [Frontend](../frontend/overview): browser UI with `useEveAgent`


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Streaming
description: Consume eve client stream events live, reconnect by event index, and aggregate turn results.
---

# Streaming



Every `ClientSession.send()` call posts the turn, then reads the session's NDJSON (newline-delimited JSON) event stream. `MessageResponse` gives you two ways to consume that stream, aggregating it with `result()` or iterating it live.

## Aggregate a turn

Use `result()` when you only need the final turn summary:

```ts
const response = await session.send("Summarize the latest forecast.");
const result = await response.result();

console.log(result.status);
console.log(result.message);
console.log(result.events.length);
```

This consumes the stream until the current turn boundary:

* `session.waiting`
* `session.completed`
* `session.failed`

## Stream events live

Use `for await...of` when you want to render progress:

```ts
const response = await session.send("Draft a plan and show your work.");

for await (const event of response) {
  if (event.type === "message.appended") {
    process.stdout.write(event.data.messageDelta);
  }

  if (event.type === "message.completed" && event.data.finishReason !== "tool-calls") {
    console.log("\nfinal:", event.data.message);
  }
}
```

`message.appended` and `reasoning.appended` are incremental delta events. Their completed forms, `message.completed` and `reasoning.completed`, are the compatibility path for clients that don't render deltas.

## Handle event types

Import event types from `eve/client` when you want exhaustiveness or helpers:

```ts
import type { HandleMessageStreamEvent } from "eve/client";
import { isCurrentTurnBoundaryEvent } from "eve/client";

function handleEvent(event: HandleMessageStreamEvent) {
  if (isCurrentTurnBoundaryEvent(event)) {
    console.log("turn settled:", event.type);
  }
}
```

The most common UI events are:

| Event                | Use                                                              |
| -------------------- | ---------------------------------------------------------------- |
| `message.received`   | Confirm the user message landed.                                 |
| `reasoning.appended` | Render reasoning deltas when the model provides them.            |
| `message.appended`   | Render assistant text deltas.                                    |
| `actions.requested`  | Show tool calls requested by the model.                          |
| `action.result`      | Show tool call results.                                          |
| `input.requested`    | Pause the UI for approval or a question answer.                  |
| `result.completed`   | Read structured output from an [output schema](./output-schema). |
| `session.waiting`    | Enable the composer for the next turn.                           |
| `session.completed`  | Mark the conversation terminal.                                  |
| `session.failed`     | Mark the conversation failed.                                    |

For the complete event table, see [Sessions, runs & streaming](../../concepts/sessions-runs-and-streaming).

## Reconnection

The client reconnects after transient stream disconnects. It resumes from the number of events already consumed in the current session:

```ts
const client = new Client({
  host: "https://agent.example.com",
  maxReconnectAttempts: 5,
});
```

`maxReconnectAttempts` is per turn. The default is `3`.

## Open a stream manually

Use `session.stream()` when you already have a session cursor and only need to attach to the existing stream:

```ts
const session = client.session({
  continuationToken: "eve:6c8b1f2e-3d4a-4b9c-8e21-9f0a1b2c3d4e",
  sessionId: "wrun_01ARYZ6S41TSV4RRFFQ69G5FAV",
  streamIndex: 10,
});

for await (const event of session.stream()) {
  console.log(event.type);
}
```

Pass `startIndex` to override the stored cursor:

```ts
for await (const event of session.stream({ startIndex: 0 })) {
  console.log(event.type);
}
```

`stream()` throws if the session has no `sessionId`, because there's no stream to attach to before the first send.

## Abort a request

Pass an `AbortSignal` to cancel the POST or stream. Arm the timeout before awaiting `send()` so it covers the POST as well as the stream:

```ts
const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), 10_000);

const response = await session.send({
  message: "Run a long analysis.",
  signal: controller.signal,
});

for await (const event of response) {
  console.log(event.type);
}

clearTimeout(timeout);
```

Once a response is aborted, create a new send for the next turn. Don't reuse the same `MessageResponse`.

## What to read next

* [Messages](./messages): the send APIs that create streams
* [Continuations](./continuations): how stream cursors are persisted
* [Output schema](./output-schema): consume `result.completed`


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Next.js
description: Run an eve agent and a Next.js app as one project with withEve.
---

# Next.js



`eve/next` ships a Next.js frontend and an eve agent as a single project. Wrap your config with `withEve()` to run both from one dev server and one Vercel deploy. [`useEveAgent`](./overview) finds the mounted routes on its own, so there's no CORS to configure and no URL env vars to keep in sync.

## Prerequisites

* The `eve` package installed in your project (`npm install eve@latest`).
* An existing eve agent directory. If you don't have one, start from [Getting started](../../getting-started).
* A Next.js app to mount the agent in.

## Wrap the Next.js config

```ts title="next.config.ts"
import type { NextConfig } from "next";
import { withEve } from "eve/next";

const nextConfig: NextConfig = {};

export default withEve(nextConfig);
```

By default `withEve()` looks for an `agent/` folder inside your Next.js project root. If the agent lives somewhere else, point at it with `eveRoot`:

```ts
export default withEve(nextConfig, {
  eveRoot: "../my-agent",
});
```

### `withEve` options

All fields are optional.

| Option               | Type     | Default                | Purpose                                                                                                                                             |
| -------------------- | -------- | ---------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------- |
| `eveRoot`            | `string` | Next.js app root       | Path to the eve app root, relative to `process.cwd()` unless absolute. Set it when the agent lives outside the Next.js project.                     |
| `eveBuildCommand`    | `string` | `"eve build"`          | Build command for the generated eve Vercel service. Use it when the eve service needs project-specific prework, without changing the Next.js build. |
| `servicePrefix`      | `string` | `"/_eve_internal/eve"` | Private Vercel route namespace for the eve service. Must match the eve service's mount in your Vercel Build Output config when you set it manually. |
| `devServerTimeoutMs` | `number` | `180000`               | Maximum time to wait for the eve development server to become available.                                                                            |

For slow cold starts, increase the development timeout:

```ts
export default withEve(nextConfig, {
  devServerTimeoutMs: 300_000,
});
```

## Call the hook

With `withEve()` in `next.config.ts`, the eve routes are same-origin, so client code can call [`useEveAgent`](./overview) without naming a host. Cookie-based auth (Auth.js or any session cookie) needs no extra wiring, since the browser already sends those cookies on every eve request. For non-cookie schemes, attach the credentials yourself:

```tsx
const agent = useEveAgent({
  headers: async () => ({
    authorization: `Bearer ${await getAccessToken()}`,
  }),
});
```

The default eve channel is fail-closed. With no `agent/channels/eve.ts` authored, eve registers `eveChannel({ auth: [localDev(), vercelOidc()] })`: `localDev()` opens the routes on localhost, `vercelOidc()` admits Vercel OIDC callers in production, and everything else gets a `401`. To run your app's own auth policy, add `agent/channels/eve.ts`:

```ts title="agent/channels/eve.ts"
import { eveChannel } from "eve/channels/eve";
import { localDev, vercelOidc } from "eve/channels/auth";

export default eveChannel({ auth: [localDev(), vercelOidc()] });
```

For a public demo, use `none()` (also from `eve/channels/auth`) to skip authentication. See [Channels](../../channels/overview) and [Auth & route protection](../auth-and-route-protection).

## Dev vs deploy topology

* **Local dev.** `npm run dev` boots the eve dev server next to `next dev` and rewrites the eve routes over to it. The browser only ever talks to the Next.js origin.

* **Vercel.** The web app and the eve runtime deploy as a single project. The web app stays public; the eve runtime sits behind it on the same site origin. When the agent needs its own build step, set `eveBuildCommand`:

  ```ts
  export default withEve(nextConfig, {
    eveBuildCommand: "npm run build:eve",
  });
  ```

* **Local production build.** `next build && next start` serves the eve runtime from its built `.output/server/index.mjs` on a stable local port (`4274`) and proxies the eve routes to it. Run `eve build` first so that output exists. Change the port with `EVE_NEXT_PRODUCTION_PORT`:

  ```bash
  EVE_NEXT_PRODUCTION_PORT=5000 npm run build && npm start
  ```

* **Non-Vercel hosts.** When the eve service lives on a separate origin, tell Next.js where to find it with `EVE_NEXT_PRODUCTION_ORIGIN`:

  ```bash
  EVE_NEXT_PRODUCTION_ORIGIN=https://agent.example.com npm run build
  ```

## What to read next

* [Frontend overview](./overview): the `useEveAgent` API
* [Auth & route protection](../auth-and-route-protection)
* [Deployment](../deployment)


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Nuxt
description: Run an eve agent and a Nuxt app as one project with the eve/nuxt module.
---

# Nuxt



The `eve/nuxt` module runs a Nuxt frontend and an eve agent as a single project from one dev server and one Vercel deploy. The auto-imported [`useEveAgent`](./use-eve-agent-vue) composable finds the mounted routes on its own, so there's no CORS to configure and no URL env vars to keep in sync.

## Prerequisites

* The `eve` package installed in your project (`npm install eve@latest`).
* An existing eve agent directory. If you don't have one, start from [Getting started](../../getting-started).
* A Nuxt app to mount the agent in.

## Register the module

```ts title="nuxt.config.ts"
export default defineNuxtConfig({
  modules: ["eve/nuxt"],
});
```

The module looks for an `agent/` folder in the Nuxt project root. Pass `eveRoot` when the agent lives elsewhere:

```ts
export default defineNuxtConfig({
  modules: ["eve/nuxt"],
  eve: {
    eveRoot: "../my-agent",
  },
});
```

The `eve` key accepts only two options, `eveRoot` and `eveBuildCommand`.

## Call the composable

`useEveAgent` (`eve/vue`) is auto-imported, so a component calls it without an explicit import and without naming a host:

```vue
<script setup lang="ts">
const { status, send } = useEveAgent();

const isBusy = computed(() => status.value === "submitted" || status.value === "streaming");

const message = ref("");

async function handleSubmit() {
  const text = message.value.trim();
  if (!text || isBusy.value) return;
  message.value = "";
  await send({ message: text });
}
</script>

<template>
  <form @submit.prevent="handleSubmit">
    <input v-model="message" :disabled="isBusy" />
    <button type="submit" :disabled="isBusy">Send</button>
  </form>
</template>
```

The default eve channel is fail-closed. With no `agent/channels/eve.ts` authored, eve registers `eveChannel({ auth: [localDev(), vercelOidc()] })`: `localDev()` opens the routes on localhost, `vercelOidc()` admits Vercel OIDC callers in production, and everything else gets a `401`. To run your own auth policy, add `agent/channels/eve.ts`:

```ts title="agent/channels/eve.ts"
import { eveChannel } from "eve/channels/eve";
import { localDev, vercelOidc } from "eve/channels/auth";

export default eveChannel({ auth: [localDev(), vercelOidc()] });
```

For a public demo, use `none()` (also from `eve/channels/auth`) to skip authentication. See [Channels](../../channels/overview) and [Auth & route protection](../auth-and-route-protection).

## Dev vs deploy topology

* **Local dev.** `npm run dev` starts the eve dev server next to `nuxt dev` and proxies the eve routes through it. As far as the browser knows, everything is the Nuxt origin.

* **Vercel.** A single Vercel project carries both the Nuxt app and the eve runtime. The web app stays public; the runtime sits behind it on the same origin. Set `eveBuildCommand` when the agent needs its own build step:

  ```ts
  export default defineNuxtConfig({
    modules: ["eve/nuxt"],
    eve: {
      eveBuildCommand: "npm run build:eve",
    },
  });
  ```

* **Non-Vercel hosts.** Point Nuxt at a separate eve origin with `EVE_NUXT_PRODUCTION_ORIGIN`. To override the local port (default `4274`), use `EVE_NUXT_PRODUCTION_PORT`:

  ```bash
  EVE_NUXT_PRODUCTION_ORIGIN=https://agent.example.com npm run build
  EVE_NUXT_PRODUCTION_PORT=5000 npm run build && npm run preview
  ```

## What to read next

* [`useEveAgent` (Vue)](./use-eve-agent-vue): the composable API
* [Auth & route protection](../auth-and-route-protection)
* [Deployment](../deployment)


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: Overview
description: Put an eve agent behind a browser chat UI with useEveAgent.
---

# Overview



The frontend helpers put a browser chat or agent UI on top of an eve agent. `useEveAgent()` opens a durable session, sends turns, streams the reply back, and turns the raw event stream into render-ready state. React is the reference implementation; [Vue](./use-eve-agent-vue) and [Svelte](./use-eve-agent-svelte) ship the same surface.

## The integration model

A browser UI is a client of the agent's HTTP routes (the [eve channel](../../channels/overview)). Two layers wire it up:

* **The framework integration** mounts the eve routes on your app's origin, so the browser never crosses a CORS boundary or reads an env var to find the agent. Pick yours: [Next.js](./nextjs) (`withEve`), [Nuxt](./nuxt) (the `eve/nuxt` module), or [SvelteKit](./sveltekit) (the `eveSvelteKit` Vite plugin). On any other stack the hook talks to same-origin `/eve/v1/*` routes directly, or you pass an explicit `host`.
* **The hook** (`useEveAgent`) holds the session state, streaming, errors, and composer status. It defaults to same-origin eve routes such as `/eve/v1/session`.

The per-framework pages below walk through the wiring step by step: [Next.js](./nextjs), [Nuxt](./nuxt), and [SvelteKit](./sveltekit).

For scripts, server-to-server calls, evals, tests, or custom clients that do not need framework UI state, use the [TypeScript SDK](../client/overview) directly.

## Basic chat (React)

The hook lives in `eve/react`. Render `data.messages`, gate the composer on `status`, and send text with `send`:

```tsx
"use client";

import { useEveAgent } from "eve/react";

export function Chat() {
  const agent = useEveAgent();
  const isBusy = agent.status === "submitted" || agent.status === "streaming";

  return (
    <form
      onSubmit={(event) => {
        event.preventDefault();
        const form = new FormData(event.currentTarget);
        const message = String(form.get("message") ?? "").trim();
        if (message.length > 0) {
          void agent.send({ message });
        }
      }}
    >
      {agent.data.messages.map((message) => (
        <article key={message.id}>
          <header>{message.role}</header>
          {message.parts.map((part, index) =>
            part.type === "text" ? <p key={index}>{part.text}</p> : null,
          )}
        </article>
      ))}
      <input name="message" disabled={isBusy} />
      <button disabled={isBusy} type="submit">
        Send
      </button>
    </form>
  );
}
```

## Returned state

`useEveAgent()` returns the current UI state plus commands:

| Field     | What it is                                                                     |
| --------- | ------------------------------------------------------------------------------ |
| `data`    | Projected UI state from the reducer. Defaults to `{ messages }`.               |
| `status`  | `"ready"`, `"submitted"`, `"streaming"`, or `"error"`. Drives the composer.    |
| `error`   | The last `Error` thrown, if any.                                               |
| `events`  | Raw eve stream events for this session.                                        |
| `session` | Serializable session cursor (`sessionId`, `continuationToken`, `streamIndex`). |
| `send`    | Send text or the full turn payload (multi-part messages, HITL responses).      |
| `stop`    | Abort the active request.                                                      |
| `reset`   | Clear local events, data, errors, and the local session cursor.                |

Most chat UIs only need `data.messages` and `status`. Drop down to `events` to render lower-level activity such as tool calls and reasoning deltas that the default reducer doesn't surface.

`data.messages` are `EveMessage[]` following the AI SDK `UIMessage` convention, so they drop straight into any AI SDK UI primitive that accepts a `UIMessage[]`. Parts include user text, assistant text, reasoning, tool calls, tool results, and input requests.

## Sending and streaming

Pass an object to `send()` for text, multi-part messages, attachments, HITL responses, and per-turn context:

```tsx
await agent.send({ message: "Summarize this session." });

await agent.send({
  message: [
    { type: "text", text: "What is in this file?" },
    {
      type: "file",
      data: fileDataUrl, // base64 data URL
      mediaType: "application/pdf",
      filename: "report.pdf",
    },
  ],
});
```

Assistant text, reasoning, tool calls, and tool results stream into `data` as they arrive, and `status` moves from `ready` to `submitted` to `streaming` and back. Call `stop()` to abort the active request, and `reset()` to clear local state so the next send starts a fresh durable session.

## Human-in-the-loop prompts

Tools opt into approval with `needsApproval`, and the model can also ask a question with `ask_question` — see [Human-in-the-loop](/docs/human-in-the-loop) for the server-side model. Either way the stream emits an `input.requested` event, and the pending request rides on a `dynamic-tool` part of the latest message at `part.toolMetadata?.eve?.inputRequest`. Read it, then answer through the same session with `send()`:

```tsx
const request = agent.data.messages
  .at(-1)
  ?.parts.find((part) => part.type === "dynamic-tool" && part.toolMetadata?.eve?.inputRequest)
  ?.toolMetadata?.eve?.inputRequest;

if (request) {
  await agent.send({
    inputResponses: [{ requestId: request.requestId, optionId: "approve" }],
  });
}
```

`request.prompt` and `request.options` give you what you need to render the approve and deny UI. The default reducer marks the part as responded immediately, then updates it again once eve streams the resumed result.

## Attach page context per turn

`clientContext` adds ephemeral context for the next model call only. Strings (or an array of strings) become user-role context messages; an object is JSON-serialized into one. It rides along with a message or HITL response, so it never dispatches a turn on its own and never lands in durable session history. Pass it directly to `send()`:

```tsx
await agent.send({
  message: "What should I do on this screen?",
  clientContext: { route: "/billing", plan: "pro", seatsUsed: 4 },
});
```

To attach the same context to every turn without threading it through each call site, use `prepareSend`. It runs right before each send and returns the (possibly augmented) turn:

```tsx
const agent = useEveAgent({
  prepareSend: (input) => ({
    ...input,
    clientContext: { route: location.pathname },
  }),
});
```

## Lifecycle callbacks

On top of `onSessionChange`, the hook takes a few per-turn callbacks:

* `onEvent(event)`: fires for each eve stream event as it arrives.
* `onError(error)`: fires with the last `Error` when a turn fails.
* `onFinish(snapshot)`: fires with the final `{ data, status, session, ... }` snapshot once a turn settles.

```tsx
const agent = useEveAgent({
  onEvent: (event) => console.debug(event.type),
  onError: (error) => toast.error(error.message),
  onFinish: (snapshot) => console.log(snapshot.status),
});
```

Two more options tune turn behavior:

* `optimistic` (default `true`): projects submitted user messages into `data` before eve confirms them with a `message.received` event. These are reducer-facing projection events only. `events` stays the authoritative eve stream.
* `maxReconnectAttempts` (default `3`): stream reconnection budget per turn.

## Custom reducer

The default reducer projects events into `{ messages }` (`EveMessageData`). When you want `data` shaped differently, pass a `reducer` implementing `EveAgentReducer<TData>`:

```tsx
import { useEveAgent } from "eve/react";
import type { EveAgentReducer } from "eve/react";

interface ToolLog {
  readonly toolCalls: number;
}

const toolCounter: EveAgentReducer<ToolLog> = {
  initial: () => ({ toolCalls: 0 }),
  reduce: (data, event) =>
    event.type === "actions.requested" ? { toolCalls: data.toolCalls + 1 } : data,
};

const agent = useEveAgent({ reducer: toolCounter });
// agent.data is ToolLog
```

`reduce(data, event)` receives both authoritative eve stream events and client projection events (`client.message.submitted`, `client.message.failed`, `client.input.responded`). Handle the client events too if you want optimistic and HITL state in your projection. Otherwise, return `data` unchanged for them.

## Resumable sessions

The browser conversation lives durably on the server. Persist the `session` cursor to pick it back up after a reload:

```tsx
const [initialSession] = useState(() => {
  const raw = localStorage.getItem("eve-session");
  return raw ? JSON.parse(raw) : undefined;
});

const agent = useEveAgent({
  initialSession,
  onSessionChange(session) {
    localStorage.setItem("eve-session", JSON.stringify(session));
  },
});
```

Store the full `session` object (`sessionId`, `continuationToken`, `streamIndex`), not a single field.

## Custom hosts and headers

Pass `host` when the eve server isn't same-origin, and pass `auth` or `headers` when the channel needs credentials. Function values are re-resolved before every HTTP request, reconnects included:

```tsx
const agent = useEveAgent({
  host: "https://agent.example.com",
  auth: {
    bearer: async () => await getAccessToken(),
  },
});
```

## Per-framework integration

| Framework | Integration                          | Hook                                             |
| --------- | ------------------------------------ | ------------------------------------------------ |
| Next.js   | [`withEve`](./nextjs)                | [`useEveAgent` (React)](#basic-chat-react)       |
| Nuxt      | [`eve/nuxt` module](./nuxt)          | [`useEveAgent` (Vue)](./use-eve-agent-vue)       |
| SvelteKit | [`eveSvelteKit` plugin](./sveltekit) | [`useEveAgent` (Svelte)](./use-eve-agent-svelte) |
| Any React | same-origin or `host`                | [`useEveAgent` (React)](#basic-chat-react)       |

## What to read next

* [Sessions, runs & streaming](../../concepts/sessions-runs-and-streaming): the event stream and session cursor
* [Channels](../../channels/overview): the HTTP routes the hook talks to
* [TypeScript SDK](../client/overview): the lower-level client underneath the frontend hooks
* [Next.js](./nextjs): step-by-step setup for wiring eve into a Next.js app


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: SvelteKit
description: Run an eve agent and a SvelteKit app as one project with the eveSvelteKit Vite plugin.
---

# SvelteKit



`eve/sveltekit` runs a SvelteKit frontend and an eve agent as one project instead of two services. The `eveSvelteKit()` Vite plugin puts both on one dev server and one Vercel deploy, and [`useEveAgent`](./use-eve-agent-svelte) finds the mounted routes on its own. There's no CORS to configure and no URL env vars to keep in sync.

## Prerequisites

* The `eve` package installed in your project (`npm install eve@latest`).
* An existing eve agent directory. If you don't have one, start from [Getting started](../../getting-started).
* A SvelteKit app to mount the agent in.

## Register the Vite plugin

Add `eveSvelteKit()` before `sveltekit()`:

```ts title="vite.config.ts"
import { sveltekit } from "@sveltejs/kit/vite";
import { eveSvelteKit } from "eve/sveltekit";
import { defineConfig } from "vite";

export default defineConfig({
  plugins: [eveSvelteKit(), sveltekit()],
});
```

The plugin looks for an `agent/` folder in the SvelteKit project root. Pass `eveRoot` when the agent lives elsewhere:

```ts
export default defineConfig({
  plugins: [
    eveSvelteKit({
      eveRoot: "../my-agent",
    }),
    sveltekit(),
  ],
});
```

The plugin accepts only two options, `eveRoot` and `eveBuildCommand`.

## Call the binding

With the plugin in `vite.config.ts`, components call [`useEveAgent`](./use-eve-agent-svelte) from `eve/svelte` and don't pass a host:

```svelte
<script lang="ts">
  import { useEveAgent } from "eve/svelte";

  const agent = useEveAgent();
  let message = $state("");
  let isBusy = $derived(agent.status === "submitted" || agent.status === "streaming");

  async function handleSubmit() {
    const text = message.trim();
    if (!text || isBusy) return;
    message = "";
    await agent.send({ message: text });
  }
</script>

<form onsubmit={(event) => {
  event.preventDefault();
  void handleSubmit();
}}>
  <input bind:value={message} disabled={isBusy} />
  <button type="submit" disabled={isBusy}>Send</button>
</form>
```

The default eve channel is fail-closed. With no `agent/channels/eve.ts` authored, eve registers `eveChannel({ auth: [localDev(), vercelOidc()] })`: `localDev()` opens the routes on localhost, `vercelOidc()` admits Vercel OIDC callers in production, and everything else gets a `401`. To set your own auth policy, add `agent/channels/eve.ts`:

```ts title="agent/channels/eve.ts"
import { eveChannel } from "eve/channels/eve";
import { localDev, vercelOidc } from "eve/channels/auth";

export default eveChannel({ auth: [localDev(), vercelOidc()] });
```

For a public demo, use `none()` (also from `eve/channels/auth`) to skip authentication. See [Channels](../../channels/overview) and [Auth & route protection](../auth-and-route-protection).

## Dev vs deploy topology

* **Local dev.** `npm run dev` boots the eve dev server next to SvelteKit and proxies the eve routes to it, so the browser only ever hits the SvelteKit origin. `npm run build && npm run preview` behaves the same way: the preview server gets its own eve route proxy and either reuses the shared eve server or starts one.

* **Vercel.** The SvelteKit app and the eve runtime deploy as a single project. The web app is public; the eve runtime sits behind it on the same origin. Use `eveBuildCommand` for a project-specific agent build:

  ```ts
  export default defineConfig({
    plugins: [
      eveSvelteKit({
        eveBuildCommand: "npm run build:eve",
      }),
      sveltekit(),
    ],
  });
  ```

* **Non-Vercel hosts.** When the eve service runs on a separate origin, pass `host` directly to `useEveAgent`:

  ```ts
  const agent = useEveAgent({
    host: "https://agent.example.com",
  });
  ```

## What to read next

* [`useEveAgent` (Svelte)](./use-eve-agent-svelte): the binding API
* [Auth & route protection](../auth-and-route-protection)
* [Deployment](../deployment)


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: useEveAgent (Svelte)
description: Svelte 5 binding that drives an eve agent session from the browser.
---

# useEveAgent (Svelte)



`useEveAgent()` from `eve/svelte` is the browser side of an eve session in a Svelte 5 app. Call it once for a long-lived session you can send turns to, with every stream event projected into rune-friendly reactive data. On SvelteKit, the [Vite plugin](./sveltekit) wires up the routes. The [frontend overview](./overview) covers the model shared across frameworks.

## Basic usage

Import from `eve/svelte` and read the reactive getters directly. No `$` prefix:

```svelte
<script lang="ts">
  import { useEveAgent } from "eve/svelte";

  const agent = useEveAgent();
</script>

{#each agent.data.messages as message}
  <p>{message.role}: {JSON.stringify(message.parts)}</p>
{/each}
```

## What it returns

| Property  | Type                                        | Description                                                               |
| --------- | ------------------------------------------- | ------------------------------------------------------------------------- |
| `data`    | `TData`                                     | Projected state. With the default reducer, `EveMessageData` (`messages`). |
| `status`  | `UseEveAgentStatus`                         | `"ready"`, `"submitted"`, `"streaming"`, or `"error"`.                    |
| `error`   | `Error \| undefined`                        | Last transport-level error.                                               |
| `events`  | `readonly HandleMessageStreamEvent[]`       | Raw server events for this session.                                       |
| `session` | `SessionState`                              | Snapshot of session state.                                                |
| `send`    | `(input: SendTurnPayload) => Promise<void>` | Send text or a full turn (multi-part, attachments, HITL responses).       |
| `stop`    | `() => void`                                | Abort the in-flight request.                                              |
| `reset`   | `() => void`                                | Clear state and start a new session.                                      |

These state fields are reactive getters, so read them straight from templates, `$derived`, or `$effect`. They are not stores, so don't prefix them with `$`.

## Send a message

```svelte
<script lang="ts">
  import { useEveAgent } from "eve/svelte";

  const agent = useEveAgent();
  let message = $state("");

  async function handleSubmit() {
    const text = message.trim();
    if (!text) return;
    message = "";
    await agent.send({ message: text });
  }
</script>

<form onsubmit={(event) => {
  event.preventDefault();
  void handleSubmit();
}}>
  <input bind:value={message} placeholder="Type a message..." />
  <button type="submit">Send</button>
</form>
```

When a turn is more than plain text, reach for `send()`. Attachments follow the AI SDK `UserContent` format. Send file data as a base64 `data:` URL so it survives the JSON transport:

```ts
const bytes = new Uint8Array(await file.arrayBuffer());
const base64 = btoa(String.fromCodePoint(...bytes));

await agent.send({
  message: [
    { type: "text", text: "Describe this image." },
    { type: "file", data: `data:${file.type};base64,${base64}`, mediaType: file.type },
  ],
});
```

## Human-in-the-loop prompts

A tool opts into approval with `needsApproval` ([Tools](../../tools)). When one fires, the pending request rides along on a `dynamic-tool` part of the latest message at `part.toolMetadata?.eve?.inputRequest`. Read it, then answer through the same session with `agent.send({ inputResponses })`:

```ts
import type { EveDynamicToolPart, EveMessagePart } from "eve/svelte";

const isDynamicToolPart = (part: EveMessagePart): part is EveDynamicToolPart =>
  part.type === "dynamic-tool";

const request = agent.data.messages
  .at(-1)
  ?.parts.filter(isDynamicToolPart)
  .map((part) => part.toolMetadata?.eve?.inputRequest)
  .find((value) => value !== undefined);

if (request) {
  await agent.send({
    inputResponses: [{ requestId: request.requestId, optionId: "approve" }],
  });
}
```

The find-and-answer flow is identical across frameworks. The [React hook reference](./overview) covers the longer walkthrough.

## Stop, reset, and resume

`stop()` aborts the in-flight stream. `reset()` wipes state and starts a fresh session. To resume across reloads, hand `initialSession` the state you saved earlier and use `onSessionChange` to persist the cursor as it advances:

```ts
const agent = useEveAgent({
  initialSession: savedSessionState,
  initialEvents: savedEvents,
  onSessionChange: (session) => {
    localStorage.setItem("eve-session", JSON.stringify(session));
  },
});
```

## Custom host and credentials

Point `host` at an eve server on a different origin, and pass credentials through `auth` or `headers`. When you supply a function, it re-resolves before every request:

```ts
const agent = useEveAgent({
  host: "https://agent.example.com",
  headers: async () => ({
    authorization: `Bearer ${await getAccessToken()}`,
  }),
});
```

## Attach page context per turn

`clientContext` adds ephemeral context for the next model call and nothing more. Strings (or an array of strings) become user-role context messages; an object is JSON-serialized into one. It rides along with a message or HITL response, never dispatches a turn on its own, and never lands in durable session history. Pass it to `send()`:

```ts
await agent.send({
  message: "What should I do on this screen?",
  clientContext: { route: "/billing", plan: "pro", seatsUsed: 4 },
});
```

To attach the same context to every turn without threading it through each call site, pass `prepareSend`. It runs right before each send and returns the (possibly augmented) turn:

```ts
const agent = useEveAgent({
  prepareSend: (input) => ({
    ...input,
    clientContext: { route: location.pathname },
  }),
});
```

## Lifecycle callbacks

The binding takes a few per-turn callbacks:

* `onEvent(event)`: fires for each eve stream event as it arrives.
* `onError(error)`: fires with the last `Error` when a turn fails.
* `onFinish(snapshot)`: fires with the final snapshot once a turn settles.
* `onSessionChange(session)`: fires when the session cursor advances. Persist it to resume across reloads.

```ts
const agent = useEveAgent({
  onEvent: (event) => console.debug(event.type),
  onError: (error) => console.error(error.message),
  onFinish: (snapshot) => console.log(snapshot.status),
});
```

Two more options tune turn behavior:

* `optimistic` (default `true`): projects submitted user messages into `data` before eve confirms them with a `message.received` event. These are reducer-facing projection events only; `events` stays the authoritative eve stream.
* `maxReconnectAttempts` (default `3`): stream reconnection budget per turn.

## Custom reducer

The default reducer projects events into `{ messages }` (`EveMessageData`). To shape `data` differently, pass a `reducer` implementing `EveAgentReducer<TData>`:

```ts
import { useEveAgent } from "eve/svelte";
import type { EveAgentReducer } from "eve/svelte";

interface ToolLog {
  readonly toolCalls: number;
}

const toolCounter: EveAgentReducer<ToolLog> = {
  initial: () => ({ toolCalls: 0 }),
  reduce: (data, event) =>
    event.type === "actions.requested" ? { toolCalls: data.toolCalls + 1 } : data,
};

const agent = useEveAgent({ reducer: toolCounter });
// agent.data is ToolLog
```

`reduce(data, event)` receives both authoritative eve stream events and client projection events (`client.message.submitted`, `client.message.failed`, `client.input.responded`). Handle the client events too if you want optimistic and HITL state in your projection. Otherwise, return `data` unchanged for them.

## What to read next

* [SvelteKit](./sveltekit): Vite plugin setup
* [Frontend overview](./overview): the shared integration model
* [Sessions, runs & streaming](../../concepts/sessions-runs-and-streaming)


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)

---
title: useEveAgent (Vue)
description: Vue composable that drives an eve agent session from the browser.
---

# useEveAgent (Vue)



`useEveAgent()` from `eve/vue` is how a Vue app talks to an eve session. It opens a long-lived session, sends turns, and folds every stream event into reactive data you can bind in a template. Nuxt auto-imports it through the [module](./nuxt), and the [frontend overview](./overview) covers the shared model.

## Basic usage

Import the composable from `eve/vue`. Its state is exposed as `ComputedRef`s, so you read it unwrapped in templates:

```vue
<script setup lang="ts">
import { useEveAgent } from "eve/vue";

const { data } = useEveAgent();
</script>

<template>
  <div v-for="message in data.messages" :key="message.id">
    <p>{{ message.role }}: {{ message.parts }}</p>
  </div>
</template>
```

## What it returns

| Property  | Type                                               | Description                                                               |
| --------- | -------------------------------------------------- | ------------------------------------------------------------------------- |
| `data`    | `ComputedRef<TData>`                               | Projected state. With the default reducer, `EveMessageData` (`messages`). |
| `status`  | `ComputedRef<UseEveAgentStatus>`                   | `"ready"`, `"submitted"`, `"streaming"`, or `"error"`.                    |
| `error`   | `ComputedRef<Error \| undefined>`                  | Last transport-level error.                                               |
| `events`  | `ComputedRef<readonly HandleMessageStreamEvent[]>` | Raw server events for this session.                                       |
| `session` | `ComputedRef<SessionState>`                        | Snapshot of session state.                                                |
| `send`    | `(input: SendTurnPayload) => Promise<void>`        | Send text or a full turn (multi-part, attachments, HITL responses).       |
| `stop`    | `() => void`                                       | Abort the in-flight request.                                              |
| `reset`   | `() => void`                                       | Clear state and start a new session.                                      |

The first five are `ComputedRef`s; the rest are methods. Destructure whatever you need, since refs keep their reactivity through destructuring. Read them with `.value` in `<script>`, and unwrapped in `<template>`.

## Send a message

```vue
<script setup lang="ts">
import { ref } from "vue";
import { useEveAgent } from "eve/vue";

const { send } = useEveAgent();
const message = ref("");

async function handleSubmit() {
  const text = message.value.trim();
  if (!text) return;
  message.value = "";
  await send({ message: text });
}
</script>

<template>
  <form @submit.prevent="handleSubmit">
    <input v-model="message" placeholder="Type a message..." />
    <button type="submit">Send</button>
  </form>
</template>
```

For anything beyond plain text, reach for `send()`. Attachments follow the AI SDK `UserContent` format. Send file data as a base64 `data:` URL so it survives the JSON transport:

```vue
<script setup lang="ts">
import { useEveAgent } from "eve/vue";

const { send } = useEveAgent();

async function onFileChange(event: Event) {
  const file = (event.target as HTMLInputElement).files?.[0];
  if (!file) return;
  const bytes = new Uint8Array(await file.arrayBuffer());
  const base64 = btoa(String.fromCodePoint(...bytes));
  await send({
    message: [
      { type: "text", text: "Describe this image." },
      { type: "file", data: `data:${file.type};base64,${base64}`, mediaType: file.type },
    ],
  });
}
</script>
```

## Human-in-the-loop prompts

A tool opts into approval with `needsApproval` ([Tools](../../tools)). When it triggers, the pending request shows up as a `dynamic-tool` part on the latest message at `part.toolMetadata?.eve?.inputRequest`. Read it, then answer through the same session with `send({ inputResponses })`:

```ts
import type { EveDynamicToolPart, EveMessagePart } from "eve/vue";

const { data, send } = useEveAgent();

const isDynamicToolPart = (part: EveMessagePart): part is EveDynamicToolPart =>
  part.type === "dynamic-tool";

const request = data.value.messages
  .at(-1)
  ?.parts.filter(isDynamicToolPart)
  .map((part) => part.toolMetadata?.eve?.inputRequest)
  .find((value) => value !== undefined);

if (request) {
  await send({
    inputResponses: [{ requestId: request.requestId, optionId: "approve" }],
  });
}
```

The find-and-answer flow is the same across every framework. The [React hook reference](./overview) covers the longer walkthrough.

## Stop, reset, and resume

`stop()` aborts the in-flight stream. `reset()` wipes state and starts a fresh session. To survive a reload, pass `initialSession` to restore a saved session and `onSessionChange` to persist the cursor as it moves:

```ts
const agent = useEveAgent({
  initialSession: savedSessionState,
  initialEvents: savedEvents,
  onSessionChange: (session) => {
    localStorage.setItem("eve-session", JSON.stringify(session));
  },
});
```

## Custom host and credentials

When your eve server lives somewhere other than the same origin, point at it with `host` and attach credentials through `auth` or `headers`. Pass a function and it re-resolves before each request:

```ts
const agent = useEveAgent({
  host: "https://agent.example.com",
  headers: async () => ({
    authorization: `Bearer ${await getAccessToken()}`,
  }),
});
```

## Attach page context per turn

`clientContext` adds ephemeral context for the next model call and nothing more. Strings (or an array of strings) become user-role context messages; an object is JSON-serialized into one. It rides along with a message or HITL response, never dispatches a turn on its own, and never lands in durable session history. Pass it to `send()`:

```ts
await send({
  message: "What should I do on this screen?",
  clientContext: { route: "/billing", plan: "pro", seatsUsed: 4 },
});
```

To attach the same context to every turn without threading it through each call site, pass `prepareSend`. It runs right before each send and returns the (possibly augmented) turn:

```ts
const agent = useEveAgent({
  prepareSend: (input) => ({
    ...input,
    clientContext: { route: location.pathname },
  }),
});
```

## Lifecycle callbacks

The composable takes a few per-turn callbacks:

* `onEvent(event)`: fires for each eve stream event as it arrives.
* `onError(error)`: fires with the last `Error` when a turn fails.
* `onFinish(snapshot)`: fires with the final snapshot once a turn settles.
* `onSessionChange(session)`: fires when the session cursor advances. Persist it to resume across reloads.

```ts
const agent = useEveAgent({
  onEvent: (event) => console.debug(event.type),
  onError: (error) => console.error(error.message),
  onFinish: (snapshot) => console.log(snapshot.status),
});
```

Two more options tune turn behavior:

* `optimistic` (default `true`): projects submitted user messages into `data` before eve confirms them with a `message.received` event. These are reducer-facing projection events only; `events` stays the authoritative eve stream.
* `maxReconnectAttempts` (default `3`): stream reconnection budget per turn.

## Custom reducer

The default reducer projects events into `{ messages }` (`EveMessageData`). To shape `data` differently, pass a `reducer` implementing `EveAgentReducer<TData>`:

```ts
import { useEveAgent } from "eve/vue";
import type { EveAgentReducer } from "eve/vue";

interface ToolLog {
  readonly toolCalls: number;
}

const toolCounter: EveAgentReducer<ToolLog> = {
  initial: () => ({ toolCalls: 0 }),
  reduce: (data, event) =>
    event.type === "actions.requested" ? { toolCalls: data.toolCalls + 1 } : data,
};

const agent = useEveAgent({ reducer: toolCounter });
// agent.data.value is ToolLog
```

`reduce(data, event)` receives both authoritative eve stream events and client projection events (`client.message.submitted`, `client.message.failed`, `client.input.responded`). Handle the client events too if you want optimistic and HITL state in your projection. Otherwise, return `data` unchanged for them.

## What to read next

* [Nuxt](./nuxt): module setup
* [Frontend overview](./overview): the shared integration model
* [Sessions, runs & streaming](../../concepts/sessions-runs-and-streaming)


---

For a semantic overview of all documentation, see [/sitemap.md](/sitemap.md)

For an index of all available documentation, see [/llms.txt](/llms.txt)

For agent-facing discovery, including API and MCP surfaces, see [/agents.md](/agents.md)