Model execution

Call the relay and report audit results.

Allowed executions receive OpenAI-compatible relay settings plus idempotent report helpers for completion, failure, and streaming callbacks.

Relay calls

Once prepared execution is allowed, the SDK exposes OpenAI-compatible fields directly on the allowed result. AI SDK routes can use @vela/ai-sdk to prepare execution and build a streamText-ready input with prepared.createStreamTextInput() plus a shared prepared.reporter. The generated callbacks include onAbort, onError, and onFinish; other clients can use the direct OpenAI-compatible fields or raw fetch.

OpenAI-compatible client

AI SDK route

import { prepareVelaStreamTextWithResponse } from "@vela/ai-sdk";
import { streamText } from "ai";

const prepared = await prepareVelaStreamTextWithResponse(vela, {
  estimatedCostUsd: 0.05,
}, {
  onReportError(error) {
    console.error("Failed to record Vela execution result", error);
  },
});

if (prepared.outcome !== "allow") {
  return prepared.response;
}

try {
  const result = streamText(prepared.createStreamTextInput({
    messages,
  }));

  return result;
} catch (error) {
  await prepared.reporter.tryReportFailedWithError(error);
  throw error;
}

For route tests and setup responses that should avoid loading the AI SDK runtime on non-allow paths, prepare with core SDK first and import @vela/ai-sdk plus ai only after allow. If route setup can fail after allow, use prepared.reporter in the shortest route path, or share an explicit idempotent reporter between the setup catch block and AI SDK stream callbacks.

allow-only AI runtime

import { createVelaClientFromEnvironment, createVelaEnvironmentDiagnosticsResponse, createVelaIntegration, getVelaEnvironmentDiagnostics } from "@vela/sdk";

const environment = getVelaEnvironmentDiagnostics({ env: process.env });

if (!environment.ok) {
  return createVelaEnvironmentDiagnosticsResponse(environment);
}

const client = createVelaClientFromEnvironment({ env: process.env });
const scopedVela = createVelaIntegration({ client, executionScope });
const prepared = await scopedVela.prepareExecutionWithResponse({
  estimatedCostUsd: 0.05,
});

if (prepared.outcome !== "allow") {
  return prepared.response;
}

const [{ createVelaStreamTextInput }, { streamText }] = await Promise.all([
  import("@vela/ai-sdk"),
  import("ai"),
]);
const reporter = prepared.createResultReporter({
  onReportError(error) {
    console.error("Failed to record Vela execution result", error);
  },
});

const result = streamText(createVelaStreamTextInput(
  prepared,
  {
    messages,
  },
  { reporter },
));

Raw fetch

chat completions

import { createOpenAiCompatibleUsageReportInput } from "@vela/sdk";

const reporter = prepared.createResultReporter();

const response = await fetch(prepared.openAiCompatibleChatCompletionsUrl, {
  method: "POST",
  headers: prepared.openAiCompatibleRequestHeaders,
  body: JSON.stringify({
    model: prepared.openAiCompatibleModel,
    messages,
  }),
});

const payload = await response.json();

await reporter.tryReportCompletedWithUsage(
  createOpenAiCompatibleUsageReportInput(payload.usage),
);

Audit reporting

Audit should not make the model response unreliable. For streaming and callback heavy integrations, create an idempotent reporter and use the non-throwing tryReport... methods. Adapter onReportError also observes supplied reporter { ok: false } outcomes and thrown reporter errors before app callbacks continue. AI SDK stream aborts are reported as failed executions before the app onAbort callback continues.

Report once, keep model result primary

const reporter = prepared.createResultReporter({
  onReportError(error) {
    console.error("Failed to record Vela execution result", error);
  },
});

try {
  const result = await callModel({
    model: prepared.openAiCompatibleModel,
    ...prepared.openAiCompatibleClientOptions,
  });

  await reporter.tryReportCompletedWithUsage(extractUsage(result));
  return result;
} catch (error) {
  await reporter.tryReportFailedWithError(error);
  throw error;
}

Use usage when available

reportCompletedWithUsage() normalizes token counts and uses the prepare-time estimated cost when no final cost is supplied.

Normalize unknown errors

reportFailedWithError() and getErrorMessage() keep thrown strings, empty errors, and unknown values predictable.

Keep reports idempotent

ExecutionResultReporter settles on the first report, so callback paths do not need a separate hasReported guard.