Tool Authoring

This is the complete guide to building tools for ToolRouter. Everything you need — from scaffolding to publishing — is here.

What should I build?

The best new tools do more than wrap a single API. They unlock a workflow agents cannot do natively, sit on top of structured live data or real-world actions, and combine enough official sources that the tool becomes more valuable than generic search.

One strong current example is contract-opportunities: a procurement intelligence tool that searches public tenders and award history across official sources like SAM.gov, USAspending, TED, Contracts Finder, Public Contracts Scotland, and World Bank procurement notices. That is a better category than another broad search tool because it exposes live revenue opportunities, buyer history, award evidence, set-asides, and repeatable watchlist workflows.

When evaluating similar ideas, prefer source sets with stable identifiers and reusable structure: ocid, notice ids, solicitation numbers, NAICS, PSC, CPV, buyer ids, award ids, and document URLs. Those are the fields that let you build dedupe, cross-source joins, fit scoring, and automations instead of just another list of links.

Capture the detailed execution plan for any serious new tool in a dated file under docs/plans/ before you start implementing.

How do I scaffold a new tool?

Run toolrouter init my-tool my_skill for a basic tool, --knowledge for RAG support, or --plugin for a standalone npm package. This creates the file structure with a working manifest and TODO placeholders.

Scaffold a new tool:

bash
toolrouter init my-tool my_skill           # Basic tool with one skill
toolrouter init my-tool --knowledge        # Tool with RAG knowledge base
toolrouter init my-tool --plugin           # Standalone npm plugin package

This creates the file structure and a working manifest with TODO placeholders.

What file structure should a tool use?

Every tool lives in src/tools/<tool-name>/ with an index.ts (defineTool manifest), a skills/ directory (one handler file per skill), and an optional knowledge/ directory for RAG markdown files.

Every built-in tool lives in src/tools/<tool-name>/:

src/tools/my-tool/
├── index.ts              # defineTool() manifest + skill wiring
├── skills/
│   ├── my-skill.ts       # Handler for my_skill
│   └── other-skill.ts    # Handler for other_skill
└── knowledge/            # Optional — RAG markdown files
    └── domain-guide.md

How do I define a tool?

Call defineTool() with the tool's name, displayName, subtitle, description, categories, and skills array. Each skill needs a name, description, input JSON Schema, at least one example, and a handler function. Export the register function and manifest.

The entire tool is defined in a single defineTool() call. This is the source of truth for metadata, skills, schemas, examples, requirements, and handlers. Because this metadata appears in discovery, docs, and agent UIs, it should read like product copy for the top-level fields and precise operational copy for schema fields.

Metadata tone standard

Treat displayName, subtitle, description, examples[].description, and returns as user-facing product copy. Treat instructions as agent-facing operational copy. Treat input property, output property, and requirement descriptions as operational copy. They serve different jobs and should sound different.

Every tool has four text layers, like an App Store listing. These are hard caps enforced by toolrouter validate --strict — every character must earn its place.

FieldAudienceMinMaxPurpose
displayNameEveryone350The tool's name
subtitleEveryone1080Short tagline for search results and tool cards
descriptionUsers50300What the tool does, who it's for, why it's useful
instructionsAgents50600Which skills to use when, workflow order, key tips

Skill-level caps:

FieldMinMaxPurpose
Skill description20300What this skill does and when to use it
Skill returns10200Brief summary of what the skill returns
Input param description150What the param is and valid values
Example description100Realistic one-line request

Why hard caps? Every token in a discover response costs the agent context window. A 300-char description is ~75 tokens. A 1500-char description is ~375 tokens — 5x the cost for diminishing returns. Agents skim; they don't read essays. Make every word count.

#### Subtitles

The subtitle is the first thing users and agents see in search results. Think App Store subtitle:

  • One line that captures what the tool does
  • Focus on the primary capability or value
  • No periods, no filler words
  • Examples: "Live crypto prices, market data, and trends", "Audit pages and sites for ranking issues"

#### Tool descriptions

The description is the full pitch on the tool detail page:

  • 2-4 sentences explaining what the tool does, who it's for, and what value it provides
  • Lead with the main value or job to be done
  • Mention the most common use cases in plain English
  • Prefer what the tool helps someone achieve over how it is implemented
  • Do not embed workflow steps, orchestration rules, or long feature inventories
  • Do not mention underlying API providers (e.g. don't say "powered by Serper" or "uses ElevenLabs")
  • Do not mention pricing, cost, or API key details (e.g. don't say "all skills are free", "no API key needed", "no authentication required", "no credits charged"). The platform handles pricing and auth — descriptions should focus purely on what the tool does and why it's useful.

Good formula: Help users do X so they can Y. Covers A, B, and C use cases.

#### Instructions

Instructions are agent-facing only — they don't appear on the website. They help agents get the most out of the tool:

  • Which skills to use for which tasks
  • Recommended workflows and skill chaining order
  • Important parameters to set and when
  • Tips for getting the best results
  • Caveats or limitations the agent should know about
  • Can be substantial (multiple paragraphs) — agents benefit from detail
  • Do not mention pricing or API keys here either — no "all skills are free", "no key needed", etc.

Good formula: Start with <skill> for <use case>. Chain <skill_a> → <skill_b> for <workflow>. Tips: ...

#### Skill descriptions

Skill descriptions should be more action-oriented than tool descriptions:

  • Describe the specific action and when to use it.
  • Stay user-facing and readable, but be slightly more precise than the tool summary.
  • Avoid raw implementation detail like "returns an array of objects" unless it changes how the skill should be used.

#### Other description fields

  • returns: summarize the outcome in human terms, not every key in the output schema.
  • examples[].description: write like a realistic request a user or agent would make.
  • Input property descriptions, output property descriptions, and requirement descriptions should stay precise and technical enough to remove ambiguity.

#### Quick examples

FieldAvoidPrefer
Subtitle"Uses AI to generate images from text""AI image generation and editing across 20+ models"
Tool description"Generate and edit images using 50+ AI models...""Create or edit images with AI for ads, concepts, product shots, brand visuals, and creative experiments. Supports text-to-image, image-to-image, inpainting, and background removal across multiple styles."
Instructions"Call generate_image with a prompt""Start with generate_image for text-to-image. Use edit_image for modifications to existing images. For best results, use descriptive prompts with style keywords. Chain with image-ops for resizing."
Skill description"Search Google News and return an array of news articles...""Find recent news stories on a topic when you need timely coverage and source links."
returns"Array of image results with title, imageUrl...""Image results with titles, links, thumbnails, and source information."

Minimal example

typescript
import { defineTool } from '../../core/define-tool.js';
import { mySkillHandler } from './skills/my-skill.js';

const { register, manifest } = defineTool({
  name: 'my-tool',
  displayName: 'My Tool',
  subtitle: 'Track brand mentions across the web in real time',
  description:
    'Monitor what people are saying about any brand, product, or topic across news, social media, and forums. ' +
    'Ideal for PR teams, marketers, and founders who need to stay on top of coverage and sentiment.',
  instructions:
    'Start with my_skill to get recent mentions for a brand name. ' +
    'Use broad terms for discovery, exact brand names for monitoring. ' +
    'Chain with sentiment analysis tools for deeper insight.',
  categories: ['analytics'],
  changelog: [{ date: '2026-03-22', changes: ['Initial release'] }],
  skills: [{
    name: 'my_skill',
    displayName: 'My Skill',
    description: 'List recent brand mentions for a search term so you can review what people are saying.',
    input: {
      type: 'object',
      properties: {
        param: { type: 'string', description: 'Brand name or search term to look up' },
      },
      required: ['param'],
    },
    examples: [
      { description: 'Find recent mentions for ToolRouter', input: { param: 'ToolRouter' } },
    ],
    returns: 'Recent mentions with the key details needed for review',
    handler: mySkillHandler,
  }],
});

export { register as registerMyTool, manifest };

Full example (real tool)

Here's a condensed version of magic-screenshots showing all features:

typescript
import { defineTool } from '../../core/define-tool.js';
import { captureHandler } from './skills/capture.js';
import { annotateHandler } from './skills/annotate.js';

const { register, manifest } = defineTool({
  name: 'magic-screenshots',
  displayName: 'Magic Screenshots',
  subtitle: 'Capture and annotate polished app store screenshots',
  description:
    'Capture, annotate, and export polished app store screenshots for launches, ASO refreshes, and creative reviews. ' +
    'Supports iPhone, iPad, and desktop viewports with full-page capture and annotation overlays.',
  instructions:
    'Start with capture to take a screenshot of any URL at a specific device size. ' +
    'Then use annotate to add callouts, highlights, or device frames. ' +
    'For App Store submissions, use iphone-15-pro device preset and avoid full_page mode.',
  categories: ['media', 'marketing'],
  changelog: [{ date: '2026-03-22', changes: ['Initial release'] }],

  // Suggested execution order for agents
  workflow: ['capture', 'annotate'],

  // Secrets/credentials this tool needs
  requirements: [
    {
      name: 'fal',
      type: 'secret',
      displayName: 'fal.ai API Key',
      description: 'API key for fal.ai model inference',
      required: true,
      acquireUrl: 'https://fal.ai/dashboard/keys',
      envFallback: 'FAL_KEY',
    },
  ],

  // Optional metadata
  author: { name: 'Humanleap', url: 'https://humanleap.com' },
  homepage: 'https://toolrouter.com/tools/magic-screenshots',

  skills: [
    {
      name: 'capture',
      displayName: 'Capture Screenshot',
      description:
        'Capture a page at a specific device size so you can create store-ready screenshots quickly.',
      contentType: 'image',
      input: {
        type: 'object',
        properties: {
          url: { type: 'string', description: 'URL to capture' },
          device: {
            type: 'string',
            description: 'Device viewport preset',
            enum: ['iphone-15-pro', 'ipad-pro', 'desktop'],
            default: 'iphone-15-pro',
          },
          full_page: {
            type: 'boolean',
            description: 'Capture full scrollable page',
            default: false,
          },
        },
        required: ['url'],
      },
      output: {
        type: 'object',
        properties: {
          image_path: { type: 'string', description: 'Path to saved PNG' },
          device: { type: 'string', description: 'Device preset used' },
        },
      },
      returns: 'Captured screenshot path plus the device metadata used for the export',
      annotations: {
        readOnlyHint: true,
        destructiveHint: false,
        idempotentHint: true,
        openWorldHint: true,
      },
      examples: [
        {
          description: 'Capture an iPhone screenshot of a landing page',
          input: { url: 'https://example.com', device: 'iphone-15-pro' },
        },
        {
          description: 'Capture a full-page desktop screenshot',
          input: { url: 'https://example.com', device: 'desktop', full_page: true },
        },
      ],
      handler: captureHandler,
    },
    // ... more skills
  ],
});

export { register as registerMagicScreenshots, manifest };

defineTool() reference

Tool-level fields

FieldTypeRequiredRules
namestringYeskebab-case, e.g. magic-screenshots
displayNamestringYesMin 3 chars. The tool's title
subtitlestringYes10-35 chars. Short tagline for search results and cards
descriptionstringYesMin 50 chars. Detailed user-facing description for the tool page
instructionsstringYesMin 50 chars. Agent-facing guide — best practices, skill selection, chaining tips
categoriesstring[]YesMin 1, from allowed list (see below)
changelogChangelogEntry[]YesMin 1 entry. Version auto-computed from length
skillsSkillConfig[]YesMin 1 skill
workflowstring[]NoSuggested skill execution order
requirementsToolRequirement[]NoSecrets and credentials
authorobjectNoAuthor info (name, url, email)
homepagestringNoValid URL
iconstringNoValid URL to square image
knowledgeKnowledgeToolConfigNoRAG config
currencystringNo3-char code, defaults to USD

Allowed categories

data, media, search, marketing, development, communication,
analytics, productivity, ai, finance, security, infrastructure

Skill-level fields

FieldTypeRequiredRules
namestringYessnake_case, e.g. analyze_page
displayNamestringYesMin 3 chars
descriptionstringYesMin 30 chars. Action-oriented summary of what the skill helps do
inputJsonSchemaYesEvery property must have description
outputJsonSchemaNoRecommended for publishability
handlerSkillHandlerYesAsync function
examplesSkillExample[]YesMin 1 (2+ recommended)
returnsstringNoMin 10 chars if provided. Describe the outcome, not every field
contentType'json' | 'image'NoDefaults to json
annotationsToolAnnotationsNoDefaults applied automatically (see below)
executionExecutionProfileNoShortcut — placed inside annotations.execution

Annotations

Annotations tell agents how to treat the tool:

AnnotationDefaultMeaning
readOnlyHintfalseTool does not modify anything
destructiveHintfalseTool may delete/overwrite data
idempotentHinttrueSame input → same effect (safe to retry)
openWorldHintfalseTool interacts with external services

Set readOnlyHint: true for read-only tools (search, lookup). Set openWorldHint: true for tools that call external APIs or browse the web.

Examples

Examples are not decorative. They are:

  • Executed by toolrouter test as integration tests
  • Shown in docs, discovery, and agent-facing pages
  • Used by agents to understand input format
  • Validated against the input schema at registration

Each example must:

  • Have a description (min 5 chars)
  • Only use properties declared in the input schema
  • Provide all required properties
typescript
examples: [
  {
    description: 'Analyze SEO for a landing page',
    input: { url: 'https://example.com', depth: 2 },
  },
],

Pricing model

Every skill call costs max($0.005, raw_cost):

Scenarioraw_costUser paysExample
Our infra (Playwright, Remotion, RapidAPI)0$0.005SEO scans, screenshots, video renders
Provider API (fal.ai, OpenRouter, ElevenLabs, etc.)actual costrawCost$2.80 for a Kling video, $0.04 for a fal.ai image
  • If your skill calls a paid provider API, return usage: { raw_cost } with the actual cost from the provider.
  • If your skill runs on our infrastructure only (no paid API), return raw_cost: 0 — the $0.005 minimum kicks in.
  • Use context.getRate(provider, metric) for dynamic rates — never hardcode dollar amounts.
  • There is no markup on provider costs. Revenue comes from the 5.5% purchase fee on credit top-ups.

How do I write a skill handler?

Handlers are async functions receiving (input, context). Access input values with input.param as string, log progress with context.log(), get credentials with context.get('openai'), and return either a plain object or { output, usage: { raw_cost } } for metered calls.

Handlers are async functions that receive input and a context object:

typescript
import type { SkillHandler } from '../../../core/types.js';
import { resolveKey, requireKeyOrUnavailable } from '../../../core/resolve-key.js';

export const mySkillHandler: SkillHandler = async (input, context) => {
  const url = input.url as string;

  // Check for required API key — returns clean "unavailable" if missing
  const unavailable = requireKeyOrUnavailable(context, 'openai', 'OPENAI_API_KEY');
  if (unavailable) return unavailable;

  const apiKey = resolveKey(context, 'openai', 'OPENAI_API_KEY')!;

  // Log progress (goes to stderr in stdio mode)
  context.log(`Processing ${url}`);

  // Do work...
  const result = await fetchSomething(url, apiKey);

  // Return output — plain object or HandlerResult with usage
  return {
    output: { data: result },
    usage: { raw_cost: 0.002 },
  };
};

Handler return values

A handler can return either:

Plain object — when there's no AI cost:

typescript
return { title: 'Example', score: 95 };

HandlerResult — when there's a cost to report:

typescript
return {
  output: { title: 'Example', score: 95 },
  usage: {
    raw_cost: 0.003,  // AI provider cost in tool's currency (USD)
    details: { model: 'gpt-4o', tokens: 1500 },  // Optional metadata
  },
};

Pricing model: The caller pays max($0.005, raw_cost). If your handler returns raw_cost: 0 (our infra, no provider API), the user is charged a flat $0.005. If your handler returns a provider cost (e.g. raw_cost: 2.80 for a fal.ai video), the user pays that amount. There is no markup — revenue comes from the 5.5% purchase fee on credit top-ups.

Handler context API

The context object provides:

PropertyTypeDescription
context.toolNamestringCurrent tool name
context.skillNamestringCurrent skill name
context.callIdstringUnique call ID (use for temp filenames)
context.log(msg)functionOperational logging (stderr in stdio mode)
context.get?.(name)function or undefinedGet a saved secret/credential by name (use ?. to guard)
context.getRate(provider, metric)functionGet live pricing rate (Convex override > static default > 0)
context.getProviderKey(name)function*Deprecated* — use get() instead
context.callTool(ref, skill, input)functionCall another tool (composition, max 6 levels deep)
context.knowledgeobject or undefinedRAG search if knowledge/ exists

Error handling

Missing API keys — skip, don't throw. If a key is not configured, the tool should gracefully skip that data source or return an "unavailable" response. Never throw errors about missing keys — agents should not see credential setup instructions.

typescript
import { resolveKey, requireKeyOrUnavailable } from '../../../core/resolve-key.js';

// Single-source handler — entire skill needs one key:
const unavailable = requireKeyOrUnavailable(context, 'openai', 'OPENAI_API_KEY');
if (unavailable) return unavailable;  // Clean { available: false } response

// Multi-source handler — skip unavailable sources, use what's available:
const rapidapiKey = resolveKey(context, 'rapidapi', 'RAPIDAPI_KEY');
if (rapidapiKey) {
  results.push(await fetchFromRapidApi(rapidapiKey, ...));
}

For external service failures (not missing keys), throw with context:

typescript
try {
  const res = await fetch(url);
  if (!res.ok) throw new Error(`HTTP ${res.status}`);
  return await res.json();
} catch (err) {
  const message = err instanceof Error ? err.message : String(err);
  throw new Error(`Failed to fetch "${url}": ${message}`);
}

Asset delivery

If your handler produces files (screenshots, exports, generated images), return output keys ending in _path. Files must be under tmpdir() or ~/.toolrouter/assets/ — paths outside these directories are silently ignored.

typescript
import { writeFile } from 'fs/promises';
import { tmpdir } from 'os';
import { join } from 'path';

// Write to allowed directory, use callId for unique names
const filePath = join(tmpdir(), `screenshot-${context.callId}.png`);
await writeFile(filePath, buffer);

return {
  output: {
    image_path: filePath,   // _path suffix triggers auto-upload
    format: 'png',
  },
  usage: { raw_cost: 0.01 },
};

The gateway automatically:

  1. Uploads the file to the asset store
  2. Adds image_url and image_asset to the response
  3. For MCP clients: inlines images < 1MB as base64

File Outputs

Handlers produce files by downloading them locally and returning *_path keys. The gateway picks up any output key ending in _path, uploads it to the asset store, and adds *_url, *_asset, and *_page keys to the response automatically. Agents get a stable, shareable URL — no S3 SDK or upload code needed in your handler.

Basic pattern

Download the file → write to tmpdir() → return the path:

typescript
import { writeFile } from 'fs/promises';
import { tmpdir } from 'os';
import { join } from 'path';

export const exportHandler: SkillHandler = async (input, context) => {
  const buffer = await generateDocument(input.content as string);

  const filePath = join(tmpdir(), `report-${context.callId}.pdf`);
  await writeFile(filePath, buffer);

  return {
    output: {
      document_path: filePath,       // _path → triggers auto-upload
      document_url_direct: someUrl,  // ephemeral provider CDN URL (capture fallback)
      page_count: 12,
    },
    usage: { raw_cost: 0 },
  };
};

After the gateway processes this, the agent receives:

json
{
  "document_url": "https://api.toolrouter.com/v1/assets/ast_abc123",
  "document_asset": "ast_abc123",
  "document_page": "https://toolrouter.com/a/ast_abc123",
  "document_url_direct": "https://cdn.provider.com/...",
  "page_count": 12
}

What happens automatically

Output keyGateway adds
image_pathimage_url, image_asset, image_page
document_pathdocument_url, document_asset, document_page
video_pathvideo_url, video_asset, video_page
audio_pathaudio_url, audio_asset, audio_page
*_path (any prefix)*_url, *_asset, *_page

Files must live under tmpdir() or ~/.toolrouter/assets/ — paths outside these directories are ignored. Use context.callId in filenames to avoid collisions across concurrent calls.

The `*_url_direct` convention

document_url_direct (or image_url_direct, video_url_direct, etc.) is the ephemeral provider CDN URL — included as a capture fallback in case the local download fails before upload. It is not a permanent render URL. The gateway uploads the local file and produces the stable *_url; the *_url_direct is just a safety net for debugging.

Never use *_url_direct for permanent on-demand render URLs (e.g. a QR code endpoint that generates on request). For those, omit _path and return only the URL — or use the capture opt-out below.

Multi-file outputs

Return numbered suffixes for multiple files:

typescript
return {
  output: {
    image_path: paths[0],
    image_2_path: paths[1],
    image_3_path: paths[2],
    // Gateway produces: image_url, image_2_url, image_3_url, etc.
  },
  usage: { raw_cost: 0.03 },
};

Capture opt-out

If a URL is a permanent render endpoint that does not need uploading, add a {prefix}_capture: false flag alongside it:

typescript
return {
  output: {
    qr_url: 'https://api.qrserver.com/v1/create-qr-code/?data=...',
    qr_capture: false,  // tells gateway: don't download+upload this URL
  },
};

Without the opt-out, the gateway would attempt to upload everything it finds, which is wasteful for stable render-on-demand URLs.

Supported file extensions

The gateway recognises these extensions for auto-upload:

TypeExtensions
Images.png, .jpg, .jpeg, .gif, .webp, .svg, .avif
Video.mp4, .webm, .mov
Audio.mp3, .wav, .ogg, .flac
Documents.pdf, .zip, .pptx, .docx, .xlsx, .csv

Files with unrecognised extensions are still uploaded; content type is inferred from the extension or falls back to application/octet-stream.

Interactive Outputs (MCP Apps)

When your tool returns data to an MCP client like Claude, ChatGPT, or VS Code, it normally appears as plain text. MCP Apps let you render that data as an interactive UI — sortable tables, image galleries, charts, maps — directly inside the conversation. The user sees a card they can interact with instead of a wall of JSON.

How it works

  1. Your handler returns output with a format key declaring which UI component to use
  2. The gateway detects the format and attaches _meta.ui.resourceUri to the MCP response
  3. The host (Claude, ChatGPT, VS Code) fetches the HTML bundle and renders it in a sandboxed iframe
  4. The iframe receives your tool's output data and renders it interactively

Your handler code barely changes — you add format and normalize data into format_data:

typescript
// Before — plain JSON output
return {
  output: { items: results, total: results.length },
  usage: { raw_cost: 0.001 },
};

// After — same data, now renders as an interactive table
return {
  output: {
    format: 'table',
    format_data: {
      items: results.map(r => ({ title: r.name, url: r.link, snippet: r.description })),
      total: totalCount,
      page: 1,
    },
    // Keep original fields too — agents still read them
    items: results,
    total: results.length,
  },
  usage: { raw_cost: 0.001 },
};

The format_data key is the canonical data envelope — it matches the exact interface the format renderer expects. The original output fields stay alongside it for agents that read JSON directly.

Available formats

FormatRenders asUse when your output has
tableSortable, filterable rows with paginationitems[] — array of objects with consistent keys
galleryImage grid with lightbox and downloadimages[] — array of { url, title? }
reportScored sections with pass/warn/fail pillsscore?, sections[] with items[]
detailKey-value pairs with nested dataFlat or nested object with named fields
compareSide-by-side columnsitems[] — 2-4 objects to compare
timelineVertical event stream with status dotsevents[]{ time, title, status? }
chartLine, bar, or pie chart with tooltipstype, labels[], datasets[]
mapMap with pin markers and info cardspins[]{ lat, lng, title }
mediaAudio or video player with downloadurl, type: 'audio'|'video'
documentFile card with preview and downloadurl, filename, type: 'pdf'|'docx'|...
listNumbered items with title, description, linkitems[]{ title, description?, url? }

Data shapes per format

Each format has a strict format_data interface defined in src/core/format-types.ts. The renderer only reads format_data — never tool-specific output keys.

Table — the most common format:

typescript
return {
  output: {
    format: 'table',
    format_data: {
      items: [
        { title: 'Result 1', url: 'https://...', score: 95 },
        { title: 'Result 2', url: 'https://...', score: 87 },
      ],
      total: 142,
      page: 1,
      per_page: 20,
    },
  },
};

Gallery — for image-producing tools:

typescript
return {
  output: {
    format: 'gallery',
    format_data: {
      images: [
        { url: output.image_url, title: 'Generated image', page: output.image_page },
      ],
    },
  },
};

Report — for audit and analysis tools:

typescript
return {
  output: {
    format: 'report',
    format_data: {
      score: 72,
      sections: [
        {
          title: 'Meta Tags',
          status: 'pass',
          items: [
            { label: 'Title tag', value: 'Present', status: 'pass' },
            { label: 'Meta description', value: 'Missing', status: 'fail' },
          ],
        },
      ],
    },
  },
};

Chart — for time series and metrics:

typescript
return {
  output: {
    format: 'chart',
    format_data: {
      type: 'line',
      labels: ['Jan', 'Feb', 'Mar', 'Apr'],
      datasets: [{ label: 'Revenue', data: [120, 145, 162, 190] }],
    },
  },
};

List — for search results, ranked items, numbered lists with rich per-item content:

typescript
return {
  output: {
    format: 'list',
    format_data: {
      items: [
        { title: 'Result One', description: 'Brief summary of the first result', url: 'https://...' },
        { title: 'Result Two', description: 'Brief summary of the second result', detail: 'Extra context' },
      ],
    },
  },
};

Handling large datasets

Never dump hundreds of items into one output. Use pagination — return one page at a time:

typescript
export const searchHandler: SkillHandler = async (input, context) => {
  const page = (input.page as number) ?? 1;
  const perPage = 20;

  const allResults = await fetchResults(input.query as string);
  const pageResults = allResults.slice((page - 1) * perPage, page * perPage);

  return {
    output: {
      format: 'table',
      format_data: {
        items: pageResults,
        total: allResults.length,
        page,
        per_page: perPage,
      },
    },
  };
};

The MCP App UI shows "page 1 of 8" and fetches the next page by calling app.callServerTool() when the user clicks forward — no extra handler code needed. Just accept a page parameter.

Interactive features

The MCP App cards have these built-in interactions (no handler code needed):

  • Expand to fullscreen: every card has an expand button (top-right corner) that calls app.requestDisplayMode('fullscreen')
  • Clickable URLs: detail views auto-detect URL values and render them as tappable links via app.openLink()
  • Footer links: if you include a footer_link URL in your format_data, a "View" link appears in the card footer
  • Map pin clicks: clicking a pin updates the model's context with the selected location
  • Human-readable labels: all snake_case keys are automatically converted to "Title Case" labels

The UI deliberately avoids non-functional buttons. No "Download" or "Variations" buttons appear unless they're wired to real server-side actions. If your tool produces a shareable asset URL (via the asset pipeline), include it as a page field in your format_data and it becomes the footer link.

How MCP Apps rendering works (protocol level)

The interactive UI is powered by the MCP Apps extension (io.modelcontextprotocol/ui). Here's what happens under the hood:

  1. The gateway's tools/list response includes _meta.ui: { resourceUri: 'ui://toolrouter/app', visibility: ['model', 'app'] } on use_tool and get_job_result
  2. The gateway's resources/read response serves a single-file HTML bundle (FORMAT_BUNDLE_HTML) with MIME type text/html;profile=mcp-app and CSP metadata
  3. When a host like Claude calls use_tool, it detects the _meta.ui and fetches the HTML resource
  4. The host renders the HTML in a sandboxed iframe and forwards the tool result via ui/notifications/tool-result
  5. When a tool's output contains a format key, the gateway includes the full output as structuredContent on the CallToolResult — the host passes this directly to the iframe without it entering model context
  6. The HTML app (built with @modelcontextprotocol/ext-apps App class) checks structuredContent first, falls back to parsing JSON from text content blocks, then dispatches to the matching renderer

The HTML bundle includes:

  • 11 format renderers — table, gallery, report, detail, compare, timeline, chart, map, media, document, list
  • Host theme integrationapplyDocumentTheme(), applyHostStyleVariables(), applyHostFonts() from the ext-apps SDK
  • Auto-resize — the App class uses ResizeObserver + sendSizeChanged() to tell the host the exact content height
  • CSP metadataresourceDomains allows images/media from provider CDNs (fal.media, replicate, S3, CloudFront, etc.)
  • Loading statesontoolinput shows "Working on [tool]..." while the tool executes
  • Cancellationontoolcancelled shows a clean "Cancelled" card
  • Dark mode — uses spec-defined --color-* CSS variables that automatically resolve differently in [data-theme="dark"] (not @media prefers-color-scheme which doesn't work in sandboxed iframes)
  • Spec-compliant CSS tokens — all styles use the MCP Apps spec's McpUiStyleVariableKey names (--color-background-primary, --font-sans, --border-radius-md, etc.) so host-injected themes apply automatically via applyHostStyleVariables()

The HTML bundle source lives in packages/mcp-apps/src/. To rebuild after changes: cd packages/mcp-apps && npm run build, then node scripts/embed-format-bundle.ts to embed it into the gateway.

Design principles

The MCP App UI is designed for non-technical users. Key rules:

  • No JSON ever — if data can't be rendered cleanly, it doesn't show
  • No dead buttons — buttons only appear when they're wired to real actions
  • Human-readable labelssnake_case keys become "Title Case" automatically
  • Clean error states — errors show plain text messages, not {"error":{"code":"..."}}
  • Null/empty values hidden — fields with null, undefined, or empty values are silently skipped
  • Gateway metadata stripped — internal keys like review_hint, request_id, usage, _meta never appear in the UI

Backward compatibility

Adding format is optional and non-breaking. Tools without a format key work exactly as before — the gateway returns plain JSON in a text content block. The format system is purely additive.

Clients that don't support MCP Apps (e.g. older MCP clients, stdio-only setups) ignore the _meta.ui field and get the same JSON output they always did. The data is always there in the text content block regardless of whether the UI renders.

Lazy loading heavy dependencies

Load expensive dependencies on first use, not at import time. This keeps startup fast:

typescript
// Good — lazy load
export const captureHandler: SkillHandler = async (input, context) => {
  const { chromium } = await import('playwright');
  // ...
};

// Bad — loaded at import time
import { chromium } from 'playwright';

How do I declare API key requirements?

Add a requirements array to defineTool() with entries specifying name (globally unique), type ('secret' or 'credential'), displayName, description, acquireUrl, and envFallback. Handlers access them via context.get('name').

If your tool needs API keys or identifiers, declare them in requirements:

typescript
requirements: [
  {
    name: 'openai',                          // Globally unique key
    type: 'secret',                          // 'secret' = masked, 'credential' = plain text
    displayName: 'OpenAI API Key',
    description: 'API key for GPT model access (min 10 chars)',
    required: true,
    acquireUrl: 'https://platform.openai.com/api-keys',
    envFallback: 'OPENAI_API_KEY',
  },
],

Resolution order (handler calls context.get('openai')):

  1. Per-request header: X-Provider-Key-OpenAI: sk-...
  2. User's saved value (Convex cloud)
  3. Config file (~/.toolrouter/config.json)
  4. Environment variable (process.env.OPENAI_API_KEY)

Important: requirement names are globally unique. If two tools both declare name: 'openai', they must have identical displayName, description, acquireUrl, and envFallback. The validator checks this.

How do I add knowledge to a tool?

Create a knowledge/ directory with .md files and enable it in defineTool() with knowledge: {}. Handlers search it with context.knowledge?.search(query, topK). Knowledge is indexed lazily using local embeddings — no API key needed.

Add a knowledge/ directory with .md files to give your handler searchable domain knowledge:

src/tools/my-tool/
├── knowledge/
│   ├── best-practices.md
│   └── common-patterns.md
├── index.ts
└── skills/

Enable in defineTool():

typescript
knowledge: {
  dir: 'knowledge',       // Default: 'knowledge'
  maxChunkSize: 1500,     // Default: 1500 chars per chunk
  overlap: 200,           // Default: 200 chars overlap
  defaultTopK: 5,         // Default: 5 results
},

Use in handlers:

typescript
export const analyzeHandler: SkillHandler = async (input, context) => {
  // Search knowledge base
  const docs = await context.knowledge?.search('meta tag best practices', 3);

  // docs = [{ content, heading, score, source }]
  const guidance = docs?.map(d => d.content).join('\n') ?? '';

  // Use guidance in your analysis...
};

Knowledge is indexed lazily on first search using local embeddings (HuggingFace, no API key needed).

For built-in tools, set __knowledgePath to a pre-resolved absolute path (tsup bundles can't resolve relative paths at runtime):

typescript
import { resolve, dirname } from 'path';
import { fileURLToPath } from 'url';

const __knowledgePath = resolve(
  dirname(fileURLToPath(import.meta.url)),
  '../../src/tools/my-tool/knowledge'
);

How do I call other tools from a handler?

Use context.callTool(toolRef, skill, input) to call another tool. Billing and provider keys are inherited from the parent call. Max composition depth is 6 levels (depth 0–5).

A handler can call other tools using context.callTool(). Billing and provider keys are inherited from the parent call:

typescript
export const fullAuditHandler: SkillHandler = async (input, context) => {
  const url = input.url as string;

  // Call another tool's skill
  const seoResult = await context.callTool(
    'seo',
    'analyze_page',
    { url },
  );

  const screenshotResult = await context.callTool(
    'magic-screenshots',
    'capture',
    { url, device: 'desktop' },
  );

  return {
    seo: seoResult.output,
    screenshot: screenshotResult.output,
  };
};

Max composition depth: 6 levels (depth 0–5). Deeper calls are rejected.

Naming conventions

WhatConventionExample
Tool namekebab-casemagic-screenshots
Skill namesnake_caseanalyze_page
Skill filenamekebab-caseskills/analyze-page.ts
Handler variablecamelCase + HandleranalyzePageHandler
Register exportregister + PascalCaseregisterMagicScreenshots
MCP tool nametool__skillseo__analyze_page
Tool directorykebab-casesrc/tools/magic-screenshots/

How do I register a tool?

Import the register function and call it inside createRegistry() in src/tools/index.ts. The registry validates the manifest and stores the handlers at startup.

Add your tool to src/tools/index.ts:

typescript
import { registerMyTool } from './my-tool/index.js';

export async function createRegistry(): Promise<ToolRegistry> {
  const registry = new ToolRegistry();

  registerMyTool(registry);
  // ... other tools

  return registry;
}

How do I validate and test a tool?

Run toolrouter validate for manifest checks (naming, schemas, examples) or toolrouter validate --strict to treat warnings as errors. Run toolrouter test my-tool to execute every example as an integration test with pass/fail reporting.

The validator enforces all the rules in this guide. Run it often:

bash
toolrouter validate                # Warnings only
toolrouter validate --strict       # Warnings become errors

What `validate` checks

Errors (always fail):

  • Tool name not kebab-case
  • Skill name not snake_case
  • Missing or invalid subtitle (must be 10-80 chars)
  • description too short (tool: 50 chars, skill: 30 chars)
  • Missing or too-short instructions (min 50 chars, aim for ~600)
  • Missing changelog (min 1 entry required)
  • Invalid category
  • No skills defined
  • No examples for a skill
  • Input property missing description
  • Example uses undeclared property
  • Example missing required property
  • Duplicate skill names
  • Workflow references non-existent skill
  • Requirement description too short (min 10 chars)
  • Requirement name conflicts across tools

Warnings (--strict makes these errors):

  • Missing author
  • Skills with fewer than 2 examples
  • Skills without outputSchema
  • Knowledge directory missing or empty

What `test` checks

bash
toolrouter test my-tool              # Test all skills
toolrouter test my-tool --timeout 60 # Custom timeout (seconds)

The test runner:

  1. Executes every example in every skill
  2. Validates output against outputSchema (in strict mode)
  3. Reports pass/fail and latency per example
  4. Default timeout: 30 seconds per example

Input schema best practices

Every property needs a description

This is enforced by the validator. Agents use descriptions to form correct inputs:

typescript
// Good
properties: {
  url: { type: 'string', description: 'Full URL to analyze including protocol' },
  depth: { type: 'integer', description: 'Crawl depth (1 = single page, 2+ = follow links)', default: 1 },
}

// Bad — will fail validation
properties: {
  url: { type: 'string' },
  depth: { type: 'integer' },
}

Use enums for constrained values

typescript
device: {
  type: 'string',
  description: 'Device viewport preset',
  enum: ['iphone-15-pro', 'ipad-pro', 'desktop'],
  default: 'iphone-15-pro',
}

Declare defaults

typescript
format: {
  type: 'string',
  description: 'Output format',
  enum: ['png', 'jpg', 'webp'],
  default: 'png',
}

Keep required properties minimal

Only mark properties as required if the skill genuinely cannot function without them. More optional properties = more flexible for agents.

Execution profiles (async skills)

For skills that take more than a few seconds, declare an execution profile. The gateway uses estimatedSeconds to decide sync vs async routing (>= 30s routes async):

typescript
skills: [{
  name: 'crawl_site',
  displayName: 'Crawl Site',
  description: 'Crawl an entire website and return structured data for all pages found',
  // ... input, handler, examples ...
  execution: {
    estimatedSeconds: 60,     // Gateway routes async if >= 30
    timeoutSeconds: 300,      // Max wait. Defaults to 2x estimatedSeconds
    mode: 'io',               // 'io' = network polling, 'cpu' = Playwright/ffmpeg
  },
}],

The execution field is a shortcut — it gets placed inside annotations.execution automatically. You do not need to nest it manually.

Dynamic pricing with context.getRate()

Never hardcode provider costs. Use context.getRate() to resolve live pricing that can be updated at runtime via Convex without redeployment:

typescript
import { resolveKey, requireKeyOrUnavailable } from '../../../core/resolve-key.js';

export const scrapeHandler: SkillHandler = async (input, context) => {
  const unavailable = requireKeyOrUnavailable(context, 'firecrawl', 'FIRECRAWL_API_KEY');
  if (unavailable) return unavailable;

  const apiKey = resolveKey(context, 'firecrawl', 'FIRECRAWL_API_KEY')!;
  const result = await firecrawlFetch(input.url as string, apiKey);

  // Resolve rate dynamically — Convex override > static default > 0
  const ratePerCredit = context.getRate('firecrawl', 'credit');
  const rawCost = (result.creditsUsed ?? 1) * ratePerCredit;

  return {
    output: result.data,
    usage: { raw_cost: rawCost },
  };
};

Static rates are defined in src/core/provider-pricing.ts as fallbacks. If your tool uses a new provider, add a static rate there.

Shared clients and handler factories

Shared API client

When multiple tools call the same API, extract a shared client into src/tools/_shared/:

typescript
// src/tools/_shared/my-api-client.ts
import type { SkillContext } from '../../core/types.js';

export const MY_API_CREDENTIAL = {
  name: 'my_api',
  type: 'secret' as const,
  displayName: 'My API Key',
  description: 'API key for My Service (get it at my-api.com/keys)',
  required: true,
  acquireUrl: 'https://my-api.com/keys',
  envFallback: 'MY_API_KEY',
};

export async function myApiFetch(
  path: string,
  params: Record<string, unknown>,
  apiKey: string,
) {
  const res = await fetch(`https://api.my-service.com${path}`, {
    method: 'POST',
    headers: { 'Authorization': `Bearer ${apiKey}`, 'Content-Type': 'application/json' },
    body: JSON.stringify(params),
  });
  if (!res.ok) throw new Error(`My API error: HTTP ${res.status}`);

  return { data: await res.json(), raw_cost: 0.001 };
}
// Handler does the key check before calling the client:
// const unavailable = requireKeyOrUnavailable(context, 'my_api', 'MY_API_KEY');
// if (unavailable) return unavailable;
// const key = resolveKey(context, 'my_api', 'MY_API_KEY')!;
// const result = await myApiFetch('/search', params, key);

Then each tool imports the shared client and credential definition.

Adding an upstream provider? If your client needs circuit breakers, /tmp asset downloads, model registry entries, and dispatch routing (e.g. image/video providers), see Adding Providers for the full pattern.

Handler factory

When a tool has many skills that differ only by an API endpoint, use a factory:

typescript
// All 11 web-search skills generated from a single factory
function createSearchHandler(endpoint: string): SkillHandler {
  return async (input, context) => {
    const result = await serperSearch(endpoint, input, context);
    return { output: result.data, usage: { raw_cost: result.raw_cost } };
  };
}

skills: [
  { name: 'search', handler: createSearchHandler('/search'), ... },
  { name: 'news_search', handler: createSearchHandler('/news'), ... },
  { name: 'image_search', handler: createSearchHandler('/images'), ... },
],

Stateful session pattern

For multi-step workflows where skills share state (e.g., pentest), return a session_id from the first skill and accept it in subsequent skills:

typescript
// skills/recon.ts — creates the session
export const reconHandler: SkillHandler = async (input, context) => {
  const session = createSession(input.target as string);
  return { session_id: session.id, findings: session.recon };
};

// skills/scan.ts — continues the session
export const scanHandler: SkillHandler = async (input, context) => {
  const session = getSession(input.session_id as string);
  if (!session) throw new Error('Invalid session_id — run recon first');
  // ... scan using session state
};

Use the workflow array to hint the correct order: workflow: ['recon', 'scan_vulnerabilities', 'generate_report'].

Workflow hints

The workflow array suggests an execution order to agents:

typescript
workflow: ['capture', 'restyle', 'annotate', 'export'],

This is a hint, not enforced. Agents can call skills in any order.

How do I add my tool to the E2E test suite?

After registering your tool, add it to the test infrastructure so all three tiers (manifests, billing, handler execution) cover it automatically.

Step 1: Register in test registry

Add your register function to tests/helpers/registry-factory.ts:

typescript
import { registerMyTool } from '../../src/tools/my-tool/index.js';

const ALL_REGISTERS = [
  // ... existing registers
  registerMyTool,
];

This gives you Tier 1 (manifest validation) and Tier 2 (billing pipeline) for free.

Step 2: Add MSW handlers

If your tool calls external APIs, add mock handlers in tests/helpers/msw-handlers.ts:

typescript
// Mock My API responses
http.get('https://api.my-service.com/*', ({ request }) => {
  const url = new URL(request.url);
  if (url.pathname.includes('/search')) {
    return HttpResponse.json({ results: [{ title: 'Mock result' }] });
  }
  return HttpResponse.json({ status: 'ok' });
}),

Place handlers before the catch-all at the bottom. For multi-step flows (submit + poll), create separate handlers.

Step 3: Add mock provider keys

If your tool declares requirements, add entries to MOCK_PROVIDER_KEYS in tests/e2e/tools.test.ts:

typescript
const MOCK_PROVIDER_KEYS: Record<string, string> = {
  // ... existing keys
  my_api: 'test-my-api-key',
};

Step 4: Skip if needed

If your tool depends on Playwright, CLI binaries, system DNS, or requires real files, add it to SKIP_EXECUTION:

typescript
const SKIP_EXECUTION = new Set([
  // ... existing skips
  'my-tool',
]);

Step 5: Run the suite

bash
npm run test:e2e

Your tool is now covered by all three tiers.

How do I create a plugin?

Run toolrouter init my-tool --plugin to create a standalone npm package. The package exports a register function. Users install plugins with toolrouter plugins add my-tool-plugin. Plugin tools get the same registry, billing, assets, and composition as built-in tools.

For tools distributed as separate npm packages:

bash
toolrouter init my-tool --plugin

This creates a standalone package with its own package.json. The package must export a register function:

typescript
export { register } from './index.js';

Users install plugins:

bash
toolrouter plugins add my-tool-plugin

Plugin tools get the same registry, billing, assets, and composition as built-in tools.

Checklist

Before shipping a tool:

Manifest & schema:

  • [ ] toolrouter validate --strict passes
  • [ ] toolrouter test <tool> passes
  • [ ] subtitle is 10-80 chars — punchy App Store tagline
  • [ ] description is 50+ chars — detailed user-facing pitch (2-4 sentences)
  • [ ] instructions is 50+ chars (aim for ~600) — agent guide with best practices and skill tips
  • [ ] changelog has at least 1 entry
  • [ ] Every skill has at least 2 examples
  • [ ] Every input property has a description
  • [ ] outputSchema defined for all skills
  • [ ] returns field describes what the skill gives back
  • [ ] requirements declared for any external API keys

Handler quality:

  • [ ] Heavy dependencies lazy-loaded (not imported at top level)
  • [ ] Missing keys use requireKeyOrUnavailable() / resolveKey() — never throw on missing keys
  • [ ] File outputs use _path suffix for auto-upload
  • [ ] Cost reported via usage: { raw_cost } for paid APIs
  • [ ] Pricing uses context.getRate() instead of hardcoded values
  • [ ] execution profile set for skills taking > 5 seconds
  • [ ] Output includes format + format_data for interactive rendering (see format interfaces in src/core/format-types.ts)
  • [ ] format_data matches the canonical interface for the declared format (TableData, GalleryData, etc.)
  • [ ] Large result sets paginated via page parameter (max ~20 items per response)

Test suite wired:

  • [ ] Registered in tests/helpers/registry-factory.ts
  • [ ] MSW handlers added for external APIs in tests/helpers/msw-handlers.ts
  • [ ] Mock provider keys added in tests/e2e/tools.test.ts (if tool has requirements)
  • [ ] npm run test:e2e passes

Registry:

  • [ ] Registered in src/tools/index.ts
  • [ ] npm run build succeeds

App icon:

  • [ ] Icon definition added to scripts/generate-icons.mjs (subject, material, gradient, lighting)
  • [ ] Icon generated: FAL_KEY=<key> node scripts/generate-icons.mjs --tool <tool-name>
  • [ ] PNG (1024x1024) and WebP (512x512) exist in web/public/icons/

Going live:

  • [ ] Committed and pushed to main (triggers Railway + Vercel deploy)
  • [ ] Tool approved in production Convex (see "How do I push a tool live?" below)
  • [ ] Verified on https://api.toolrouter.com/v1/tools/<tool-name>
  • [ ] Verified on https://toolrouter.com/tools/<tool-name> (may take up to 60s for ISR)

How do I push a tool live?

Registering a tool in code and pushing to main is not enough. ToolRouter has a tool approval system — new tools are hidden from the public API until explicitly approved in production Convex. This prevents unfinished tools from appearing on the website or being discovered by agents.

Step 1: Generate an app icon

Every tool needs an icon before going live. Add a definition to scripts/generate-icons.mjs and generate:

bash
# Get the FAL_KEY from Railway
export FAL_KEY=$(railway variables --json | node -e "const d=require('fs').readFileSync('/dev/stdin','utf8');console.log(JSON.parse(d).FAL_KEY)")

# Generate icon (creates PNG 1024x1024 + WebP 512x512 in web/public/icons/)
node scripts/generate-icons.mjs --tool my-tool

See the generate-tool-icons skill for icon design rules (single 3D object, real materials, gradient background, no text/clutter).

Step 2: Push to main

Commit your changes (including the icon files) and push. This triggers auto-deploys on both Railway (API) and Vercel (website):

bash
git push origin main

Railway deploys the API gateway with the new tool registered internally. Vercel rebuilds the website. Both take 1–3 minutes.

Step 3: Approve the tool in production Convex

New tools must be approved before they appear in GET /v1/tools, the website, or MCP discovery.

Using toolrouter approve CLI (recommended):

The CLI needs CONVEX_URL and TOOLROUTER_CONVEX_SERVER_SECRET pointing at production. These are on the Railway ToolRouter service. Pull them and run:

bash
# Get production credentials from Railway (ToolRouter service must be linked)
railway service link ToolRouter

# Approve a single tool
CONVEX_URL="https://jovial-pika-231.eu-west-1.convex.cloud" \
TOOLROUTER_CONVEX_SERVER_SECRET="$(railway variables --json | node -e "console.log(JSON.parse(require('fs').readFileSync('/dev/stdin','utf8')).TOOLROUTER_CONVEX_SERVER_SECRET)")" \
node dist/bin/toolrouter.js approve my-tool

# Approve ALL tools at once
CONVEX_URL="https://jovial-pika-231.eu-west-1.convex.cloud" \
TOOLROUTER_CONVEX_SERVER_SECRET="$(railway variables --json | node -e "console.log(JSON.parse(require('fs').readFileSync('/dev/stdin','utf8')).TOOLROUTER_CONVEX_SERVER_SECRET)")" \
node dist/bin/toolrouter.js approve --all

Alternative: Direct Convex function call:

bash
# Approve a single tool
CONVEX_DEPLOYMENT=prod:jovial-pika-231 npx convex run --no-push \
  toolApprovals:setApproval '{"toolName": "my-tool", "approved": true, "updatedBy": "yourname"}'

# Approve multiple tools at once
CONVEX_DEPLOYMENT=prod:jovial-pika-231 npx convex run --no-push \
  toolApprovals:bulkSetApproval '{"toolNames": ["tool-a", "tool-b"], "approved": true, "updatedBy": "yourname"}'
Important: .env.local points to the dev Convex instance (acoustic-antelope-876), NOT production. The toolrouter approve CLI and npx convex run both need explicit production targeting — never rely on .env.local for production operations.

Step 4: Wait for the approval refresh

The API gateway refreshes its approved tool list from Convex every 60 seconds. After approving, wait up to 60 seconds for the tool to appear in the public API.

Step 5: Verify

bash
# Check the API serves the tool
curl -s https://api.toolrouter.com/v1/tools/my-tool | python3 -m json.tool

# Check the website (ISR revalidates every 60s)
curl -sI https://toolrouter.com/tools/my-tool | head -1
# Should return: HTTP/2 200

Step 6: Test via ToolRouter MCP

The final verification is calling the tool through the MCP package — the same way real agents use it:

bash
# If not already configured, add ToolRouter MCP:
claude mcp add toolrouter -- npx -y toolrouter-mcp

# Then in Claude Code, use the discover and use_tool MCP tools to:
# 1. discover("my-tool") — verify it appears in results
# 2. use_tool({ tool: "my-tool", skill: "my_skill", input: { ... } }) — verify it executes

This confirms the full pipeline: agent → MCP → gateway → handler → response.

How approval works internally

  • Tools register into the ToolRegistry at startup regardless of approval status
  • registry.listTools() filters by the approved set (loaded from Convex tool_approvals table)
  • registry.listToolsUnfiltered() returns all tools (used by admin endpoints)
  • The gateway calls refreshApprovals() every 60 seconds to sync the in-memory set
  • Health endpoint (/health) reports the approved tool count
  • If the approval loader is not configured (e.g. local dev with no Convex), all tools are visible

Checking current approvals

bash
# List all currently approved tools
CONVEX_DEPLOYMENT=prod:jovial-pika-231 npx convex run --no-push toolApprovals:listApproved

# Revoke approval (hides tool from public API)
CONVEX_DEPLOYMENT=prod:jovial-pika-231 npx convex run --no-push \
  toolApprovals:setApproval '{"toolName": "my-tool", "approved": false, "updatedBy": "yourname"}'

How do I add an icon for my tool?

Every tool needs an icon. Icons are 3D photorealistic objects on gradient backgrounds, generated with Flux 2 Pro via fal.ai ($0.03/image).

Icon config format

Add your tool to TOOL_ICONS in scripts/generate-icons.mjs:

javascript
'my-tool': {
  subject: {
    object: '3D [single object description]',
    material: '[what it is made of — ceramic, metal, glass, etc.]',
  },
  background: {
    gradient: ['[top color]', '[bottom color]'],
  },
  lighting: '[soft top-down | dramatic rim | warm golden | cool ambient]',
},

Design rules

  1. ONE object only — not "magnifying glass with globe and documents." Just the magnifying glass.
  2. 3D with real materials — describe what it's made of: ceramic, chrome, leather, glass, wood.
  3. Unique gradient — check existing tools in your category. Don't duplicate colors.
  4. No accessories — no floating secondary elements, sparkles, or decorative extras.

Category color guidelines

CategoryGradient Range
Search/WebBlues, cyans, teals
SecurityDark reds, crimsons, blacks
FinanceTeals, golds, navys
Media/AudioPurples, violets, magentas
Data/ReferenceGreens, limes, soft blues
SocialPinks, oranges, corals
MarketingElectric blues, purples
Weather/NatureSky blues, golds, oranges
UtilityCharcoals, slates, dark grays

Generate the icon

bash
FAL_KEY=<key> node scripts/generate-icons.mjs --tool my-tool

Output: web/public/icons/my-tool.png + my-tool.webp (512x512)

How metadata is surfaced

Your tool's metadata appears in different contexts with different levels of detail:

ContextWhat's shown
Search results (6+ matches)name + subtitle + skill names
Compact results (2-5 matches)name + subtitle + skills with input params
Exact match / tool detailsubtitle + description + instructions + full skills with schemas and examples
Website tool cardIcon + displayName + subtitle + star rating
Website tool pageIcon + displayName + subtitle + description + all skills
REST API GET /v1/tools/:toolFull manifest including all fields

The subtitle is the most-seen field — it appears in every context. Write it as if it's the only thing agents and users will read before deciding to click.

The instructions field is agent-only — it never appears on the website. It's shown when an agent queries a specific tool by name via discover. This is your chance to give agents a detailed guide.

  • Adding Providers for integrating a new upstream API provider (client, models, dispatch, billing)
  • Architecture for the runtime execution path
  • CLI for validation, testing, and plugin workflows
  • Integration for how tools are exposed via MCP and REST