Authoring an Action

This guide walks you from an empty file to a working, installable action. By the end you will have a single Markdown file the LLM can discover, the runtime can execute deterministically, and any user can install with one CLI command.

If you have not read it yet, start with Actions for the model. This guide is the how; that page is the what and why. The companion guide is Authoring a Connector — actions exist to call connectors, so the two surfaces are designed in parallel.

The reference templates throughout this guide are the actions/*/action.md files in aileron-connector-google — six real action templates spanning idempotent reads, non-idempotent writes, and per-call user approval.

What you are building

An action is a single Markdown file. There is no compiled artifact, no separate manifest, no bundled binary. The whole contract — what the action does, which connectors it touches, which capabilities it exercises, what arguments it accepts — lives in TOML frontmatter at the top of the file. The Markdown body is documentation that doubles as the description the LLM reads when deciding whether to invoke.

Once installed, the file lives at ~/.aileron/actions/<name>.md and the user owns it. They can edit the trigger phrases, swap a connector version, retitle the action, rewrite the body — without anyone’s permission. You are not shipping a runtime dependency; you are shipping a starting-point template that gets copied into the user’s home directory and forked from there.

This shapes how you author. The file is the artifact. There is nothing else.

Project skeleton

For an action template you are publishing, the file lives inside a connector repo per the layout in Publishing a Connector:

aileron-connector-google/
└── actions/
    └── list-recent-emails/
        └── action.md

The file is named action.md; the directory name becomes the action handle (list-recent-emails here). When a user runs aileron action add github://ALRubinger/aileron-connector-google/actions/list-recent-emails@<version>, the runtime fetches the tarball, verifies it, and writes the body to ~/.aileron/actions/list-recent-emails.md.

For an action you are writing for yourself — no template, no publishing — just create the file directly under ~/.aileron/actions/. Same shape, same rules.

The skeleton

A complete, valid minimum, modeled on list-recent-emails/action.md from the reference repo:

+++
name = "list-recent-emails"
version = "0.0.0-dev"
source = "github://ALRubinger/aileron-connector-google/actions/[email protected]"

[[requires.connectors]]
name = "github://ALRubinger/aileron-connector-google"
version = "0.0.0-dev"
hash = "sha256:bound-at-release"
capabilities = ["list_recent_emails"]

[match]
intent = "list my recent Gmail messages"

[[execute]]
id = "list"
connector = "github://ALRubinger/aileron-connector-google"
op = "list_recent_emails"
idempotent = true

[[inputs]]
name = "query"
type = "string"
description = "Optional Gmail search query, e.g. \"is:unread\" or \"from:[email protected]\"."
required = false

[[inputs]]
name = "max_results"
type = "integer"
description = "Maximum number of messages to return. Defaults to 10; capped at 100."
required = false
+++

# List Recent Gmail Messages

Fetches a list of recent Gmail messages for the authenticated user. Returns
the raw `users.messages.list` response (a paginated list of `{id, threadId}`
pairs); the agent or a downstream action resolves message bodies as needed.

When it fires:
- "summarize my unread emails from this week"
- "show me messages from [email protected]"
- "what's in my inbox right now"

That is everything. Open one of those +++ blocks at the top; close it; write the body underneath.

The 0.0.0-dev and sha256:bound-at-release placeholders are intentional in source. The release workflow substitutes the real values at tag-push time. That pattern is covered in the Identity and Connector dependencies sections below, and in detail in Publishing a Connector.

Frontmatter, field by field

Every field in the frontmatter is enforced. There is no “ignored if empty” — unknown keys are a parse error and missing required keys are a validation error. The schema is closed in v1; better one specific message than silent acceptance of a typo.

Identity

name    = "list-recent-emails"
version = "0.0.0-dev"
source  = "github://ALRubinger/aileron-connector-google/actions/[email protected]"

name is the bare local handle. Lowercase, starts with a letter or digit, dashes/dots/underscores allowed (^[a-z0-9][a-z0-9._-]*$). It becomes the filename and the tool name the LLM sees.
version is strict SemVer 2.0 — no v prefix. Pre-MVP convention is to stay at 0.x.y until the action’s surface is stable.
source is the FQN+version the file was copied from. It is provenance only — the runtime never re-fetches it. If the user edits the file post-install, source still points at the upstream origin so aileron action update knows where to fetch the new template.

The recognized FQN schemes are github://, gitlab://, and hub://. Anything else is a validation error.

Source-template convention: in a published action template, both version and the version suffix in source are typically 0.0.0-dev. The release workflow substitutes the real version (extracted from the pushed tag) into a build copy of the file before signing. The committed source intentionally stays a template, so the publisher pushes a tag and never edits version fields by hand. The shipped tarball — the file the user actually installs — carries the real values.

Connector dependencies

[[requires.connectors]]
name         = "github://ALRubinger/aileron-connector-google"
version      = "0.0.0-dev"
hash         = "sha256:bound-at-release"
capabilities = ["list_recent_emails"]

This block does several jobs at once:

Pins the connector. name + version + hash together identify exactly one binary. The runtime refuses to execute an action whose pinned hash does not match what is in the content-addressed store. A connector publisher cannot ship a malicious update to your installed action — the action references a specific content hash, period.
Declares the capability subset. capabilities is what the action actually uses, not what the connector is capable of. The Google connector exposes list_recent_emails, get_email, send_email, create_calendar_event, etc.; an action that only lists messages declares ["list_recent_emails"] and that is the entire surface area. The capability strings are typically the connector’s op names, since each op corresponds to one external operation.
Drives the audit story. A reader of the file can tell at a glance what surface the action touches. If a future template update adds a new capability, it shows up as a TOML diff against the user’s local copy and they can accept or reject it.

You can declare multiple connectors. Compose them in your execute chain — one step pulls a list of messages from gmail, the next creates a calendar event in calendar.

Placeholder convention: as with version, the 0.0.0-dev and sha256:bound-at-release placeholders are how a published template stays template-shaped. The release workflow substitutes both at tag-push time:

0.0.0-dev → the real version (taken from the pushed tag).
sha256:bound-at-release → the real connector content hash (computed after the connector tarball is built in the same workflow run).

The committed source carries the placeholders; the published tarball carries the real values. This means every action in a connector repo automatically pins to the same connector hash, in the same release cohort, with no per-action commits per release.

Intent matching

[match]
intent = "list my recent Gmail messages"

intent is a canonical natural-language phrase the runtime uses to surface this action to the agent. The shape will grow as ADR-0008 matures; in v1 it is one required string.

The intent string is not the only signal the LLM sees. The Markdown body’s “When it fires” section (or whatever shape your prose takes) is what actually drives selection. Treat intent as the canonical short form and the body as the elaboration.

Inputs

[[inputs]]
name        = "query"
type        = "string"
description = "Optional Gmail search query, e.g. \"is:unread\" or \"from:[email protected]\"."
required    = false  # default is true; set false to mark optional

[[inputs]]
name        = "max_results"
type        = "integer"
description = "Maximum number of messages to return. Defaults to 10; capped at 100."
required    = false

Inputs become the JSON Schema parameters object the LLM sees when Aileron exposes the action as a tool. The LLM uses description to decide what to pass. Write descriptions for the LLM — terse, concrete, with one example if format matters.

name is ^[a-z][a-z0-9_]*$ (snake_case, starts with a letter). It maps to a JSON Schema property name.
type is one of string, integer, number, boolean. Object and array types are post-MVP.
required defaults to true. Set required = false for optional inputs.
description is required. This is what the LLM reads — write it accordingly.

Execution chain

[[execute]]
id         = "send"
connector  = "github://ALRubinger/aileron-connector-google"
op         = "send_email"
idempotent = false

[execute.inputs]
to      = "${args.to}"
subject = "${args.subject}"
body    = "${args.body}"

Steps run in declared order. Each step calls one connector op with one set of inputs. The runtime’s contract:

First failure terminates. If step 2 fails, step 3 does not run. Per ADR-0010, the action returns a Result whose Failure is the failing step’s structured error.
No auto-rollback. Successful prior steps are not undone. If you need compensating actions, the agent composes them in conversation, not the action file.
Each step ID is unique. Step IDs are how prior outputs will be referenced once interpolation lands (see below).
Each connector must appear in [[requires.connectors]]. You cannot reference a connector you did not declare — validation fails before the action is installable.

[execute.inputs] is a keyed map of arguments the runtime passes into the connector’s op. Values can be literals or ${args.<name>} interpolations referencing the call-time inputs you declared in [[inputs]]. The validator confirms every ${args.X} reference matches a declared input.

idempotent declares whether the gateway’s retry layer (ADR-0010) is allowed to re-invoke this step on transient failure. Read ops (GETs) are typically true. Write ops that mutate the world (sending an email, creating a calendar event) MUST be false — repeating them produces duplicates. The connector’s documentation tells you which ops are repeatable; the action manifest is where you record that decision.

v1 caveat on interpolation: ${args.X} works. ${step_id.field} (referencing a prior step’s output) is post-MVP. If your action genuinely needs to feed step 1’s output into step 2’s input, factor it into the connector for now — write a connector op that does both halves — rather than trying to chain steps in the action file.

Approval (optional)

[approval]
required = true

When required = true, the action does not run until the user explicitly approves the invocation in the Aileron webapp. The HTTP response from POST /v1/actions/<name>/run holds open while the approval is queued; on approve, the action runs and returns its normal result; on deny, the runtime returns a structured error with class approval_denied.

Use approval for actions that touch the world in ways the user wants to confirm. The reference repo’s three write actions illustrate the gating axis:

draft-email — un-gated. Drafts land in Gmail’s Drafts folder and are fully reversible. The user already has a built-in human-in-the-loop step (clicking Send in Gmail), so a runtime prompt would duplicate that review without adding safety.
send-email — gated. Dispatched mail is not reversible. The approval step moves from Gmail’s UI to Aileron’s prompt; it does not disappear.
create-calendar-event — gated. Calendar’s events.insert dispatches invitation emails to attendees as a side effect, and those notifications don’t retract cleanly when the event is later deleted.

Reversibility is the test. If the action’s effect is recoverable by a normal user action (delete a draft, edit a doc), skip approval. If it isn’t (mail dispatched, money moved, irreversible API call), gate it.

The block is optional. Absent block ⇒ no approval needed.

The Markdown body

Everything after the closing +++ is the body. Four readers consume it:

The developer browsing the file or its source repo.
The Hub rendering it for catalog listings.
The LLM deciding whether to invoke the action when the user types something.
The user reading the audit log after the fact.

You write one piece of prose. There is no separate “LLM hint” field to keep in sync. This is structurally important — the documentation and the LLM’s understanding of the action cannot drift apart, because they are the same string.

A workable shape:

# Action Title

One-paragraph description of what the action does. Lead with the verb and
the affected system. The LLM uses this as the function description.

## When it fires

- Trigger phrase one.
- Trigger phrase two.
- Trigger phrase three.

## What it does

Brief prose explanation of the steps. Mention any non-obvious behavior
(e.g. "posts to the first channel matching the input, case-insensitive").

## Inputs

- `query` — Optional Gmail search query.
- `max_results` — Page size cap (default 10).

Write the body for the LLM first. Trigger phrases and a tight one-paragraph description matter more than ASCII art. Avoid hedging language (“might”, “sometimes”, “depending on”) — the LLM uses prose to predict the function’s behavior, and uncertain prose makes it less likely to invoke when it should.

The execute model in practice

A few non-obvious behaviors worth knowing:

Inputs from [[inputs]] are merged with [execute.inputs]. v1 passes both through to the connector op. The connector receives the call-time args plus whatever the step declared.
Connector ops get the JSON envelope shape from Authoring a Connector — {op: "list_recent_emails", args: {query: "is:unread", max_results: 10}}. The action’s job is to produce that shape; the connector’s job is to dispatch on it.
Step output becomes the action result. When all steps succeed, the runtime aggregates them into the final result. v1 returns the last successful step’s output as the primary content.
Capability denial is sticky. If a step tries a network call to a host outside the connector’s manifest, or outside the action’s declared capability subset, the call is denied at the sandbox boundary and the step fails. The runtime does not silently retry or substitute.

Defense in depth: the two capability gates

This is worth understanding because it shapes what capabilities should contain.

When a step runs:

Connector boundary. The runtime checks the call against the connector’s manifest. If the connector declared gmail.googleapis.com:443 in [capabilities.network].hosts and the call dials that host, fine. If the connector tries to dial anywhere else, the call is denied.
Action boundary. The runtime also checks the call against the action’s declared capability subset. If the action declared ["list_recent_emails"] and the step tries to invoke send_email instead, the call is denied — even though the connector itself permits it.

Both gates fire on every call. Either denies and the step fails. The action’s capabilities list is the second gate; declare the minimum subset you need so adding a new capability later is a visible TOML diff for the user, not a silent change in behavior.

Iterating on an action

The fastest loop is to install a real template, edit the local file, and run it through the agent.

# Install a template you'll modify (replace <version> with a real tag)
$ aileron action add github://ALRubinger/aileron-connector-google/actions/list-recent-emails@<version>
✓ Action file written to ~/.aileron/actions/list-recent-emails.md

# Edit the file
$ $EDITOR ~/.aileron/actions/list-recent-emails.md

# Validation runs on every load — invalid frontmatter surfaces immediately
$ aileron action list
list-recent-emails  0.0.1  github://ALRubinger/aileron-connector-google/actions/[email protected]

When you have written something the runtime does not like — missing field, malformed FQN, unknown frontmatter key — aileron action list and aileron action show list-recent-emails surface the error with the file and (when known) the line. Fix it in place and try again.

To exercise the action end-to-end, drive it through aileron launch — the agent picks it up automatically once it is in the actions directory.

Common authoring mistakes

Forgetting hash. The validator requires sha256:<hex> on every [[requires.connectors]] entry. In a published template, the placeholder sha256:bound-at-release is what the source carries; CI substitutes the real hash at release time. If you are hand-writing an action against a connector you’ve already installed, get the hash from aileron connector show <FQN>.
Listing more capabilities than you use. The action boundary enforces the subset you declare; declaring more weakens the audit story. Strip capabilities aggressively.
A ${args.X} reference that does not match an input. Validation catches this — the error names the offending step, key, and missing input.
Multi-paragraph descriptions on inputs. The LLM reads description as a single line. Keep it tight.
Putting business logic in the body. The body is documentation; the contract is in the frontmatter. If the body says “posts only to channels starting with #eng-” but the execute step does no filtering, the runtime will not enforce it. Either filter in a connector op or do not promise the behavior.
Reusing an existing id across steps. Step IDs must be unique within the action.

Where to go next

Authoring a Connector — the connector your action depends on. Action capability subsets are meaningful only against a connector’s declared capability set.
Publishing a connector — once you have an action template you want others to install, the same repo and signing flow applies. Action tarballs ship alongside the connector.
ADR-0001: Manifest Format — why TOML, why +++, why the frontmatter shape.
ADR-0003: Action Model — the design constraints behind everything in this guide.
ADR-0010: Failure Handling — first-failure-terminates, no auto-rollback, structured error envelopes.