Workflow Schema Reference¶
Complete YAML schema reference for Ploston workflows.
Canonical Example¶
This example shows every field. Copy it as a starting point for your workflows.
# ─────────────────────────────────────────────────────────────────
# METADATA (required)
# ─────────────────────────────────────────────────────────────────
name: data-pipeline # Required: Workflow identifier
version: "1.0.0" # Required: Semantic version
description: "Fetch, transform, and validate data" # Optional
# ─────────────────────────────────────────────────────────────────
# PACKAGES (optional)
# ─────────────────────────────────────────────────────────────────
packages:
profile: standard # minimal | standard | data_science
additional: # Extra packages to allow
- requests
# ─────────────────────────────────────────────────────────────────
# DEFAULTS (optional)
# ─────────────────────────────────────────────────────────────────
defaults:
timeout: 30 # Default step timeout (seconds)
on_error: fail # fail | continue | retry
retry: # Retry config (when on_error: retry)
max_attempts: 3
initial_delay: 1.0
max_delay: 30.0
backoff_multiplier: 2.0
# ─────────────────────────────────────────────────────────────────
# INPUTS (optional, but usually needed)
# Format: Array of input definitions
# ─────────────────────────────────────────────────────────────────
inputs:
# Simple syntax: just the name (required, type: string)
- url
# With default value (makes it optional)
- format: "json"
# Full definition with all options
- count:
type: integer # string | integer | number | boolean | array | object
required: false # Default: true
default: 10 # Default value
description: "Number of items" # For documentation
minimum: 1 # Validation: minimum value
maximum: 100 # Validation: maximum value
# Enum constraint
- output_format:
type: string
enum: ["json", "csv", "xml"] # Allowed values
default: "json"
# Pattern constraint
- email:
type: string
pattern: "^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\\.[a-zA-Z0-9-.]+$"
# ─────────────────────────────────────────────────────────────────
# STEPS (required, at least one)
# ─────────────────────────────────────────────────────────────────
steps:
# Tool step: calls an MCP tool
- id: fetch # Required: unique step identifier
tool: http_get # MCP tool name
params: # Tool parameters (templates allowed)
url: "{{ inputs.url }}"
headers:
Accept: "application/json"
timeout: 60 # Override default timeout
on_error: retry # Override default error handling
retry:
max_attempts: 3
initial_delay: 2.0
# Code step: runs Python in sandbox
- id: transform
code: |
import json
# Access previous step output
data = context.steps['fetch'].output
# Access inputs
limit = context.inputs.get('count', 10)
# Process data
items = data.get('items', [])[:limit]
# Return result (available as steps.transform.output)
return {"items": items, "count": len(items)}
# Step with dependency
- id: validate
depends_on: [transform] # Wait for these steps first
code: |
data = context.steps['transform'].output
if data['count'] == 0:
raise ValueError("No items found")
return {"valid": True, "count": data['count']}
# ─────────────────────────────────────────────────────────────────
# OUTPUTS (optional)
# ─────────────────────────────────────────────────────────────────
# Option 1: Single output (simple)
output: "{{ steps.validate.output }}"
# Option 2: Multiple named outputs (use this OR output, not both)
# outputs:
# - name: result
# from_path: steps.validate.output
# description: "Validation result"
# - name: item_count
# value: "{{ steps.transform.output.count }}"
# description: "Number of items processed"
Top-Level Structure¶
# Required
name: string # Workflow identifier (alphanumeric, hyphens)
version: string # Semantic version (e.g., "1.0", "2.1.3")
# Optional
description: string # Human-readable description
packages: object # Python package configuration
defaults: object # Default step settings
# Schema
inputs: array # Input definitions (array format)
steps: array # Step definitions (required, at least one)
outputs: array # Output definitions (optional)
output: string # Single output expression (alternative to outputs)
Metadata¶
name (required)¶
Unique workflow identifier.
- Type:
string - Pattern:
^[a-zA-Z][a-zA-Z0-9-]*$ - Example:
data-transform,hello-world
version (required)¶
Semantic version string.
- Type:
string - Example:
"1.0","2.1.3"
description (optional)¶
Human-readable description.
- Type:
string - Example:
"Transform and validate JSON data"
Packages Configuration¶
packages:
profile: string # Package profile: minimal | standard | data_science
additional: array # Additional packages to install
Profiles¶
| Profile | Packages |
|---|---|
minimal |
json, re, datetime, math |
standard |
minimal + collections, itertools, functools, hashlib, uuid |
data_science |
standard + numpy, pandas (if available) |
Defaults¶
defaults:
timeout: integer # Default step timeout (seconds)
on_error: string # Error handling: fail | continue | retry
retry: object # Retry configuration
Retry Configuration¶
defaults:
retry:
max_attempts: 3 # Maximum retry attempts
initial_delay: 1.0 # Initial delay (seconds)
max_delay: 30.0 # Maximum delay (seconds)
backoff_multiplier: 2.0 # Exponential backoff multiplier
Inputs¶
Format: inputs is an array (list) of input definitions.
Ploston supports three syntaxes for input definitions:
Syntax 1: Simple String (Required Input)¶
Syntax 2: Name with Default (Optional Input)¶
Syntax 3: Full Definition (All Options)¶
inputs:
- url:
type: string # Required: string | integer | number | boolean | array | object
required: true # Optional: default is true
default: null # Optional: default value (makes input optional)
description: "URL" # Optional: human-readable description
enum: [...] # Optional: allowed values
pattern: "^https?" # Optional: regex pattern (strings only)
minimum: 1 # Optional: minimum value (numbers only)
maximum: 100 # Optional: maximum value (numbers only)
Input Types¶
| Type | JSON Type | Example | Notes |
|---|---|---|---|
string |
string | "hello" |
Default type if not specified |
integer |
number | 42 |
Whole numbers only |
number |
number | 3.14 |
Any numeric value |
boolean |
boolean | true |
true or false |
array |
array | [1, 2, 3] |
JSON array |
object |
object | {"key": "value"} |
JSON object |
Complete Input Examples¶
inputs:
# Simple required inputs
- url
- topic
# With default values
- format: "json"
- retries: 3
# Full definitions
- count:
type: integer
required: false
default: 10
description: "Number of items to fetch"
minimum: 1
maximum: 100
- output_format:
type: string
enum: ["json", "csv", "xml"]
default: "json"
description: "Output format"
- email:
type: string
required: true
description: "Contact email"
pattern: "^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\\.[a-zA-Z0-9-.]+$"
Required vs Optional¶
| Condition | Required? |
|---|---|
Simple string syntax (- url) |
✅ Required |
Has default value |
❌ Optional |
required: true (explicit) |
✅ Required |
required: false (explicit) |
❌ Optional |
Steps¶
steps:
- id: string # Step identifier (required)
# Type (exactly one required)
tool: string # MCP tool name
code: string # Python code block
# Tool parameters (tool steps only)
params: object # Tool parameters
# Dependencies
depends_on: array # List of step IDs to wait for
# Error handling
timeout: integer # Step timeout (seconds)
on_error: string # Error handling: fail | continue | retry
retry: object # Retry configuration
Tool Step¶
steps:
- id: fetch
tool: http_get
params:
url: "{{ inputs.url }}"
headers:
Authorization: "Bearer {{ inputs.token }}"
Code Step¶
steps:
- id: process
code: |
import json
data = json.loads('{{ inputs.data }}')
result = {"processed": data}
Dependencies¶
steps:
- id: step1
code: |
result = "first"
- id: step2
depends_on: [step1]
code: |
result = "second"
- id: step3
depends_on: [step1, step2]
code: |
result = "third"
Outputs¶
Single Output¶
Multiple Outputs¶
outputs:
- name: string # Output name
from_path: string # Path to value (e.g., "steps.process.output.data")
value: string # Template expression (alternative to from_path)
description: string # Human-readable description
Output Examples¶
outputs:
- name: result
from_path: steps.transform.output
description: Transformed data
- name: count
value: "{{ steps.count.output }}"
description: Number of items processed
Template Expressions¶
Use Jinja2 templates to reference values:
| Expression | Description |
|---|---|
{{ inputs.name }} |
Input value |
{{ steps.id.output }} |
Step output |
{{ steps.id.output.field }} |
Nested field |
{{ value \| tojson }} |
JSON encode |
{{ value \| default('x') }} |
Default value |
Complete Example¶
name: data-pipeline
version: "1.0"
description: Fetch, transform, and validate data
packages:
profile: standard
defaults:
timeout: 30
on_error: fail
inputs:
url:
type: string
description: API endpoint URL
format:
type: string
enum: ["json", "csv"]
default: "json"
steps:
- id: fetch
tool: http_get
params:
url: "{{ inputs.url }}"
timeout: 60
- id: transform
depends_on: [fetch]
code: |
data = {{ steps.fetch.output }}
result = [item for item in data if item.get("active")]
- id: format
depends_on: [transform]
code: |
import json
data = {{ steps.transform.output }}
result = json.dumps(data, indent=2)
output: "{{ steps.format.output }}"