Skip to main content

Toolchain Integration Framework Design

Status: Draft Authors: Morphir Team Created: 2026-01-08 Epic: morphir-pfi

Overview

This document describes the design of Morphir's Toolchain Integration Framework, which enables orchestration of both external tools (like morphir-elm) and native Go implementations (like WIT bindings) through a flexible, composable abstraction.

Goals

  1. Polyglot Integration: Support tools from any ecosystem (npm, dotnet, native binaries)
  2. Native & External: Support both in-process Go toolchains and external process-based tools
  3. Unified CLI: Present a consistent interface regardless of underlying toolchain
  4. Composable Workflows: Enable complex build pipelines from simple building blocks
  5. Inspectability: Users can understand and debug what will happen before execution
  6. Extensibility: Third-party toolchains can integrate seamlessly

Non-Goals (for initial version)

  • LSP/MCP/BSP/gRPC communication protocols (future)
  • Distributed/remote execution (future)
  • Fine-grained incremental compilation (future)

Reference Implementation

The WIT bindings (pkg/bindings/wit/pipeline) serve as the reference implementation for native toolchains:

  • morphir wit make → WIT source → Morphir IR (frontend)
  • morphir wit gen → Morphir IR → WIT source (backend)
  • morphir wit build → Full pipeline with round-trip validation
  • JSONL batch processing for streaming workflows

This implementation demonstrates the patterns that the toolchain framework will generalize.

Core Concepts

Conceptual Hierarchy

┌─────────────────────────────────────────────────────────────────┐
│ CLI Commands │
│ morphir make | morphir build | morphir gen:scala │
├─────────────────────────────────────────────────────────────────┤
│ Workflows (named orchestrations) │
│ build, ci, release, custom workflows... │
├─────────────────────────────────────────────────────────────────┤
│ Targets (capabilities) │
│ make, gen, test, validate, format... │
├─────────────────────────────────────────────────────────────────┤
│ Tasks (toolchain implementations) │
│ morphir-elm/make, morphir-elm/gen, morphir-native/validate │
├─────────────────────────────────────────────────────────────────┤
│ Toolchains (provide tasks) │
│ morphir-elm, morphir-native, custom toolchains │
└─────────────────────────────────────────────────────────────────┘

Toolchains

A toolchain is both a tool adapter AND a capability provider. Toolchains come in two flavors:

External Toolchains (Process-Based)

External toolchains invoke tools via process execution. They:

  • Declare how to acquire and invoke external tools (npm, mise, path)
  • Communicate via file-based artifacts (JSONC) and diagnostics (JSONL)
  • Run in separate processes with configurable environment

Example: morphir-elm

[toolchain.morphir-elm]
name = "morphir-elm"
version = "2.90.0"

# Acquisition
acquire.backend = "npx" # or "npm", "mise", "dotnet-tool", "path"
acquire.package = "morphir-elm"
acquire.version = "^2.90.0"

# Environment
env.NODE_OPTIONS = "--max-old-space-size=4096"
working_dir = "."
timeout = "5m"

# Tasks provided
[toolchain.morphir-elm.tasks.make]
exec = "morphir-elm"
args = ["make", "-o", "{outputs.ir}"]
inputs = ["elm.json", "src/**/*.elm"]
outputs = { ir = { path = "morphir-ir.json", type = "morphir-ir" } }
fulfills = ["make"]

[toolchain.morphir-elm.tasks.gen]
exec = "morphir-elm"
args = ["gen", "-i", "{inputs.ir}", "-o", "{outputs.dir}", "-t", "{variant}"]
inputs = { ir = "@morphir-elm/make:ir" }
outputs = { dir = "dist/{variant}/**/*" }
fulfills = ["gen"]
variants = ["Scala", "JsonSchema", "TypeScript"]

Native Toolchains (In-Process)

Native toolchains are implemented in Go and run in-process. They:

  • Register Go functions as task handlers
  • Share memory with the Morphir runtime (zero-copy artifact passing)
  • Support streaming via Go channels

Example: WIT bindings (existing implementation)

The WIT bindings in pkg/bindings/wit/pipeline demonstrate native toolchains:

// Native toolchain registration (conceptual)
type WITToolchain struct{}

func (t *WITToolchain) Name() string { return "wit" }

func (t *WITToolchain) Tasks() []Task {
return []Task{
{Name: "make", Handler: t.Make, Fulfills: []string{"make"}},
{Name: "gen", Handler: t.Gen, Fulfills: []string{"gen"}},
{Name: "build", Handler: t.Build, Fulfills: []string{"build"}},
}
}

func (t *WITToolchain) Make(ctx Context, input MakeInput) (MakeOutput, error) {
// Direct Go implementation - no process spawning
return witpipeline.Make(input)
}

Current WIT CLI (cmd/morphir/cmd/wit.go):

  • morphir wit make - Compile WIT → Morphir IR
  • morphir wit gen - Generate Morphir IR → WIT
  • morphir wit build - Full pipeline with round-trip validation
  • Supports --jsonl output and --jsonl-input for batch processing

Toolchain Registration

Both external and native toolchains register tasks that fulfill targets:

ToolchainTypeTasksTargets Fulfilled
morphir-elmExternalmake, genmake, gen
witNativemake, gen, buildmake, gen, build
morphir-nativeNativevalidate, formatvalidate, format

Targets

A target is a CLI-facing capability that tasks fulfill. Targets:

  • Have well-known names that map to CLI commands
  • Declare artifact contracts (what they produce/require)
  • Support variants (e.g., gen:scala, gen:typescript)
[targets.make]
description = "Compile sources to Morphir IR"
produces = ["morphir-ir"]

[targets.gen]
description = "Generate code from IR"
requires = ["morphir-ir"]
produces = ["generated-code"]
variants = ["scala", "json-schema", "typescript"]

[targets.validate]
description = "Validate IR structure"
requires = ["morphir-ir"]
produces = ["diagnostics"]

Target Resolution:

  • morphir make → find task(s) fulfilling "make" target
  • morphir gen:scala → find task(s) fulfilling "gen" with variant "scala"
  • Multiple providers → morphir doctor advises, project config can pin

Tasks

A task is a concrete implementation provided by a toolchain. Tasks:

  • Execute external processes or native Go code
  • Declare inputs (files, artifact references)
  • Produce outputs to .morphir/out/{toolchain}/{task}/
  • Can be cached based on input hashes

Input References:

# File glob patterns
inputs = ["src/**/*.elm", "elm.json"]

# Task output references (logical)
inputs = { ir = "@morphir-elm/make:ir" }

# Mixed
inputs = {
sources = "src/**/*.elm",
ir = "@morphir-elm/make:ir"
}

Workflows

A workflow composes targets into named, staged orchestrations. Workflows:

  • Define explicit stages with names
  • Can run targets in parallel within stages
  • Support conditions for conditional execution
  • Can extend other workflows (inheritance)
[workflows.build]
description = "Standard build workflow"
stages = [
{ name = "frontend", targets = ["make"] },
{ name = "backend", targets = ["gen:scala"] },
]

[workflows.ci]
description = "CI pipeline with validation"
stages = [
{ name = "compile", targets = ["make"] },
{ name = "validate", targets = ["validate"], parallel = true },
{ name = "generate", targets = ["gen:scala"] },
{ name = "test", targets = ["test"] },
]

[workflows.release]
description = "Full release workflow"
extends = "ci"
stages = [
{ name = "package", targets = ["package"] },
{ name = "publish", targets = ["publish"], condition = "branch == 'main'" },
]

Workflow Inheritance:

┌─────────────────────────────────────────────────────┐
│ Project workflows (morphir.toml) │
│ extends = "@morphir-elm/elm-standard" │
├─────────────────────────────────────────────────────┤
│ Toolchain workflows (morphir-elm) │
│ extends = "@morphir/default-build" │
├─────────────────────────────────────────────────────┤
│ Built-in defaults (morphir core) │
│ build, test, check, clean, ... │
└─────────────────────────────────────────────────────┘

Execution Model

Execution Plan

The system computes an execution plan by merging workflow stages with target dependencies:

┌─────────────────┐     ┌─────────────────┐
│ Workflow Order │ │ Target Deps │
│ (stages) │ + │ (requires/ │
│ │ │ produces) │
└────────┬────────┘ └────────┬────────┘
│ │
└───────────┬───────────┘

┌───────────────────────┐
│ Execution Plan │
│ - Validated │
│ - Optimized │
│ - Inspectable │
└───────────────────────┘

Plan Features:

  • Validation: Catches workflow/dependency conflicts before execution
  • Optimization: Identifies parallelization opportunities
  • Caching: Skips tasks with unchanged inputs
  • Persistence: Cached to .morphir/out/plan.json, optionally committable

Plan Commands

# Show execution plan
morphir plan build

# Show optimized plan with parallelization
morphir plan ci --optimize

# Explain why a specific task runs
morphir plan ci --explain gen:scala

# Export plan as JSON
morphir plan ci --output plan.json

Example Output:

$ morphir plan ci

Execution Plan for workflow "ci":
┌─────────────────────────────────────────────────────────────────┐
│ Stage: compile │
│ └── morphir-elm/make │
│ ├── inputs: src/**/*.elm, elm.json │
│ └── outputs: .morphir/out/morphir-elm/make/ir.json │
├─────────────────────────────────────────────────────────────────┤
│ Stage: validate (parallel) │
│ └── morphir-native/validate │
│ └── inputs: @morphir-elm/make:ir │
├─────────────────────────────────────────────────────────────────┤
│ Stage: generate │
│ └── morphir-elm/gen [variant: scala] │
│ └── inputs: @morphir-elm/make:ir │
└─────────────────────────────────────────────────────────────────┘

Cache status: make (cached), validate (stale), gen (pending)

Task Execution Lifecycle

Each task executes through a pipeline with hook points:

┌─────────────────────────────────────────────────────────────────┐
│ Task Execution Pipeline │
├─────────────────────────────────────────────────────────────────┤
│ RESOLVE → CACHE → PREPARE → EXECUTE → COLLECT → REPORT │
│ ↑ ↑ ↑ ↑ ↑ ↑ │
│ │ │ │ │ │ │ │
│ ┌──┴──┐ ┌──┴──┐ ┌──┴──┐ ┌───┴───┐ ┌──┴──┐ ┌───┴───┐ │
│ │hook │ │hook │ │hook │ │ hook │ │hook │ │ hook │ │
│ │chain│ │chain│ │chain│ │ chain │ │chain│ │ chain │ │
│ └─────┘ └─────┘ └─────┘ └───────┘ └─────┘ └───────┘ │
└─────────────────────────────────────────────────────────────────┘

Stages:

  1. RESOLVE: Find toolchain, check tool acquired, resolve input artifacts
  2. CACHE: Hash inputs, check for cache hit in .morphir/out/
  3. PREPARE: Run pre-task hooks, create output directory, set up environment
  4. EXECUTE: Spawn process, capture output, stream diagnostics
  5. COLLECT: Gather artifacts, write meta.json, run post-task hooks
  6. REPORT: Report success/failure, aggregate diagnostics

Middleware Pattern: Toolchains can inject handlers at any stage to modify context, add behavior, or short-circuit execution.

Artifact Model

Output Structure

All task outputs go to a structured directory:

.morphir/
├── cache/ # Download cache (tools, dependencies)
└── out/ # Task outputs (namespaced)
├── morphir-elm/
│ ├── make/
│ │ ├── meta.json # Task metadata
│ │ ├── ir.json # Actual output (JSONC)
│ │ └── diagnostics.jsonl # Warnings/errors (JSONL)
│ └── gen/
│ ├── scala/
│ │ ├── meta.json
│ │ └── output/ # Generated files
│ └── json-schema/
│ └── ...
└── morphir-native/
└── validate/
├── meta.json
└── diagnostics.jsonl

Artifact Formats

TypeFormatUse
Task outputsJSONCHuman-readable, supports comments
DiagnosticsJSONL/NDJSONStreaming errors, warnings, progress
MetadataJSONmeta.json with inputs_hash, duration, etc.
PlanJSONplan.json for caching and export

Artifact References

Tasks reference artifacts using logical paths that the VFS resolves:

# Reference another task's output
inputs = { ir = "@morphir-elm/make:ir" }

# System resolves to: .morphir/out/morphir-elm/make/ir.json

Artifact Typing:

outputs = {
ir = { path = "morphir-ir.json", type = "morphir-ir/v3" }
}

inputs = {
ir = { ref = "@morphir-elm/make:ir", type = "morphir-ir/v3" }
}

Type compatibility is validated at plan time with auto-detection support.

Tool Acquisition

Acquisition Backends

BackendDescriptionPriority
pathTool already on PATHImmediate
npxRun via npx (avoids global install conflicts)Near-term
npmInstall via npmNear-term
miseManage via miseNear-term
dotnet-toolInstall via dotnet toolFuture
binaryDownload pre-built binaryFuture

Acquisition Configuration

[toolchain.morphir-elm.acquire]
backend = "npx"
package = "morphir-elm"
version = "^2.90.0"

# Or for path-based
[toolchain.custom-tool.acquire]
backend = "path"
executable = "my-custom-tool"

Environment Configuration

[toolchain.morphir-elm]
# Additional PATH entries
path = ["./node_modules/.bin"]

# Environment variables
env.NODE_OPTIONS = "--max-old-space-size=4096"

# Working directory (relative to project root)
working_dir = "."

# Resource limits
timeout = "5m"

CLI Integration

Command Mapping

# Run targets directly
morphir make # Run "make" target
morphir gen:scala # Run "gen" target with variant "scala"
morphir validate # Run "validate" target

# Run workflows
morphir build # Run "build" workflow
morphir ci # Run "ci" workflow

# Plan commands
morphir plan build # Show execution plan
morphir plan --explain X # Explain why task X runs

# Doctor
morphir doctor # Check for issues, ambiguities

Target Variants

Variants use colon syntax:

morphir gen:scala
morphir gen:typescript
morphir gen:json-schema

Disambiguation

When multiple toolchains provide a target:

$ morphir make
WARNING: Multiple toolchains fulfill "make": morphir-elm, morphir-haskell
Run `morphir doctor` for advice, or set targets.make in morphir.toml

Configuration

Toolchain Definition Locations

  1. Built-in (lowest precedence): Embedded in Morphir binary
  2. Toolchain packages: Distributed with toolchains
  3. User global: ~/.config/morphir/morphir.toml
  4. Project: morphir.toml (highest precedence)

Example Project Configuration

[project]
name = "my-morphir-project"

# Pin target implementations
[project.targets]
make = "@morphir-elm/make"
gen = "@morphir-elm/gen"

# Toolchain configuration
[toolchain.morphir-elm]
version = "2.90.0"
acquire.backend = "npx"

# Custom workflow
[workflows.deploy]
extends = "build"
stages = [
{ name = "upload", targets = ["@my-toolchain/deploy-s3"] },
]

# Custom target
[targets.deploy-s3]
description = "Deploy to S3"
requires = ["generated-code"]

Diagnostics & Error Handling

Error Ownership

The task system owns error reporting. Toolchains contribute diagnostics in a structured format.

Diagnostic Format (JSONL)

{"level": "error", "file": "src/Foo.elm", "line": 10, "col": 5, "message": "Type mismatch", "code": "E001"}
{"level": "warning", "file": "src/Bar.elm", "line": 20, "message": "Unused import"}
{"level": "info", "message": "Compiled 15 modules"}

Diagnostic Sources

  • stderr: Tool writes JSONL to stderr (preferred)
  • file: Tool writes to diagnostics.jsonl (fallback)
  • stdout: Captured and wrapped if unstructured

Diagnostics are tee'd by default (both displayed and saved to file), configurable via settings.

Doctor Command

$ morphir doctor

Checking toolchain configuration...
✓ morphir-elm: version 2.90.0 (via npx)
✓ morphir-native: built-in

Checking target resolution...
⚠ Target "gen" has multiple providers:
- morphir-elm/gen (variants: scala, json-schema, typescript)
- custom-toolchain/gen (variants: spark)
Suggestion: Pin in morphir.toml: targets.gen = "@morphir-elm/gen"

Checking workflows...
✓ build: valid
✓ ci: valid
✗ release: invalid
- Stage "publish" depends on target "package" which is not defined

Suggestions:
1. Define target "package" or remove stage "publish"
2. Run `morphir plan release` for detailed dependency analysis

Implementation Phases

Phase 1: Foundation

  • Core types: Toolchain, Target, Task, Workflow
  • Configuration loading for toolchains
  • Toolchain registry for native and external toolchains
  • Output directory structure (.morphir/out/)

Phase 2: WIT Toolchain Adapter

  • Adapt existing WIT pipeline to toolchain abstraction
  • Register WIT as native toolchain (make, gen, build targets)
  • Unify morphir wit * commands with toolchain CLI
  • JSONL diagnostic integration with task system
  • Validate native toolchain patterns for reuse

Phase 3: morphir-elm Integration

  • NPX acquisition backend
  • morphir-elm toolchain definition (external)
  • make and gen task implementations
  • File-based artifact passing between processes

Phase 4: Workflows & Planning

  • Workflow definition and parsing
  • Execution plan computation (merge workflow + deps)
  • Plan validation and optimization
  • morphir plan command

Phase 5: Caching & Performance

  • Input hashing for cache keys
  • Cache hit/miss detection
  • Plan caching to .morphir/out/plan.json
  • Parallel execution within stages

Phase 6: Polish & Ecosystem

  • morphir doctor command
  • Additional acquisition backends (mise, npm)
  • Workflow inheritance (extends)
  • Diagnostic aggregation

Open Questions

  1. Plan lock file format: Should morphir.lock include resolved tool versions?
  2. Remote caching: Should we support shared caches (like Bazel/Turborepo)?
  3. Plugin distribution: How should third-party toolchains be distributed?
  4. Streaming large artifacts: How to handle artifacts too large for JSONC?

Appendix: Comparison with Other Tools

FeatureMorphirMillBazelMakeTurborepo
Task cachingYesYesYesNoYes
Execution planYesYesYesNoNo
PolyglotYesJVMYesYesJS
Artifact typingYesYesYesNoNo
Workflow inheritanceYesNoNoNoNo
Tool acquisitionYesNoYesNoNo