Skip to content

Proposal: Data-Flow Support in Modeler API

Draft

This is a working draft and not yet shared with the Modeler API maintainer. Content, API surface, and naming are provisional and subject to change.

Date : 2026-05-27 • status : Draft (for maintainer review) • Author : Shibin Das (d34dman)

Companion to Experiments with Modeler API. That document recorded why the 2024 proof-of-concept stalled. This document proposes a concrete, additive path to remove that ceiling so Modeler API can serve data-flow workflows (AI agents, n8n/LangFlow style) as a first-class citizen — not just BPMN/ECA-style process flows.

TL;DR

Modeler API today models control flow: a graph of components connected by sequence flows ("after A, do B, optionally gated by a condition"). FlowDrop — and AI agent orchestration in general — needs data flow: typed output ports wired to typed input ports, with execution order derived from data dependencies rather than authored.

The gap is small and well-localized at the data-model layer. We propose making the connection carry optional typed ports, letting the model owner declare ports per component, and adding type-aware validation — all additive and backward compatible. The change only delivers visible value when paired with a data-flow-capable modeler, for which we offer FlowDrop's xyflow UI as the reference implementation.

Problem: control flow vs data flow

The two paradigms encode fundamentally different graph semantics.

Control flow (BPMN / ECA) Data flow (FlowDrop / n8n / LangFlow)
Edge means "execute B after A" "A's output port p feeds B's input port q"
Edge carries target + optional condition source port, target port, data type
Node shape single entry/exit, gateways branch a token many typed input/output ports
Execution order authored by the sequence flows derived (topological sort of data deps)
Branching gateway routes one control token every connection is an independent data channel

The smoking gun is the connection primitive. The entire ComponentSuccessor (web/modules/contrib/modeler_api/src/ComponentSuccessor.php:13-16) is:

public function __construct(
  protected string $id,          // target component
  protected string $conditionId, // optional gating condition
) {}

No source port, no target port, no data type. The component type vocabulary (src/Api.php:57-63START, SUBPROCESS, SWIMLANE, ELEMENT, LINK, GATEWAY, ANNOTATION) is the BPMN palette, and the cardinality constraints (src/Api.php:516-642) are expressed as successor counts per component — a control-flow notion. There is no concept of a port or a typed socket anywhere in the model.

By contrast, a FlowDrop edge connects typed ports and carries a data type:

sourceHandle: "{nodeId}-output-{portName}"
targetHandle: "{nodeId}-input-{portName}"
data.metadata.sourcePortDataType: "string" | "tool" | ...

and FlowDrop derives execution order via its compiler (DependencyGraph → ExecutionGraph), rather than honoring an authored sequence.

Why the 2024 POC stalled

Both limitations recorded in the experiments doc trace directly to this single gap:

  • "becomes a themed variant of BPMN.io" — the shared interchange (Component + successor) is control-flow, so FlowDrop could only surface what BPMN can express; its typed-port richness had nowhere to live.
  • "in-workflow configuration editing not available" — the port wiring FlowDrop edits inline is not part of the Component contract; it is buried in owner-internal plugin configuration, opaque to the modeler.

Conversion asymmetry (why a clean model matters)

  • Data flow → control flow is lossy but mechanical: topologically sort, emit sequence flows. (This is exactly the "processors to massage data + auto-layout" written for the POC.) Fine for a one-way export / visualization.
  • Control flow → data flow is under-determined: sequence flows do not say what data each step consumes or produces, so ports cannot be recovered. This is the wall the ECA→FlowDrop direction hit.

A first-class data-flow model removes the lossy round-trip and lets both paradigms coexist behind one API.

Design principle: additive, opt-in, backward compatible

The guiding constraint: a control-flow owner (ECA) and the BPMN.io modeler must keep working with zero changes. Every new field is optional and defaults to today's behavior; every new interface method gets a default implementation on the base classes. Data flow is a capability a model owner and a modeler opt into, not a new requirement.

1. Generalize the connection (typed ComponentConnection)

Extend the successor with optional port + type metadata. When the new fields are null, the connection behaves exactly as today's sequence flow.

readonly class ComponentSuccessor {
  public function __construct(
    protected string $id,
    protected string $conditionId = '',
    // New, optional — present only for data-flow connections:
    protected ?string $sourcePort = NULL,
    protected ?string $targetPort = NULL,
    protected ?string $dataType   = NULL,
  ) {}
  // + getters for the new fields.
}
  • Control-flow owners: never set the new fields → identical to current behavior.
  • Data-flow owners: set sourcePort / targetPort / dataType → a typed data edge.

(Naming is open — see questions. A superset ComponentConnection type with ComponentSuccessor kept as the control-flow specialization is an alternative if in-place extension is undesirable.)

2. Owner-declared ports

A Component today is plugin + flat configuration with no port concept. Add an optional method to ModelOwnerInterface (default: empty → control-flow owner) letting the owner declare input/output ports per plugin:

/**
 * @return \Drupal\modeler_api\PortDefinition[]
 *   Declared input/output ports for the given component plugin. Empty for
 *   control-flow owners.
 */
public function componentPorts(int $type, string $pluginId): array;

PortDefinition shape (new value object):

id / name, direction (input|output), dataType,
cardinality (single|multiple), required (bool)

For FlowDrop this is a trivial projection of the existing metadata.inputs / metadata.outputs. For ECA it returns [] and nothing changes.

3. Declare graph semantics

So the API knows whether to honor an authored order or treat the graph as data dependencies, the owner (and modeler) declare a capability:

public function graphSemantics(): string; // 'control_flow' (default) | 'data_flow'
  • control_flow → successors define execution order (today).
  • data_flow → Modeler API persists the typed connection graph and does not impose order; the owner derives execution order itself (e.g. FlowDrop's compiler). Modeler API stays an editor + storage + validation layer, not an executor — unchanged from its current role.

4. Type-aware validation

Today's validation is successor-count per component type (src/Api.php:516-642). Add a parallel path, gated behind graphSemantics() === 'data_flow' so control-flow owners are untouched:

  • Type compatibility — a connection is valid only if sourcePort.dataType is assignable to targetPort.dataType (exact match, plus an owner-supplied compatibility hook for subtyping).
  • Per-port cardinality — enforce single vs multiple and required at the port level, rather than counting successors on the whole component.

5. FlowDrop xyflow as the reference data-flow modeler

The catch: the only shipped modeler is bpmn_io, and BPMN.io is a sequence-flow renderer by specification — it cannot draw typed multi-port nodes. So the data-model work delivers no visible value until paired with a data-flow-capable modeler. We offer FlowDrop's xyflow UI as that reference modeler:

  • Modeler plugins gain a capability flag (e.g. supportsDataFlow(): bool); bpmn_io returns FALSE and is unaffected.
  • The readComponents() / addComponent() handoff now carries port info on connections for data-flow modelers.
  • This also gives the ecosystem (e.g. AI Agents) a shareable data-flow editor, not just FlowDrop.
flowchart LR
    subgraph Modelers
      BPMN["bpmn_io<br/>(control flow)"]
      XY["FlowDrop xyflow<br/>(data flow)"]
    end
    API["Modeler API<br/>typed ComponentConnection<br/>+ port declarations<br/>+ semantics flag"]
    subgraph Owners
      ECA["ECA<br/>(control flow)"]
      FD["FlowDrop workflow<br/>(data flow)"]
      AI["AI Agents<br/>(data flow)"]
    end
    BPMN --- API
    XY --- API
    API --- ECA
    API --- FD
    API --- AI

Backward compatibility & migration

  • New ComponentSuccessor fields: optional, default NULL → existing connections unchanged.
  • New ModelOwnerInterface methods: default implementations on ModelOwnerBase (componentPorts()[], graphSemantics()'control_flow') → existing owners compile and behave identically.
  • New modeler capability: defaults to FALSEbpmn_io and all existing modelers unaffected.
  • Validation: the type-aware path only runs for data_flow owners.
  • No config migration for existing ECA models — their stored config has no port data and none is introduced.

The net is a strictly additive surface: nothing in the BPMN/ECA path changes shape.

What this is not

  • Not a request for Modeler API to execute data-flow graphs. Order derivation and execution stay with the owner (FlowDrop already does this). Modeler API remains editor + storage + validation.
  • Not a rewrite of the component-type taxonomy. Data-flow connections are orthogonal to whether a node is a START / ELEMENT / GATEWAY.

Open questions for the maintainer

  1. Scope/appetite — Is data flow something Modeler API should absorb, or should it live as a sibling/companion API? (The experiments-doc note that the AI direction "may change as maintainers bring first-class AI support to ECA" suggests the door may be open.)
  2. Connection shape — Extend ComponentSuccessor in place, or introduce a superset ComponentConnection with ComponentSuccessor as the control-flow specialization?
  3. Port source of truth — Owner-declared via componentPorts(), or derived from a richer plugin metadata contract?
  4. Mixed graphs — Should a single model be allowed to contain both control- and data-flow connections, or is graphSemantics() strictly per-model?
  5. Terminology — ports vs. handles vs. sockets; align early to avoid churn.
  6. Reference modeler — Openness to a second, non-BPMN modeler (FlowDrop xyflow) in-tree or as a companion module?

References

  • Experiments with Modeler API
  • web/modules/contrib/modeler_api/src/ComponentSuccessor.php
  • web/modules/contrib/modeler_api/src/Component.php
  • web/modules/contrib/modeler_api/src/Api.php
  • web/modules/contrib/modeler_api/src/Plugin/ModelerApiModelOwner/ModelOwnerInterface.php