Layout Engine

Zindex ships with a built-in layout engine. Agents describe what the diagram contains - nodes, edges, the relationships between them - and the engine figures out where each element should sit, how each edge should route around obstacles, and where each label should land.

The engine is implemented from scratch with zero external layout dependencies. It uses a Sugiyama-style layered layout pipeline (the same family of algorithms behind tools like Graphviz and Mermaid) and runs deterministically: identical input always produces identical output.

Building an agent integration? See How AI Agents Should Use Zindex for the recommended workflow. The auto-layout section there covers when to use auto-layout vs explicit positioning.

Why this exists

Most diagram tools assume the human knows where every box and line goes. That assumption breaks down for agents. An LLM generating an architecture diagram knows the services and the connections, but asking it to also produce pixel-perfect coordinates is asking for a worse diagram and a worse use of tokens.

The layout engine inverts the contract: agents describe the graph, the engine handles the geometry. Agents that want pixel control can still provide explicit layout on individual elements - the engine respects user-supplied positions and only fills in the gaps. This mixed mode is the most common pattern in practice: pin the critical elements, let the engine arrange the rest.

What you get

What auto-layout handles and what it doesn’t

Auto-layout is powerful but has a specific scope. Understanding the boundary prevents frustration:

Auto-layout handles (no coordinates needed):

Auto-layout also handles frames and containers:

What still needs explicit coordinates:

The rule of thumb: omit coordinates from everything. The engine auto-positions nodes, edges, frames, activation bars, and fragments. Provide explicit layout only when you need pixel-precise control or a specific spatial arrangement the planner can’t infer from the graph structure.

The pipeline

Every render runs the scene through five phases. Each phase has well-defined inputs and outputs and can be inspected in isolation.

1. Measurement

Every node and every edge label is measured before any positioning happens. Node sizes come from intrinsic measurements (text width, padding, fixed shape constraints). Edge labels are measured to the pixel so the planner can leave room for them between ranks.

2. Layer assignment

Nodes are assigned to ranks (rows for TB, columns for LR) based on the directional flow of the graph. Cycles are detected and broken using a feedback arc set heuristic, so the algorithm always produces a finite layering even on cyclic input.

3. Crossing reduction

Within each layer, nodes are reordered to minimise the number of edges that cross. The algorithm uses the median heuristic: each node’s preferred position is the median index of its neighbours in the adjacent layer. The down sweep aligns each layer with its predecessors, the up sweep aligns each layer with its successors, and 12 iterations are enough to converge for any normal graph.

When two nodes have similar median positions (within 0.5), the algorithm tiebreaks by input order. This preserves natural reading order: if you declare api1 before api2, they’ll appear in that order even when the heuristic could swap them to save a single crossing. Substantial median differences still trust the heuristic, so this never produces a worse layout - it just prevents counter-intuitive reorderings.

4. Coordinate assignment

Once layer order is fixed, each node gets an exact position along its layer using iterative barycentric refinement with PAVA projection:

The result: chains of single-parent / single-child nodes end up dead straight (a worker connected only to a job queue sits exactly underneath it), branching points are centered above or below their children, and node spacing always satisfies the configured nodeSpacing.

5. Edge routing

After every node has a position, edges are routed. The router walks a ranked list of anchor candidates for each edge:

  1. The visually preferred centered entry: leave the source perpendicular to the line connecting the centers, enter the target through a centered N/S or E/W face. The L’s terminal leg lands on the target’s center column / row.
  2. If the centered L collides with an obstacle, try a 5-point step path that routes through the gap between source and target while still entering the target perpendicular to a centered face.
  3. If both fail, fall back to the same-axis pair (E↔W or N↔S) so the bend hugs the target’s edge - never ideal, but always valid.

Frames (pools, lanes, fragments) and groups are containers, not obstacles - an edge between two children of a swimlane will never be reported as colliding with the lane itself. Frames and groups are also valid edge endpoints: when an edge’s from or to references a frame ID, the router computes anchors against the frame’s bounds and the arrow terminates at the frame’s border, exactly as it would for a node target. This is the platform mechanism behind the edges-to-a-group-as-a-whole authoring convention.

After routing, every path is run through a simplification pass that snaps near-equal coordinates, drops consecutive duplicates, and removes colinear midpoints. This eliminates the sub-pixel wiggles that come from planner-driven width mismatches and keeps arrow markers oriented cleanly.

A channel routing refinement then detects interior edge segments that overlap or run in parallel within the same corridor (from different edges) and spreads them into distinct tracks. This prevents multiple edges from visually overlapping when they share the same horizontal or vertical segment. Track spacing is constrained by the available gap between nearby nodes.

Canvas-aware spread

When scene.canvas is set and the laid-out content uses substantially less space than the canvas allows, the planner scales node positions and rank centers around their centroid so the layout fills the available area. Minimum node and rank spacing remain hard floors - auto-spread only increases gaps, never compresses them. This means agents who set a larger canvas get a layout that uses the space, instead of content clustered into one corner. Spread is capped (3x maximum scale factor) to prevent absurd spacing on tiny graphs that happen to live in a generous canvas.

Bonus phase: label placement

For each labeled edge, the placer samples positions along every segment of the routed path. Each candidate’s bounding box is tested against a spatial index of nodes and previously-placed labels - if anything intersects, the candidate is rejected. The remaining candidates are scored by combining segment-centeredness (prefer mid-segment) with a longest-segment preference (prefer the most prominent visual run of the edge). The chosen position is then reserved as an obstacle for subsequent placements, so neighbouring labels keep visible breathing room.

Aesthetic scoring

The engine doesn’t just run the pipeline once and hope for the best. It evaluates 5 layout candidates with different planner parameters and picks the one with the lowest aesthetic penalty score. This happens automatically on every render.

The 5 candidates vary two parameters that have the biggest impact on layout quality: node/rank spacing (how dense or spacious the layout is) and crossing-reduction thoroughness (how many optimisation passes the planner runs). One candidate always uses the scene’s own defaults, so the worst case is identical to a single-pass layout.

Each candidate is scored on:

The winner is cached, so subsequent renders of the same scene return it instantly without re-running the candidates. The entire process is deterministic: same scene always produces the same layout.

For agents, this is invisible - you describe the graph, the engine does the rest. The scoring is the mechanism behind the “always produces a clean diagram” guarantee.

Configuration

The layout engine reads scene.layoutStrategy:

{
  "layoutStrategy": {
    "algorithm": "hierarchical",
    "direction": "TB",
    "nodeSpacing": 60,
    "rankSpacing": 100
  }
}
FieldTypeDefaultDescription
algorithmstring"hierarchical"Layout algorithm. Currently "hierarchical" is the production-ready choice.
direction"TB" | "BT" | "LR" | "RL"depends on familyPrimary flow direction.
nodeSpacingnumber30Minimum pixels between nodes in the same rank.
rankSpacingnumber80Minimum pixels between adjacent ranks.

When omitted, sensible defaults are used per diagram family. The hierarchical pipeline is used by architecture, workflow, entityRelationship, uiflow, and the default fallback strategy. Org charts use a tidy-tree planner instead, and network topologies use a force-directed layout (since they have no inherent direction). Sequence diagrams use a dedicated time-ordered resolver.

Mixed mode in practice

The most common production pattern is to fix some nodes and let the engine position the rest:

{
  "schemaVersion": "0.1",
  "scene": { "id": "mixed-arch", "canvas": { "width": 1000, "height": 600 } },
  "layoutStrategy": { "algorithm": "hierarchical", "direction": "LR" },
  "elements": [
    {
      "id": "gateway",
      "kind": "node",
      "nodeType": "service",
      "label": "API Gateway",
      "layout": { "mode": "absolute", "x": 50, "y": 250, "width": 160, "height": 80 }
    },
    { "id": "auth", "kind": "node", "nodeType": "service", "label": "Auth" },
    { "id": "users", "kind": "node", "nodeType": "service", "label": "Users" },
    { "id": "billing", "kind": "node", "nodeType": "service", "label": "Billing" },
    { "id": "db", "kind": "node", "nodeType": "database", "label": "Postgres" },
    { "id": "e1", "kind": "edge", "from": { "elementId": "gateway" }, "to": { "elementId": "auth" } },
    { "id": "e2", "kind": "edge", "from": { "elementId": "gateway" }, "to": { "elementId": "users" } },
    { "id": "e3", "kind": "edge", "from": { "elementId": "gateway" }, "to": { "elementId": "billing" } },
    { "id": "e4", "kind": "edge", "from": { "elementId": "users" }, "to": { "elementId": "db" } },
    { "id": "e5", "kind": "edge", "from": { "elementId": "billing" }, "to": { "elementId": "db" } }
  ]
}

The gateway is pinned where the diagram’s user wants it. Auth, Users, Billing, and Postgres have no layout - the engine positions them automatically and routes the edges around the gateway.

Determinism guarantees

The engine is fully deterministic. Given the same scene document, the same layoutStrategy, and the same set of diagram-family rules, the output is bit-identical across runs. This means:

There is no random initialization, no time-based heuristic, and no external service call. Every algorithm in the pipeline is purely a function of the input.