Protocol Overview

The Diagram Scene Protocol (DSP) is the structured format at the core of zindex - agent-native diagram state infrastructure. Zindex is a stateful diagram runtime for AI agents. DSP defines how scenes are structured, how operations mutate scenes, and how validation ensures correctness. Diagrams are durable state agents can patch, validate, diff, and render - not one-shot output regenerated from scratch on every prompt.

Using the protocol from an AI agent? Read How AI Agents Should Use Zindex for the recommended operational sequence for creating, updating, validating, and rendering scenes.

Why create a persisted scene? Stateless rendering works for one-shot previews, but persisted scenes give you: revision history (every edit is an immutable revision), incremental edits (change one node without regenerating the whole scene), visual diff (compare any two revisions - added=green, removed=red, modified=amber), and audit trail (“at this date, this is what the diagram looked like”). Always create a persisted scene for production diagrams.

Design principles

  1. Operations, not pixels - Agents work with typed operations (createNode, createEdge, move) rather than pixel coordinates and mouse events
  2. Validation-first - Every operation and scene state is validated against 40+ rules before execution
  3. Immutable revisions - Each operation creates a new revision. No destructive mutations.
  4. Format-agnostic - The scene model is independent of rendering format (SVG, PNG)

Diagram families

Always declare scene-level diagramFamily on every scene. The field is technically optional in the JSON schema, but it is effectively load-bearing - it gates family-specific behaviour the engine and downstream tooling rely on. Omitting it triggers a MISSING_DIAGRAM_FAMILY info diagnostic on the render response.

{
  "diagramFamily": "workflow",
  "schemaVersion": "0.1",
  "scene": { ... },
  "elements": [ ... ]
}

Supported families: architecture, workflow, entityRelationship, sequence, network, orgchart, uiflow.

When declared, the family enables:

Scenes without diagramFamily fall back to the generic-flowchart base case and lose all of the above. Always declare it.

Scene structure

A DSP scene contains:

{
  "schemaVersion": "0.1",
  "diagramFamily": "workflow",
  "layoutStrategy": { "algorithm": "hierarchical", "direction": "TB" },
  "scene": {
    "id": "unique-id",
    "title": "My Diagram",
    "units": "px",
    "canvas": { "width": 1200, "height": 800, "background": "#ffffff" }
  },
  "elements": [],
  "styles": {},
  "constraints": []
}

layoutStrategy is optional but recommended for auto-layout - when set, nodes can omit their layout field and the engine computes positions from the graph structure automatically. diagramFamily is technically optional in the schema but should always be declared (see Diagram families above).

Element kinds

KindDescription
nodeA shape with a label (rect, roundedRect, ellipse, diamond, cylinder, pill, hexagon, parallelogram, cloud)
edgeA connection between two elements (typically nodes; frame and group IDs are also valid targets)
groupA logical grouping of elements
frameA visual container with a title bar
textA standalone text block
imageAn image element
guideA horizontal or vertical alignment guide

Layout modes

Node text wrapping

Node labels support textStyle for controlling text wrapping and alignment:

{
  "id": "card",
  "kind": "node",
  "shape": "roundedRect",
  "label": "A longer description that wraps within the node bounds",
  "layout": { "mode": "absolute", "x": 0, "y": 0, "width": 200, "height": 100, "autoSize": "content" },
  "textStyle": { "wrap": "word", "align": "left", "verticalAlign": "top" }
}

Set autoSize: "content" on the layout to auto-compute node height from the wrapped text content.

Constraints

10 constraint types enforce spatial relationships between elements:

Rendering

Scenes are rendered to SVG or PNG. Rendering is a projection - the scene is the canonical state, rendered output is a view.

Pass "theme" on render requests to control aesthetics without modifying the scene: