# PII data-flow diagram with audit history
> Generate and maintain data-flow diagrams that show where sensitive data moves, what each system stores, and how the architecture changed between audit periods. Revision history serves as audit evidence.
## Metadata
- **Diagram family**: `architecture`
- **Source**: Compliance inventory / data catalog / codebase classification
- **Workflow types**: classify, generate, update, validate, render, audit
- **Audience**: security, platform, agent
- **MCP tools**: `dsp_get_scene`, `dsp_apply_ops`, `dsp_validate_scene`, `dsp_render_scene`, `dsp_diff_scene`, `dsp_list_revisions`
- **HTTP endpoints**:
  - `GET /v1/scenes/:id`
  - `POST /v1/scenes/:id/applyOps`
  - `POST /v1/scenes/:id/render`
  - `GET /v1/scenes/:id/revisions`
  - `GET /v1/scenes/:id/diff`
---
---

## What this example shows

A KYC / customer-onboarding pipeline rendered as a data-flow architecture diagram, with sensitive-data edges visually distinct from regular flows. The diagram captures where personally identifiable information enters the system, every system that stores or processes it, and where it crosses a trust boundary into a third-party service. Each revision of the scene becomes an audit-evidence snapshot - "here is what the architecture looked like on the day of the audit."

## When to use it

Reach for this when you need GDPR / CCPA / SOC 2 evidence, when a vendor is reviewing your data flows, when a security team is assessing PII exposure, or when a customer's procurement team asks for a diagram of where their data lives. The diagram is generated from a classified inventory rather than hand-drawn - agents are good at this kind of comprehensive sweep, humans are good at reviewing the result.

## What the agent does

The agent ingests a data classification source (compliance inventory, data catalog, or codebase scan tagged with PII annotations), then applies typed operations to the persisted scene: `createNode` for each system that touches PII, `createEdge` with sensitive-data styling for each flow, frame containers for trust boundaries (internal / customer-facing / third-party). Every change creates an immutable revision; the agent stores `dsp_list_revisions` output as part of the audit packet.

When the architecture changes - a vendor swap, a new processor, a deprecated cache - the agent applies the delta and the visual diff highlights exactly what's different from the prior audit period. The revision history IS the audit trail.

## What the output includes

- Frame-grouped diagram with explicit trust boundaries (customer device, application backend, third-party processors).
- Sensitive-data edges visually styled distinctly from regular flows.
- A complete revision history accessible via `dsp_list_revisions` with timestamps and per-revision messages - each revision documents what changed and why.
- Visual diff between any two revisions for "what changed since the last audit?" reviews.
- Persistent stable IDs so a system that's been in the architecture for years has the same `elementId` across all audit periods, simplifying cross-period comparison.
## Agent workflow
Maintain a regulator-friendly data-flow diagram showing where PII / KYC / financial data moves between systems, what each system stores, and how the topology has evolved over time, by combining a structured data-classification inventory with a Zindex scene that carries an immutable audit trail.
**Inputs**

- Data-classification inventory (typically a YAML/JSON file in compliance/ describing which fields each system stores and their classification)
- Trust-boundary configuration (which systems sit inside vs outside the company perimeter)
- Existing Zindex scene id (one per audited workflow - e.g. one for KYC, one for billing)
- Zindex API key with scene-write scope
**Outputs**

- Updated persisted scene with one node per system, edges labelled by what data crosses (e.g. 'full PII (DPA)'), and a frame around any external trust boundary
- Rendered SVG suitable for inclusion in a SOC 2 / ISO 27001 audit pack
- Revision diff showing what changed between this audit cycle and the previous one (which is the auditor-facing answer to 'what changed since last quarter?')
- Per-revision watermark trail proving the diagram pinned to a specific date
**Steps**

1. **Fetch the persisted compliance scene** - Compliance scenes are long-lived - one scene per audited workflow, with ten or twenty revisions stretching back over the audit history. Fetch the current revision before changing anything.
   _MCP: `dsp_get_scene` · HTTP: `GET /v1/scenes/${SCENE_ID}`_

2. **Read the data-classification inventory** - Parse compliance/data-classification.yaml. Each entry maps a system to the data it stores and its classification (`public`, `internal`, `pii`, `kyc`, `financial`). Edges between systems are derived from which classification crosses each boundary.

3. **Compute the diff against the persisted scene** - For each system in the classification, ensure a node exists with the expected nodeType (`actor`, `service`, `database`, `externalSystem`). For each data flow, ensure an edge exists with the expected label (e.g. 'full PII (DPA)' or 'tokenized'). Style PII-bearing edges with a red stroke; tokenized/hashed edges with green; internal-only edges in default. Frame any external-vendor systems inside a dashed trust-boundary frame so the auditor can see what data crosses the company perimeter.

4. **Apply the operation batch** - Send the diff as one applyOps batch. Use a `revisionMessage` that reads like an audit-log entry - 'Q3 2026: add tokenization layer between webapp and customer vault'. The revision message will surface in `dsp_diff_scene` output and ends up in the audit trail.
   _MCP: `dsp_apply_ops` · HTTP: `POST /v1/scenes/${SCENE_ID}/ops`_

5. **Validate the new revision** - Confirm the topology is valid. Pay attention to `LABEL_DUPLICATION_DETECTED` (two edges with the same label between different systems can confuse an auditor) and resolve before publishing.
   _MCP: `dsp_validate_scene` · HTTP: `POST /v1/scenes/validate`_

6. **Render the data-flow diagram** - Render to SVG. The watermark stamps scene-id + revision + date so the auditor always knows which version of the diagram they are looking at, with no separate metadata sidecar.
   _MCP: `dsp_render_scene` · HTTP: `POST /v1/scenes/${SCENE_ID}/render`_

7. **Generate the audit-cycle diff** - On audit cycles (typically quarterly), call `dsp_diff_scene` with `from=` the revision pinned at the start of the audit period and `to=` the current revision. The output is the canonical answer to 'what changed since last audit?' - far better evidence than a stack of dated screenshots.
   _MCP: `dsp_diff_scene` · HTTP: `GET /v1/scenes/${SCENE_ID}/diff?from=${AUDIT_START_REVISION}&to=${NEW_REVISION}`_

8. **Publish the SVG + diff to the audit pack** - Drop the rendered SVG and the diff JSON into the audit-pack directory. Both are traceable back to the persisted scene by id + revision; auditors who want to verify the diagram against current systems can replay the workflow at the recorded revision number.
## Agent prompt

Drop this into a system prompt for an MCP-connected agent.

```
You are an automated compliance-documentation agent. Your job is to keep a regulator-friendly data-flow diagram in sync with the actual systems and data classifications, so that SOC 2 / ISO 27001 / GDPR / KYC audits can rely on the diagram as primary evidence rather than dated screenshots.

The persisted Zindex scene id is `${SCENE_ID}`; it already exists. Each audited workflow has its own scene - one for KYC, one for billing, one for any vendor-share boundary you need to document. Treat each scene as the canonical, mutable, immutable-revisioned record of how data flowed through the system at every point in time.

Workflow on every run (typically scheduled monthly, plus on every PR that touches `compliance/data-classification.yaml`):

1. Read `compliance/data-classification.yaml` (or the equivalent inventory). Each entry maps a system to (a) the data it stores, (b) the data classification (`public`, `internal`, `pii`, `kyc`, `financial`), and (c) whether the system is internal or an external vendor.

2. Call `dsp_get_scene({ sceneId: "${SCENE_ID}" })` to read the current revision and elements. Do not modify anything yet - this run might be a no-op if classifications haven't changed.

3. Diff the parsed classification against the persisted scene. For each system, ensure a node exists with the right nodeType (`actor`, `service`, `database`, `externalSystem`) and label. For each data-flow edge, ensure the edge exists with a label that names what data crosses (e.g. 'full PII (DPA)', 'tokenized', 'hashed events'). Apply edge styles consistently: red stroke for unprotected PII, green for tokenized/hashed, default for internal traffic - auditors and engineers should be able to read the protection posture at a glance. If any external vendors are touched, wrap them in a dashed trust-boundary frame so the perimeter is explicit.

4. Call `dsp_apply_ops` with one batch. `errorPolicy: "allOrNothing"`. The `revisionMessage` should read like an audit-log entry: 'Q3 2026: add tokenization layer between webapp and customer vault' or '2026-04-15: add SendGrid as PII recipient under DPA'. Revision messages surface in `dsp_diff_scene`; they ARE the audit trail.

5. Call `dsp_validate_scene`. Resolve any `LABEL_DUPLICATION_DETECTED` (two edges with the same label between different systems is genuinely confusing in an audit context - rename or anchor to a column to disambiguate). `EDGE_LABEL_SUPPRESSED_REDUNDANT` should not appear on this diagram family; if it does, an edge label is matching a column in an ER diagram and you've used the wrong scene.

6. Call `dsp_render_scene({ format: "svg", theme: "clean" })`. The rendered SVG carries a watermark with scene-id + revision + date - leave the watermark on; auditors rely on it.

7. On audit-cycle runs (quarterly is typical), call `dsp_diff_scene({ from: ${AUDIT_START_REVISION}, to: NEW_REVISION })`. The output lists added / removed / changed elements over the audit period - this is the canonical answer to 'what changed since last audit?' and is far stronger evidence than a stack of dated screenshots. Drop both the SVG and the diff JSON into the audit-pack directory.

Hard rules: never hand-edit the rendered SVG (the watermark and scene-id traceability are the whole point). Never delete a revision or rewrite history - the immutability is what makes the audit trail credible. Never commit a scene that fails validation; auditors who spot-check the diagram against live systems must always be able to re-validate it. Treat the data-classification inventory as the source of truth; if a system is not classified, it does not belong in the diagram until it is.
```
## Validation

Captured `POST /v1/scenes/validate` response: **valid** (0 diagnostics)

_Scene validates with no diagnostics._
## Resources
- **Canonical scene**: [/examples/compliance-pii-flow.scene.json](/examples/compliance-pii-flow.scene.json)
- **Operations envelope**: [/examples/compliance-pii-flow.ops.json](/examples/compliance-pii-flow.ops.json)
- **Workflow recipe**: [/examples/compliance-pii-flow.workflow.json](/examples/compliance-pii-flow.workflow.json)
- **Revision diff**: [/examples/compliance-pii-flow.diff.json](/examples/compliance-pii-flow.diff.json)
- **GitHub Actions workflow**: [/examples/compliance-pii-flow.github-actions.yml](/examples/compliance-pii-flow.github-actions.yml)
- **Rendered SVG**: [/examples/compliance-pii-flow.svg](/examples/compliance-pii-flow.svg)
- **Human page**: [/examples/compliance-pii-flow](/examples/compliance-pii-flow)
- **Manifest**: [/examples/index.json](/examples/index.json)
## Related examples

- [/examples/er-diagram-from-migrations](/examples/er-diagram-from-migrations)
- [/examples/living-architecture-docs](/examples/living-architecture-docs)
- [/examples/api-dependency-map](/examples/api-dependency-map)
- [/examples/request-flow-from-handler](/examples/request-flow-from-handler)