Introduction

insigz is a geospatial intelligence fusion platform. This documentation explains how it's built, how to put data into it, how to reason over it, and how to run sessions on top of it. Read in order; each section depends on the previous one.


What insigz is

insigz is the layer between continuously-published world data — AIS, electrical grids, sanctions registries, news feeds, weather, cyber advisories — and the people who have to make decisions on top of it. We fuse live sources across maritime, energy, sanctions, news and research into a single canonical model: Source → Observation → Entity → Event → Case → Report.

Three things make insigz different from "another intelligence platform":

PRINCIPLE
AI supports, never decides. This is the binding constraint that shapes every agent surface. See #oversight for how it's enforced.

Quickstart

A working insigz tenant in five steps. Assumes you have a contract scope and a deployment target (Cloud Run on GCP — our Swiss europe-west6 region, or a customer region).

# 1. Provision a tenant database
$ insigzctl tenant create --name=baltic-2027 --region=eu-central

# 2. Seed the canonical model
$ insigzctl migrate --tenant=baltic-2027 --version=latest

# 3. Enable the data sources you need
$ insigzctl sources enable ais entso-e ofac ncsc noaa

# 4. Verify ingestion
$ insigzctl observe --since="5 min ago" --count
   14,602 observations

# 5. Open the working surface
$ insigzctl open

Within ~10 minutes you should see the live map populating in the Workshop view. From there, scenarios and sessions are built using the Workshop authoring tools.

Core concepts

Five nouns are enough to describe everything insigz models:

Source

An external system insigz reads from. Examples: ais.marinetraffic, entso-e.transparency, ofac.sdn. Each source has an ingestor, a polling cadence, and a provenance signature.

Observation

An atomic, timestamped, geolocated fact produced by an ingestor. Observations are immutable and signed by their source. Example: vessel:imo:9123456 position:(59.43,24.75) at:2027-02-15T04:17:08Z source:ais.

Entity

A noun in the world. Vessels, cables, ports, substations, sanctioned parties, news outlets. Entities accumulate observations over time and have stable canonical IDs across sessions.

Event

A change in entity state, derived from one or more observations. Examples: cable-throughput-drop, ais-gap, designation-added. Events carry confidence, provenance, and a back-reference to their constituting observations.

Case

A bound collection of events that mean something together. Analysts open cases; insigz suggests bindings. A case is the unit of analysis — what becomes a report.

The canonical model

Every table in every insigz tenant is one of: a source registry, an observation log, an entity table, an event log, a case binding, or a report manifest. There are no exceptions and no escape hatches.

CREATE TABLE observations (
  id              uuid PRIMARY KEY DEFAULT gen_random_uuid(),
  source_id       text NOT NULL REFERENCES sources(id),
  entity_id       uuid REFERENCES entities(id),
  observed_at     timestamptz NOT NULL,
  geom            geometry(Point, 4326),
  payload         jsonb NOT NULL,
  signature       bytea NOT NULL,
  received_at     timestamptz NOT NULL DEFAULT now()
);

Every observation has a signature from its source — an Ed25519 signature over a canonical hash of (source_id, observed_at, payload). The signature is what makes the audit log defensible: if a faculty member disputes a consequence, we can trace it back through events to specific signed observations.

SEE ALSO
The full schema reference lives at #schema. The data-source catalog is at #sources.

Framework lineage

The Source → Observation → Entity → Event → Case → Report chain is a deliberate synthesis of two established traditions — not a bespoke invention. Naming the lineage matters: it places the model for any reviewer with a fusion or platform background.

Defense / IC data fusion (JDL / DFIG)

The Joint Directors of Laboratories model is the reference framework for intelligence data fusion. Our stages map onto its levels almost one-to-one:

JDL level                       insigz stage
L0  source / signal refinement   Source, Observation
L1  object refinement            Entity
L2  situation refinement         Event
L2/L3 situation & threat        Case
L3  impact assessment            Report
L4  process refinement           the platform itself (ingestion, tasking, audit)

Entity-centric ontology (commercial)

The closer comparison is the modern ontology pattern — object types (an entity or event), properties (its characteristics), and link types (relationships), all mapped from source data. insigz grounds its AI analyst at the observation / entity layer, the same move ontology platforms made when the ontology became the backbone for AI agents from ~2023. The analyst cites [OBS-####] rather than asserting unsupported facts.

Why "Case" is a stage

In pure sensor-fusion vocabulary, Case isn't a level — it belongs to investigative / compliance / OSINT practice (case management, sanctions work, link analysis). It's a deliberate, recognisable choice for our maritime, sanctions and academic users; for a hard-defense audience the same stage reads as "investigation."

CAVEAT
The chain is a communication device, not a claim that data only flows left-to-right. Fusion isn't strictly linear — any level can run on another's output. What earns credibility is the properties around the chain: provenance & lineage, entity resolution, grounding, auditability, and time-travel reconstruction (AsOf). The public explainer is at knowledge.html#lineage.

Source onboarding — the Ontology Agent

◐ PREVIEW · NOT YET GA
The Ontology Agent ships today as a guided, scripted preview of the onboarding flow; full self-serve onboarding of arbitrary sources is on the roadmap. The Q9 guarantee below — the agent drafts, a human commits every schema change — is firm.

Adding a source is traditionally a two-specialist job: a Connector Engineer wires the feed and maps its fields to the canonical model, and a Canonical Model Steward (ontologist) decides how those fields become entities and events without forking the schema. The Ontology Agent compresses both into one guided flow driven from the analyst chat. You drop a source — a URL, an API spec, a sample payload, a file, or a catalog pick — and the agent profiles it, drafts the connector, drafts the field→Observation mapping, proposes which entity/event types it maps onto (reusing existing ones wherever possible), proposes resolution keys, and dry-runs the result against a live sample. It drafts; a human commits.

PRINCIPLE · Q9
Onboarding is agent-drafted, human-committed. The agent proposes connector config, mappings, new types and resolution rules; the Steward approves a diff before anything touches the canonical model. No silent schema changes, no silent entity merges. Q9 is the extension of Q8 (#oversight) into the data layer: the canonical model is a protected branch; the agent opens the pull request, a human merges it.

The staged pipeline

The agent runs a fixed, auditable pipeline. Each stage proposes an artifact and waits for approve / edit / reject.

# org-layer: make a source exist & be trustworthy
0 Intake     → pull a representative sample (N records)
1 Profile    → protocol, rate, auth; field types, units, semantic roles; PII flags
2 Connector  → endpoint, auth binding, cadence, health probe
3 Mapping    → native field → Observation field; observed_property; UTC/WGS84
4 Ontology   → match existing entity/event types (reuse>new); resolution keys
5 Dry-run    → replay sample, no commit: resolved/new/dedup/conflicts + coverage
6 Steward    → Q9 GATE: human approves the full diff (logged)
7 Catalog    → register in org connector catalog + entitlement
# project-layer: decide to use it (existing flow)
8 Connect    → Project Lead adds it to a project, sets alerts

Automated vs. human-gated

The Observation schema

One shape for every fact, from every source — the atomic unit the analyst is grounded on and lineage is tracked at.

// Observation — atomic, timestamped, sourced fact
{
  "id": "OBS-50412",
  "source_id": "spire-ais",
  "observed_property": "vessel_position",  // typed enum; governs value shape
  "observed_at": "2027-02-17T14:33:07Z",    // EVENT time (when true), UTC
  "ingested_at": "2027-02-17T14:33:11Z",    // SYSTEM time (when received), UTC
  "entity_ref": "VESSEL-SF12",             // resolved entity, or null
  "entity_candidate": null,
  "geo": { "lat": 59.81, "lon": 24.77, "geohash": "ud9…" },  // WGS84 | null
  "value": { "sog_kn": 0.2, "cog_deg": 184 },        // schema per observed_property
  "source_native": { "mmsi": 209123456 },          // raw fields, preserved for replay
  "provenance": {
    "connector_type": "ws",        // ws | http | file | manual
    "signature": "ed25519:…",       // Ed25519 signature over the record hash
    "classification": "OPEN",      // → drives entitlement/visibility
    "resolution": { "method": "deterministic", "confidence": 1.0, "keys_used": ["mmsi"] }
  }
}

Entity resolution

Resolution is where fusion creates value — and where it most often goes wrong at scale. Rules are declared per entity type, proposed by the agent at stage 4, approved by the Steward.

// runtime — per incoming Observation
1. try deterministic keys in priority order   // e.g. VESSEL: imo > mmsi > callsign
     unique hit            → attach, confidence 1.0, method=deterministic
2. else block + score (probabilistic)
     block on coarse key (geo-cell+time, or name prefix); score signals → s∈[0,1]
3. decide by band:
     s ≥ AUTO   (≈0.92)            → attach, method=probabilistic
     REVIEW ≤ s < AUTO (≈0.65)     → candidate + MERGE-REVIEW task (no silent merge · Q9)
     s < REVIEW                    → create NEW candidate entity
ANTI-FORK
The biggest risk of many sources is schema sprawl — 50 feeds inventing 50 near-duplicate "vessel" types. At stage 4 the agent embeds the source's concepts, matches them against existing types, and defaults to reuse; new types are proposed only when the fit is poor, as the minimal covering type, with near-neighbours shown. Every accepted change is a versioned ontology_diff in the audit log. The customer-facing explainer is at knowledge.html#agent.

The Autonomy Layer — Watches & Proposals

Watches turn insigz from a platform you query into one that watches for you — without crossing into autonomous action. A Watch runs a grounded agent against the live feed on a trigger or schedule and stages a Proposal; a human Accepts, Modifies or Rejects it from the Proposal Inbox. A committed proposal becomes an ordinary Event / Case-update / Alert — no parallel data world.

PRINCIPLE · Q10
Autonomy is bounded and reversible. An agent may auto-apply only internal, reversible outputs (attach an observation, flag an entity, raise an alert, open a WATCH-status case) above the AUTO confidence band (≈0.92). Anything customer-facing or irreversible — report drafts, published artifacts, external notifications — always stages for review. Enforced in the band function and again server-side. Q10 extends Q8/Q9 from one-off approvals to always-on monitoring.

Autonomy bands

confidence ≥ AUTO (0.92)  AND  output ∈ auto_allowed_kinds  → auto-apply + audit
REVIEW (0.65) ≤ confidence < AUTO                            → stage for review (alert)
confidence < REVIEW                                          → stage for review, flagged low

Each needs-review proposal emits a watch-review alert into the one notification inbox (no standalone queue). The Review panel’s Show Logic opens the cited [OBS-####] observations + the tool-use trace + the reasoning chain — the same provenance substrate the analyst chat uses.

Trust Evals & the Scorecard

Before an agent drives a watch, an eval suite proves it. Evaluators: citation-accuracy (fraction of claims citing a real observation — the defensibility number), refusal-correctness (does it refuse when unsupported?), answer-fidelity (vs. an expert ground-truth), plus contains-key-details and ROUGE. Runs compare against a baseline so a model/prompt change can’t silently regress citation accuracy. The output is a signed Scorecard (“Every claim cited · citation-accuracy 0.98 · n=120”) that attaches to a report or case — for regulated buyers this is the buy reason.

Analyst Studio

A no-code builder for grounded agents. Config (instructions · tool scope · grounding rules · model) on the left, a live test panel on the right. Every published agent inherits the grounding contract — citation-required, the literal refusal string, an observation scope — so you cannot ship an ungrounded analyst. Publishing makes the agent selectable in the New-Watch picker and as an Eval target. The four defaults (analyst chat, Ontology Agent, Adjudicator, After-Action) are read-only.

The Entity Network

The Entity Network renders the canonical model as a force-directed graph. Nodes are entities; edges are ties derived from shared observations and events. The layout runs a 3D force simulation client-side and projects it to a navigable point cloud — grab to rotate, scroll to zoom, click a node to open the Entity Inspector.

Edges are typed and weighted. Intra-community edges take the community colour; cross-community ties render as faint bridges; agent-suggested ties render dashed until a human confirms them (Q10). Community assignment and centrality are computed over the live tie graph, so the colouring and node sizing reflect the current state of the model, not a cached snapshot.

Pattern of Life

Pattern of Life scores an entity's activity over time against a baseline computed from its own history. The baseline band uses robust statistics (median + robust spread) so it resists single-point noise. Each time bucket is compared to the band; buckets outside it are flagged and scored.

// per bucket
band   = robust_baseline(entity, window)      // median ± k·MAD
z      = (count - band.center) / band.spread  // robust deviation
anomaly = count < band.lo OR count > band.hi
direction = count > band.hi ? 'elevated' : 'suppressed'

Analysts brush a window to aggregate (total, peak, anomalies vs expected) or click a bucket to inspect a single moment and open the evidence. A scored deviation is the canonical trigger for a Standing Watch — the same condition surfaced here is what a watch evaluates against the live feed before staging a Proposal. See #autonomy.

Human-in-loop oversight

The agent triad — Adjudicator, Inject Generator, After-Action analyst — runs in suggestion mode. Every output is queued for human review. Implementation:

INSERT INTO adjudicated_consequences (
  action_id, variant, text, confidence_low, confidence_high,
  reasoning, historical_analogs, approval
) VALUES (
  $1, $2, $3, $4, $5, $6, $7, 'pending'
);

-- approval state transitions only via a faculty session
UPDATE adjudicated_consequences
   SET approval = 'approved', approved_by = $1, approved_at = now()
 WHERE id = $2;

Faculty users have a database-level bypass on visibility views but not on approval requirements. The two policies are independent and both are enforced in Postgres, not in application code. See #deploy-arch for the row-level-security configuration.


← PREVIOUS NEXT — QUICKSTART →