Introduction
insigz is a geospatial intelligence fusion platform. This documentation explains how it's built, how to put data into it, how to reason over it, and how to run sessions on top of it. Read in order; each section depends on the previous one.
What insigz is
insigz is the layer between continuously-published world data — AIS, electrical grids, sanctions registries, news feeds, weather, cyber advisories — and the people who have to make decisions on top of it. We fuse live sources across maritime, energy, sanctions, news and research into a single canonical model: Source → Observation → Entity → Event → Case → Report.
Three things make insigz different from "another intelligence platform":
- The data is live. AIS positions update every minute. ENTSO-E grid flows update every 15 minutes. News and sanctions update as they're published. The world the platform shows is the world you're acting on.
- The AI shows its work. Every agent suggestion carries its reasoning. Faculty and analysts approve, edit, or override. Nothing publishes autonomously.
- One canonical model. No bespoke pipelines, no proprietary formats. Add a source: write one ingestor, run one migration. Everything else inherits.
#oversight for how it's enforced.Quickstart
A working insigz tenant in five steps. Assumes you have a contract scope and a deployment target (Cloud Run on GCP — our Swiss europe-west6 region, or a customer region).
# 1. Provision a tenant database
$ insigzctl tenant create --name=baltic-2027 --region=eu-central
# 2. Seed the canonical model
$ insigzctl migrate --tenant=baltic-2027 --version=latest
# 3. Enable the data sources you need
$ insigzctl sources enable ais entso-e ofac ncsc noaa
# 4. Verify ingestion
$ insigzctl observe --since="5 min ago" --count
14,602 observations
# 5. Open the working surface
$ insigzctl open
Within ~10 minutes you should see the live map populating in the Workshop view. From there, scenarios and sessions are built using the Workshop authoring tools.
Core concepts
Five nouns are enough to describe everything insigz models:
Source
An external system insigz reads from. Examples: ais.marinetraffic, entso-e.transparency, ofac.sdn. Each source has an ingestor, a polling cadence, and a provenance signature.
Observation
An atomic, timestamped, geolocated fact produced by an ingestor. Observations are immutable and signed by their source. Example: vessel:imo:9123456 position:(59.43,24.75) at:2027-02-15T04:17:08Z source:ais.
Entity
A noun in the world. Vessels, cables, ports, substations, sanctioned parties, news outlets. Entities accumulate observations over time and have stable canonical IDs across sessions.
Event
A change in entity state, derived from one or more observations. Examples: cable-throughput-drop, ais-gap, designation-added. Events carry confidence, provenance, and a back-reference to their constituting observations.
Case
A bound collection of events that mean something together. Analysts open cases; insigz suggests bindings. A case is the unit of analysis — what becomes a report.
The canonical model
Every table in every insigz tenant is one of: a source registry, an observation log, an entity table, an event log, a case binding, or a report manifest. There are no exceptions and no escape hatches.
CREATE TABLE observations (
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
source_id text NOT NULL REFERENCES sources(id),
entity_id uuid REFERENCES entities(id),
observed_at timestamptz NOT NULL,
geom geometry(Point, 4326),
payload jsonb NOT NULL,
signature bytea NOT NULL,
received_at timestamptz NOT NULL DEFAULT now()
);
Every observation has a signature from its source — an Ed25519 signature over a canonical hash of (source_id, observed_at, payload). The signature is what makes the audit log defensible: if a faculty member disputes a consequence, we can trace it back through events to specific signed observations.
#schema. The data-source catalog is at #sources.Framework lineage
The Source → Observation → Entity → Event → Case → Report chain is a deliberate synthesis of two established traditions — not a bespoke invention. Naming the lineage matters: it places the model for any reviewer with a fusion or platform background.
Defense / IC data fusion (JDL / DFIG)
The Joint Directors of Laboratories model is the reference framework for intelligence data fusion. Our stages map onto its levels almost one-to-one:
JDL level insigz stage
L0 source / signal refinement Source, Observation
L1 object refinement Entity
L2 situation refinement Event
L2/L3 situation & threat Case
L3 impact assessment Report
L4 process refinement the platform itself (ingestion, tasking, audit)
Entity-centric ontology (commercial)
The closer comparison is the modern ontology pattern — object types (an entity or event), properties (its characteristics), and link types (relationships), all mapped from source data. insigz grounds its AI analyst at the observation / entity layer, the same move ontology platforms made when the ontology became the backbone for AI agents from ~2023. The analyst cites [OBS-####] rather than asserting unsupported facts.
Why "Case" is a stage
In pure sensor-fusion vocabulary, Case isn't a level — it belongs to investigative / compliance / OSINT practice (case management, sanctions work, link analysis). It's a deliberate, recognisable choice for our maritime, sanctions and academic users; for a hard-defense audience the same stage reads as "investigation."
AsOf). The public explainer is at knowledge.html#lineage.Source onboarding — the Ontology Agent
Adding a source is traditionally a two-specialist job: a Connector Engineer wires the feed and maps its fields to the canonical model, and a Canonical Model Steward (ontologist) decides how those fields become entities and events without forking the schema. The Ontology Agent compresses both into one guided flow driven from the analyst chat. You drop a source — a URL, an API spec, a sample payload, a file, or a catalog pick — and the agent profiles it, drafts the connector, drafts the field→Observation mapping, proposes which entity/event types it maps onto (reusing existing ones wherever possible), proposes resolution keys, and dry-runs the result against a live sample. It drafts; a human commits.
#oversight) into the data layer: the canonical model is a protected branch; the agent opens the pull request, a human merges it.The staged pipeline
The agent runs a fixed, auditable pipeline. Each stage proposes an artifact and waits for approve / edit / reject.
# org-layer: make a source exist & be trustworthy
0 Intake → pull a representative sample (N records)
1 Profile → protocol, rate, auth; field types, units, semantic roles; PII flags
2 Connector → endpoint, auth binding, cadence, health probe
3 Mapping → native field → Observation field; observed_property; UTC/WGS84
4 Ontology → match existing entity/event types (reuse>new); resolution keys
5 Dry-run → replay sample, no commit: resolved/new/dedup/conflicts + coverage
6 Steward → Q9 GATE: human approves the full diff (logged)
7 Catalog → register in org connector catalog + entitlement
# project-layer: decide to use it (existing flow)
8 Connect → Project Lead adds it to a project, sets alerts
Automated vs. human-gated
- Agent drafts: protocol/field detection, unit·timestamp·geo normalization, connector config + health probe, field→Observation mapping, proposed resolution keys & thresholds, dry-run replay with conflict/dedup detection, and reuse suggestions (anti-fork).
- Human only (never autonomous): committing a new entity/event type (Steward, Q9), supplying credentials/classification (Admin), merging two entities in the review band (analyst/Steward), connecting a source into a project (Project Lead).
The Observation schema
One shape for every fact, from every source — the atomic unit the analyst is grounded on and lineage is tracked at.
// Observation — atomic, timestamped, sourced fact
{
"id": "OBS-50412",
"source_id": "spire-ais",
"observed_property": "vessel_position", // typed enum; governs value shape
"observed_at": "2027-02-17T14:33:07Z", // EVENT time (when true), UTC
"ingested_at": "2027-02-17T14:33:11Z", // SYSTEM time (when received), UTC
"entity_ref": "VESSEL-SF12", // resolved entity, or null
"entity_candidate": null,
"geo": { "lat": 59.81, "lon": 24.77, "geohash": "ud9…" }, // WGS84 | null
"value": { "sog_kn": 0.2, "cog_deg": 184 }, // schema per observed_property
"source_native": { "mmsi": 209123456 }, // raw fields, preserved for replay
"provenance": {
"connector_type": "ws", // ws | http | file | manual
"signature": "ed25519:…", // Ed25519 signature over the record hash
"classification": "OPEN", // → drives entitlement/visibility
"resolution": { "method": "deterministic", "confidence": 1.0, "keys_used": ["mmsi"] }
}
}
- Bitemporal (
observed_atvsingested_at) — this is what powers the AsOf control (Q2). Do not collapse the two axes. source_nativeis never discarded — normalization is additive; lineage and replay need the raw fields.signature+classificationon every record — a degraded feed can be down-weighted or quarantined without corrupting the picture; access is scoped at the row.valueis polymorphic byobserved_property— keep a registry of property types; a genuinely new property type is part of the Q9 diff.
Entity resolution
Resolution is where fusion creates value — and where it most often goes wrong at scale. Rules are declared per entity type, proposed by the agent at stage 4, approved by the Steward.
// runtime — per incoming Observation
1. try deterministic keys in priority order // e.g. VESSEL: imo > mmsi > callsign
unique hit → attach, confidence 1.0, method=deterministic
2. else block + score (probabilistic)
block on coarse key (geo-cell+time, or name prefix); score signals → s∈[0,1]
3. decide by band:
s ≥ AUTO (≈0.92) → attach, method=probabilistic
REVIEW ≤ s < AUTO (≈0.65) → candidate + MERGE-REVIEW task (no silent merge · Q9)
s < REVIEW → create NEW candidate entity
- Candidate → confirmed — probabilistic/unresolved attaches create candidate entities; promotion is an analyst/Steward action, logged.
- Merge & split are first-class, audited, reversible — resolution is never destructive; a merge can be undone, restoring both lineages.
- Attribute conflicts are kept, not overwritten — disagreeing sources both retained with
source_ref; the entity surfaces the freshest value and flags the conflict. No last-writer-wins. - Thresholds are per-type and tunable —
AUTO/REVIEWbands live on the entity type, version-controlled with the ontology diff.
ontology_diff in the audit log. The customer-facing explainer is at knowledge.html#agent.The Autonomy Layer — Watches & Proposals
Watches turn insigz from a platform you query into one that watches for you — without crossing into autonomous action. A Watch runs a grounded agent against the live feed on a trigger or schedule and stages a Proposal; a human Accepts, Modifies or Rejects it from the Proposal Inbox. A committed proposal becomes an ordinary Event / Case-update / Alert — no parallel data world.
Autonomy bands
confidence ≥ AUTO (0.92) AND output ∈ auto_allowed_kinds → auto-apply + audit
REVIEW (0.65) ≤ confidence < AUTO → stage for review (alert)
confidence < REVIEW → stage for review, flagged low
Each needs-review proposal emits a watch-review alert into the one notification inbox (no standalone queue). The Review panel’s Show Logic opens the cited [OBS-####] observations + the tool-use trace + the reasoning chain — the same provenance substrate the analyst chat uses.
Trust Evals & the Scorecard
Before an agent drives a watch, an eval suite proves it. Evaluators: citation-accuracy (fraction of claims citing a real observation — the defensibility number), refusal-correctness (does it refuse when unsupported?), answer-fidelity (vs. an expert ground-truth), plus contains-key-details and ROUGE. Runs compare against a baseline so a model/prompt change can’t silently regress citation accuracy. The output is a signed Scorecard (“Every claim cited · citation-accuracy 0.98 · n=120”) that attaches to a report or case — for regulated buyers this is the buy reason.
Analyst Studio
A no-code builder for grounded agents. Config (instructions · tool scope · grounding rules · model) on the left, a live test panel on the right. Every published agent inherits the grounding contract — citation-required, the literal refusal string, an observation scope — so you cannot ship an ungrounded analyst. Publishing makes the agent selectable in the New-Watch picker and as an Eval target. The four defaults (analyst chat, Ontology Agent, Adjudicator, After-Action) are read-only.
The Entity Network
The Entity Network renders the canonical model as a force-directed graph. Nodes are entities; edges are ties derived from shared observations and events. The layout runs a 3D force simulation client-side and projects it to a navigable point cloud — grab to rotate, scroll to zoom, click a node to open the Entity Inspector.
Edges are typed and weighted. Intra-community edges take the community colour; cross-community ties render as faint bridges; agent-suggested ties render dashed until a human confirms them (Q10). Community assignment and centrality are computed over the live tie graph, so the colouring and node sizing reflect the current state of the model, not a cached snapshot.
- Colour by community (default), entity type, or risk.
- Centrality drives node size and brightness — degree within the live subgraph.
- Ego-network — select a node to scope the graph to its neighbourhood.
- Every edge resolves to the observations and events that established it; nothing in the graph is ungrounded.
Pattern of Life
Pattern of Life scores an entity's activity over time against a baseline computed from its own history. The baseline band uses robust statistics (median + robust spread) so it resists single-point noise. Each time bucket is compared to the band; buckets outside it are flagged and scored.
// per bucket
band = robust_baseline(entity, window) // median ± k·MAD
z = (count - band.center) / band.spread // robust deviation
anomaly = count < band.lo OR count > band.hi
direction = count > band.hi ? 'elevated' : 'suppressed'
Analysts brush a window to aggregate (total, peak, anomalies vs expected) or click a bucket to inspect a single moment and open the evidence. A scored deviation is the canonical trigger for a Standing Watch — the same condition surfaced here is what a watch evaluates against the live feed before staging a Proposal. See #autonomy.
Human-in-loop oversight
The agent triad — Adjudicator, Inject Generator, After-Action analyst — runs in suggestion mode. Every output is queued for human review. Implementation:
INSERT INTO adjudicated_consequences (
action_id, variant, text, confidence_low, confidence_high,
reasoning, historical_analogs, approval
) VALUES (
$1, $2, $3, $4, $5, $6, $7, 'pending'
);
-- approval state transitions only via a faculty session
UPDATE adjudicated_consequences
SET approval = 'approved', approved_by = $1, approved_at = now()
WHERE id = $2;
Faculty users have a database-level bypass on visibility views but not on approval requirements. The two policies are independent and both are enforced in Postgres, not in application code. See #deploy-arch for the row-level-security configuration.