Skip to content

Glossary

Key terms and concepts in the Archivus platform.


Core Concepts

Knowledge Graph

A graph database structure where entities, relationships, and claims are stored as interconnected nodes. Forms the foundation of Archivus's verifiable intelligence.

Entity

A person, organization, location, date, or concept extracted from documents. Entities are nodes in the Knowledge Graph with rich context including aliases, descriptions, and external identifiers.

Claim

A statement or assertion extracted from a document, with supporting evidence and confidence scores. Claims can support or contradict other claims, enabling epistemic reasoning.

Provenance

The complete chain of evidence showing where information came from, who created it, when, and with what level of confidence. Every fact in Archivus has traceable provenance.

Quadruple Model

An extension of the traditional triple model (subject, predicate, object) that adds context: (subject, predicate, object, CONTEXT). Context includes temporal, geographic, and provenance information.


AI & Verification

GOLAG

Game-Oriented Lagrangian Agent Governance - Evolutionary verification system where AI agents compete and improve through calibrated confidence budgets and quadratic voting.

Lagrangian

Mathematical optimization function balancing confidence, context match, and risk. Low Lagrangian values trigger human escalation; high values enable autonomous action.

Quadratic Voting

Voting mechanism where vote cost equals votes squared (cost = votes²), forcing honest confidence calibration. Prevents overconfident agents from dominating decisions.

Expected Calibration Error (ECE)

Metric measuring how well an agent's confidence scores match actual accuracy. Well-calibrated agents (low ECE) accumulate influence over time.

Expert Agent

An agent with 95%+ accuracy over 20+ decisions, earning +50 confidence budget bonus. Expert status is domain-specific.

Actionability Confidence (AC)

Measure of agent's ability to make autonomous decisions. Evolves based on performance through replicator dynamics.


Document Processing

AI Processing

Automated document analysis including text extraction, entity recognition, summarization, and classification using Claude, Gemini, and OpenAI models.

OCR (Optical Character Recognition)

Text extraction from images and scanned documents using Tesseract, Google Cloud Vision, or AWS Textract.

Text Extraction

Converting document content into machine-readable text while preserving structure and formatting.

Embedding

Vector representation of text enabling semantic similarity search. Generated using OpenAI's embedding models.

Finding documents by meaning rather than exact keyword matches, using vector similarity in embedding space.


Architecture

Multi-Tenant

Architecture where multiple organizations (tenants) share the same infrastructure while maintaining complete data isolation through Row-Level Security (RLS).

Row-Level Security (RLS)

PostgreSQL feature ensuring every database query is automatically scoped to the authenticated user's tenant, preventing data leakage.

DAG (Directed Acyclic Graph)

Workflow representation where nodes are tasks and edges show dependencies. No circular dependencies allowed, ensuring workflows complete.

CGR3 Pipeline

Context Graph Reasoning with Retrieve-Rank-Reason - Three-stage process: 1. Retrieve: Extract entities and query graph 2. Rank: LLM ranks candidates with entity context 3. Reason: Generate fluent response grounded in verified facts


Federation

Federation

System enabling multiple Archivus instances to share verified facts while preserving data sovereignty. Raw documents never leave tenant boundaries.

Trust Chain

Complete record of every cross-tenant interaction with explicit trust levels. No implicit access is allowed.

Data Sovereignty

Principle that raw documents and sensitive data remain within the tenant's control. Only verified, anonymized facts flow through federation.

Hedera Consensus Service

Public ledger providing tamper-proof timestamps for federated fact exchanges, enabling trustless verification.


Enterprise Features

BYOB (Bring Your Own Bucket)

Enterprise feature allowing storage of documents in your own S3/Azure/GCS bucket instead of Archivus-managed storage.

White Label

Enterprise deployment option where Archivus runs under your brand with custom domain, colors, and logos.

MCP (Model Context Protocol)

Protocol for integrating external AI tools and services into Archivus workflows. Enables extensibility without vendor lock-in.

Compliance Backbone

MotherDuck/DuckDB analytics layer for audit trails, compliance reporting, and historical analysis stored in Parquet format on S3.


User Concepts

Tenant

An organization's isolated instance of Archivus with its own users, documents, and settings. Subdomain identifies tenant (e.g., acme.archivus.app).

Workspace

Organizational container for documents and folders, similar to a filing cabinet. Can be personal, shared, or team-owned.

Cabinet

Another term for Workspace, used interchangeably.

Folder

Container for organizing documents within a workspace, supporting nested hierarchies.

Tag

Label applied to documents for categorization. Can be manual or AI-generated.

Rule

Automated logic that triggers actions based on document properties (e.g., auto-route invoices to Finance folder).


API & Development

API Key

Long-lived authentication token for server-to-server integrations. Prefixed with ak_live_ or ak_test_.

JWT Token

Short-lived user session token for web/mobile applications. Expires after 24 hours.

Webhook

HTTP callback triggered when events occur (e.g., document processed, workflow completed).

Rate Limit

Maximum number of API requests allowed per minute, based on subscription tier.

Idempotency

Property ensuring duplicate requests with same idempotency key produce the same result without side effects.


Subscription Tiers

Free

50 one-time AI credits, no GOLAG, basic features only.

Starter

500 AI credits/month, no GOLAG, standard features.

Pro

1,500 AI credits/month, basic GOLAG (3 domains), advanced search.

Team

6,000 AI credits/month, standard GOLAG (7 domains), DAG workflows, view-only federation.

Enterprise

Unlimited AI credits, full GOLAG (13 domains), full federation, BYOB, white label, SSO.


Technical Terms

Neuro-Symbolic AI

AI approach combining neural networks (LLMs) with symbolic reasoning (knowledge graphs) for verifiable intelligence.

Epistemic Reasoning

Reasoning about knowledge itself - what is known, with what confidence, supported by what evidence.

Replicator Dynamics

Mathematical model from evolutionary biology used to evolve agent capabilities: AC(gen+1) = AC(gen) + α(AC_best - AC_avg).

Hash Chain

Cryptographic chain where each block contains hash of previous block, enabling tamper detection without blockchain overhead.

Context Window

Maximum amount of text an LLM can process in a single request (e.g., 200K tokens for Claude Opus 4.5).

Prompt Caching

Claude feature caching repeated prompt prefixes to reduce costs by 70-90%.


File Formats

Parquet

Columnar storage format used for analytics data on S3. Efficient compression and query performance.

JSONL

JSON Lines format where each line is a complete JSON object. Used for bulk data exports and imports.

RDF (Resource Description Framework)

Standard for representing graph data. Archivus supports RDF export for interoperability.