Glossary¶
Key terms and concepts in the Archivus platform.
Core Concepts¶
Knowledge Graph¶
A graph database structure where entities, relationships, and claims are stored as interconnected nodes. Forms the foundation of Archivus's verifiable intelligence.
Entity¶
A person, organization, location, date, or concept extracted from documents. Entities are nodes in the Knowledge Graph with rich context including aliases, descriptions, and external identifiers.
Claim¶
A statement or assertion extracted from a document, with supporting evidence and confidence scores. Claims can support or contradict other claims, enabling epistemic reasoning.
Provenance¶
The complete chain of evidence showing where information came from, who created it, when, and with what level of confidence. Every fact in Archivus has traceable provenance.
Quadruple Model¶
An extension of the traditional triple model (subject, predicate, object) that adds context: (subject, predicate, object, CONTEXT). Context includes temporal, geographic, and provenance information.
AI & Verification¶
GOLAG¶
Game-Oriented Lagrangian Agent Governance - Evolutionary verification system where AI agents compete and improve through calibrated confidence budgets and quadratic voting.
Lagrangian¶
Mathematical optimization function balancing confidence, context match, and risk. Low Lagrangian values trigger human escalation; high values enable autonomous action.
Quadratic Voting¶
Voting mechanism where vote cost equals votes squared (cost = votes²), forcing honest confidence calibration. Prevents overconfident agents from dominating decisions.
Expected Calibration Error (ECE)¶
Metric measuring how well an agent's confidence scores match actual accuracy. Well-calibrated agents (low ECE) accumulate influence over time.
Expert Agent¶
An agent with 95%+ accuracy over 20+ decisions, earning +50 confidence budget bonus. Expert status is domain-specific.
Actionability Confidence (AC)¶
Measure of agent's ability to make autonomous decisions. Evolves based on performance through replicator dynamics.
Document Processing¶
AI Processing¶
Automated document analysis including text extraction, entity recognition, summarization, and classification using Claude, Gemini, and OpenAI models.
OCR (Optical Character Recognition)¶
Text extraction from images and scanned documents using Tesseract, Google Cloud Vision, or AWS Textract.
Text Extraction¶
Converting document content into machine-readable text while preserving structure and formatting.
Embedding¶
Vector representation of text enabling semantic similarity search. Generated using OpenAI's embedding models.
Semantic Search¶
Finding documents by meaning rather than exact keyword matches, using vector similarity in embedding space.
Architecture¶
Multi-Tenant¶
Architecture where multiple organizations (tenants) share the same infrastructure while maintaining complete data isolation through Row-Level Security (RLS).
Row-Level Security (RLS)¶
PostgreSQL feature ensuring every database query is automatically scoped to the authenticated user's tenant, preventing data leakage.
DAG (Directed Acyclic Graph)¶
Workflow representation where nodes are tasks and edges show dependencies. No circular dependencies allowed, ensuring workflows complete.
CGR3 Pipeline¶
Context Graph Reasoning with Retrieve-Rank-Reason - Three-stage process: 1. Retrieve: Extract entities and query graph 2. Rank: LLM ranks candidates with entity context 3. Reason: Generate fluent response grounded in verified facts
Federation¶
Federation¶
System enabling multiple Archivus instances to share verified facts while preserving data sovereignty. Raw documents never leave tenant boundaries.
Trust Chain¶
Complete record of every cross-tenant interaction with explicit trust levels. No implicit access is allowed.
Data Sovereignty¶
Principle that raw documents and sensitive data remain within the tenant's control. Only verified, anonymized facts flow through federation.
Hedera Consensus Service¶
Public ledger providing tamper-proof timestamps for federated fact exchanges, enabling trustless verification.
Enterprise Features¶
BYOB (Bring Your Own Bucket)¶
Enterprise feature allowing storage of documents in your own S3/Azure/GCS bucket instead of Archivus-managed storage.
White Label¶
Enterprise deployment option where Archivus runs under your brand with custom domain, colors, and logos.
MCP (Model Context Protocol)¶
Protocol for integrating external AI tools and services into Archivus workflows. Enables extensibility without vendor lock-in.
Compliance Backbone¶
MotherDuck/DuckDB analytics layer for audit trails, compliance reporting, and historical analysis stored in Parquet format on S3.
User Concepts¶
Tenant¶
An organization's isolated instance of Archivus with its own users, documents, and settings. Subdomain identifies tenant (e.g., acme.archivus.app).
Workspace¶
Organizational container for documents and folders, similar to a filing cabinet. Can be personal, shared, or team-owned.
Cabinet¶
Another term for Workspace, used interchangeably.
Folder¶
Container for organizing documents within a workspace, supporting nested hierarchies.
Tag¶
Label applied to documents for categorization. Can be manual or AI-generated.
Rule¶
Automated logic that triggers actions based on document properties (e.g., auto-route invoices to Finance folder).
API & Development¶
API Key¶
Long-lived authentication token for server-to-server integrations. Prefixed with ak_live_ or ak_test_.
JWT Token¶
Short-lived user session token for web/mobile applications. Expires after 24 hours.
Webhook¶
HTTP callback triggered when events occur (e.g., document processed, workflow completed).
Rate Limit¶
Maximum number of API requests allowed per minute, based on subscription tier.
Idempotency¶
Property ensuring duplicate requests with same idempotency key produce the same result without side effects.
Subscription Tiers¶
Free¶
50 one-time AI credits, no GOLAG, basic features only.
Starter¶
500 AI credits/month, no GOLAG, standard features.
Pro¶
1,500 AI credits/month, basic GOLAG (3 domains), advanced search.
Team¶
6,000 AI credits/month, standard GOLAG (7 domains), DAG workflows, view-only federation.
Enterprise¶
Unlimited AI credits, full GOLAG (13 domains), full federation, BYOB, white label, SSO.
Technical Terms¶
Neuro-Symbolic AI¶
AI approach combining neural networks (LLMs) with symbolic reasoning (knowledge graphs) for verifiable intelligence.
Epistemic Reasoning¶
Reasoning about knowledge itself - what is known, with what confidence, supported by what evidence.
Replicator Dynamics¶
Mathematical model from evolutionary biology used to evolve agent capabilities: AC(gen+1) = AC(gen) + α(AC_best - AC_avg).
Hash Chain¶
Cryptographic chain where each block contains hash of previous block, enabling tamper detection without blockchain overhead.
Context Window¶
Maximum amount of text an LLM can process in a single request (e.g., 200K tokens for Claude Opus 4.5).
Prompt Caching¶
Claude feature caching repeated prompt prefixes to reduce costs by 70-90%.
File Formats¶
Parquet¶
Columnar storage format used for analytics data on S3. Efficient compression and query performance.
JSONL¶
JSON Lines format where each line is a complete JSON object. Used for bulk data exports and imports.
RDF (Resource Description Framework)¶
Standard for representing graph data. Archivus supports RDF export for interoperability.