Skip to content

Document Intelligence

Archivus transforms documents into structured knowledge using advanced AI processing. Every document uploaded becomes a queryable asset in your Knowledge Graph, with automatically extracted entities, relationships, and claims.

Smart Classification

Starter+

Automatically identify document types with AI confidence scoring:

  • 95%+ accuracy across 20+ document types
  • Supports contracts, invoices, receipts, reports, policies, and more
  • Confidence scores for audit trails
  • AI-suggested tags based on content analysis

Example Classification:

Input: Uploaded PDF
Output: INVOICE (97% confidence)
Tags: #finance, #vendor-acme, #accounts-payable
Entities: Acme Corp, $15,420.50, Jan 15 2026

Entity Extraction

Pro+

Extract structured data from unstructured documents:

Extracted Entity Types:

  • People: Names, roles, contact information
  • Organizations: Companies, departments, vendors
  • Dates: Deadlines, effective dates, expirations
  • Amounts: Prices, fees, totals, line items
  • Locations: Addresses, jurisdictions, regions

Knowledge Graph Integration:

All extracted entities populate your tenant-scoped Knowledge Graph with:

  • Temporal context (when was this true?)
  • Geographic context (where did this apply?)
  • Provenance tracking (what document proves this?)
  • Relationship mapping (who is connected to whom?)

AI-Enhanced OCR

Starter+

Extract text from scanned documents and images with AI accuracy enhancement:

  • 95%+ accuracy vs. 80-85% traditional OCR
  • Context-aware error correction
  • Supports scanned PDFs, photos, JPG, PNG, TIFF, HEIC
  • Structure preservation during extraction

Perfect for mobile document capture and legacy document digitization.

Document Summarization

Starter+

Generate summaries tailored to document type:

Quick Summary (1 credit)

  • 2-3 sentence overview
  • Key entities and amounts
  • Primary purpose or action items

Full Summary (2 credits)

  • Comprehensive summary with sections
  • Key points and themes
  • Important deadlines or obligations
  • Context-aware formatting

Example Summary:

Service agreement between Acme Corp and Vendor Inc for software development services. Contract term: Jan 1 - Dec 31, 2026. Total value: $150,000. Payment terms: Monthly invoicing with Net-30 terms.

Multi-Document Analysis

Pro+

Analyze multiple documents simultaneously:

Document Comparison

  • Side-by-side comparison of 2-10 documents
  • Identify similarities and differences
  • Spot inconsistencies or contradictions
  • Version tracking for contracts

Multi-Document Q&A

  • Ask questions across document collections
  • Synthesize answers from various sources
  • Source attribution for each finding
  • Detect conflicts or discrepancies

Use Cases:

  • Compare vendor proposals to find best pricing
  • Analyze contract versions to track changes
  • Cross-reference financial statements
  • Research questions across multiple reports

Chat with Documents

Pro+

Natural language question answering with citations:

  • Multi-turn conversations with context
  • Citations to specific document sections
  • Hallucination prevention with confidence indicators
  • Answers grounded in document content

Example Conversation:

You: What is the refund policy?

AI: According to section 4.2, the refund policy allows for full
refunds within 30 days of purchase if the product is returned
in original condition. [Section 4.2, Page 8]

You: Are there any exceptions?

AI: Yes, section 4.3 notes that custom items are not eligible
for refunds. [Section 4.3, Page 8]

Batch Processing

Team+

Process large document volumes efficiently:

  • Parallel processing for batch uploads
  • 95% of documents processed in under 60 seconds
  • Background job queues prevent UI blocking
  • Automatic retry logic for failed operations

Document Welcome Messages

Pro+

First-time document analysis generates personalized welcome messages:

  • Key insights preview
  • Suggested questions to ask
  • Quick action recommendations
  • 1 credit per document (included in processing)

Supported File Types

Documents

  • PDF (native and scanned)
  • Microsoft Office (Word, Excel, PowerPoint)
  • Google Docs exports
  • Plain text (TXT, Markdown, CSV)
  • Rich text (RTF)

Images with OCR

  • JPEG, PNG, TIFF, HEIC
  • Scanned documents
  • Photos of documents

Archive Formats

  • ZIP (extract and process contents)

File Size Limits

Tier Single File Total Storage
Free 50 MB 1 GB
Starter 100 MB 10 GB
Pro 200 MB 50 GB
Team 500 MB 250 GB
Enterprise 1 GB Unlimited

Performance

Processing Speed:

  • 95% of documents processed in under 60 seconds
  • Parallel processing for batch uploads
  • Real-time status updates via WebSocket

Accuracy:

  • 95%+ classification accuracy
  • 95%+ OCR accuracy (vs. 80-85% traditional)
  • Entity extraction validated against Knowledge Graph

Integration Points

Document intelligence integrates throughout the platform:

  • Search: Extracted entities power semantic search
  • Rules Engine: Classification triggers automation
  • Workflows: Documents flow through DAG pipelines
  • Knowledge Graph: Entities populate the graph
  • Analytics: Document metrics for reporting

Getting Started

  1. Upload documents via web UI, API, or integrations
  2. Automatic processing extracts entities and metadata
  3. Review results in document details panel
  4. Query intelligence using search or AI assistant
  5. Build automation with rules and workflows

View API Reference → See Use Cases →