Document Intelligence¶
Archivus transforms documents into structured knowledge using advanced AI processing. Every document uploaded becomes a queryable asset in your Knowledge Graph, with automatically extracted entities, relationships, and claims.
Smart Classification¶
Starter+
Automatically identify document types with AI confidence scoring:
- 95%+ accuracy across 20+ document types
- Supports contracts, invoices, receipts, reports, policies, and more
- Confidence scores for audit trails
- AI-suggested tags based on content analysis
Example Classification:
Input: Uploaded PDF
Output: INVOICE (97% confidence)
Tags: #finance, #vendor-acme, #accounts-payable
Entities: Acme Corp, $15,420.50, Jan 15 2026
Entity Extraction¶
Pro+
Extract structured data from unstructured documents:
Extracted Entity Types:
- People: Names, roles, contact information
- Organizations: Companies, departments, vendors
- Dates: Deadlines, effective dates, expirations
- Amounts: Prices, fees, totals, line items
- Locations: Addresses, jurisdictions, regions
Knowledge Graph Integration:
All extracted entities populate your tenant-scoped Knowledge Graph with:
- Temporal context (when was this true?)
- Geographic context (where did this apply?)
- Provenance tracking (what document proves this?)
- Relationship mapping (who is connected to whom?)
AI-Enhanced OCR¶
Starter+
Extract text from scanned documents and images with AI accuracy enhancement:
- 95%+ accuracy vs. 80-85% traditional OCR
- Context-aware error correction
- Supports scanned PDFs, photos, JPG, PNG, TIFF, HEIC
- Structure preservation during extraction
Perfect for mobile document capture and legacy document digitization.
Document Summarization¶
Starter+
Generate summaries tailored to document type:
Quick Summary (1 credit)
- 2-3 sentence overview
- Key entities and amounts
- Primary purpose or action items
Full Summary (2 credits)
- Comprehensive summary with sections
- Key points and themes
- Important deadlines or obligations
- Context-aware formatting
Example Summary:
Service agreement between Acme Corp and Vendor Inc for software development services. Contract term: Jan 1 - Dec 31, 2026. Total value: $150,000. Payment terms: Monthly invoicing with Net-30 terms.
Multi-Document Analysis¶
Pro+
Analyze multiple documents simultaneously:
Document Comparison
- Side-by-side comparison of 2-10 documents
- Identify similarities and differences
- Spot inconsistencies or contradictions
- Version tracking for contracts
Multi-Document Q&A
- Ask questions across document collections
- Synthesize answers from various sources
- Source attribution for each finding
- Detect conflicts or discrepancies
Use Cases:
- Compare vendor proposals to find best pricing
- Analyze contract versions to track changes
- Cross-reference financial statements
- Research questions across multiple reports
Chat with Documents¶
Pro+
Natural language question answering with citations:
- Multi-turn conversations with context
- Citations to specific document sections
- Hallucination prevention with confidence indicators
- Answers grounded in document content
Example Conversation:
You: What is the refund policy?
AI: According to section 4.2, the refund policy allows for full
refunds within 30 days of purchase if the product is returned
in original condition. [Section 4.2, Page 8]
You: Are there any exceptions?
AI: Yes, section 4.3 notes that custom items are not eligible
for refunds. [Section 4.3, Page 8]
Batch Processing¶
Team+
Process large document volumes efficiently:
- Parallel processing for batch uploads
- 95% of documents processed in under 60 seconds
- Background job queues prevent UI blocking
- Automatic retry logic for failed operations
Document Welcome Messages¶
Pro+
First-time document analysis generates personalized welcome messages:
- Key insights preview
- Suggested questions to ask
- Quick action recommendations
- 1 credit per document (included in processing)
Supported File Types¶
Documents
- PDF (native and scanned)
- Microsoft Office (Word, Excel, PowerPoint)
- Google Docs exports
- Plain text (TXT, Markdown, CSV)
- Rich text (RTF)
Images with OCR
- JPEG, PNG, TIFF, HEIC
- Scanned documents
- Photos of documents
Archive Formats
- ZIP (extract and process contents)
File Size Limits¶
| Tier | Single File | Total Storage |
|---|---|---|
| Free | 50 MB | 1 GB |
| Starter | 100 MB | 10 GB |
| Pro | 200 MB | 50 GB |
| Team | 500 MB | 250 GB |
| Enterprise | 1 GB | Unlimited |
Performance¶
Processing Speed:
- 95% of documents processed in under 60 seconds
- Parallel processing for batch uploads
- Real-time status updates via WebSocket
Accuracy:
- 95%+ classification accuracy
- 95%+ OCR accuracy (vs. 80-85% traditional)
- Entity extraction validated against Knowledge Graph
Integration Points¶
Document intelligence integrates throughout the platform:
- Search: Extracted entities power semantic search
- Rules Engine: Classification triggers automation
- Workflows: Documents flow through DAG pipelines
- Knowledge Graph: Entities populate the graph
- Analytics: Document metrics for reporting
Getting Started¶
- Upload documents via web UI, API, or integrations
- Automatic processing extracts entities and metadata
- Review results in document details panel
- Query intelligence using search or AI assistant
- Build automation with rules and workflows