Proactive Research Engine
Autonomous Research with Source of Truth Grounding (Pro+)
Overview
The Proactive Research Engine extends Archie’s capabilities beyond your document library. Research the web, validate claims against external sources, and ground findings in your verified Source of Truth documents.
Core Capabilities
Web Research
Search and synthesize information from the web:
User: "Research current market rates for cloud storage services"
Archie:
1. Searching Perplexity Sonar for "cloud storage pricing 2026"
2. Found 12 relevant sources
3. Synthesizing findings...
Results:
- AWS S3: $0.023/GB/month (Standard)
- Azure Blob: $0.0184/GB/month (Hot tier)
- Google Cloud: $0.020/GB/month (Standard)
- Wasabi: $6.99/TB/month (no egress fees)
Sources:
- aws.amazon.com/s3/pricing
- azure.microsoft.com/pricing/details/storage
- cloud.google.com/storage/pricing
Power Mode (Pro+)
Multi-provider parallel search for comprehensive research:
┌─────────────────────────────────────────────────────────────┐
│ Research Query │
│ "Compare enterprise document management solutions" │
└──────────────────────────┬──────────────────────────────────┘
│
┌────────────┼────────────┐
│ │ │
┌─────▼─────┐ ┌────▼────┐ ┌─────▼─────┐
│ Perplexity│ │ Brave │ │ Bing │
│ Sonar │ │ Search │ │ News │
└─────┬─────┘ └────┬────┘ └─────┬─────┘
│ │ │
┌─────▼────────────▼────────────▼─────┐
│ Result Aggregation │
│ Deduplication │ Ranking │ Synthesis│
└─────────────────┬───────────────────┘
│
┌─────▼─────┐
│ Grounding │
│ Against │
│ SoT │
└───────────┘
Power Mode Providers:
- Perplexity Sonar - AI-powered search synthesis
- Brave Search - Privacy-focused web search
- Bing News - News and current events
Source of Truth Grounding
Validate research findings against your verified documents:
User: "Research competitor pricing and compare to our current contracts"
Archie:
1. Researching competitor pricing...
2. Found external data on 5 competitors
3. Grounding against Source of Truth collection...
Findings:
- Market average: $50/user/month
- Our current contract (Contract-2024.pdf): $45/user/month
- Potential savings identified: 10% below market
⚠️ Discrepancy found:
- External source claims competitor X offers $35/user
- Source of Truth (Vendor-Analysis.pdf) shows $42/user
- Recommend verification before decisions
Confidence: High (0.89) - Grounded in 3 Source of Truth documents
Research Tools
propose_research
Initiate a research task:
POST /api/v1/archie/chat
{
"message": "Research current trends in AI document processing",
"tools": ["propose_research"]
}
Response:
{
"research_id": "res_123",
"status": "proposed",
"scope": {
"query": "AI document processing trends 2026",
"sources": ["web", "news"],
"depth": "comprehensive"
},
"estimated_credits": 5
}
search_web
Direct web search:
{
"tool": "search_web",
"parameters": {
"query": "enterprise OCR accuracy benchmarks",
"sources": ["perplexity"],
"max_results": 10
}
}
ground_finding
Validate against Source of Truth:
{
"tool": "ground_finding",
"parameters": {
"finding": "Market leader offers 99.5% OCR accuracy",
"source_of_truth_collection": "verified_benchmarks"
}
}
Response:
{
"grounding_status": "partial_match",
"confidence": 0.75,
"supporting_documents": [
{
"document": "OCR-Benchmarks-2025.pdf",
"relevant_excerpt": "Leading providers achieve 98-99% accuracy...",
"match_score": 0.82
}
],
"discrepancies": [
"External claim: 99.5% vs Document claim: 99% maximum"
]
}
validate_claim
Fact-check external claims:
{
"tool": "validate_claim",
"parameters": {
"claim": "Company X processes 1 million documents per day",
"validation_sources": ["web", "source_of_truth"]
}
}
Response:
{
"claim_status": "unverified",
"web_evidence": {
"supporting": 2,
"contradicting": 1,
"neutral": 3
},
"source_of_truth_evidence": {
"found": false,
"message": "No relevant documents in Source of Truth"
},
"recommendation": "Verify directly with Company X"
}
Research Task Management
Task Lifecycle
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Proposed │────▶│ Active │────▶│ Completed │
└─────────────┘ └──────┬──────┘ └─────────────┘
│
┌──────▼──────┐
│ Paused │
│ (User input │
│ needed) │
└─────────────┘
API Endpoints
| Endpoint | Method | Description |
|---|---|---|
/research/tasks |
GET | List research tasks |
/research/tasks |
POST | Create research task |
/research/tasks/{id} |
GET | Get task details |
/research/tasks/{id}/status |
GET | Get task status |
/research/tasks/{id}/results |
GET | Get research results |
/research/tasks/{id}/cancel |
POST | Cancel task |
Research Task Status
GET /api/v1/research/tasks/res_123/status
Response:
{
"task_id": "res_123",
"status": "active",
"progress": {
"phase": "grounding",
"percent_complete": 75,
"sources_searched": 8,
"documents_grounded": 3
},
"preliminary_findings": 5,
"estimated_completion": "2026-01-18T10:35:00Z"
}
Intelligence Reports
Research tasks automatically generate professional, publication-ready intelligence reports.
Report Modes
Standard Mode (Pro):
- Concise research brief (800-1500 words)
- Executive summary with key takeaway
- Findings organized by confidence level
- Tiered source list
Power Mode (Team+):
- Comprehensive intelligence report (2000-3500 words)
- Quick Facts metrics table
- Table of Contents with linked sections
- Detailed evidence assessment with source quality matrix
- Contradictions & limitations analysis
- Strategic implications with risk monitoring
- Prioritized recommendations with success metrics
Report Structure (Power Mode)
# [Topic]: Intelligence Report
**Report Date:** January 19, 2026
**Classification:** Research Intelligence Report
**Confidence Level:** HIGH - Multiple Tier 1 sources verified
---
## Executive Summary
[4-5 sentence summary of key findings]
> **Bottom Line:** [One powerful sentence for executives]
---
## Quick Facts
| Metric | Value |
|--------|-------|
| Total Findings | 47 |
| Verified Findings | 12 |
| Source Quality | Primarily Tier 1 |
| Confidence Level | HIGH |
---
## Table of Contents
1. Key Findings
2. Detailed Analysis
3. Evidence Assessment
4. Contradictions & Limitations
5. Strategic Implications
6. Recommendations
7. Sources & Methodology
---
[Full sections follow...]
Accessing Reports
Via UI:
- Navigate to Research tab
- Click on completed task
- View full report or download as Markdown
Via API:
GET /api/v1/research/tasks/{id}/report
Response:
{
"task_id": "res_123",
"topic": "AI regulations in Canada",
"report": "# AI Regulations in Canada: Intelligence Report\n\n..."
}
Report Features
| Feature | Standard | Power Mode |
|---|---|---|
| Executive Summary | ✓ | ✓ |
| Bottom Line Callout | ✓ | ✓ |
| Quick Facts Table | - | ✓ |
| Table of Contents | - | ✓ |
| Evidence Assessment | - | ✓ |
| Source Quality Matrix | - | ✓ |
| Risk Monitoring Table | - | ✓ |
| Success Metrics | - | ✓ |
Source of Truth Collections
Marking Collections as Source of Truth
PUT /api/v1/collections/{id}
{
"is_source_of_truth": true,
"sot_priority": 1,
"sot_categories": ["pricing", "contracts", "policies"]
}
Source of Truth Hierarchy
- Priority 1 - Highest authority (e.g., signed contracts)
- Priority 2 - Official documents (e.g., policies)
- Priority 3 - Reference materials (e.g., guidelines)
When grounding, higher priority sources take precedence.
Epistemic Metadata
Research results include confidence and provenance tracking:
{
"finding": "Cloud storage market growing 25% YoY",
"epistemic_metadata": {
"confidence": 0.85,
"confidence_factors": {
"source_reliability": 0.9,
"recency": 0.8,
"corroboration": 0.85
},
"provenance": {
"primary_source": "Gartner Research Report 2026",
"secondary_sources": ["AWS Blog", "Azure Documentation"],
"grounded_in_sot": true,
"sot_documents": ["Market-Analysis-2025.pdf"]
},
"limitations": [
"Data from Q3 2025, may not reflect recent changes"
]
}
}
Use Cases
Competitive Intelligence
"Research our top 3 competitors' pricing and compare to our current rates"
→ Web search for competitor pricing
→ Ground against internal pricing documents
→ Generate comparison report
Due Diligence
"Research this company's financials and validate against our assessment"
→ Search financial news and reports
→ Ground against internal due diligence docs
→ Flag discrepancies for review
Policy Compliance
"Research latest GDPR requirements and compare to our current policy"
→ Search for GDPR updates
→ Ground against current privacy policy
→ Identify gaps and recommendations
Market Research
"Research current trends in document AI and summarize for leadership"
→ Multi-source web research
→ Synthesize findings
→ Generate executive summary
Credits and Limits
| Operation | Credits | Notes |
|---|---|---|
| Basic web search | 2 | Single provider |
| Power Mode search | 5 | Multi-provider parallel |
| Ground finding | 2 | Per finding validated |
| Validate claim | 2 | Per claim checked |
| Full research task | 5-15 | Depends on scope |
Tier Limits
| Tier | Research Tasks/Day | Power Mode |
|---|---|---|
| Pro | 20 | Available |
| Team | 100 | Available |
| Enterprise | Unlimited | Available |
Best Practices
Research Quality
- Be Specific - Narrow queries yield better results
- Use Source of Truth - Ground findings in verified docs
- Check Confidence - Review epistemic metadata
- Verify Discrepancies - Don’t auto-trust external sources
Cost Optimization
- Start Narrow - Expand scope if needed
- Use Basic Search First - Upgrade to Power Mode if insufficient
- Batch Related Queries - Combine related research
Source of Truth Management
- Curate Carefully - Only mark verified documents as SoT
- Set Priorities - Establish clear authority hierarchy
- Keep Current - Update SoT documents regularly
- Categorize - Enable targeted grounding
API Reference
Research Tools in Archie
| Tool | Description | Credits |
|---|---|---|
propose_research |
Start research task | 2 |
search_web |
Web search | 2-5 |
get_research_status |
Check progress | 0 |
ground_finding |
Validate vs SoT | 2 |
validate_claim |
Fact-check claim | 2 |
Research Task Endpoints
| Endpoint | Method | Description |
|---|---|---|
/research/tasks |
GET | List tasks |
/research/tasks |
POST | Create task |
/research/tasks/{id} |
GET | Task details |
/research/tasks/{id}/results |
GET | Task results |
Related Documentation
Ready to research? Ask Archie: “Research current market trends in [your topic]”