Bring Your Own AI Provider

Use Your Own AI Infrastructure for Document Intelligence (Enterprise)


Overview

Enterprise customers can configure Archivus to use their own AI provider accounts. This enables cost control, compliance with data processing requirements, and integration with existing AI infrastructure agreements.


Supported Providers

Provider Models Use Case
Anthropic Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Haiku Default, best document understanding
OpenAI GPT-4 Turbo, GPT-4o, GPT-3.5 Turbo Alternative, cost optimization
Azure OpenAI GPT-4, GPT-3.5 (Azure-hosted) Enterprise compliance, data residency

Configuration

Anthropic (Claude)

PUT /api/v1/admin/ai/provider
Content-Type: application/json
Authorization: Bearer YOUR_ADMIN_TOKEN

{
  "provider": "anthropic",
  "config": {
    "api_key": "sk-ant-...",
    "default_model": "claude-3-5-sonnet-20241022",
    "fallback_model": "claude-3-haiku-20240307",
    "max_tokens": 4096,
    "rate_limit_rpm": 1000
  }
}

Supported Models:

  • claude-3-5-sonnet-20241022 - Best quality/speed balance
  • claude-3-opus-20240229 - Highest quality
  • claude-3-haiku-20240307 - Fastest, most cost-effective

OpenAI

PUT /api/v1/admin/ai/provider
{
  "provider": "openai",
  "config": {
    "api_key": "sk-...",
    "organization_id": "org-...",
    "default_model": "gpt-4-turbo-preview",
    "fallback_model": "gpt-3.5-turbo",
    "max_tokens": 4096
  }
}

Supported Models:

  • gpt-4-turbo-preview - Latest GPT-4
  • gpt-4o - Optimized GPT-4
  • gpt-3.5-turbo - Cost-effective option

Azure OpenAI

PUT /api/v1/admin/ai/provider
{
  "provider": "azure_openai",
  "config": {
    "endpoint": "https://your-resource.openai.azure.com",
    "api_key": "...",
    "api_version": "2024-02-15-preview",
    "deployment_name": "gpt-4-deployment",
    "fallback_deployment": "gpt-35-turbo-deployment"
  }
}

Requirements:

  • Azure OpenAI Service resource
  • Model deployments created
  • API access enabled

Model Routing

Configure intelligent model selection based on task type:

PUT /api/v1/admin/ai/routing
{
  "routing_rules": [
    {
      "task_type": "document_summary",
      "model": "claude-3-haiku-20240307",
      "reason": "Fast, cost-effective for summaries"
    },
    {
      "task_type": "document_analysis",
      "model": "claude-3-5-sonnet-20241022",
      "reason": "Best quality for detailed analysis"
    },
    {
      "task_type": "document_comparison",
      "model": "claude-3-5-sonnet-20241022",
      "reason": "Complex reasoning required"
    },
    {
      "task_type": "entity_extraction",
      "model": "claude-3-haiku-20240307",
      "reason": "Structured output, fast"
    },
    {
      "task_type": "dag_workflow",
      "model": "claude-3-5-sonnet-20241022",
      "reason": "Multi-step reasoning"
    }
  ]
}

Task Types

Task Type Description Recommended Model
document_summary Quick summaries Haiku/GPT-3.5
document_analysis Deep analysis Sonnet/GPT-4
document_comparison Multi-doc comparison Sonnet/GPT-4
entity_extraction Extract entities Haiku/GPT-3.5
classification Categorization Haiku/GPT-3.5
qa_simple Basic Q&A Haiku/GPT-3.5
qa_complex Complex reasoning Sonnet/GPT-4
dag_workflow Workflow nodes Sonnet/GPT-4
research Web research Sonnet/GPT-4

Cost Management

Usage Tracking

Monitor AI costs in real-time:

GET /api/v1/admin/ai/usage?period=month

Response:
{
  "period": "2026-01",
  "total_tokens": 15420000,
  "total_requests": 28500,
  "cost_estimate_usd": 234.50,
  "by_model": [
    {
      "model": "claude-3-5-sonnet-20241022",
      "input_tokens": 10000000,
      "output_tokens": 2500000,
      "requests": 15000,
      "cost_usd": 187.50
    },
    {
      "model": "claude-3-haiku-20240307",
      "input_tokens": 2500000,
      "output_tokens": 420000,
      "requests": 13500,
      "cost_usd": 47.00
    }
  ],
  "by_task_type": [
    {"task": "document_analysis", "requests": 8000, "cost_usd": 95.00},
    {"task": "document_summary", "requests": 12000, "cost_usd": 28.00}
  ]
}

Cost Optimization

Automatic Optimization:

  • Prompt caching for repeated content (70-90% savings)
  • Context windowing for long conversations (60-80% savings)
  • Smart model routing based on task complexity

Manual Controls:

  • Set per-model token limits
  • Configure cost alerts
  • Enable/disable specific models

Failover Configuration

Configure automatic failover for high availability:

PUT /api/v1/admin/ai/failover
{
  "enabled": true,
  "primary_provider": "anthropic",
  "failover_providers": [
    {
      "provider": "openai",
      "trigger_on": ["rate_limit", "service_unavailable"],
      "cooldown_minutes": 5
    },
    {
      "provider": "azure_openai",
      "trigger_on": ["service_unavailable"],
      "cooldown_minutes": 10
    }
  ],
  "circuit_breaker": {
    "failure_threshold": 5,
    "reset_timeout_seconds": 60
  }
}

Failover Triggers:

  • rate_limit - Provider rate limit exceeded
  • service_unavailable - Provider API down
  • timeout - Request timeout
  • error_rate - High error rate detected

Security

Credential Management

  • All API keys encrypted with AES-256-GCM
  • Keys stored separately from application data
  • Automatic key rotation support
  • Audit logging for key access

Data Privacy

Your AI provider configuration ensures:

  • Direct API Calls - Requests go directly to your provider
  • No Data Logging - Archivus doesn’t log prompts/responses
  • Your Agreement - Subject to your provider agreement
  • Data Residency - Use Azure OpenAI for specific regions

Embeddings Configuration

Configure embedding models for semantic search:

PUT /api/v1/admin/ai/embeddings
{
  "provider": "openai",
  "config": {
    "api_key": "sk-...",
    "model": "text-embedding-3-small",
    "dimensions": 1536
  }
}

Supported Embedding Models:

  • text-embedding-3-small (OpenAI) - Default, cost-effective
  • text-embedding-3-large (OpenAI) - Higher quality
  • text-embedding-ada-002 (OpenAI) - Legacy

Validation

Test Provider Configuration

POST /api/v1/admin/ai/provider/validate
{
  "provider": "anthropic",
  "config": {
    "api_key": "sk-ant-..."
  }
}

Response:
{
  "valid": true,
  "provider": "anthropic",
  "available_models": [
    "claude-3-5-sonnet-20241022",
    "claude-3-opus-20240229",
    "claude-3-haiku-20240307"
  ],
  "rate_limits": {
    "requests_per_minute": 1000,
    "tokens_per_minute": 100000
  }
}

Tier Availability

Feature Team Enterprise
Archivus-managed AI Default Available
BYOB Anthropic - Available
BYOB OpenAI - Available
BYOB Azure OpenAI - Available
Custom Model Routing - Available
Multi-Provider Failover - Available
Usage Analytics Basic Advanced

API Reference

AI Provider Endpoints

Endpoint Method Description
/admin/ai/provider GET Current provider config
/admin/ai/provider PUT Configure AI provider
/admin/ai/provider/validate POST Test provider config
/admin/ai/routing PUT Configure model routing
/admin/ai/failover PUT Configure failover
/admin/ai/usage GET Usage statistics
/admin/ai/embeddings PUT Configure embeddings


Need help configuring BYOB AI? Contact Enterprise Support →