Bring Your Own AI Provider
Use Your Own AI Infrastructure for Document Intelligence (Enterprise)
Overview
Enterprise customers can configure Archivus to use their own AI provider accounts. This enables cost control, compliance with data processing requirements, and integration with existing AI infrastructure agreements.
Supported Providers
| Provider | Models | Use Case |
|---|---|---|
| Anthropic | Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Haiku | Default, best document understanding |
| OpenAI | GPT-4 Turbo, GPT-4o, GPT-3.5 Turbo | Alternative, cost optimization |
| Azure OpenAI | GPT-4, GPT-3.5 (Azure-hosted) | Enterprise compliance, data residency |
Configuration
Anthropic (Claude)
PUT /api/v1/admin/ai/provider
Content-Type: application/json
Authorization: Bearer YOUR_ADMIN_TOKEN
{
"provider": "anthropic",
"config": {
"api_key": "sk-ant-...",
"default_model": "claude-3-5-sonnet-20241022",
"fallback_model": "claude-3-haiku-20240307",
"max_tokens": 4096,
"rate_limit_rpm": 1000
}
}
Supported Models:
claude-3-5-sonnet-20241022- Best quality/speed balanceclaude-3-opus-20240229- Highest qualityclaude-3-haiku-20240307- Fastest, most cost-effective
OpenAI
PUT /api/v1/admin/ai/provider
{
"provider": "openai",
"config": {
"api_key": "sk-...",
"organization_id": "org-...",
"default_model": "gpt-4-turbo-preview",
"fallback_model": "gpt-3.5-turbo",
"max_tokens": 4096
}
}
Supported Models:
gpt-4-turbo-preview- Latest GPT-4gpt-4o- Optimized GPT-4gpt-3.5-turbo- Cost-effective option
Azure OpenAI
PUT /api/v1/admin/ai/provider
{
"provider": "azure_openai",
"config": {
"endpoint": "https://your-resource.openai.azure.com",
"api_key": "...",
"api_version": "2024-02-15-preview",
"deployment_name": "gpt-4-deployment",
"fallback_deployment": "gpt-35-turbo-deployment"
}
}
Requirements:
- Azure OpenAI Service resource
- Model deployments created
- API access enabled
Model Routing
Configure intelligent model selection based on task type:
PUT /api/v1/admin/ai/routing
{
"routing_rules": [
{
"task_type": "document_summary",
"model": "claude-3-haiku-20240307",
"reason": "Fast, cost-effective for summaries"
},
{
"task_type": "document_analysis",
"model": "claude-3-5-sonnet-20241022",
"reason": "Best quality for detailed analysis"
},
{
"task_type": "document_comparison",
"model": "claude-3-5-sonnet-20241022",
"reason": "Complex reasoning required"
},
{
"task_type": "entity_extraction",
"model": "claude-3-haiku-20240307",
"reason": "Structured output, fast"
},
{
"task_type": "dag_workflow",
"model": "claude-3-5-sonnet-20241022",
"reason": "Multi-step reasoning"
}
]
}
Task Types
| Task Type | Description | Recommended Model |
|---|---|---|
document_summary |
Quick summaries | Haiku/GPT-3.5 |
document_analysis |
Deep analysis | Sonnet/GPT-4 |
document_comparison |
Multi-doc comparison | Sonnet/GPT-4 |
entity_extraction |
Extract entities | Haiku/GPT-3.5 |
classification |
Categorization | Haiku/GPT-3.5 |
qa_simple |
Basic Q&A | Haiku/GPT-3.5 |
qa_complex |
Complex reasoning | Sonnet/GPT-4 |
dag_workflow |
Workflow nodes | Sonnet/GPT-4 |
research |
Web research | Sonnet/GPT-4 |
Cost Management
Usage Tracking
Monitor AI costs in real-time:
GET /api/v1/admin/ai/usage?period=month
Response:
{
"period": "2026-01",
"total_tokens": 15420000,
"total_requests": 28500,
"cost_estimate_usd": 234.50,
"by_model": [
{
"model": "claude-3-5-sonnet-20241022",
"input_tokens": 10000000,
"output_tokens": 2500000,
"requests": 15000,
"cost_usd": 187.50
},
{
"model": "claude-3-haiku-20240307",
"input_tokens": 2500000,
"output_tokens": 420000,
"requests": 13500,
"cost_usd": 47.00
}
],
"by_task_type": [
{"task": "document_analysis", "requests": 8000, "cost_usd": 95.00},
{"task": "document_summary", "requests": 12000, "cost_usd": 28.00}
]
}
Cost Optimization
Automatic Optimization:
- Prompt caching for repeated content (70-90% savings)
- Context windowing for long conversations (60-80% savings)
- Smart model routing based on task complexity
Manual Controls:
- Set per-model token limits
- Configure cost alerts
- Enable/disable specific models
Failover Configuration
Configure automatic failover for high availability:
PUT /api/v1/admin/ai/failover
{
"enabled": true,
"primary_provider": "anthropic",
"failover_providers": [
{
"provider": "openai",
"trigger_on": ["rate_limit", "service_unavailable"],
"cooldown_minutes": 5
},
{
"provider": "azure_openai",
"trigger_on": ["service_unavailable"],
"cooldown_minutes": 10
}
],
"circuit_breaker": {
"failure_threshold": 5,
"reset_timeout_seconds": 60
}
}
Failover Triggers:
rate_limit- Provider rate limit exceededservice_unavailable- Provider API downtimeout- Request timeouterror_rate- High error rate detected
Security
Credential Management
- All API keys encrypted with AES-256-GCM
- Keys stored separately from application data
- Automatic key rotation support
- Audit logging for key access
Data Privacy
Your AI provider configuration ensures:
- Direct API Calls - Requests go directly to your provider
- No Data Logging - Archivus doesn’t log prompts/responses
- Your Agreement - Subject to your provider agreement
- Data Residency - Use Azure OpenAI for specific regions
Embeddings Configuration
Configure embedding models for semantic search:
PUT /api/v1/admin/ai/embeddings
{
"provider": "openai",
"config": {
"api_key": "sk-...",
"model": "text-embedding-3-small",
"dimensions": 1536
}
}
Supported Embedding Models:
text-embedding-3-small(OpenAI) - Default, cost-effectivetext-embedding-3-large(OpenAI) - Higher qualitytext-embedding-ada-002(OpenAI) - Legacy
Validation
Test Provider Configuration
POST /api/v1/admin/ai/provider/validate
{
"provider": "anthropic",
"config": {
"api_key": "sk-ant-..."
}
}
Response:
{
"valid": true,
"provider": "anthropic",
"available_models": [
"claude-3-5-sonnet-20241022",
"claude-3-opus-20240229",
"claude-3-haiku-20240307"
],
"rate_limits": {
"requests_per_minute": 1000,
"tokens_per_minute": 100000
}
}
Tier Availability
| Feature | Team | Enterprise |
|---|---|---|
| Archivus-managed AI | Default | Available |
| BYOB Anthropic | - | Available |
| BYOB OpenAI | - | Available |
| BYOB Azure OpenAI | - | Available |
| Custom Model Routing | - | Available |
| Multi-Provider Failover | - | Available |
| Usage Analytics | Basic | Advanced |
API Reference
AI Provider Endpoints
| Endpoint | Method | Description |
|---|---|---|
/admin/ai/provider |
GET | Current provider config |
/admin/ai/provider |
PUT | Configure AI provider |
/admin/ai/provider/validate |
POST | Test provider config |
/admin/ai/routing |
PUT | Configure model routing |
/admin/ai/failover |
PUT | Configure failover |
/admin/ai/usage |
GET | Usage statistics |
/admin/ai/embeddings |
PUT | Configure embeddings |
Related Documentation
Need help configuring BYOB AI? Contact Enterprise Support →