Google Drive Migration

Import your documents from Google Drive into Archivus with intelligent processing, metadata preservation, and optional AI-powered analysis.


Overview

The Google Drive Migration feature allows you to bulk import documents from your Google Drive account into Archivus. The migration wizard guides you through selecting folders, configuring import settings, and monitoring progress. Once imported, your documents are automatically indexed for semantic search and available for AI-powered analysis.

Availability: Teams plan and above


Quick Start

  1. Connect Google Drive - Go to Settings > Integrations and connect your Google account
  2. Start Migration - Navigate to Migrations and click “New Migration”
  3. Select Folders - Browse and select the Google Drive folders to import
  4. Review & Start - Confirm your settings and start the migration
  5. Monitor Progress - Track the import progress in real-time

Prerequisites

Before starting a migration, ensure you have:

  • Teams Plan or higher - Migration is available on Teams, Business, and Enterprise plans
  • Google Account Connected - Your Google Drive must be linked via OAuth in Settings > Integrations
  • Sufficient Storage - Verify you have enough storage quota for the documents you plan to import

Step-by-Step Guide

Step 1: Connect Google Drive

If you haven’t already connected your Google account:

  1. Go to Settings > Integrations
  2. Find Google Drive in the list of available integrations
  3. Click Connect
  4. Sign in with your Google account and grant the requested permissions
  5. You’ll be redirected back to Archivus once connected

Permissions requested:

  • View files and folders in your Google Drive
  • Download files from your Google Drive

Step 2: Create a New Migration Project

  1. Navigate to Migrations in the main navigation
  2. Click the New Migration button
  3. You’ll be taken through a 4-step wizard

Step 3: Configure the Migration

Source Selection

Select Google Drive as your source. Other providers like Dropbox and OneDrive are coming soon.

Project Details

Field Description
Project Name Give your migration a descriptive name (e.g., “Q4 2024 Financial Reports”)
Description Optional notes about what you’re importing and why

Folder Selection

Use the folder browser to select which folders to import:

  1. Browse your Google Drive folder structure
  2. Click the checkbox next to folders you want to import
  3. Selected folders appear as chips below the browser
  4. You can select multiple folders from different locations
  5. Subfolders within selected folders are included automatically

Tip: Select specific folders rather than your entire Drive to keep imports focused and manageable.

Review

Before creating, review your configuration:

  • Source: Google Drive
  • Project name and description
  • Selected folders (shown as a list)

Click Create Project to proceed.

Step 4: Start the Migration

After creating the project:

  1. The system scans your selected folders to discover all files
  2. Once scanning completes, you’ll see a count of files to be imported
  3. Review the file list if desired (filters available by type, size, status)
  4. Click Start Migration to begin importing

Migration Statuses

Your migration project will progress through these statuses:

Status Description
Draft Project created but not yet configured
Configured Ready to scan for files
Scanning Discovering files in selected folders
Ready Scan complete, ready to start migration
In Progress Actively importing documents
Paused Migration paused (can be resumed)
Completed All documents successfully imported
Completed with Errors Migration finished but some files failed
Cancelled Migration was cancelled by user
Failed Migration encountered a critical error

Managing Migrations

Pause a Migration

If you need to temporarily stop an in-progress migration:

  1. Open the migration project
  2. Click the Pause button
  3. The migration will stop after completing the current file
  4. Resume anytime by clicking Resume

Resume a Migration

To continue a paused migration:

  1. Open the migration project
  2. Click Resume
  3. The migration continues from where it stopped

Cancel a Migration

To permanently stop a migration:

  1. Open the migration project
  2. Click Cancel
  3. Files already imported will remain in Archivus
  4. Remaining files will not be imported

Delete a Migration Project

Completed, cancelled, or failed migration projects can be deleted:

  1. Open the migration project
  2. Click Delete (only available for terminal statuses)
  3. This removes the migration record but not the already-imported documents

Google Docs Handling

Google Docs (Documents, Spreadsheets, Slides, Drawings) are automatically converted during import:

Google Format Export Options
Documents PDF (default), DOCX, ODT, TXT, HTML
Spreadsheets XLSX (default), ODS, CSV, PDF
Presentations PDF (default), PPTX, ODP
Drawings PDF (default), PNG, SVG

The default export format (PDF) is used unless you configure otherwise.


AI Agent Integration

Supercharge your migration by assigning AI agents to process documents as they’re imported.

Available Processing Phases

Phase When It Runs
Post-Import Immediately after each document is imported
Batch After every N documents (configurable batch size)
Completion Once when the entire migration finishes

How to Assign Agents

  1. Open your migration project
  2. Go to the Agents tab
  3. Click Add Agent
  4. Select an agent from your available agents
  5. Choose the processing phase
  6. (For batch phase) Set the batch size
  7. Optionally enable auto-approval for agent actions

Example Use Cases

Use Case Phase Description
Summarize each doc Post-Import Automatically generate summaries as documents arrive
Topic analysis Batch Generate topic reports every 50 documents
Full report Completion Create comprehensive summary of all imported content
Auto-tagging Post-Import Apply AI-generated tags to each document
Compliance check Post-Import Scan documents for compliance issues

Filtering Options

When scanning folders, you can configure filters to control which files are imported.

File Type Filtering

Include or exclude specific file types:

Include: pdf, docx, xlsx
Exclude: tmp, bak, log

Size Limits

Set a maximum file size to skip large files that might not be needed.

Setting Description
No limit Import all files regardless of size
Custom limit Skip files larger than specified size

Exclusion Patterns

Use wildcard patterns to exclude files:

Pattern Effect
*.tmp Exclude temporary files
~* Exclude files starting with tilde
archive/* Exclude files in “archive” folders
.hidden* Exclude hidden files

Shared Files

Choose whether to include files that have been shared with you or only files you own:

Option Description
My files only Only import files you own
Include shared Also import files shared with you

Progress Tracking

Monitor your migration in real-time:

  • Progress Bar - Visual percentage of completion
  • File Counts - Total files, successfully processed, and failed
  • Byte Progress - Track data transferred
  • Status Badges - Animated indicators during active operations
  • File List - Browse individual files with their status

Progress Details

Metric Description
Total Items Number of files discovered during scan
Processed Files successfully imported
Failed Files that encountered errors
Skipped Files excluded by filters or duplicates
Progress % Overall completion percentage

Error Handling

If files fail to import:

  1. Failed files are tracked separately and don’t stop the migration
  2. View failed files in the project’s file list (filter by “Failed”)
  3. Each failed file shows its error reason
  4. Failed files can be retried up to 3 times automatically
  5. After max retries, files are marked as permanently failed

Common Failure Reasons

Error Cause Solution
File too large Exceeds size limit Adjust size filter or skip
Unsupported format File type not recognized Convert to supported format
Permission denied Access revoked in Drive Re-authorize connection
File deleted File removed during migration No action needed
Network timeout Connection issues Automatic retry

Duplicate Detection

Archivus automatically detects duplicates:

  • Files are identified by their content hash
  • If a file already exists in Archivus, it won’t be imported again
  • Duplicate detection prevents wasted storage and processing time
  • Duplicates are counted as “Skipped” in progress tracking

API Reference

For programmatic access to migration features:

Create Migration Project

curl -X POST https://api.archivus.app/api/v1/migrations/projects \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "X-Tenant-Subdomain: your-tenant" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Q4 2024 Documents",
    "description": "Importing Q4 financial reports",
    "source_type": "google_drive",
    "config": {
      "folder_ids": ["folder_id_1", "folder_id_2"]
    }
  }'

List Migration Projects

curl https://api.archivus.app/api/v1/migrations/projects \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "X-Tenant-Subdomain: your-tenant"

Start Migration

curl -X POST https://api.archivus.app/api/v1/migrations/projects/{id}/start \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "X-Tenant-Subdomain: your-tenant"

Check Progress

curl https://api.archivus.app/api/v1/migrations/projects/{id}/progress \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "X-Tenant-Subdomain: your-tenant"

All Migration Endpoints

Endpoint Method Description
/migrations/projects POST Create a new migration project
/migrations/projects GET List all migration projects
/migrations/projects/{id} GET Get project details
/migrations/projects/{id} DELETE Delete a project
/migrations/projects/{id}/scan POST Start folder scanning
/migrations/projects/{id}/start POST Begin migration
/migrations/projects/{id}/pause POST Pause migration
/migrations/projects/{id}/resume POST Resume migration
/migrations/projects/{id}/cancel POST Cancel migration
/migrations/projects/{id}/progress GET Get current progress
/migrations/projects/{id}/items GET List files with filters
/migrations/projects/{id}/agents POST Assign an AI agent

Best Practices

Plan Your Migration

  1. Organize first - Clean up your Google Drive before migrating
  2. Start small - Test with a small folder before migrating large amounts
  3. Check storage - Ensure you have sufficient storage quota
  4. Off-peak timing - Run large migrations during off-peak hours

Folder Organization

  1. Be selective - Import only what you need, not everything
  2. Use descriptive names - Name migration projects clearly
  3. Group logically - Create separate migrations for different projects or departments

AI Agent Setup

  1. Pre-create agents - Set up AI agents before starting the migration
  2. Test agents first - Verify agents work as expected on a few documents
  3. Start with post-import - Begin with post-import phase for immediate results
  4. Monitor AI credits - Ensure you have sufficient credits for processing

Large Migrations

For migrations with 1,000+ files:

  1. Break into batches - Create multiple smaller migration projects
  2. Monitor progress - Check in periodically for any issues
  3. Use off-hours - Run during nights or weekends for better performance
  4. Assign agents wisely - Use batch processing instead of post-import for efficiency

Troubleshooting

“Google Drive not connected”

Go to Settings > Integrations and connect your Google account. You’ll need to sign in and grant permissions.

“Scanning taking too long”

Large folders with many files take longer to scan. The scanning process is thorough to ensure all files are discovered correctly. Wait for completion before starting the migration.

“Some files failed to import”

Check the failed files list for specific error messages. Common fixes:

  • Re-try the migration (already-imported files will be skipped)
  • Check if files still exist in Google Drive
  • Verify file permissions haven’t changed
  • Check file sizes are within limits

“Migration stuck at 0%”

  1. Check your internet connection
  2. Verify Google Drive is still connected (Settings > Integrations)
  3. Try pausing and resuming the migration
  4. If issues persist, cancel and create a new migration

“Not enough storage”

Your Archivus storage quota may be full:

  1. Check storage usage in Settings > Usage
  2. Delete unused documents
  3. Upgrade your plan for more storage
  4. Reduce migration scope by excluding large files

“Token expired”

Your Google authorization may have expired:

  1. Go to Settings > Integrations
  2. Disconnect Google Drive
  3. Reconnect and re-authorize
  4. Resume or restart your migration

Code Examples

Python - Create and Run Migration

import requests
import time

API_BASE = "https://api.archivus.app/api/v1"
HEADERS = {
    "Authorization": "Bearer YOUR_API_KEY",
    "X-Tenant-Subdomain": "your-tenant",
    "Content-Type": "application/json"
}

def create_migration(name, folder_ids):
    """Create a new migration project."""
    response = requests.post(
        f"{API_BASE}/migrations/projects",
        headers=HEADERS,
        json={
            "name": name,
            "source_type": "google_drive",
            "config": {"folder_ids": folder_ids}
        }
    )
    return response.json()

def start_migration(project_id):
    """Start the migration after scanning."""
    requests.post(
        f"{API_BASE}/migrations/projects/{project_id}/start",
        headers=HEADERS
    )

def get_progress(project_id):
    """Check migration progress."""
    response = requests.get(
        f"{API_BASE}/migrations/projects/{project_id}/progress",
        headers=HEADERS
    )
    return response.json()

def run_migration(name, folder_ids):
    """Create and run a migration, polling until complete."""
    # Create project
    project = create_migration(name, folder_ids)
    project_id = project["id"]
    print(f"Created project: {project_id}")

    # Wait for scan to complete
    while True:
        progress = get_progress(project_id)
        if progress["status"] == "ready":
            break
        print(f"Scanning... {progress.get('total_items', 0)} files found")
        time.sleep(5)

    # Start migration
    start_migration(project_id)
    print("Migration started")

    # Poll until complete
    while True:
        progress = get_progress(project_id)
        status = progress["status"]
        pct = progress.get("progress_percent", 0)

        print(f"Status: {status} - {pct}% complete")

        if status in ["completed", "failed", "cancelled"]:
            break
        time.sleep(10)

    print(f"Migration finished with status: {status}")
    return progress

# Usage
run_migration(
    "Q4 Reports",
    ["1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms"]
)

JavaScript - Monitor Migration Progress

async function monitorMigration(projectId, apiKey, tenant) {
  const headers = {
    'Authorization': `Bearer ${apiKey}`,
    'X-Tenant-Subdomain': tenant
  };

  const checkProgress = async () => {
    const response = await fetch(
      `https://api.archivus.app/api/v1/migrations/projects/${projectId}/progress`,
      { headers }
    );
    return response.json();
  };

  // Poll every 5 seconds
  const interval = setInterval(async () => {
    const progress = await checkProgress();

    console.log(`Status: ${progress.status}`);
    console.log(`Progress: ${progress.progress_percent}%`);
    console.log(`Files: ${progress.processed_items}/${progress.total_items}`);

    if (['completed', 'failed', 'cancelled'].includes(progress.status)) {
      clearInterval(interval);
      console.log('Migration complete!');
    }
  }, 5000);
}

// Usage
monitorMigration('proj_abc123', 'YOUR_API_KEY', 'your-tenant');

Future Enhancements

Coming soon:

  • Dropbox integration
  • OneDrive/SharePoint integration
  • Scheduled migrations - Set up recurring imports
  • Priority-based processing - Process important files first
  • Custom field mapping - Map Google Drive metadata to Archivus fields
  • Incremental sync - Automatically sync new files

Next Steps


Questions? Contact support@archivusdms.com