Skip to main content

Upgrades & Migrations

Vulcan is designed for zero-downtime upgrades with automatic database migrations.

How Upgrades Work

┌─────────────────────────────────────────────────────────────┐
│ Upgrade Flow │
├─────────────────────────────────────────────────────────────┤
│ │
│ 1. New version pushed to main branch │
│ ↓ │
│ 2. GitHub Actions builds Docker image │
│ ↓ │
│ 3. ECS starts new task with new image │
│ ↓ │
│ 4. New task runs migrations automatically │
│ ↓ │
│ 5. Health check passes → traffic routes to new task │
│ ↓ │
│ 6. Old tasks drain connections gracefully │
│ ↓ │
│ 7. Old tasks terminate │
│ │
│ ✅ Database persists throughout — no data loss │
└─────────────────────────────────────────────────────────────┘

Database Migrations

Vulcan uses a sequential migration system that runs automatically on startup.

How It Works

  1. Version tracking — Each migration has a unique version number
  2. Tracked in database — Applied migrations are recorded in a migrations table
  3. Transactional — Each migration runs in a transaction (automatic rollback on error)
  4. Idempotent — Safe to run multiple times (won't re-apply)

Migration Log

On startup, you'll see:

INF connecting to postgres...
INF postgres connected successfully
INF running database migrations...
INF Applying migration version=1 description="Add relay_tokens table"
INF Migration applied version=1 duration=45ms
INF Applying migration version=2 description="Add threat_intel_config table"
INF Migration applied version=2 duration=12ms
INF migrations complete

Checking Migration Status

SELECT version, description, applied_at, duration_ms 
FROM migrations
ORDER BY version;

Data Persistence

What Survives Upgrades

DataStoragePreserved
Tenants & usersRDS PostgreSQL✅ Yes
Discovered assets (nodes/edges)RDS PostgreSQL✅ Yes
Compliance findingsRDS PostgreSQL✅ Yes
Credentials (encrypted)RDS PostgreSQL✅ Yes
Audit logsRDS PostgreSQL✅ Yes
Generated documentsRDS + S3✅ Yes
Scheduled jobsRDS PostgreSQL✅ Yes
License informationRDS PostgreSQL✅ Yes

What Reconnects Automatically

ComponentBehavior
WebSocket connectionsClients auto-reconnect
Relay connectorsAuto-reconnect with backoff
Agent heartbeatsResume on next interval
Scheduled scansContinue from schedule

ECS Deployment Details

Rolling Update Strategy

deploymentConfiguration:
maximumPercent: 200
minimumHealthyPercent: 100

This means:

  • New tasks start before old tasks stop
  • At least 100% capacity maintained during deploy
  • Up to 200% capacity during transition

Health Checks

ECS waits for health checks before routing traffic:

healthCheck:
command: ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"]
interval: 30
timeout: 5
retries: 3
startPeriod: 60

Connection Draining

ALB drains connections gracefully:

  • New requests → new tasks
  • In-flight requests → complete on old tasks
  • Default drain time: 300 seconds

Rollback Procedures

Automatic Rollback

If a new task fails health checks, ECS automatically:

  1. Stops the failing task
  2. Keeps old tasks running
  3. Reports deployment failure

Manual Rollback

Option 1: Revert to previous image

# Deploy previous version
aws ecs update-service \
--cluster vulcan-prod \
--service vulcan-prod \
--task-definition vulcan-prod:PREVIOUS_REVISION \
--force-new-deployment

Option 2: Database migration rollback

# SSH to container
aws ecs execute-command --cluster vulcan-prod --task TASK_ID --command "/bin/sh"

# Roll back last N migrations
vulcan migrate down 2

Option 3: Point-in-time recovery

# Restore RDS to specific time
aws rds restore-db-instance-to-point-in-time \
--source-db-instance-identifier vulcan-prod \
--target-db-instance-identifier vulcan-prod-restored \
--restore-time 2026-04-10T12:00:00Z

Migration Best Practices

Safe Migration Patterns

✅ Safe❌ Avoid
ADD COLUMNDROP COLUMN
CREATE TABLE IF NOT EXISTSDROP TABLE
ADD INDEX CONCURRENTLYALTER COLUMN TYPE
New nullable columnsRenaming columns

Adding a New Migration

  1. Add to internal/storage/postgres/migrations.go:
{
Version: 6,
Description: "Add new_feature table",
Up: func(ctx context.Context, tx pgx.Tx) error {
_, err := tx.Exec(ctx, `
CREATE TABLE IF NOT EXISTS new_feature (
id TEXT PRIMARY KEY,
tenant_id TEXT NOT NULL,
created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE INDEX IF NOT EXISTS idx_new_feature_tenant
ON new_feature(tenant_id);
`)
return err
},
Down: func(ctx context.Context, tx pgx.Tx) error {
_, err := tx.Exec(ctx, `DROP TABLE IF EXISTS new_feature;`)
return err
},
},
  1. Test locally:
# Apply migration
go run ./cmd/vulcan serve --db-type postgres --db-url "$DB_URL"

# Verify
psql $DB_URL -c "SELECT * FROM migrations ORDER BY version;"

Monitoring Upgrades

CloudWatch Metrics

Monitor during deployments:

  • ECS/CPUUtilization — Should stay stable
  • ECS/MemoryUtilization — Should stay stable
  • ALB/HealthyHostCount — Should not drop to zero
  • ALB/UnHealthyHostCount — Should return to zero
  • RDS/DatabaseConnections — May spike briefly

Deployment Alerts

Set up alerts for:

ECS Deployment State = FAILED
ALB HealthyHostCount < 1
RDS DatabaseConnections > 80% of max

Scheduled Maintenance Windows

For major version upgrades that require extended migrations:

  1. Announce maintenance — Notify users in advance
  2. Scale down — Reduce to single task
  3. Run migrations — May take longer for large datasets
  4. Verify — Check migration status and data integrity
  5. Scale up — Return to normal capacity
  6. Monitor — Watch metrics for 30 minutes

FAQ

Do I need to stop the service for upgrades?

No. The rolling deployment strategy ensures zero downtime for normal upgrades.

What if a migration fails?

The migration runs in a transaction. On failure:

  1. Transaction rolls back automatically
  2. Task fails health check
  3. ECS keeps old tasks running
  4. No data is modified

How long do migrations take?

Most migrations complete in under 1 second. Large table alterations may take longer, but these are rare and announced in release notes.

Can I skip migrations?

No. Migrations are required and run automatically. Skipping would cause schema mismatches and errors.

How do I check what migrations have run?

SELECT * FROM migrations ORDER BY version;

Or via API:

GET /api/v1/admin/migrations