Upgrades & Migrations
Vulcan is designed for zero-downtime upgrades with automatic database migrations.
How Upgrades Work
┌─────────────────────────────────────────────────────────────┐
│ Upgrade Flow │
├─────────────────────────────────────────────────────────────┤
│ │
│ 1. New version pushed to main branch │
│ ↓ │
│ 2. GitHub Actions builds Docker image │
│ ↓ │
│ 3. ECS starts new task with new image │
│ ↓ │
│ 4. New task runs migrations automatically │
│ ↓ │
│ 5. Health check passes → traffic routes to new task │
│ ↓ │
│ 6. Old tasks drain connections gracefully │
│ ↓ │
│ 7. Old tasks terminate │
│ │
│ ✅ Database persists throughout — no data loss │
└─────────────────────────────────────────────────────────────┘
Database Migrations
Vulcan uses a sequential migration system that runs automatically on startup.
How It Works
- Version tracking — Each migration has a unique version number
- Tracked in database — Applied migrations are recorded in a
migrationstable - Transactional — Each migration runs in a transaction (automatic rollback on error)
- Idempotent — Safe to run multiple times (won't re-apply)
Migration Log
On startup, you'll see:
INF connecting to postgres...
INF postgres connected successfully
INF running database migrations...
INF Applying migration version=1 description="Add relay_tokens table"
INF Migration applied version=1 duration=45ms
INF Applying migration version=2 description="Add threat_intel_config table"
INF Migration applied version=2 duration=12ms
INF migrations complete
Checking Migration Status
SELECT version, description, applied_at, duration_ms
FROM migrations
ORDER BY version;
Data Persistence
What Survives Upgrades
| Data | Storage | Preserved |
|---|---|---|
| Tenants & users | RDS PostgreSQL | ✅ Yes |
| Discovered assets (nodes/edges) | RDS PostgreSQL | ✅ Yes |
| Compliance findings | RDS PostgreSQL | ✅ Yes |
| Credentials (encrypted) | RDS PostgreSQL | ✅ Yes |
| Audit logs | RDS PostgreSQL | ✅ Yes |
| Generated documents | RDS + S3 | ✅ Yes |
| Scheduled jobs | RDS PostgreSQL | ✅ Yes |
| License information | RDS PostgreSQL | ✅ Yes |
What Reconnects Automatically
| Component | Behavior |
|---|---|
| WebSocket connections | Clients auto-reconnect |
| Relay connectors | Auto-reconnect with backoff |
| Agent heartbeats | Resume on next interval |
| Scheduled scans | Continue from schedule |
ECS Deployment Details
Rolling Update Strategy
deploymentConfiguration:
maximumPercent: 200
minimumHealthyPercent: 100
This means:
- New tasks start before old tasks stop
- At least 100% capacity maintained during deploy
- Up to 200% capacity during transition
Health Checks
ECS waits for health checks before routing traffic:
healthCheck:
command: ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"]
interval: 30
timeout: 5
retries: 3
startPeriod: 60
Connection Draining
ALB drains connections gracefully:
- New requests → new tasks
- In-flight requests → complete on old tasks
- Default drain time: 300 seconds
Rollback Procedures
Automatic Rollback
If a new task fails health checks, ECS automatically:
- Stops the failing task
- Keeps old tasks running
- Reports deployment failure
Manual Rollback
Option 1: Revert to previous image
# Deploy previous version
aws ecs update-service \
--cluster vulcan-prod \
--service vulcan-prod \
--task-definition vulcan-prod:PREVIOUS_REVISION \
--force-new-deployment
Option 2: Database migration rollback
# SSH to container
aws ecs execute-command --cluster vulcan-prod --task TASK_ID --command "/bin/sh"
# Roll back last N migrations
vulcan migrate down 2
Option 3: Point-in-time recovery
# Restore RDS to specific time
aws rds restore-db-instance-to-point-in-time \
--source-db-instance-identifier vulcan-prod \
--target-db-instance-identifier vulcan-prod-restored \
--restore-time 2026-04-10T12:00:00Z
Migration Best Practices
Safe Migration Patterns
| ✅ Safe | ❌ Avoid |
|---|---|
ADD COLUMN | DROP COLUMN |
CREATE TABLE IF NOT EXISTS | DROP TABLE |
ADD INDEX CONCURRENTLY | ALTER COLUMN TYPE |
| New nullable columns | Renaming columns |
Adding a New Migration
- Add to
internal/storage/postgres/migrations.go:
{
Version: 6,
Description: "Add new_feature table",
Up: func(ctx context.Context, tx pgx.Tx) error {
_, err := tx.Exec(ctx, `
CREATE TABLE IF NOT EXISTS new_feature (
id TEXT PRIMARY KEY,
tenant_id TEXT NOT NULL,
created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE INDEX IF NOT EXISTS idx_new_feature_tenant
ON new_feature(tenant_id);
`)
return err
},
Down: func(ctx context.Context, tx pgx.Tx) error {
_, err := tx.Exec(ctx, `DROP TABLE IF EXISTS new_feature;`)
return err
},
},
- Test locally:
# Apply migration
go run ./cmd/vulcan serve --db-type postgres --db-url "$DB_URL"
# Verify
psql $DB_URL -c "SELECT * FROM migrations ORDER BY version;"
Monitoring Upgrades
CloudWatch Metrics
Monitor during deployments:
ECS/CPUUtilization— Should stay stableECS/MemoryUtilization— Should stay stableALB/HealthyHostCount— Should not drop to zeroALB/UnHealthyHostCount— Should return to zeroRDS/DatabaseConnections— May spike briefly
Deployment Alerts
Set up alerts for:
ECS Deployment State = FAILED
ALB HealthyHostCount < 1
RDS DatabaseConnections > 80% of max
Scheduled Maintenance Windows
For major version upgrades that require extended migrations:
- Announce maintenance — Notify users in advance
- Scale down — Reduce to single task
- Run migrations — May take longer for large datasets
- Verify — Check migration status and data integrity
- Scale up — Return to normal capacity
- Monitor — Watch metrics for 30 minutes
FAQ
Do I need to stop the service for upgrades?
No. The rolling deployment strategy ensures zero downtime for normal upgrades.
What if a migration fails?
The migration runs in a transaction. On failure:
- Transaction rolls back automatically
- Task fails health check
- ECS keeps old tasks running
- No data is modified
How long do migrations take?
Most migrations complete in under 1 second. Large table alterations may take longer, but these are rare and announced in release notes.
Can I skip migrations?
No. Migrations are required and run automatically. Skipping would cause schema mismatches and errors.
How do I check what migrations have run?
SELECT * FROM migrations ORDER BY version;
Or via API:
GET /api/v1/admin/migrations