Skip to content
🔒

Login Required

You need to be logged in to view this content. This page requires Developer access.

Database Sync Architecture

Story: 2-5-2 Database Sync After Email Verification Version: 1.0 Last Updated: 2025-11-25

Overview

The database sync system replicates verified user PII from the user-service canister to the oracle-bridge PostgreSQL database for operational use (marketing, exports, queries).

Architecture Principles

1. Canister is Authoritative

  • Source of Truth: User-service canister (encrypted PII + recovery keys)
  • Convenience Copy: PostgreSQL database (plaintext PII for marketing)
  • Recovery: Database can be lost/rebuilt from canister data

2. Zero-Knowledge Maintained

  • Canister Storage: Encrypted PII (AES-256-GCM with master recovery key)
  • Database Storage: Plaintext PII (synced after verification)
  • Client-Side Encryption: User password never sent to canister
  • Sync Limitation: Only plaintext provided by frontend can be synced

3. Sync After Verification Only

  • Privacy-Respecting: No sync of unverified users
  • Trigger: Email verification completion (verify_code() success)
  • Non-Blocking: Sync failure does NOT block verification success

4. Canister Signature Authentication

  • Ed25519 Signatures: All sync requests must be signed by canister
  • Prevents Tampering: Oracle-bridge validates signature before accepting data
  • Timestamp Validation: 5-minute window to prevent replay attacks

Sync Flow

Automatic Sync (Normal Flow)

┌─────────────────┐
│   Frontend      │
│  (React + II)   │
└────────┬────────┘

         │ 1. User submits verification code
         │    + email
         │    + verification code
         │    + first_name (plaintext)
         │    + last_name (plaintext)


┌─────────────────────────────────────────┐
│  User-Service Canister                  │
│  verify_code(email, code, first, last)  │
└────────┬────────────────────────────────┘

         │ 2. Validate code + email_hash
         │ 3. Mark user as verified
         │ 4. Update verified_at timestamp

         │ 5. Spawn async sync task
         │    (fire-and-forget)


┌──────────────────────────────────────────┐
│  sync_individual_to_database()           │
│  - Build JSON body with plaintext PII    │
│  - Generate Ed25519 signature            │
│  - HTTP POST to oracle-bridge            │
└────────┬─────────────────────────────────┘

         │ 6. HTTP outcall with signature
         │    Headers:
         │    - X-Canister-ID: <canister_id>
         │    - X-Signature: <ed25519_sig>
         │    - X-Timestamp: <milliseconds>


┌──────────────────────────────────────────┐
│  Oracle-Bridge                           │
│  POST /sync/individual                   │
└────────┬─────────────────────────────────┘

         │ 7. requireSignature middleware
         │    - Fetch canister public key
         │    - Verify Ed25519 signature
         │    - Check timestamp (5 min window)

         │ 8. Validate required fields
         │    - id, email, first_name, last_name
         │    - email_hash, submitted_at, etc.

         │ 9. UPSERT to PostgreSQL
         │    - INSERT ... ON CONFLICT UPDATE
         │    - Idempotent (safe to retry)

         │ 10. Log audit event
         │     - Action: SYNC
         │     - User: canister


┌──────────────────────────────────────────┐
│  PostgreSQL Database                     │
│  individuals table                       │
│  - Plaintext PII stored                  │
│  - synced_at = NOW()                     │
└──────────────────────────────────────────┘

Error Handling:

  • Sync failure is logged but does NOT block verification
  • User sees "Email verified successfully!" even if sync fails
  • Sync can be retried manually via admin endpoint

Manual Resync (Admin Flow)

Option 1: Specific User Resync

bash
curl -X POST https://oracle-bridge.example.com/admin/resync/individual \
  -H "Content-Type: application/json" \
  -H "X-Canister-ID: <canister_id>" \
  -H "X-Signature: <signature>" \
  -H "X-Timestamp: <timestamp>" \
  -d '{
    "id": "user-123",
    "email": "john@example.com",
    "first_name": "John",
    "last_name": "Doe",
    "email_hash": "abc123...",
    "verified": true,
    "verified_at": "1700000000000000000",
    "gdpr_marketing_consent": true
  }'

Use Cases:

  • Sync failure during automatic sync
  • Database record corruption
  • Manual data correction

Option 2: Batch Metadata Resync

bash
npm run resync -- \
  --canister-id rrkah-fqaaa-aaaaa-aaaaq-cai \
  --verified-only \
  --dry-run

What Gets Synced:

  • ✅ email_hash (SHA-256)
  • ✅ verified status (boolean)
  • ✅ verified_at timestamp
  • ✅ submitted_at timestamp
  • ❌ Plaintext PII (NOT available - encrypted in canister)

Limitations:

  • Cannot recover plaintext email, first_name, last_name
  • Users must re-verify email to populate plaintext fields
  • Only syncs verification metadata

API Specification

POST /sync/individual

Authentication: Canister signature (Ed25519)

Headers:

Content-Type: application/json
X-Canister-ID: <canister_id>
X-Signature: <base64_signature>
X-Timestamp: <milliseconds>

Request Body:

json
{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "email": "user@example.com",
  "first_name": "John",
  "last_name": "Doe",
  "email_hash": "2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae",
  "verified": true,
  "verified_at": "1700000000000000000",
  "submitted_at": "1699000000000000000",
  "encryption_key_id": "key-123",
  "gdpr_marketing_consent": true,
  "gdpr_deleted": false
}

Response (Success - 200 OK):

json
{
  "success": true
}

Response (Validation Error - 400 Bad Request):

json
{
  "error": "Missing required fields: id, email, first_name, last_name, email_hash, submitted_at, encryption_key_id"
}

Response (Authentication Error - 401 Unauthorized):

json
{
  "error": "Invalid canister signature"
}

Response (Server Error - 500 Internal Server Error):

json
{
  "error": "Failed to sync individual data"
}

POST /admin/resync/individual

Authentication: Canister signature (Ed25519)

Purpose: Manual resync for specific user (admin operation)

Request/Response: Same as /sync/individual but logs action as MANUAL_RESYNC


GET /admin/resync/status

Authentication: Canister signature (Ed25519)

Purpose: Get database sync statistics

Response (200 OK):

json
{
  "success": true,
  "statistics": {
    "total_users": "1523",
    "verified_users": "1204",
    "deleted_users": "12",
    "oldest_sync": "2025-10-01T10:30:00.000Z",
    "latest_sync": "2025-11-25T14:22:33.000Z",
    "stale_syncs": "45"
  }
}

Signature Authentication

Canister Registration

Before syncing, canister must register its public key:

typescript
// 1. Canister generates Ed25519 keypair during init
const keypair = nacl.sign.keyPair();

// 2. Canister registers public key with oracle-bridge
POST /canister/register
{
  "canisterId": "rrkah-fqaaa-aaaaa-aaaaq-cai",
  "publicKey": "<base64_public_key>",
  "signature": "<self_signature>",
  "timestamp": 1700000000000
}

Signature Generation (Canister)

rust
// user-service/src/lib.rs:3099-3118
fn sign_http_request(
    method: &str,
    path: &str,
    body: &str,
    timestamp: u64,
    signing_key_bytes: &[u8]
) -> Result<String, String> {
    let signing_key = SigningKey::from_bytes(&key_bytes);

    // Message format: timestamp + method + path + body
    let message = format!("{}{}{}{}", timestamp, method, path, body);

    // Sign with Ed25519
    let signature = signing_key.sign(message.as_bytes());

    // Return base64-encoded signature
    Ok(base64_engine.encode(signature.to_bytes()))
}

Signature Verification (Oracle-Bridge)

typescript
// oracle-bridge/src/middleware/signature-auth.ts:33-118
export async function requireSignature(req, res, next) {
  const canisterId = req.headers['x-canister-id'];
  const signature = req.headers['x-signature'];
  const timestamp = req.headers['x-timestamp'];

  // 1. Validate timestamp (5-minute window)
  const now = Date.now();
  const fiveMinutes = 5 * 60 * 1000;
  if (Math.abs(now - parseInt(timestamp)) > fiveMinutes) {
    return res.status(401).json({ error: 'Signature expired' });
  }

  // 2. Fetch canister public key (from cache or storage)
  const publicKey = await getCanisterPublicKey(canisterId);

  // 3. Reconstruct message
  const message = timestamp + req.method + req.path + JSON.stringify(req.body);

  // 4. Verify Ed25519 signature
  const verified = nacl.sign.detached.verify(
    Buffer.from(message, 'utf-8'),
    Buffer.from(signature, 'base64'),
    Buffer.from(publicKey, 'base64')
  );

  if (!verified) {
    return res.status(401).json({ error: 'Invalid signature' });
  }

  // 5. Attach canister ID to request
  req.canisterId = canisterId;
  next();
}

Database Operations

UPSERT Implementation

typescript
// oracle-bridge/src/db/individuals.ts:27-87
export async function upsertIndividual(data: IndividualData): Promise<void> {
  const query = `
    INSERT INTO individuals (
      id, email, first_name, last_name, email_hash,
      verified, verified_at, submitted_at,
      canister_id, encryption_key_id,
      gdpr_marketing_consent, gdpr_deleted, synced_at
    )
    VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, NOW())
    ON CONFLICT (id) DO UPDATE SET
      email = EXCLUDED.email,
      first_name = EXCLUDED.first_name,
      last_name = EXCLUDED.last_name,
      verified = EXCLUDED.verified,
      verified_at = EXCLUDED.verified_at,
      gdpr_marketing_consent = EXCLUDED.gdpr_marketing_consent,
      gdpr_deleted = EXCLUDED.gdpr_deleted,
      synced_at = NOW(),
      updated_at = NOW()
  `;

  await pool.query(query, values);
}

Idempotency:

  • Multiple syncs with same data = no duplicate rows
  • Latest data always wins (EXCLUDED values)
  • synced_at always updated to track freshness

Error Handling

Canister-Side Errors

rust
// Sync failure is logged but does NOT block verification
ic_cdk::spawn(async move {
    match sync_individual_to_database(...).await {
        Ok(_) => ic_cdk::println!("✅ Synced user {} to database", user_id),
        Err(e) => ic_cdk::println!("⚠️  Database sync failed: {}", e),
        // User verification still succeeds!
    }
});

Rationale: Email verification is critical user flow, database sync is secondary.

Oracle-Bridge Errors

typescript
try {
  await upsertIndividual(individualData);
  await logAudit('SYNC', 'individuals', id, 'canister', ...);
  res.json({ success: true });
} catch (error) {
  console.error('Error syncing individual:', error);
  res.status(500).json({ error: 'Failed to sync individual data' });
}

Audit Logging:

  • Audit failures do NOT block sync (internal error handling)
  • Logged to console for monitoring

Monitoring

Sync Health Indicators

  1. Stale Syncs: Users with synced_at > 7 days old

    sql
    SELECT COUNT(*) FROM individuals
    WHERE synced_at < NOW() - INTERVAL '7 days';
  2. Unverified Users: Should NOT be in database

    sql
    SELECT COUNT(*) FROM individuals WHERE verified = FALSE;
  3. GDPR Compliance: Deleted users excluded from marketing

    sql
    SELECT COUNT(*) FROM individuals WHERE gdpr_deleted = TRUE;
  4. Audit Log Analysis: Track sync operations

    sql
    SELECT action, COUNT(*) FROM audit_log
    WHERE table_name = 'individuals'
      AND created_at > NOW() - INTERVAL '24 hours'
    GROUP BY action;

Security Considerations

1. Signature Validation

  • Algorithm: Ed25519 (collision-resistant, 256-bit security)
  • Timestamp Window: 5 minutes (prevents replay attacks)
  • Public Key Storage: Cached in oracle-bridge (verified during registration)

2. Data Integrity

  • Canister Signature: Ensures data comes from trusted canister
  • HTTPS Only: All communication over TLS 1.3
  • No Raw Credentials: Database password in environment variables only

3. Privacy

  • Zero-Knowledge Canister: Never sees plaintext passwords or keys
  • Plaintext Database: Only stores data after user email verification
  • GDPR Soft Deletes: Deleted users marked, not hard-deleted

Performance

Throughput

  • Async Sync: Non-blocking, spawned as background task
  • Database Connection Pool: 20 connections max
  • Query Timeout: 2 seconds per operation

Optimization

  • Indexes: All query columns indexed (email, email_hash, verified, etc.)
  • UPSERT: Single query (no SELECT + INSERT/UPDATE race condition)
  • Pagination: Batch resync uses 100 users per page

Production Infrastructure

Implemented: Epic 2.5, Story 2-5-3 (2025-11-25)

Automated Backups

  • Schedule: Daily at 2:00 AM UTC, monthly on 1st of each month
  • Storage: AWS S3 (s3://hwdao-database-backups/)
  • Encryption: Optional GPG (AES-256) or AWS KMS
  • Retention: 30 daily backups + 12 monthly backups
  • Validation: Monthly restoration tests to staging database
  • Scripts: oracle-bridge/scripts/backup-database.sh, validate-backup.sh

SSL/TLS Security

  • Protocol: TLS 1.2+ (TLS 1.0/1.1 disabled)
  • Certificates: Let's Encrypt or cloud provider managed
  • Connection Mode: sslmode=require in DATABASE_URL
  • Auto-Renewal: Certbot cron job (twice daily)
  • Monitoring: Certificate expiration alerts (30 days, 7 days)

Monitoring and Alerting

  • Metrics: Uptime (60s checks), disk space, connection pool, slow queries, backup success
  • Alerts: Critical (database down, disk >90%, backup failure), Warning (disk >80%, slow queries)
  • Dashboard: Grafana with Prometheus metrics
  • Configuration: oracle-bridge/monitoring/ (prometheus.yml, alert-rules.yml, grafana-dashboard.json)

Disaster Recovery

  • RTO: 4 hours (time to full restoration)
  • RPO: 24 hours (maximum data loss)
  • Primary Method: Backup restoration (Procedure D in runbook)
  • Fallback Method: Canister metadata resync (Procedure A)
  • Runbook: docs/operations/database-rebuild-runbook.md

Production Readiness Checklist

  • [x] Automated daily backups configured
  • [x] SSL/TLS enforced for all connections
  • [x] Monitoring dashboards deployed
  • [x] Alerting configured for critical failures
  • [x] Disaster recovery procedures documented
  • [x] Monthly backup validation scheduled
  • [x] Certificate auto-renewal configured

Next Steps for Deployment:

  1. Configure production environment variables (DATABASE_URL with SSL, BACKUP_S3_BUCKET)
  2. Set up AWS S3 bucket for backups
  3. Install and configure Prometheus + Grafana
  4. Schedule cron jobs for backups and validation
  5. Configure certbot for SSL certificate management
  6. Test disaster recovery procedures in staging
  7. Schedule first quarterly disaster recovery drill

Future Enhancements

  • Replication: Multi-region read replicas for high availability
  • Point-in-Time Recovery: Transaction log archival for sub-daily RPO
  • Query Optimization: Automated slow query analysis and index recommendations
  • Capacity Planning: Predictive analytics for storage and connection pool scaling

Hello World Co-Op DAO