Skip to content
🔒

Login Required

You need to be logged in to view this content. This page requires Member access.

Database Connectivity Issues ​

Last Updated: 2025-12-04 Alert: EmailDeliveryFailures, OracleBridgeHighErrorRate Severity: Warning / High Response Time: < 1 hour

Overview ​

This runbook covers troubleshooting connectivity issues between the oracle-bridge service and external databases/services, as well as issues with the PostgreSQL database for user data sync (Epic 2.5).

Affected Systems ​

  • Oracle-bridge service (off-chain)
  • PostgreSQL database (user data sync)
  • External services (Stripe, email provider)

Symptoms ​

  • Alert: EmailDeliveryFailures or high error rate
  • Email verification emails not being sent
  • Stripe webhooks not being processed
  • Database sync failures
  • Oracle-bridge health check failing

Diagnosis ​

Step 1: Check Oracle-Bridge Health ​

bash
# Check health endpoint
curl -s https://oracle.helloworlddao.com/health

# Expected response: {"status": "ok"}

Step 2: Check Database Connectivity ​

If oracle-bridge can't connect to PostgreSQL:

bash
# SSH to oracle-bridge server
ssh deploy@oracle-bridge-server

# Test database connection
psql -h <db-host> -U <db-user> -d helloworlddao -c "SELECT 1"

# Check connection pooling
# Review connection pool metrics in logs

Step 3: Check External Service Status ​

ServiceStatus Page
SendGridhttps://status.sendgrid.com
Stripehttps://status.stripe.com
PostgreSQL (cloud)Check cloud provider status

Step 4: Review Logs ​

bash
# Oracle-bridge logs
ssh deploy@oracle-bridge-server
cd /home/deploy/oracle-bridge
npm run logs

# Look for:
# - Connection refused
# - Timeout errors
# - Authentication failures
# - SSL/TLS errors

Resolution ​

Scenario A: Database Connection Refused ​

bash
# Check if database is running
psql -h <db-host> -U <db-user> -c "SELECT 1"

# If connection refused:
# 1. Check if database server is running
# 2. Check firewall rules allow connection
# 3. Check security group settings (cloud)
# 4. Verify database credentials

Scenario B: Connection Pool Exhausted ​

If seeing "too many connections" errors:

bash
# Check current connections
psql -h <db-host> -U <db-user> -d helloworlddao -c \
  "SELECT count(*) FROM pg_stat_activity WHERE datname = 'helloworlddao'"

# Kill idle connections if needed
psql -h <db-host> -U <db-user> -d helloworlddao -c \
  "SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname = 'helloworlddao' AND state = 'idle' AND state_change < NOW() - INTERVAL '10 minutes'"

For persistent issues:

  1. Increase max_connections in PostgreSQL
  2. Reduce connection pool size in oracle-bridge
  3. Implement connection pooler (PgBouncer)

Scenario C: SSL/TLS Issues ​

If seeing SSL handshake failures:

bash
# Test SSL connection
openssl s_client -connect <db-host>:5432 -starttls postgres

# Verify certificate
# Check if CA cert is installed correctly
# Update SSL mode in connection string if needed

Scenario D: Authentication Failures ​

bash
# Verify credentials work
psql -h <db-host> -U <db-user> -d helloworlddao

# If password auth fails:
# 1. Verify password in environment variables
# 2. Check pg_hba.conf allows connection method
# 3. Verify user exists with correct permissions

Scenario E: Network/Firewall Issues ​

bash
# Test network connectivity
nc -zv <db-host> 5432

# If timeout:
# 1. Check security groups (cloud)
# 2. Check network ACLs
# 3. Check local firewall
# 4. Check VPC peering if applicable

Scenario F: Oracle-Bridge Service Down ​

bash
# SSH to server
ssh deploy@oracle-bridge-server

# Check service status
pm2 status

# Restart if needed
pm2 restart oracle-bridge

# Check logs for startup errors
pm2 logs oracle-bridge

Database Recovery ​

Recovering from Connection Issues ​

  1. Restart connection pools:

    bash
    pm2 restart oracle-bridge
  2. Clear stale connections:

    sql
    SELECT pg_terminate_backend(pid)
    FROM pg_stat_activity
    WHERE datname = 'helloworlddao'
    AND state = 'idle'
    AND state_change < NOW() - INTERVAL '5 minutes';
  3. Verify sync is working:

    • Check recent records in database
    • Compare with canister state

Data Sync Verification ​

After connectivity is restored:

sql
-- Check latest sync timestamp
SELECT MAX(updated_at) FROM users;

-- Count records
SELECT COUNT(*) FROM users;

-- Compare with canister count
-- dfx canister call user_service get_stats

External Service Issues ​

Email Provider (SendGrid) ​

If email delivery is failing:

  1. Check SendGrid status page
  2. Review SendGrid dashboard for:
    • Bounces
    • Blocks
    • Spam reports
  3. Check API key validity
  4. Review rate limits

Stripe ​

If Stripe webhooks are failing:

  1. Check Stripe dashboard > Developers > Webhooks
  2. Review failed webhook attempts
  3. Verify webhook secret is correct
  4. Check endpoint URL is reachable

Post-Resolution ​

Step 1: Verify Services ​

bash
# Health check
curl -s https://oracle.helloworlddao.com/health

# Test email sending
curl -X POST https://oracle.helloworlddao.com/test-email \
  -H "Authorization: Bearer <test-token>" \
  -d '{"email": "test@example.com"}'

Step 2: Monitor ​

  • Watch Grafana for 30 minutes
  • Verify error rates return to normal
  • Confirm email delivery is working

Step 3: Document ​

If this was a significant outage:

  1. Document root cause
  2. Update monitoring for earlier detection
  3. Create tickets for improvements

Prevention ​

Database Best Practices ​

  1. Connection pooling - Use PgBouncer or built-in pooling
  2. Health checks - Regular database health monitoring
  3. Alerts - Set up connection count alerts
  4. Backups - Regular automated backups

Network Resilience ​

  1. Retry logic - Implement exponential backoff
  2. Circuit breakers - Fail fast when database is down
  3. Timeouts - Set appropriate connection timeouts
  4. Redundancy - Consider read replicas

Escalation ​

ConditionAction
Database unrecoverableContact DBA / cloud support
Data corruption suspectedContact team lead immediately
Cloud provider issueOpen support ticket
Prolonged email outageContact SendGrid support

Hello World Co-Op DAO