Testing Pyramid Guidelines

This document defines what belongs at each test layer for each repository type in the FounderyOS ecosystem.

Overview

                    /\
                   /  \
                  /E2E \        Slow, expensive, high confidence
                 /      \       Browser-based user flows
                /________\
               /          \
              / Integration \   Medium speed, real dependencies
             /    Tests      \  Multi-component interactions
            /________________\
           /                  \
          /    Unit Tests      \  Fast, isolated, high coverage
         /                      \ Pure functions, components, stores
        /________________________\

Pyramid Principle: More tests at the bottom (fast, cheap), fewer at the top (slow, expensive).

Repository Type 1: Frontend (React/Vite)

Examples: foundery-os-suite

Unit Tests

Tools: Vitest + React Testing Library + jsdom

What belongs here:

Store logic (atoms, actions, computed values)
Service functions (data transformation, validation, API helpers)
Utility functions (formatters, parsers, calculations)
Component behavior (renders correctly with props, handles events)
Hooks (custom hook logic)

Coverage target: 80%+ for business logic, 70%+ for components

Mocking strategy:

typescript

// DO: Mock external dependencies
vi.mock('../services/captureService', () => ({
  createCapture: vi.fn().mockResolvedValue({ id: 1, title: 'Test' }),
}));

// DO: Mock browser APIs not available in jsdom
vi.stubGlobal('localStorage', localStorageMock);

// DO: Mock third-party UI libraries that don't work in jsdom
class ResizeObserverMock {
  observe = () => {};
  disconnect = () => {};
}
global.ResizeObserver = ResizeObserverMock;

// DON'T: Mock the code under test
// DON'T: Mock simple utilities that can run in jsdom

Example patterns:

typescript

// Store test - pure logic
describe('chatStore', () => {
  beforeEach(() => {
    resetChatStore();
  });

  it('should create conversation and auto-select it (AC-2.4.1.4)', () => {
    const convId = createConversation('agent');
    expect($activeConversationId.get()).toBe(convId);
  });
});

// Service test - data transformation
describe('chatService', () => {
  it('should parse message timestamps correctly', () => {
    saveMessage('conv-1', createMockMessage({ id: 'msg-1' }));
    const messages = getMessages('conv-1');
    expect(messages[0].timestamp).toBeInstanceOf(Date);
  });
});

// Component test - behavior not visuals
describe('ChatMessage', () => {
  it('should render user message content', () => {
    render(<ChatMessage message={mockMessage} />);
    expect(screen.getByText('Hello, World!')).toBeInTheDocument();
  });

  it('should display agent name when provided', () => {
    render(<ChatMessage message={{ ...mockMessage, agentName: 'Winston' }} />);
    expect(screen.getByText('Winston')).toBeInTheDocument();
  });
});

Integration Tests

Tools: Vitest + React Testing Library (jsdom)

What belongs here:

Multi-component workflows (modal open -> fill form -> submit -> close)
Store + service + component interactions
Form validation flows
State persistence roundtrips (save -> reload -> verify)

Coverage target: Critical user paths only

Mocking strategy:

typescript

// DO: Mock external services (API calls, canisters)
vi.mock('../services/captureService', () => ({
  createCapture: vi.fn().mockResolvedValue({ id: 1 }),
}));

// DON'T: Mock internal store/service communication
// Let stores and services interact naturally

Example pattern:

typescript

describe('CaptureModal Integration Tests', () => {
  it('should complete full capture creation workflow', async () => {
    openModal();
    render(<CaptureModal />);

    // 1. Select type
    fireEvent.click(screen.getByRole('button', { name: 'Project' }));

    // 2. Fill in fields
    fireEvent.change(screen.getByLabelText(/title/i), {
      target: { value: 'My Epic Project' }
    });

    // 3. Submit
    fireEvent.click(screen.getByRole('button', { name: /create project/i }));

    // 4. Verify modal closes
    await waitFor(() => {
      expect(screen.queryByText('New Project')).toBeNull();
    });
  });
});

E2E Tests

Tools: Playwright

What belongs here:

Critical user flows (login, core features)
Cross-page navigation
Visual regression (screenshot comparison)
Accessibility verification
Real browser interactions (hover, drag-drop, keyboard)

Coverage target: Critical paths only (5-10 scenarios)

Example pattern:

typescript

// e2e/capture.spec.ts
test('create and view capture', async ({ page }) => {
  await page.goto('/captures');

  // Open modal
  await page.click('[data-testid="new-capture-button"]');

  // Fill form
  await page.fill('[data-testid="title-input"]', 'Test Capture');

  // Submit
  await page.click('[data-testid="submit-button"]');

  // Verify in list
  await expect(page.locator('text=Test Capture')).toBeVisible();
});

Manual QA

What requires human verification:

Visual design fidelity
Animation smoothness
Responsive behavior at edge breakpoints
Complex drag-and-drop interactions
Color accessibility (WCAG contrast)
Real third-party integrations (OAuth flows)

Repository Type 2: Canister (Rust/IC)

Examples: auth-service, membership, governance, treasury, otter-camp

Unit Tests

Tools: cargo test with #[test] attribute

What belongs here:

Pure functions (validation, transformation, calculations)
State management helpers
Candid type serialization
Error handling paths
Business logic that doesn't require IC runtime

Coverage target: 80%+ for business logic

Mocking strategy:

rust

// DO: Test pure functions directly
#[test]
fn test_hash_email() {
    let result = hash_email("Test@Example.com");
    assert!(result.len() == 64); // SHA-256 hex length
}

// DO: Use test fixtures for complex types
fn create_test_session() -> UserSession {
    UserSession {
        session_id: "test-session".to_string(),
        user_id: "test-user".to_string(),
        // ...
    }
}

// DON'T: Try to mock ic_cdk functions in unit tests
// Use integration tests for IC-specific behavior

Integration Tests

Tools: PocketIC (local IC replica)

What belongs here:

Canister lifecycle (install, upgrade)
Query and update calls
Inter-canister communication
Access control validation
Session/token management
Full workflow validation

Coverage target: All public canister methods

Mocking strategy:

rust

// DO: Use PocketIC for real IC behavior
fn setup() -> (PocketIc, Principal) {
    let pic = PocketIc::new();
    let canister_id = pic.create_canister();
    pic.add_cycles(canister_id, 2_000_000_000_000);
    pic.install_canister(canister_id, wasm, init_args, None);
    (pic, canister_id)
}

// DO: Mock external canisters when testing one canister in isolation
// Create mock_user_service for testing auth-service

// DON'T: Skip PocketIC tests - they're the primary integration test layer

Example patterns:

rust

// Smoke test - basic canister functionality
#[test]
fn test_health_check() {
    let (pic, canister_id) = setup();

    let response = pic.query_call(
        canister_id,
        Principal::anonymous(),
        "health",
        encode_one(()).unwrap(),
    ).unwrap();

    let health: String = decode_one(&unwrap_wasm_result(response)).unwrap();
    assert_eq!(health, "ok");
}

// Business logic test
#[test]
fn test_create_session_basic() {
    let (pic, canister_id) = setup();

    let response = pic.update_call(
        canister_id,
        Principal::anonymous(),
        "create_session_for_user",
        encode_args((
            "user-test".to_string(),
            AuthMethodType::EmailPassword,
            // ...
        )).unwrap(),
    ).unwrap();

    let session: UserSession = decode_one(&unwrap_wasm_result(response)).unwrap();

    assert_eq!(session.user_id, "user-test");
    assert!(!session.access_token.is_empty());
}

E2E Tests

Tools: PocketIC multi-canister setup + Frontend E2E

What belongs here:

Full user flows spanning frontend + canisters
Multi-canister interactions (auth -> membership -> governance)
Token burn/mint flows
NFT lifecycle

Coverage target: Critical business flows only

Manual QA

What requires human verification:

Mainnet deployment verification
Cycle consumption monitoring
Upgrade migration success
Real wallet integrations (Internet Identity)

Repository Type 3: Service (TypeScript/Node)

Examples: oracle-bridge, foundery-os-agents

Unit Tests

Tools: Vitest

What belongs here:

Route handler logic (with mocked dependencies)
Middleware functions
Service layer business logic
Data transformation
Input validation
Error handling paths

Coverage target: 80%+ for business logic

Mocking strategy:

typescript

// DO: Mock external services (canisters, third-party APIs)
const mockValidateAccessToken = vi.fn();
vi.mock('../../src/ic/auth-client.js', () => ({
  validateAccessToken: (...args: unknown[]) => mockValidateAccessToken(...args),
}));

// DO: Use supertest for HTTP route testing
import request from 'supertest';
import express from 'express';

app = express();
app.use('/api/chat', chatRouter);

await request(app)
  .post('/api/chat')
  .set('Authorization', 'Bearer valid-token')
  .send({ agentId: 'test', message: 'hello' })
  .expect(200);

// DON'T: Make real HTTP calls to external services
// DON'T: Connect to real databases in unit tests

Example patterns:

typescript

// Route test with auth mocking
describe('chat routes', () => {
  beforeEach(() => {
    mockValidateAccessToken.mockResolvedValue('user-456');
  });

  it('should return 401 without Authorization header (AC-1.3.2.2)', async () => {
    const response = await request(app)
      .post('/api/chat')
      .send({ agentId: 'test', message: 'hello' })
      .expect(401);

    expect(response.body.code).toBe('MISSING_TOKEN');
  });

  it('should process chat with valid token (AC-1.3.2.1)', async () => {
    mockInvoke.mockResolvedValue({ message: 'Response' });

    const response = await request(app)
      .post('/api/chat')
      .set('Authorization', 'Bearer valid-token')
      .send({ agentId: 'chat-agent', message: 'Test' })
      .expect(200);

    expect(response.body.content).toBe('Response');
  });
});

// Middleware test
describe('requireSession', () => {
  it('should attach userId for valid token', async () => {
    mockReq.headers = { authorization: 'Bearer valid-token-123' };
    mockValidateAccessToken.mockResolvedValue('user-abc');

    await requireSession(mockReq, mockRes, mockNext);

    expect(mockReq.userId).toBe('user-abc');
    expect(mockNext).toHaveBeenCalled();
  });
});

Integration Tests

Tools: Vitest + real service dependencies (containerized)

What belongs here:

Database interactions (with test database)
External API integrations (with sandbox/test accounts)
Full request/response cycles
Webhook handling

Coverage target: External integration points

Example pattern:

typescript

// Integration test with mock stripe
describe('Payment Flow Integration', () => {
  it('should complete full payment flow with success', async () => {
    // Step 1: Create checkout session
    const session = await mockStripeClient.createCheckoutSession({
      amount: 2500,
      customer_email: 'test@example.com',
      // ...
    });
    expect(session.payment_status).toBe('unpaid');

    // Step 2: Process payment
    const paymentIntent = await mockStripeClient.processPayment(
      session.id,
      '4242424242424242'
    );
    expect(paymentIntent.status).toBe('succeeded');

    // Step 3: Verify webhook construction
    const webhookEvent = mockStripeClient.constructWebhookEvent(
      session.id,
      paymentIntent.id,
      session.amount_total,
      session.metadata
    );
    expect(webhookEvent.type).toBe('checkout.session.completed');
  });
});

E2E Tests

Tools: Supertest against running service + Docker

What belongs here:

Full API flow testing
Authentication/authorization flows
Rate limiting verification
Service health checks

Manual QA

What requires human verification:

Production deployment verification
Real payment processing (with test cards)
Canister communication from production service

Anti-Patterns to Avoid

1. Testing Implementation Details

typescript

// BAD: Tests internal state shape
it('stores message in _messages array', () => {
  store._messages.push(message);
  expect(store._messages[0]).toBe(message);
});

// GOOD: Tests observable behavior
it('displays message after adding', () => {
  addMessage(message);
  expect($messages.get()).toContain(message);
});

2. Over-Mocking

typescript

// BAD: Mocking everything including what you're testing
vi.mock('../stores/chatStore');
it('creates conversation', () => {
  createConversation('agent');
  expect(createConversation).toHaveBeenCalled(); // Just testing mock
});

// GOOD: Let real code run, mock only external boundaries
it('creates conversation and updates store', () => {
  const convId = createConversation('agent');
  expect($conversations.get()).toContainEqual(
    expect.objectContaining({ id: convId })
  );
});

3. Brittle Visual Tests

typescript

// BAD: Testing exact SVG coordinates
expect(chart.querySelector('path')).toHaveAttribute('d', 'M0,100L50,80...');

// GOOD: Test semantic structure and accessibility
expect(screen.getByRole('img')).toHaveAttribute('aria-label', /burndown/i);

4. Flaky Time-Dependent Tests

typescript

// BAD: Real timers in tests
await new Promise(r => setTimeout(r, 5000));

// GOOD: Use fake timers or proper async handling
vi.useFakeTimers();
await act(async () => {
  vi.advanceTimersByTime(5000);
});
vi.useRealTimers();

5. Testing Third-Party Library Internals

typescript

// BAD: Testing how Radix UI works internally
it('Radix Dialog uses portal', () => {
  expect(document.body.querySelector('[data-radix-portal]')).toBeTruthy();
});

// GOOD: Test your component's behavior
it('modal becomes visible when opened', () => {
  openModal();
  expect(screen.getByRole('dialog')).toBeVisible();
});

Coverage Expectations Summary

Layer	Frontend	Canister	Service
Unit	80% logic, 70% components	80% pure functions	80% business logic
Integration	Critical workflows	All public methods	Integration points
E2E	5-10 critical paths	Business-critical flows	Full API flows
Manual QA	Visual, accessibility	Mainnet, upgrades	Production verification

When to Add Tests at Each Layer

Add a Unit Test When:

You write a new pure function or utility
You add a new store action or computed value
You create a new component with conditional rendering
You add new validation logic
A bug is found that should have been caught

Add an Integration Test When:

Multiple components need to work together
Store changes affect multiple components
You need to test a user workflow (e.g., form submission)
Testing real persistence (localStorage, IndexedDB)

Add an E2E Test When:

Testing critical business paths (login, payment, core features)
Verifying cross-page navigation works
Visual regression testing is needed
Testing real browser-only features (drag-drop, file upload)

Do Manual QA When:

Visual design review is required
Testing accessibility with assistive technology
Verifying real third-party integrations
Production deployment verification

PocketIC vs Full Dev Environment Decision Matrix

When testing canister functionality, choose the appropriate testing approach:

Use PocketIC (`cargo test`) When:

Scenario	Example	Why PocketIC
Single canister in isolation	Testing dao-admin CRUD operations	Fast, focused, no external dependencies
Backend-only stories	FOS-3.1.8 (PocketIC tests for dao-admin)	No UI to verify, canister logic only
Access control verification	Testing admin vs non-admin permissions	PocketIC simulates IC principal handling
Candid encoding/decoding	Ensuring types match .did file	Real Candid serialization in tests
CI pipeline	GitHub Actions test runs	Fast (~50s for 40 tests), no setup required
Pure business logic	Financial calculations, state transitions	Deterministic, reproducible results

Command: cargo test in canister repo

Use Full Dev Environment (`npm run dev:full`) When:

Scenario	Example	Why Full Environment
Frontend stories	FOS-3.2 (DAO Admin UI)	Need running UI to verify
Cross-canister flows	auth-service → membership → governance	Multiple canisters interacting
E2E user journeys	Login → Create capture → View in list	Full stack integration
Playwright testing	Visual verification, accessibility	Requires browser + running app
Oracle bridge integration	Payment confirmation, price feeds	Off-chain service involved
Manual QA	Pre-release verification	Real user experience

Command: npm run dev:full in frontend repo (starts PocketIC + canisters + oracle-bridge + Vite)

Decision Flowchart

Is there a UI component to verify?
├── YES → Use Full Dev Environment
└── NO → Does it involve multiple canisters?
         ├── YES → Use Full Dev Environment
         └── NO → Use PocketIC

Testing Layer Summary

                              Full Dev Environment
                                      │
    ┌─────────────────────────────────┴─────────────────────────────────┐
    │                                                                     │
    │   Playwright E2E Tests                                             │
    │   • User journeys through UI                                        │
    │   • Visual verification                                             │
    │   • Cross-canister flows via frontend                              │
    │                                                                     │
    └─────────────────────────────────────────────────────────────────────┘

                              PocketIC (cargo test)
                                      │
    ┌─────────────────────────────────┴─────────────────────────────────┐
    │                                                                     │
    │   Canister Integration Tests                                        │
    │   • Single canister CRUD operations                                │
    │   • Access control verification                                     │
    │   • Candid encoding/decoding                                        │
    │   • State management                                                │
    │                                                                     │
    └─────────────────────────────────────────────────────────────────────┘

                              Unit Tests (cargo test)
                                      │
    ┌─────────────────────────────────┴─────────────────────────────────┐
    │                                                                     │
    │   Pure Rust Functions                                               │
    │   • Validation logic                                                │
    │   • Data transformation                                             │
    │   • Helper functions                                                │
    │                                                                     │
    └─────────────────────────────────────────────────────────────────────┘

Story Type → Testing Approach

Story Type	Primary Tests	QA Verification
Canister backend	PocketIC	`cargo test` passes
Frontend component	Vitest + RTL	Playwright in dev:full
Frontend + canister	Both	Playwright in dev:full
Service (Node.js)	Vitest + supertest	API testing
Cross-canister integration	PocketIC multi-canister	dev:full E2E

References

Vitest Documentation
React Testing Library
PocketIC Documentation
PocketIC Setup Guide
Playwright Documentation
Source: foundery-os-suite/src/**/__tests__/*.test.{ts,tsx}
Source: auth-service/tests/*.rs
Source: foundery-os-agents/tests/**/*.test.ts

Testing Pyramid Guidelines ​

Overview ​

Repository Type 1: Frontend (React/Vite) ​

Unit Tests ​

Integration Tests ​

E2E Tests ​

Manual QA ​

Repository Type 2: Canister (Rust/IC) ​

Unit Tests ​

Integration Tests ​

E2E Tests ​

Manual QA ​

Repository Type 3: Service (TypeScript/Node) ​

Unit Tests ​

Integration Tests ​

E2E Tests ​

Manual QA ​

Anti-Patterns to Avoid ​

1. Testing Implementation Details ​

2. Over-Mocking ​

3. Brittle Visual Tests ​

4. Flaky Time-Dependent Tests ​

5. Testing Third-Party Library Internals ​

Coverage Expectations Summary ​

When to Add Tests at Each Layer ​

Add a Unit Test When: ​

Add an Integration Test When: ​

Add an E2E Test When: ​

Do Manual QA When: ​

PocketIC vs Full Dev Environment Decision Matrix ​

Use PocketIC (cargo test) When: ​

Use Full Dev Environment (npm run dev:full) When: ​

Decision Flowchart ​

Testing Layer Summary ​

Story Type → Testing Approach ​

References ​

Testing Pyramid Guidelines

Overview

Repository Type 1: Frontend (React/Vite)

Unit Tests

Integration Tests

E2E Tests

Manual QA

Repository Type 2: Canister (Rust/IC)

Unit Tests

Integration Tests

E2E Tests

Manual QA

Repository Type 3: Service (TypeScript/Node)

Unit Tests

Integration Tests

E2E Tests

Manual QA

Anti-Patterns to Avoid

1. Testing Implementation Details

2. Over-Mocking

3. Brittle Visual Tests

4. Flaky Time-Dependent Tests

5. Testing Third-Party Library Internals

Coverage Expectations Summary

When to Add Tests at Each Layer

Add a Unit Test When:

Add an Integration Test When:

Add an E2E Test When:

Do Manual QA When:

PocketIC vs Full Dev Environment Decision Matrix

Use PocketIC (`cargo test`) When:

Use Full Dev Environment (`npm run dev:full`) When:

Decision Flowchart

Testing Layer Summary

Story Type → Testing Approach

References