FAS Troubleshooting Guide
Common issues encountered during the Frontend Application Split project and their solutions. These are real problems found and fixed during FAS-1 through FAS-8.1.
Table of Contents
- GitHub Packages Issues
- CI/CD Issues
- Build Issues
- SEO Issues (marketing-suite)
- Game Issues (otter-camp-suite)
- SSO Issues (dao-suite, dao-admin-suite)
- React Form Patterns
- Console Logging Patterns (Game Suites)
- React Hook Stability Patterns
- CSP Issues (all suites)
- Deployment Issues
- Testing Issues
- CI/CD Maintenance
- Related Documentation
GitHub Packages Issues
"npm ERR! 401 Unauthorized" during npm install
Cause: Missing or invalid GitHub Packages authentication.
Fix: Ensure ~/.npmrc has a valid PAT with read:packages scope:
@hello-world-co-op:registry=https://npm.pkg.github.com
//npm.pkg.github.com/:_authToken=ghp_YOUR_TOKEN_HERE"npm ERR! 404 Not Found" for @hello-world-co-op packages
Cause: Wrong npm scope or package not published.
Fix:
- Scope MUST be
@hello-world-co-op(not@helloworlddao). The npm scope must match the GitHub org nameHello-World-Co-Op. - Verify the package exists:
npm view @hello-world-co-op/ui --registry=https://npm.pkg.github.com - Ensure you have org-level Read access to the package.
Cross-Repo Package Access Denied
Cause: GITHUB_TOKEN in CI cannot read packages from other repos without explicit access grants.
Fix: In GitHub UI, go to the package Settings -> Manage Access -> grant Read access to the organization. This must be done for each published package (api, auth, ui).
"npm deprecate" or "npm unpublish" Fails
Cause: GitHub Packages does not support npm deprecate or npm unpublish for org-scoped packages.
Fix: There is no workaround -- publish a new version instead.
CI/CD Issues
Reusable Workflow "startup_failure"
Cause: Using short SHA refs (e.g., @fef6709) in reusable workflow uses: references causes startup_failure in tag-triggered contexts.
Fix: Use @main for reusable workflow references:
# WRONG -- causes startup_failure
uses: Hello-World-Co-Op/.github/.github/workflows/package-publish.yml@fef6709
# CORRECT
uses: Hello-World-Co-Op/.github/.github/workflows/package-publish.yml@mainNote: SHA-pinning is still required for regular actions (e.g.,
actions/checkout@SHA). Only reusable workflows from the org's.githubrepo should use@main.
"file:" References Break CI
Cause: package-lock.json contains file:../pkg references from local npm link development. In CI, these become broken symlinks.
Fix: The package-publish.yml workflow automatically detects file: in the lock file and falls back to npm install (instead of npm ci). To prevent the issue:
- Do not commit
file:references inpackage.json - Run
npm installwith registry versions before committingpackage-lock.json - If CI fails, check
package-lock.jsonforfile:entries
gh CLI Cannot Query Packages
Cause: The default gh CLI token does not have read:packages scope.
Fix: Use CI workflows for package operations instead of gh CLI. The GITHUB_TOKEN in GitHub Actions has the necessary scopes when configured with packages: read permission.
"gh run rerun --failed" Does Not Pick Up Workflow Changes
Cause: gh run rerun --failed uses the cached workflow definition from the original run.
Fix: Push a new tag to trigger a fresh workflow run instead of rerunning a failed one.
Build Issues
Empty Vite Chunks Warning
Cause: manualChunks in vite.config.ts references packages that are not actually imported (e.g., auth: ['@hello-world-co-op/auth'] when auth is only a transitive dependency via ui).
Fix: Remove entries from manualChunks for packages that are not directly imported:
// Only include packages you actually import in your code
manualChunks: {
vendor: ['react', 'react-dom', 'react-router-dom'],
ui: ['@hello-world-co-op/ui'],
// Don't include auth/api if only used transitively
}Tailwind Classes Missing from Shared Packages
Cause: Tailwind content paths do not include the package's dist directory.
Fix: Add the package dist to tailwind.config.js content:
content: [
'./index.html',
'./src/**/*.{ts,tsx}',
'./node_modules/@hello-world-co-op/ui/dist/**/*.js', // CRITICAL
]CJS require() in ESM Files
Cause: Using require('tailwindcss-animate') in a file that is treated as ESM.
Fix: Use ESM imports:
// WRONG
const animate = require('tailwindcss-animate');
// CORRECT
import tailwindcssAnimate from 'tailwindcss-animate';Terser Drops console.error/console.warn in Production
Cause: Using drop_console: true in terser options removes ALL console output including errors and warnings.
Fix: Use pure_funcs to selectively remove only info-level logs:
// vite.config.ts
build: {
minify: 'terser',
terserOptions: {
compress: {
drop_debugger: true,
pure_funcs: ['console.log', 'console.debug', 'console.info'],
// console.error and console.warn are preserved
},
},
}manualChunks Circular Dependency (think-tank-suite)
Cause: Custom manualChunks configuration in vite.config.ts caused circular chunk dependencies (vendor-react <-> vendor-misc <-> vendor-markdown), resulting in ReferenceError: Cannot access 's' before initialization.
Fix: Remove manualChunks and let Rollup handle code-splitting automatically. Fixed in commit 5e68c31 during FAS-8.1.
SEO Issues (marketing-suite)
vite-ssg Build Time > 30s
Cause: Pre-rendering all routes with ReactDOMServer.renderToString() is inherently slow.
Fix: This is expected behavior (~30s for pre-rendering). Total build time stays within NFR1 (< 3 min). If adding many routes, use selective pre-rendering -- only pre-render SEO-critical pages.
Sitemap Missing Pages
Cause: Routes are not included in the sitemap generation script.
Fix: Verify the routes array in scripts/generate-sitemap.ts includes all public routes. New routes must be added to both scripts/generate-sitemap.ts and src/entry-prerender.tsx.
Meta Tags Not Rendering in Pre-Rendered HTML
Cause: react-helmet-async HelmetProvider is not wrapping the component tree during SSR.
Fix: Ensure HelmetProvider wraps your app in src/entry-prerender.tsx:
import { HelmetProvider } from 'react-helmet-async';
const helmetContext = {};
const html = ReactDOMServer.renderToString(
<HelmetProvider context={helmetContext}>
<App />
</HelmetProvider>
);
// Extract helmet data from helmetContext.helmetGame Issues (otter-camp-suite)
Phaser.js Bundle Size Too Large
Cause: Phaser.js (~340KB gzipped) included in the initial bundle.
Fix: Extract Phaser to a standalone lazy-loaded chunk. The game page is lazy-loaded in App.tsx:
const OtterCampPage = lazy(() => import('./pages/OtterCampPage'));This keeps the initial bundle at ~58KB. Phaser and game code load only when the user navigates to /otter-camp.
Game Canvas Not Rendering (Black Screen)
Cause: Phaser.AUTO configuration issue, missing assets, or wrong route.
Fix:
- Navigate to
/otter-camp(not root/) - Check browser console for Phaser errors
- Verify
Phaser.AUTOmode in game config (falls back to Canvas if WebGL unavailable) - Check that game assets exist in
public/assets/
Avatar Creator State Loss on Navigation
Cause: Game state not persisted to localStorage.
Fix: Verify localStorage persistence in avatar creation flow. Character data is saved to localStorage after creation and read back on game load.
SSO Issues (dao-suite, dao-admin-suite)
Infinite Redirect Loop in ProtectedRoute
Cause: Unstable useCallback references in ProtectedRoute component cause the auth check to re-trigger on every render, creating an infinite loop between the suite and the login page.
Fix: Use stable useCallback references with proper dependency arrays:
// WRONG -- causes infinite loop
const checkAuth = useCallback(() => {
if (!isAuthenticated) {
navigate('/login');
}
}, [isAuthenticated, navigate]); // navigate changes on every render
// CORRECT -- stable ref
const checkAuth = useCallback(() => {
if (!isAuthenticated) {
window.location.href = loginUrl;
}
}, [isAuthenticated, loginUrl]);This was a critical bug fixed in FAS-8.1.
Cookie SSO Not Populating localStorage
Cause: The authCookieClient bridge between cookie-based SSO and localStorage is not running, or the cookie domain is wrong.
Fix:
- Verify cookie domain is
.helloworlddao.com(leading dot required) - Implement
authCookieClientthat callsoracle-bridge /auth/sessionon app load - On success, populate
localStorage.user_datawith session data - Check browser DevTools -> Application -> Cookies for correct cookie configuration
oracle-bridge ERR_ERL_KEY_GEN_IPV6
Cause: express-rate-limit IPv6 address validation fails when generating rate limit keys.
Fix: This was fixed in FAS-8.1. Pull the latest oracle-bridge code. If the issue persists, ensure Node.js is listening on IPv4 (0.0.0.0) not IPv6 (::1).
AUTH_COOKIE_DOMAIN Not Set in Staging
Cause: Missing AUTH_COOKIE_DOMAIN environment variable in the staging deployment of oracle-bridge.
Fix: Add .helloworlddao.com to the AUTH_COOKIE_DOMAIN environment variable in the staging oracle-bridge deployment configuration.
Login Works on One Suite but Fails on Another
Cause: Cookie scoped to a single subdomain instead of the parent domain.
Fix: Verify the cookie domain is .helloworlddao.com (with leading dot). Without the dot, the cookie is only available on the subdomain where it was set. See Cross-Suite Auth Debugging for detailed diagnosis steps.
React Form Patterns
react-hook-form watch() Infinite Loop
Cause: watch() with no arguments returns a new object on every render. Putting the result in a useEffect dependency array causes infinite re-renders because the object reference changes each time.
Fix: Use field-specific watch('fieldName') instead of watch(), or use the subscription API:
// WRONG — causes infinite loop
const formValues = watch(); // New object ref every render
useEffect(() => {
syncToServer(formValues);
}, [formValues]); // Re-runs every render!
// FIX 1: Watch specific fields (returns stable primitive values)
const email = watch('email');
const name = watch('name');
useEffect(() => {
syncToServer({ email, name });
}, [email, name]); // Only re-runs when values actually change
// FIX 2: Use subscription API (no re-renders at all)
useEffect(() => {
const subscription = watch((values) => {
syncToServer(values);
});
return () => subscription.unsubscribe();
}, [watch]); // watch ref is stable from useFormThis was a recurring pattern found during FAS-4 (marketing-suite) development.
Duplicate Title Tags with react-helmet-async
Cause: index.html has a <title> tag and components also set titles via <Helmet>. In pre-rendered output, both appear.
Fix: Remove the <title> tag from index.html and manage all titles through react-helmet-async. For pre-rendered pages, ensure HelmetProvider wraps the component tree in entry-prerender.tsx.
Console Logging Patterns (Game Suites)
Production Console Statement Cleanup
Cause: Game code (Phaser.js) frequently uses console.log for debugging during development. The otter-camp-suite had 404 console statements across 57 files at extraction time.
Fix: Wrap debug logs in development-only guards:
// WRONG — logs in production
console.log('Player position:', player.x, player.y);
// CORRECT — only logs in development
if (import.meta.env.DEV) {
console.log('Player position:', player.x, player.y);
}Finding console statements: Run this command to audit console usage:
grep -rn "console\.\(log\|debug\|info\|warn\|error\)" src/ --include="*.ts" --include="*.tsx" | wc -lNote: console.error and console.warn should generally be preserved (they indicate real problems). Only wrap console.log, console.debug, and console.info in DEV guards. Vite's terser config (pure_funcs) also strips these in production builds, but DEV guards are preferred for game code because they prevent the performance cost of string formatting.
React Hook Stability Patterns
These patterns prevent infinite re-render loops and unstable references in React components. All were discovered during FAS-6 through FAS-8 suite development.
Custom Hook Functions Must Use useCallback
When a custom hook returns a function, that function gets a new reference on every render unless wrapped in useCallback. Any useEffect depending on it will re-trigger infinitely.
// PROBLEM: useAuthService() returns new refreshTokens ref each render
function useAuthService() {
const refreshTokens = async (token: string) => { /* ... */ };
return { refreshTokens }; // New function ref every render!
}
// useEffect re-runs infinitely
useEffect(() => {
refreshTokens(token);
}, [refreshTokens]); // refreshTokens changes every render
// FIX: Use useCallback in the hook
function useAuthService() {
const refreshTokens = useCallback(async (token: string) => {
/* ... */
}, []); // Stable ref
return { refreshTokens };
}Mount-Only Effects for Auth Checks
When an auth check should run once on mount, use an empty dependency array with an eslint-disable comment explaining why:
useEffect(() => {
const checkAuth = async () => {
const session = await checkSession();
if (session.authenticated) {
setAuthenticated(true);
}
};
checkAuth();
// eslint-disable-next-line react-hooks/exhaustive-deps
}, []); // Mount-only: auth check should not re-run on state changesThis pattern was used in dao-suite and dao-admin-suite ProtectedRoute after fixing the infinite loop (FAS-8.1).
Cookie SSO to localStorage Bridge
When adding cookie-based SSO to a suite that has legacy components reading localStorage, the bridge must populate localStorage before any component reads it:
// In ProtectedRoute (runs before children render)
const session = await checkSession();
if (session.authenticated) {
// Bridge: populate localStorage for legacy components
if (!localStorage.getItem('user_data')) {
localStorage.setItem('user_data', JSON.stringify({
userId: session.userId || 'sso-user',
email: '',
firstName: 'Member',
lastName: '',
}));
}
setAuthenticated(true);
return; // Skip localStorage token check
}Without this bridge, components like Dashboard read null from localStorage and redirect to /login, causing an infinite redirect loop (FAS-8.1 bug).
CSP Issues (all suites)
connect-src Blocking *.helloworlddao.com
Cause: Content Security Policy does not allow connections to *.helloworlddao.com subdomains.
Fix: Add the wildcard domain to the CSP connect-src directive in vite.config.ts:
// vite.config.ts
server: {
headers: {
'Content-Security-Policy': "connect-src 'self' https://*.helloworlddao.com https://ic0.app https://icp0.io wss://*"
}
}wss:// WebSocket Blocked
Cause: CSP connect-src does not include wss: protocol for WebSocket connections.
Fix: Add wss: to the connect-src directive:
connect-src 'self' https://*.helloworlddao.com wss://*.helloworlddao.comDeployment Issues
dfx deploy Fails with "vite auto-detected"
Cause: dfx detects vite.config.ts and tries to rebuild the project, conflicting with the pre-built dist/ directory.
Fix: Rename vite.config.ts before deployment (as done in deploy-staging.yml):
mv vite.config.ts vite.config.ts.bak
mv package.json package.json.bak
echo '{"name":"suite","private":true,"scripts":{"build":"echo No build needed"}}' > package.json
dfx deploy <canister_name> --network ic --yes
mv vite.config.ts.bak vite.config.ts
mv package.json.bak package.jsonCanister Out of Cycles
Cause: IC canisters consume cycles for storage and computation. The user-service canister ran critically low (0.5 TC) during FAS-8.1.
Fix: Check canister status and top up:
dfx canister status <canister_id> --network ic
# If low, add cycles:
dfx wallet send <canister_id> <cycles_amount> --network icPrevention: Monitor cycle balances before major deployments. Keep at least 2 TC as a buffer.
DNS Propagation Delay
Cause: DNS changes take time to propagate (5-60 minutes).
Fix: During DNS propagation, use direct canister URLs for testing:
https://<canister-id>.icp0.io
These bypass DNS entirely and are available immediately after canister deployment.
Identity PEM Cleanup in CI
Cause: DFX identity PEM files left on CI runners are a security risk.
Fix: Always clean up in a post-step with if: always():
- name: Cleanup identity
if: always()
run: |
rm -rf ~/.config/dfx/identity/github-ci
if [ -d ~/.config/dfx/identity/github-ci ]; then
echo "Warning: Identity directory still exists"
exit 1
fiTesting Issues
vitest.setup.ts "vi is not defined"
Cause: Using vi.fn() in setup file without importing vi from vitest, relying on globals.
Fix: Add explicit import:
import { vi } from 'vitest';Barrel Export Violations
Cause: Re-exporting utility functions from a component barrel file (e.g., components/governance/index.ts re-exporting from utils/votingHelpers.ts).
Fix: Only export from the same directory in barrel files. Import utilities directly from their source path.
@dfinity/agent v2 vs v3 Mismatch
Cause: Different packages or suites use different major versions of @dfinity/agent.
Status: Known limitation. Documented but not blocking. Requires coordinated upgrade across all packages (tracked as deferred-dfinity-v3-upgrade).
CI/CD Maintenance
GitHub Actions SHA Pinning Update Runbook
All GitHub Actions in our workflows are SHA-pinned for supply-chain security. When updating to newer versions:
Step 1: Identify the action and target version
# Example: updating actions/checkout from v4.1.7 to v4.2.0
# Find the full SHA for the tag on GitHub
gh api repos/actions/checkout/git/ref/tags/v4.2.0 --jq '.object.sha'Step 2: Find all usages across workflows
cd ~/git/dot-github/.github/workflows
grep -rn "actions/checkout@" *.ymlStep 3: Update the SHA in all workflows
# Replace old SHA with new SHA in all workflow files
# Example: sed -i 's/old_sha/new_sha/g' *.yml
# Then verify the change
git diffStep 4: Test the update
Push a test commit to a branch and verify the workflow runs successfully before merging to main. Check the Actions tab for any startup_failure errors.
Important rules:
- Always use full 40-character SHA (not short refs)
- Never use
@mainor@v4for regular actions (only for org reusable workflows) - Update all workflows at once to keep SHAs consistent
- Document the version mapping in a comment:
# actions/checkout v4.2.0
GitHub Environment Protection Rules
Configure environment protection rules in the GitHub repository settings to prevent accidental production deployments:
Staging environments:
- No required reviewers (automated deploy on push to main)
- Branch restriction:
mainonly - Deployment timeout: 30 minutes
Production environments (when ready):
- Required reviewers: 1 (any team member)
- Branch restriction:
mainonly - Wait timer: 5 minutes (cool-down before deploy starts)
- Deployment timeout: 30 minutes
Setting up in GitHub UI:
- Go to Repository → Settings → Environments
- Create environment (e.g.,
staging,production) - Add protection rules as described above
- Reference in workflow:
environment: staging
Current state: Staging environments are configured for all 6 suites. Production environments should be configured before the production DNS cutover.
RBAC Issues
User Sees /unauthorized but Should Have Admin Access
Cause: User's session does not include the required role, or roles were not properly assigned in auth-service.
Diagnosis:
- Check role assignment in auth-service:
# Check if user has the Admin role
dfx canister call auth-service has_role '("2vxsx-fae", variant { Admin })' --network ic --query
# Get all roles for user
dfx canister call auth-service get_user_roles '("2vxsx-fae")' --network ic --queryExpected: (true) for has_role, or (vec { variant { Admin }; variant { Member } }) for get_user_roles
- Check if roles are in the session:
# In browser DevTools Console (on the frontend suite)
const response = await fetch('https://staging-oracle.helloworlddao.com/api/auth/session', {
credentials: 'include'
});
const data = await response.json();
console.log('Roles:', data.roles);Expected: ["admin", "member"]
If roles are missing from the session but present in auth-service, the user needs to log out and back in (roles are cached at login time).
Fix:
If role is not assigned:
# Assign Admin role
dfx canister call auth-service assign_role '("2vxsx-fae", variant { Admin })' --network icIf role is assigned but not in session:
- User must log out and log back in
- Roles are cached in the session token at login time
- Changes to roles do not take effect retroactively
Roles Not Appearing in Session
Cause: auth-service not properly deployed, oracle-bridge not calling auth-service correctly, or frontend not reading from oracle-bridge session endpoint.
Diagnosis:
- Verify auth-service is deployed and has role data:
# Check auth-service canister status
dfx canister status auth-service --network ic
# Try to get roles for a known user
dfx canister call auth-service get_user_roles '("2vxsx-fae")' --network ic --query- Verify oracle-bridge session endpoint returns roles:
# Test oracle-bridge endpoint (replace with your session cookie)
curl -H "Cookie: session=YOUR_SESSION_COOKIE" \
https://staging-oracle.helloworlddao.com/api/auth/sessionExpected response:
{
"authenticated": true,
"user_id": "2vxsx-fae",
"roles": ["admin", "member"]
}- Check frontend auth store:
// In browser DevTools Console
console.log('User roles:', $userRoles?.get());Expected: ["admin", "member"]
Fix:
If auth-service is not deployed:
cd ~/git/auth-service
dfx deploy auth-service --network icIf oracle-bridge is not returning roles:
- Check oracle-bridge logs for errors calling auth-service
- Verify
AUTH_SERVICE_CANISTER_IDenvironment variable is set correctly - Restart oracle-bridge service
If frontend is not reading roles:
- Verify
@hello-world-co-op/auth@^0.2.0is installed - Check that
fetchSession()in auth store calls oracle-bridge endpoint - Verify
$userRolesatom is being populated
RoleGuard Shows Fallback for Admin User
Cause: AuthProvider not wrapping the app, roles not loaded, or RoleGuard checking before auth completes.
Diagnosis:
- Check if AuthProvider wraps app:
// In src/App.tsx or src/main.tsx
import { AuthProviderBridge } from './components/auth/AuthProviderBridge';
function App() {
return (
<AuthProviderBridge> {/* This must wrap everything */}
<Router>...</Router>
</AuthProviderBridge>
);
}- Check if roles are loaded:
// In browser DevTools Console
import { $userRoles } from './stores/auth';
console.log('Roles loaded:', $userRoles.get());Expected: ["admin", "member"] (not [] or undefined)
- Check RoleGuard usage:
// Component using RoleGuard
function AdminPanel() {
return (
<RoleGuard role="admin" fallback={<div>Not authorized</div>}>
<AdminContent />
</RoleGuard>
);
}Fix:
If AuthProvider is missing:
- Wrap app root with
<AuthProviderBridge>(Auth Bridge pattern suites) - For Auth Direct pattern (governance-suite), ensure auth state is loaded before rendering protected components
If roles are not loaded:
- Check network tab for successful
/api/auth/sessioncall - Verify session endpoint returns
rolesarray - Force session refresh by logging out and back in
If RoleGuard renders before auth completes:
// Add loading state
function AdminPanel() {
const { roles, isLoading } = useRoles();
if (isLoading) {
return <div>Loading...</div>;
}
return (
<RoleGuard role="admin" fallback={<div>Not authorized</div>}>
<AdminContent />
</RoleGuard>
);
}How to Check Roles via dfx CLI
Get all roles for a user:
dfx canister call auth-service get_user_roles '("2vxsx-fae")' --network ic --queryCheck if user has a specific role:
# Check Admin role
dfx canister call auth-service has_role '("2vxsx-fae", variant { Admin })' --network ic --query
# Check Moderator role
dfx canister call auth-service has_role '("2vxsx-fae", variant { Moderator })' --network ic --query
# Check Member role
dfx canister call auth-service has_role '("2vxsx-fae", variant { Member })' --network ic --queryValidate session with role:
# Replace ACCESS_TOKEN with actual session token
dfx canister call auth-service validate_session_with_role '("ACCESS_TOKEN", "admin")' --network ic --queryGet user's Principal ID from email:
# If you only know the email, use auth-service to look up the Principal
dfx canister call auth-service get_user_by_email '("user@example.com")' --network ic --queryHow to Assign Admin Role
Via dfx CLI (requires controller access):
# Assign Admin role
dfx canister call auth-service assign_role '("2vxsx-fae", variant { Admin })' --network ic
# Assign Moderator role
dfx canister call auth-service assign_role '("2vxsx-fae", variant { Moderator })' --network icVia another Admin user (programmatically):
// In auth-service canister or via inter-canister call
#[update(guard = "is_admin")]
async fn assign_admin_role(user_id: Principal) -> Result<(), String> {
add_role(user_id, Role::Admin);
Ok(())
}Important: After assigning a role, the user must log out and log back in for the change to take effect (roles are cached in session tokens).
Canister Method Returns "Access Denied" for Admin
Cause: Canister is not correctly calling validate_session_with_role, or access token is not being passed from frontend.
Diagnosis:
- Check if frontend is passing token:
// In frontend service function
async function callAdminMethod() {
const token = getAccessToken();
console.log('Sending token:', token ? 'present' : 'MISSING');
const actor = await createActor('canister-id');
const result = await actor.admin_method(token, data);
return result;
}- Check canister logs:
# View recent canister logs
dfx canister logs <canister-name> --network icLook for:
"SECURITY: Unauthorized admin access attempt"(auth check failed)"Auth service error"(inter-canister call failed)
- Manually test role validation:
# Test validate_session_with_role directly with a real token
dfx canister call auth-service validate_session_with_role '("YOUR_ACCESS_TOKEN", "admin")' --network ic --queryFix:
If token is not being passed:
- Update frontend service to include access token in canister calls
- Ensure token is retrieved from auth store or cookie
If canister role check is failing:
- Verify
AUTH_SERVICE_CANISTER_IDconstant in canister code - Check that canister is calling
validate_session_with_rolecorrectly - Ensure error handling is fail-closed (denies access on error)
Example correct implementation:
#[update]
async fn admin_method(access_token: String, data: String) -> Result<String, String> {
let auth_service_id = Principal::from_text(AUTH_SERVICE_CANISTER_ID)
.map_err(|e| format!("Config error: {}", e))?;
let result: Result<(Result<SessionInfo, String>,), _> = call(
auth_service_id,
"validate_session_with_role",
(access_token, "admin".to_string()),
).await;
match result {
Ok((Ok(session_info),)) => {
ic_cdk::println!("AUDIT: Admin action by {}", session_info.user_id);
perform_admin_operation(data)
}
Ok((Err(e),)) => {
ic_cdk::println!("SECURITY: Access denied - {}", e);
Err(format!("Access denied: {}", e))
}
Err((code, msg)) => {
ic_cdk::println!("ERROR: Auth service call failed - {:?}: {}", code, msg);
Err("Authentication service unavailable".to_string())
}
}
}Stale Roles After Role Change
Cause: Roles are cached in session tokens at login time. Changes do not take effect retroactively.
Expected Behavior:
- User logs in → Session created with current roles at that moment
- Admin changes user's roles → Stored in auth-service state
- User's active session still has old roles (cached)
- User logs out and back in → New session with updated roles
Fix:
For immediate role change enforcement:
- Implement session invalidation in auth-service (requires backend support)
- Force user to log out (revoke session token)
For non-urgent role changes:
- User will get updated roles on next login (within 24 hours when session expires)
Mitigation for time-sensitive changes:
# Backend: Invalidate user's sessions (if implemented)
dfx canister call auth-service invalidate_user_sessions '("user-id")' --network ic
# Frontend: Clear local session and redirect to login
localStorage.removeItem('access_token');
window.location.href = '/login';Related Documentation
- RBAC Integration Guide -- Complete RBAC integration documentation
- Architecture Overview -- System architecture (includes RBAC section)
- Repository Map -- Which repo to change
- Local Setup Guide -- Getting started
- Suite Creation Guide -- Creating new suites
- Cross-Suite Auth Debugging -- Detailed SSO diagnosis
- Rollback Procedures -- Suite-specific rollback steps