Operations
Make sure you can deploy confidently and know when things break.
Run this audit before going to production, after infrastructure changes, or when reliability becomes a concern.
Useful for developers and ops teams ensuring production readiness.
Why This Matters
You can't fix what you can't see. Good operations practices let you ship with confidence, catch problems before users report them, and recover quickly when things go wrong. Without monitoring and reliable deployments, you're flying blind.
What to Check
Focus on these four areas when reviewing operations:
Deployment Pipeline
Can you deploy safely? Check that deployments are automated, tests run before deploy, and you can rollback quickly if something goes wrong.
Monitoring
Do you know when it breaks? Error tracking, uptime monitoring, and performance dashboards should alert you to problems before users complain.
Resilience
Does it recover gracefully? Error boundaries, retry logic, and graceful degradation help the app survive failures without crashing completely.
Environment Config
Are environments properly separated? Production secrets should never leak to development. Environment variables should be documented and secure.
Stage Expectations
What operational standards apply at each stage:
Skip
Manual deploy is fine. No monitoring needed. Just get it running.
Light
Basic CI/CD in place. Error tracking active. Can deploy without fear.
Full
Full pipeline with tests. Monitoring dashboard. Alerting configured.
Complete
Runbooks documented. On-call rotation. Incident response ready.
Incident Response Steps
When something goes wrong, follow this pattern:
- 1. Detect — Alert received, verify the issue exists
- 2. Communicate — Notify stakeholders, update status page
- 3. Contain — Prevent further damage (feature flag, rollback)
- 4. Diagnose — Find root cause using logs and monitoring
- 5. Fix — Implement and deploy solution
- 6. Verify — Confirm issue resolved, monitoring normal
- 7. Review — Post-incident review, document learnings
Common Issues
Manual deployments required
Set up auto-deploy from main branch using Vercel, Netlify, or GitHub Actions
No way to rollback
Use Vercel instant rollback or implement deployment versioning
Don't know when site is down
Set up uptime monitoring with alerts (UptimeRobot, Vercel, etc.)
Errors crash entire app
Add React error boundaries at page and component level
Secrets in version control
Move to environment variables. Use platform secret manager.
Run with AI: Use /audit deploy to have an AI agent check your deployment and monitoring setup. The agent will analyze CI/CD configuration, error handling, and environment variables.
Next Steps
Continue your quality review: