Skip to main content

Testing in the Real World

Why staging environments lie to you and how to test enterprise web apps with real data, real users, and real load.
20 July 2018·7 min read
John Li
John Li
Chief Technology Officer
Your staging environment is lying to you. It has clean data, predictable load, and none of the edge cases that real users generate in the first hour of production. Every enterprise project I've worked on has had at least one moment where something worked perfectly in staging and fell over in production. The difference is whether you planned for that moment or got surprised by it.

What You Need to Know

  • Staging environments are useful for development workflow but unreliable as predictors of production behaviour
  • The three gaps that cause production failures: data volume, concurrent load, and integration behaviour under stress
  • Testing with production-like data is non-negotiable for enterprise systems
  • Load testing should happen continuously, not as a final gate before launch

Why Staging Lies

The Data Gap

Staging databases are small. They have hundreds of records where production will have millions. Queries that return instantly against 500 rows take 12 seconds against 5 million rows. Indexes that aren't needed at staging scale become critical at production scale. Data patterns that are clean and consistent in staging are messy, inconsistent, and full of edge cases in production.
I've seen a search feature work perfectly in staging and time out in production because the staging database had 200 products and production had 180,000. The query wasn't optimised for scale because it never needed to be, until it did.

The Load Gap

Staging has one or two testers clicking through the application sequentially. Production has 500 users hitting it simultaneously, running reports that lock database tables, uploading files that consume bandwidth, and triggering background processes that compete for CPU.
Concurrency issues don't surface under light load. Race conditions. Deadlocks. Connection pool exhaustion. Memory leaks that only become visible after hours of sustained use. These are the bugs that staging can't catch because staging doesn't replicate the conditions that create them.

The Integration Gap

In staging, your integrations connect to sandbox environments with predictable responses. In production, they connect to real systems that time out, return unexpected data, change their API behaviour without warning, and occasionally go down entirely.
The integration that worked perfectly in staging fails in production because the real SAP instance returns date formats that the sandbox never did. The payment gateway that responded in 200ms in sandbox takes 4 seconds under real transaction load. The email service that delivered instantly in testing starts queuing during peak hours.

How to Test for Reality

1. Use Production-Like Data

Copy production data into your test environment. Sanitise it for privacy, of course. Remove personally identifiable information, mask sensitive fields, comply with your data governance requirements. But keep the volume, the variety, and the messiness.
If you can't copy production data, generate synthetic data that matches production characteristics. Same volume. Same distribution of values. Same proportion of edge cases and malformed records. This takes effort, but it's the single most valuable investment in testing quality.

2. Load Test Continuously

Don't save load testing for the end. Run it every sprint. Define baseline performance metrics early and test against them continuously. When a code change causes a 20% regression in response time, catch it the week it was introduced, not three months later when the architecture has been built on top of it.
Tools like JMeter and Gatling are mature enough for most enterprise scenarios. The hard part isn't the tooling. It's defining realistic load profiles. How many concurrent users? What mix of operations? What peak-to-average ratio? Get this from the client's operational data, not from guesswork.
A load test with the wrong profile is worse than no load test at all. It gives you false confidence.
John Li
Chief Technology Officer

3. Test Integration Failure Modes

Don't just test that integrations work. Test that they fail gracefully. What happens when the upstream system is slow? What happens when it's down entirely? What happens when it returns invalid data?
Build integration tests that simulate degraded conditions. Inject latency. Return error responses. Drop connections. Your application needs to handle all of these without losing data or showing users a blank screen.

4. Include Real Users Early

Beta testing with actual enterprise users catches things that no automated test can. Users don't follow happy paths. They paste data from Excel with hidden characters. They open the application in three tabs simultaneously. They start a process, get interrupted, and come back to it two days later.
We typically run a two-week beta period with 10-20 real users before any enterprise launch. The volume of feedback in those two weeks is consistently the most valuable testing phase of the entire project.

5. Monitor Like You're Already in Production

Set up production-grade monitoring in your test environment. Application performance monitoring. Error tracking. Log aggregation. Database query analysis. If you're going to use these tools in production (and you should), use them in testing first. Learn to read them. Establish baselines.
When something goes wrong in production, and something will, you need to be able to diagnose it in minutes, not hours. That diagnostic skill comes from using the tools before you need them urgently.

The Launch Conversation

Every enterprise launch should include a clear, honest conversation about risk. What are the known gaps in testing? What are the monitoring plans for the first 48 hours? Who's on call? What's the rollback plan if something goes seriously wrong?
The goal isn't zero risk. That's impossible. The goal is understood risk with a plan for every plausible failure mode. Clients respect honesty about risk far more than false confidence that everything will be fine.