Skip to main content

The Database Nobody Designed

Every enterprise has one: a database that grew without design. What happens when you try to build on top of it.
10 February 2017·6 min read
John Li
John Li
Chief Technology Officer
Every enterprise has one. It started as a simple Access database in 2004. Someone added a few tables. Then a few more. Then someone else added their own tables without knowing the first set existed. Twelve years later, it is the backbone of three business processes, nobody understands the schema, and two columns are named "Status" and "Status2."
Every enterprise has this database. Sometimes it's Access. Sometimes it's a SQL Server instance that's been running on a machine under someone's desk since the Clark government. Sometimes it's a collection of Excel files that someone optimistically calls "the system." The details change. The pattern doesn't.

How It Happens

Nobody sets out to build a bad database. It happens incrementally, over years, through decisions that were each reasonable at the time.
A team needs to track something. They create a table. The fields make sense for the current requirement. Six months later, a new requirement arrives. Rather than redesigning the schema, someone adds a column. Then another. Then a "Notes" field that becomes a dumping ground for everything that doesn't fit elsewhere. Then a "Type" column that has fourteen possible values, three of which mean the same thing, and two of which nobody can explain.
The relationships between tables are implicit rather than explicit. There's no foreign key constraint because someone removed it when it "kept causing errors" during a data import in 2009. The data types are inconsistent. Dates are stored as strings in some tables and datetime in others. Currency fields are decimal in one table and float in another, which means the numbers don't quite match when you join them.
Nobody designed this. It grew.

What Happens When You Build On It

Eventually, the organisation decides they need a proper application. A web portal, a reporting dashboard, an automated workflow. The requirement is clear. The budget is approved. The project kicks off.
Then the development team meets the database.
The first week is spent understanding the schema. Not from documentation, because there isn't any. From reverse-engineering the tables, reading the stored procedures, and asking staff what each field means. The answers are usually "I think it's..." or "that was before my time."
The second week is spent discovering the data quality issues. Null values where there shouldn't be any. Duplicate records that differ by one field. Orphaned rows that reference records deleted years ago. Date fields with values like "TBC" and "asap."
By the third week, the project plan needs revising. The "simple" data layer that was going to take two weeks is now a month. Half of that is data cleaning. The other half is building an abstraction layer that presents a coherent data model on top of the incoherent reality underneath.
The database is never as clean as the client thinks it is. Budget for the moment when someone says "oh, we also have this other spreadsheet."
John Li
Chief Technology Officer

The Temptation to Rebuild

The obvious answer is: redesign the database. Start fresh. Build a proper schema with proper relationships and proper constraints. Migrate the data. Done.
In theory, yes. In practice, the existing database is connected to everything. Reports pull from it. Exports feed into other systems. Staff have built processes around its quirks. Rebuilding the database means rebuilding every downstream dependency, and discovering dependencies nobody knew existed.
We've done database migrations. They work. But they're their own project, not a line item in the application build. The organisations that treat migration as a quick prerequisite get burned. The ones that scope it honestly, with its own timeline and budget, get a foundation that lasts.

What We Actually Do

When we encounter the database nobody designed, we follow three steps.
First, we map it. Every table, every relationship, every stored procedure. We document what we find, including the inconsistencies and the unknowns. This map becomes the foundation for every decision that follows.
Second, we decide what to fix and what to abstract. Not everything needs to be perfect. Some tables are messy but stable and rarely touched. Others are actively causing problems and need to be restructured. The pragmatic choice is to fix what matters and build an abstraction layer over the rest.
Third, we build the new application against a clean data access layer, not directly against the messy schema. The application sees a coherent data model. The data access layer translates between that model and the reality underneath. When the client is ready to improve the database further, the application doesn't need to change.
It's not elegant. But it works. And it's honest about the state of most enterprise data.
The database nobody designed is one of the most common challenges in NZ enterprise projects. Pretending it's not there doesn't help. Acknowledging it, scoping for it, and building around it intelligently is what separates projects that succeed from projects that stall three weeks in.