Why we need RFCs for database implementations!

We eschew reinventing the wheel in every aspect of development, yet when it comes to creating a customer database, or an order database, everyone creates their own. They roll out their own database model, usually to their regret, and have to roll their own create, read, update, delete (CRUD) services. We’ve been using databases for decades and somehow, we have failed to eliminate the constant repetition of the same work. Everyday someone is creating a Customer or Order database model from scratch. Why?

We have technologies like Entity Framework, NHibernate, Linq to Sql, etc… to help try to make sure we don’t reinvent the wheel for Object-relational mapping (ORM, a technique to talk to a relational database with object-oriented code), but we are constantly doing it just above the ORM layer because we are reinventing the database itself. Look at the design below. The blocks in red are the ones we are reinventing over and over again.

Reinventing the database wheel

Red = Layers in which we reinvent the wheel and duplicate a lot of code

Let’s look at how this causes a chain reaction of reinventing the wheel. No only do we reinvent the wheel for the services, we do so for the UI. You can break it down further from UI to MVC or MVVM, or your favorite UI partern, and you will see that we are duplicating a lot of code that we just don’t need to duplicate.

This also has an effect on Business Intelligence (BI). Because everyone is created a completely different database design, all the queries to get data from the database are different. These queries are used to build reports or to mine the data into a a data warehouse. The data often has to go through an Extract, Translate, Load (ETL) procedure to even make it reportable. Why?

Reinventing the database wheel chain reaction

Why are we all excepting the need to reinvent the wheel in so many layers? Is the open source world missing the vision of separation of concerns? We have entire projects that are tightly coupled end to end, some are far better and are loosely coupled end to end. But we don’t have a well-tuned engine of replaceable parts.

Why is it that we don’t have a place for a common database schema and ORM design. The closest thing I have found is this:
http://www.databaseanswers.org/data_models

This list of common schemas is really great, but it is just one box of the image above and that box isn’t even complete. We need not only common data models, but we need them in a source control repository so people can branch them, change them and contribute back. We need versions for MS Sql, Oracle, MySql, Postgresql, etc… We need tests to make sure the data models scale. We need a common services layer, and not just in one language. Perhaps Java, C#, and PHP to start out with, so companies don’t have to change the language they code in to use this.

Imagine it. Software startups could really get a jumpstart if they could download a schema in the database they want to use (perhaps Postgresql), a common service layer in their language (perhaps C#), and then build their proprietary idea on top of that. The startup can claim to implement the standard data model (SDM) and so enterprise companies can know their software will drop in.

UI designers could start building purchasable UIs for these database models and services.

BI companies could start selling prepackaged content reports along with their BI software.

This blows up even bigger when you start stacking these up. We could start using either inheritance or data model bundling. A shopping cart database model be combination of a person database model an order data model, and a Product data model. A software product data model might inherit from Product data model and add a few software specific tables, etc… When a full solution is released, each bundled model is independent of the rest. Each could be on their own databases instances on different servers if desired.

Now you can see how integration becomes so much easier. You start seeing how two completely separate applications could share the same data source. No more ETL from one system to the next for integration purposes. You wouldn’t need to do that. Imagine you have already implement your Customer Relationship Management (CRM) using Standard Data Models (SDMs), and then you want to implement an E-Commerce shopping cart. They choose an E-Commerce shopping cart that also uses the SDMs. Instead of having to migrate customers into the E-Commerce solution, they simply redirect the E-Commerce solution’s customer data model to point to their existing system. The rest of the E-Commerce data models can point to a separate database.

I feel like everyone is missing the big picture here. What do you think?

Leave a Reply

How to post code in comments?