I am finally scratching the itch I’ve had to figure out how to create a relational database management system or RDBMS.  Why do this?  Well when I was in the throws of my graduate degree, a professor was telling us about a contract he had taken in the past.  As part of the contract he decided he would build a RDBMS from the ground up.  “How hard can it be?” he thought to himself.  Turns out it was a lot harder than he thought it would be.  Then about a month ago silly little questions arose in my head like, “how does a RDBMS do that?” and “Why haven’t you built a compiler recently?”  So of course, the only thing I can do is build a RDBMS and blog about my progress. Certainly nothing would make me happier than to share my programming problems with the world in general.  I should also make it clear that even though I am pretty sure of the general outline of the project, I am not certain about my final solution.  A quick note also, I may use DBMS instead of RDBMS in some places.   I really mean the same thing, its just that RDBMS is more complete, but DBMS is a more common term.

There are three popular commercial RDBMS and two popular open source RDBMS.  The three commercial products I am thinking of are Oracle, IBM DB2, and Microsoft SQL Server.  There are many more excellent commercial RDBMS systems out there.  The commercial systems listed also provide their basic engine for free.  They are really meant as development tools, educational, or for really small databases.  You certainly wouldn’t want to run a web-based store from them, unless you are only expecting a couple of customers at a time.  The two open source systems are MySQL and PostgreSQL.  They are relatively complete, but they do lack some of the more sophisticated features of the commercial offerings.  There are other open source extensions for these, but I will leave that to you to explore if you desire.  For completeness I should also mention that not all databases are relational.

So what am I looking to do?  What is required to say I have made a RDBMS?  First, of course, is that it must be able to store and retrieve data. It should also do it in a relational way, that is within the framework of Relational Algebra in general, and normalized more specifically.  It should implement the SQL language to query the database management system.  It should be composed of two parts, for now, the first is a command line interface and the second is a service or daemon that manages the files.

Finally as I go I will try to get code written so that I am sure what I am saying isn’t completely wrong.  Just because I am trying to prevent that doesn’t mean it won’t happen, just that I want to try to prevent that.  I will try to post the results to SourceForge.  I will also be using PostgreSQL to develop my SQL code and examples and to validate my results.

This is the basic framework that I want to try:

  1. Basic concepts of relational databases
  2. Relational Algebra
  3. Normalization
  4. SQL Basics
  5. Advanced SQL
  6. RDBMS Security
  7. RDBMS Management
  8. Building a daemon
  9. Efficient Data Storage
  10. Compiling SQL
  11. Error Trapping
  12. Building a Command Line Interface

Well I am off to get my mind around the first topic and how to present it in a non-pedantic way.

Like it on Facebook, +1 on Google, Tweet it or share this article on other bookmarking websites.