Saturday, June 28, 2008

Other than RDBMS.

"A database is a structured collection of records or data." Wikipedia.

That is how the Wikipedia article for Database starts and it is a good beginning. The article then proceeds to talk primarily about Relational Databases. Which is not surprising since Relational Databases and Relational Database Management Systems are the dominant solution to many of the problems encountered when trying to maintain and use a collection of data.

These are problems are non trivial to say the least and often have conflicting requirements for their solution which results in the RDBMS feeling heavy and bloated.

So it is not surprising there are quite a few special purpose databases (again a database is a collection of data) to handle unique needs and these databases are not RDBMS. There are also some practices routinely engaged in by developers when they do not want the heavy weight of an RDBMS on their development.

Here is a non exhaustive list:

1) configuration files
These files help an application answer some questions:
In what context do I run?
How do I find the resources I need? Like the RDBMS I use.

It should be noted their a couple of common implementation of configuration databases. Their is the Registry and there are INI or config xml files.

2) log files
As soon as an application is running in productions it will encounter difficulties and there are some common questions that need to be answered:
What happened recently?
What is note worthy?
In what order did things happen?

Even if things are okay you will always want some source for diagnostic information.

Two dominant databases in this area are text files that are simply appended to and the Event Log.

3) documents and xml
Often the data needs to be transported, in fact a lot of the terminology in the early days of the web talked of documents and transporting documents. XML due to it's internal structure and syntax is becoming the dominant solution for transferring data. In fact the config files mentioned earlier are XML because they solve the problem of how to get the initial start up data installed.

3)Directory Services (Active Directory)
Separate from configuration that is unique to a program there is information that is global in nature:
Where are things located? People, printers, computers.
What is allowed? Who can view what? Who can change what.
Given these require security and security is itself a global concern it is natural to locate all the security in the Active Directory as well.

4) programs
You might make the case that programs are nothing more than documents and you would be correct. Which makes them a database as well. However they are enough of them and they have some unique concerns. At their core, programs are nothing more than ordered lists of instructions. But to provide anything useful there must be a means of interacting with flow of the program. Either people interact with the program or other machines or programs interact with the program.

5) file systems
Again back to the original definition any collection of data is a database. A file system is a collection of information on how to find files and details like creation date, ownership and so forth.

6) traditional RDBMS
This is a not a complete list of databases other than RMDBS so I will close with RDBMS but only to point out a couple things about them.

RDBMS seek to answer a question authoritatively:
What happened to the data across all of time and space?

This is a hard thing to do and immediately issues arose involving performance and reliability. Reliability forced the notion of transactions which can be summarized in a odd way as "until it happens, it did not happen".

No comments: