BigTable Concept: Why do the World’s Smartest People Ignore Relational DBs?

In the era of the Internet, the key problem is scalability. As cloud’s popularity climbs up, we are hearing more about the constraints. So far, I only had time to play with Google’s App Engine and Microsoft’s Azure Services Platform. Cloud developers are mainly shocked by the new non-relational databases that cloud services use as the only alternative. Google calls it BigTable and Microsoft finds a new place in its own terminology dictionary for BLOB. Many start to wonder what the hype about the relational databases was over the past 30 years. Foremost, let’s clear that this is not a replacement, but a more efficient way to store data by eliminating not-that-fundamental super engineered functionality layers of the current relational database management systems. Yes, good news for people makes living by designing super large and highly normalized databases to ensure data integrity.

On a relational database, everything is in control; you can add constrains to ensure nobody will be able to enter a duplicated row. Or in deletion, you can program DBMS to handle the useless orphan rows. But the best, a relational DBMS is going to pre-process your SQL query before executing to avoid silly performance mistakes you can make. Think of the environment now: constraints over constraints, query execution strategies, high-level of dependence and complex indexing methods. This package works great unless you want to distribute the tables to different machines. Can you image joining two tables where tables are distributed over 100.000 nodes? In a Google case, this is the everyday problem (or better, call it an every millisecond issue). Luckily, Google’s data has characteristics; according to Jeffrey Dean, they are able to manage constraints DBMSes serve to process data, on the application level. Consequently, Google keeps data in a very basic form as <key,value,timestamp> tuples.

BigTable looks like a very large B+ tree. It has 3 levels of hierarchy. All of tables are sorted and those tables are separated into pieces called tablets. First two levels are made of metadata tables to locate you to the right  tablet. Root tablet is not distributed, but with helps of prefetching and extreme caching, it is actually not the bottleneck of the system. Final level tablets points to physical files (managed by Google File System). GFS provides 3 copies for each file on the system, so no matter if a machine is going down, they still have 2 other copies somewhere else. In the 2nd figure, a row of a tablet is illustrated. com.cnn.www is the key in this case and value has three different columns: contents, anchor:cnnsi.com and anchor:my.look.ca. Notice the timestamps, these fields may contain more than one version of entry. In this case, as Google crawler finds updated content on www.cnn.com, a new layer is being added. This enables and leads BigTable to provide a three dimensional data presentation.

In the end of the day, BigTable is not rocket science. It is compact and easy to adopt. It is very straight-forward. Many friends know I came with a very similar concept while designing Rootapi two years ago, those were the times I havent heard of BigTable. Additionally I was saving values as JSON (equality operation was enough in querying) in blocks which were multiples of the sector size of my physical hard drives. IO operations were super fast, JSON based web services were super fast and it was highly distributable, although I couldn’t find a great environment to explore the severe situations deeply.

As we move on the cloud, this is the way we are going to look at data storage. If you need more technical details, I highly recommend you to take a look at the following references:

  1. BigTable: A Distributed Structured Storage System
  2. Bigtable: A Distributed Storage System for Structured Data – Original publication paper of BigTable, appeared in OSDI’06.
  3. Google File System An introduction to GFS by Aaron Kimball.