NoSQL database is the buzzword in the current software industry. NoSQL database is also widely accepted, but it is NOT a replacement for the traditional relational database management system (RDBMS), which stores data in relational tables. So we can simplify this by saying that NoSQL is there to overcome the gaps found in traditional RDBMS.
In this article, I will discuss the NoSQL database and its various aspects.
What is NoSQL
NoSQL – interpreted as ‘Not only SQL’ is a database that provides a mechanism to store and retrieve data in a manner which is different from the traditional RDBMS and which heavily depends on tabular relations. This approach was initiated and accepted based on the following facts:
- Design Simplicity/Performance – In NoSQL the data structure is either key-value or flat file. Because of its simple and easy to manage data structure, NoSQL is faster than its counter parts. Its performance is major differentiator.
- Horizontal Scalability – NoSQL database implementations can be easily scaled up or down as and when required.
So the two most influencial factors of NoSQL databases are ‘Performance’ and ‘Scalability’. NoSQL database is designed to combat the drawbacks of the relational model.
Different Types of NoSQL Databases
There are different types of NoSQL databases available in the market. Let us have a look to get an understanding of each.
- Key Value paired database – This is the simplest and most commonly used type of NoSQL based database. In this category, each item in the database is stored in the database as an attribute called key along with its value. So it is basically a key-value pair.
- Graph Stores – This category of NoSQL is used to store information about network e.g. social networking data. E.g. Neo4J and HyperGraphDB etc.
- Document Database – This is an extended form of key value paired DB where every key is associated with a complex data structure. This data structure is known as document. Documents can further contain key value pair or even nested documents.
- Wide Column storage – These are optimized for queries over large data records. These databases store columns of data instead of rows. E.g. Cassandra, HBase.
Advantages of NoSQL
As compared to the traditional relational databases, NoSQL based databases are more scalable and offer better performance. Relational databases are said to not be able to competently handle the following scenarios:
- Relational databases often fail to handle data of larger volumes be it structured, semi structured or unstructured data.
- Relational databases have failed in the agile environment which are sprint based and require raid iteration and frequent code publishing.
- Relational databases are not designed to be compatible with object oriented programming which is very simple, flexible and easy to use.
- If you want to store hierarchical objects with query capabilities, then RDBMS is not the recommended solution. Only NoSQL can perform well.
- For a cloud deployment, which is a distributed environment, RDBMS is not suitable.
So in the above scenarios NoSQL is the only solution to fill the gaps. NoSQL data model is efficient and has a scalable architecture as compared to relational model which is expensive and follows a monolithic architecture.
NoSQL Allows Us to Have Dynamic Schema for the Database
In a relational database; we need to define the schema in the very beginning. Any relational database will like to know in advance, the data that we want to store e.g. if we want to store an employee’s record such as name, department, phone number, address etc. We also need to know the data type and their possible size in advance. This approach presents challenges in agile development methodology as every time we need to include a new feature, we need to modify the schema which may result in making the application unstable. For example, if we take a call to add the spouse and kids details of every employee in the application, we will be required to add a few more columns and then a migration is required to migrate the old data into the new table. In this situation, if the database size is large, we will require a significant amount of time to migrate the database which may result in a large down time. If we need to address these kinds of changes frequently, then it will be quite problematic to manage these downtimes.
NoSQL based databases are designed and developed to handle these kinds of situations. In NoSQL databases, we can insert data without having a pre-defined schema which makes our lives easier while making changes at the database level. Thus, it helps in rapid development and also the code integration is easier in this approach.
So in NoSQL, the advantage of ‘Dynamic Schema’ gives us a lot of flexibility for managing the ever changing demands of web applications.
Because of their way of structuring, relational databases can scale vertically i.e. if we need to scale the database of an application; we need to host a single server having the entire database loaded on it. This is to ensure data availability. This approach is relatively expensive and the chances of failure are also high. To come out of this bottleneck it is advised to scale horizontally rather than vertically. A sharding mechanism allows us to have the database across multiple server instances which are done on SQL based databases. This is accomplished with the help of Storage Area Networks or SANs. Since the databases don’t provide this feature it becomes the responsibility of the developer to deploy multiple relational databases across different systems. Each and every single data record is stored on all the database instances. The developer needs to develop the application code in order to distribute data, query and collate the results of the data across all the database instances. In addition to this, code should be developed to handle any resource failures. This can be done by performing joins across the different databases. This approach is called data rebalancing and replication. Many benefits of the relational database like transactional integrity are compromised while employing manual sharding.
On the other hand, NoSQL databases generally support automatic sharding, i.e., these databases have the ability to spread the data across any number of database instances automatically. This mechanism doesn’t require the application to even be aware of the server composition pool. Data and query load are automatically balanced across servers, and when a server goes down, it is replaced immediately causing no disruption in the application.
With Cloud computing in place, we can have this approach in an easy way. Cloud Providers like Amazon Web Services or AWS have the ability to provide virtually unlimited capacity on demand and also takes care of all the important database administration tasks. Now the developers are no longer required to build complicated and expensive platforms to support their applications, and hence are free to concentrate on writing application code which requires more attention given the complexity of the business. This approach is also more cost effective.
The commonly used NoSQL databases support automatic data replication. Thus we get high availability of the data and also recovery against disaster and do not require involving separate applications to manage these tasks.
Implementing NoSQL Databases
Most organizations start with having a trial implementation of NoSQL database which helps them to develop an understanding of the software and the technology since it becomes very difficult for the traditional DBAs to digest the approach of NoSQL. Most of the NoSQL databases are open source, thus allowing the developers to download the software and start the POC development without having to bother with licensing challenges. Since the development cycles are shorter and faster developers can take the advantage to innovate and explore new areas which might produce better results.
We have discussed the NoSQL database and its various aspects. Now, it is clear that NoSQL is not a replacement to the traditional RDBMS. But it has a different set of use cases which are not suitable for RDBMS. NoSQL databases are continuously evolving and it will come with more new features in near future. To conclude the discussion, let’s have a quick look at the following bullets.
- NoSQL stands for ‘Not Only SQL’.
- NoSQL based databases differs from the traditional databases in the approach of storing and retrieving data.
- NoSQL based databases are much faster compared to their relational counterpart.
- Different types of NoSQL databases are –
- Key Value Paired
- Graph Stored
- Document Database
- Wide Column based