MongoDB Architecture

What is the Architecture of MongoDB?

MongoDB is developed to provide the distributed general-purpose and document-based database system that is used to develop the modern application considering the cloud era. MongoDB stores data in a JSON-like document that is more expressive, powerful, and flexible compared to traditional rows and columns format structure. The Powerful query language of MongoDB provides the capability to filter the fields without worrying about the depth of document nesting. It also supports the aggregation and the modern geo-based search, text search, and graph search.

MongoDB works on the below three core architectural principles.

1. MongoDB Query Language & The Document Data Model

The Developers can use the document-based model and MongoDB query language to develop the applications which are transactional, operational and analysis can be performed on those applications.

2. A Global Multi-Cloud database

Using the Global multi-cloud database, a user can flexibly move its application from private to the public cloud without changing a line of code.

3. MongoDB Cloud

The MongoDB cloud solution provides a unified experience to the applications and facilitates integrated services.

MongoDB Distributed System Architecture

MongoDB has a distributed system architecture that provides a high level of availability and redundancy of user data by replicating multiple copies of data on multiple database servers. It is achieved by the replica sets, which is the mongod process used to maintain the multiple copies.

The architecture of MongoDB has single-master concepts. It has one primary node and there can be 2 to 48 secondaries. The primary node writes and reads all data changes of its datasets in the operation log also called oplog. The secondary nodes read the primary node oplog and apply those changes to their datasets. So in case of a primary node failure, the eligible secondary node will be elected as a primary node and perform the read and write operation.

In the following figure, we can see the MongoDB database cluster. In a MongoDB cluster, there could be multiple machines having data. Each database machine has its own primary and secondary replication data sets and connecting through MongoDB sharding. Sharding is a concept to distribute data across multiple systems and MongoDB achieves Horizontal Scaling through Sharding.

High-Level Components of MongoDB Architecture

Let us understand the high-level component of MongoDB architecture.

1. Application and MongoDB Driver

The application uses the MongoDB Node.js driver to make a connection with MongoDB and run the programs. Apart from this MongoDB supports other drivers as well which are listed below.

C Driver
C++ Driver
C# Driver
Go Driver
Java Driver
Node.js Driver
PHP Driver
Python Driver
Ruby Driver
Rust Driver
Scala Driver
Swift Driver

2. Query Router

The query router component is also called the mongos process that works as an interface and an entry point for applications. The application can connect with the query router despite underlying shards and replica sets. Once the connection is made the query router accepts the application query and executes it and sends the output back to the application.

3. Shard

MongoDB Sharding is a technique to distribute the data across multinode. Using Sharding MongoDB supports large datasets and delivers high throughput operation. It uses horizontal scaling to distribute the data on multiple systems and depending upon the requirement add the nodes as well.

4. Primary Replica Set Member

The Primary Replica Set Member is used to receive all the write and read operations and process that. It maintains the oplog for all write operations being performed on the data set.

5. Secondary Replica Set Member

The Secondary Replica Set Member is used to maintain the primary data sets. It reads the primary's oplog and applies the changes in its dataset asynchronously.