MarkLogic Server 11.0 Product Documentation
Scalability, Availability, and Failover Guide — Chapter 4

Clustering in MarkLogic Server

This chapter describes the basics of how clustering works in MarkLogic Server, and includes the following sections:

Overview of Clustering
Evaluator/Data Node Architecture
Communication Between Nodes
Communication Between Clusters
Installation

Overview of Clustering

You can combine multiple instances of MarkLogic Server to run as a cluster. The cluster has multiple machines (hosts), each running an instance of MarkLogic Server. Each host in a cluster is sometimes called a node, and each node in the cluster has its own copy of all of the configuration information for the entire cluster.

When deployed as a cluster, MarkLogic Server implements a shared-nothing architecture. There is no single host in charge; each host communicates with every other host, and each node in the cluster maintains its own copy of the configuration. The security database, as well as all of the other databases in the cluster, are available to each node in the cluster. This shared-nothing architecture has great advantages when it comes to scalability and availability. As your scalability needs grow, you simply add more nodes.

For more details of how clustering works in MarkLogic Server, see Getting Started with Distributed Deployments.

Evaluator/Data Node Architecture

There are two roles a node in a MarkLogic Server cluster can perform:

Evaluator node (e-node)
Data node (d-node)

E-nodes evaluate XQuery programs, XCC/XDBC requests, WebDAV requests, and other server requests. If the request does not need any forest data to complete, then an e-node request is evaluated entirely on the e-node. If the request needs forest data (for example, a document in a database), then it communicates with one or more d-nodes to service the forest data. Once it gets the content back from the d-node, the e-node finishes processing the request (performs the filter portion of query processing) and sends the results to the application.

D-nodes are responsible for maintaining transactional integrity during insert, update, and delete operations. This transactional integrity includes forest journaling, forest recovery, backup operations, and on-disk forest management. D-nodes are also responsible for providing forest optimization (merges), index maintenance, and content retrieval. D-nodes service e-nodes when the e-nodes require content returned from a forest. A d-node gets the communication from an e-node, then sends the results of the index resolution back to the e-node. The d-node part of the request includes the index resolution portion of query processing. Also, each d-node performs the work needed for merges for any forests hosted on that node.

It is possible for a single node to act as both an e-node and a d-node. In single host configurations, both e-node and d-node activities are carried out by a single host. In a cluster, it is also possible for some or all of the hosts to have shared e-node and d-node duties. In large configurations, however, it is usually best have e-nodes and d-nodes operate on separate hosts in the cluster. For more details about this distributed architecture, see Getting Started with Distributed Deployments.

Communication Between Nodes

Each node in a cluster communicates with all of the other nodes in the cluster at periodic intervals. This periodic communication, known as a heartbeat, circulates key information about host status and availability between the nodes in a cluster. Through this mechanism, the cluster determines which nodes are available and communicates configuration changes with other nodes in the cluster. If a node goes down for some reason, it stops sending heartbeats to the other nodes in the cluster.

The cluster uses the heartbeat to determine if a node in the cluster is down. A heartbeat from a given node communicates its view of the cluster at the moment of the heartbeat. This determination is based on a vote from each node in the cluster, based on each node's view of the current state of the cluster. To vote a node out of the cluster, there must be a quorum of nodes voting to remove a node. A quorum occurs more than 50% of the total number of nodes in the cluster (including any nodes that are down) vote the same way. Therefore, you need at least 3 nodes in the cluster to reach a quorum. The voting that each host performs is done based on how long it has been since it last had a heartbeat from the other node. If at half or more of the nodes in the cluster determine that a node is down, then that node is disconnected from the cluster. The wait time for a host to be disconnected from the cluster is typically considerably longer than the time for restarting a host, so restarts should not cause hosts to be disconnected from the cluster (and therefore they should not cause forests to fail over). There are configuration parameters to determine how long to wait before removing a node (for details, see XDQP Timeout, Host Timeout, and Host Initial Timeout Parameters).

Each node in the cluster continues listening for the heartbeat from the disconnected node to see if it has come back up, and if a quorum of nodes in the cluster are getting heartbeats from the node, then it automatically rejoins the cluster.

The heartbeat mechanism enables the cluster to recover gracefully from things like hardware failures or other events that might make a host unresponsive. This occurs automatically, without human intervention; machines can go down and automatically come back up without requiring intervention from an administrator. If the node that goes down hosts content in a forest, then the database to which that forest belongs goes offline until the forest either comes back up or is detached from the database. If you have failover enabled and configured for that forest, it attempts to fail over the forest to a secondary host (that is, one of the secondary hosts will attempt to mount the forest). Once that occurs, the database will come back online. For details on failover, see High Availability of Data Nodes With Failover.

Communication Between Clusters

Database replication uses inter-cluster communication. Communication between clusters uses the XDQP protocol. Before you can configure database replication, each cluster in the replication scheme must be aware of the configuration of the other clusters. This is accomplished by coupling the local cluster to the foreign cluster. For more information, see Inter-cluster Communication and Configuring Database Replication in the Database Replication Guide and Clusters in the Administrator's Guide.

Installation

Installation of new nodes in a cluster is simply a matter of installing MarkLogic Server on a machine and entering the connection information for any existing node in the cluster you want to join. Once a node joins a cluster, depending on how that node is configured, you can use it to process queries and/or to manage content in forests. Under normal conditions, you can also perform cluster-wide administrative tasks from any host in the cluster. For details on the installation process, see the Installation Guide.

If you are installing MarkLogic 9.0-4 or later, you may have to install MarkLogic Converters package separately. For more details, see MarkLogic Converters Installation Changes Starting at Release 9.0-4 in the Installation Guide.

« Previous chapter

Next chapter »

MarkLogic Server 11.0 Product DocumentationScalability, Availability, and Failover Guide — Chapter 4