This chapter describes the other failover procedures for configuring failover for a forest. The task here apply to both local-disk and shared-disk failover. For details about how failover works and the requirements for failover, see High Availability of Data Nodes With Failover. For details on configuring local-disk failover, see Configuring Local-Disk Failover for a Forest. For details on configuring shared-disk failover, see Configuring Shared-Disk Failover for a Forest. This chapter includes the following sections:
The XDQP port is the port in which each host in the cluster listens for communication with the other hosts in the cluster. This communication happens over the XDQP port, which is set to 7999 by default. The XDQP port must be open on the network so that all other hosts in the cluster can communicate over it. If the XDQP port is not available, then hosts can get
XDMP-FORESTMNT and other errors when trying to start up. For more details about communication between nodes in a cluster, see Communication Between Nodes.
Each group configuration has an
xdqp timeout, a
host timeout, and a
host initial timeout setting. These settings monitor activity over the XDQP port, and govern the time periods which will induce failover in various scenarios.
xdqp timeout is the time, in seconds, after which communication between e-node and d-node hosts (which happens over an internal MarkLogic Server protocol named XDQP) will time out if the host is unresponsive. If an
xdqp timeout is reached during a request (for example, during a query), a message is logged to the
ErrorLog.txt file and the request is retried until the
host timeout is reached, after which time, if the d-node host is still unresponsive, the request will fail with an exception. The
host timeout is the time, in seconds, after which a host will time out if the host is responsive, and then it will be disconnected from the cluster. The
xdqp timeout must be less than the
host timeout, and should typically be about one-third the value of the
host timeout. This allows the system to restart the connection with the unresponsive host after the
xdqp timeout occurs but before the host is disconnected from the cluster. The
host timeout is what can trigger a forest to fail over. For details on when a forest will fail over, see Scenarios that Cause a Forest to Fail Over.
The default settings for
xdqp timeout and
host timeout should work well for most configurations. If, however, you are seeing hosts disconnect from the cluster because of timeouts, you can raise these limits (keeping the 1 to 3 ratio between
xdqp timeout and
host timeout). Keep in mind, though that if the hosts are timing out, there might be some other issue that is causing the timeouts (such as a network problem, a disconnected cable, and so on).
host initial timeout is the time, in seconds, that an instance of MarkLogic Server will wait for another node to come online when the cluster first starts up before deciding that the node is down. This setting is designed to allow for 'staggered' cluster startups, where one machine might take a little longer to reboot than another, and avoid unneeded failover of forests during this initial system startup period. The default setting is 4 minutes, and is based on the amount of time it might take for an entire system to reboot (after a power outage, for example). Failover for any forests on a particular host will not be initiated during that cluster startup for this time period. If you know that your machines take more or less time to start up, you can change the
host initial timeout accordingly.
You can disable failover at two levels of granularity: you disable failover for a group, or you can disable failover for an individual forest. To disable failover, navigate to the group or the forest and set
failover enable to
false. Then, if a primary host fails, it will not fail over.
The Security database and all of the other auxiliary databases (Schemas, Modules, and Triggers) are set up by default to use private forests, which are forests that store their data in the default data directory. If you want to configure these databases to use shared-disk failover, then you must first move their forests to public forests (forests that have a directory specified) that store their data on a supported CFS.
Security2. For the new forest:
Security2, navigate to the
Securitydatabase in the Admin Interface, select the Forests link in the tree menu, and check the attached button corresponding to the forest named
Securityforest in the
Securitydatabase check the retired button.
Security) to the public forest (
Security2). You can check the status of the migration operation, as described in Checking the Rebalancer Status in the Administrator's Guide.
Securityforest, uncheck the attached button. Then click OK to detach it.