This chapter describes the procedure for configuring shared-disk failover for a forest. For details about how failover works and the requirements for failover, see High Availability of Data Nodes With Failover. For details on configuring local-disk failover, see Configuring Local-Disk Failover for a Forest. This chapter includes the following sections:
For other failover administrative procedures that apply to both local-disk and shared-disk failover, see Other Failover Configuration Tasks.
failover enablebutton is set to
In the database in which you plan to attach shared-disk failover forests, MarkLogic recommends setting the
journaling option to
strict setting does an explicit file sync to the journal before committing the transaction. The default
fast setting syncs to the operating system, but does not do an explicit file sync, and instead relies on the operating system to write to the disk. If the operating system fails suddenly (for example, a power failure or a system crash) after the transaction is committed but before the journal has been written to disk, in
fast mode, it is possible to lose a transaction, but in
strict mode you will not lose any transactions.
journaling option to
strict will slow each transaction down (because it has to wait for the filesystem to report back that the journal file was written to disk). In some cases, the slowdown can be significant, and in other cases it can be relatively modest, but in all cases it will be slower. If you are doing many small transactions one after another, the slowdown can be considerable. In these cases, the best practice is to batch the updates into larger transactions (for example, update 100 documents per transaction instead of 1 document per transaction). The optimal size of your batches will vary based on the size of your documents, the hardware in which you are running MarkLogic Server, and other factors. You should perform your own performance benchmarks with your own workload to determine your optimum batch size.
The reason for setting the option to
strict for shared-disk failover is that, with shared-disk failover, you are specifically trying to guard against hosts going down with only a single copy of the data. With local-disk failover, the replica forests each store a copy of the data, so not synchronizing the filesystem for each transaction is less risky (because the master host as well as all of the replica hosts would need to have the operating system fail suddenly in order to potentially lose a transaction), and it therefore tends to be more of an acceptable performance trade-off to set
fast (although you could use
strict if you want to be even safer). You should evaluate your requirements and SLAs when making these trade-offs, as what is acceptable for one environment might not be acceptable for another.
journalingoption and set it to
/veritas/marklogicon your primary host and all of your failover hosts, specify
failover enable. Note that
failover enablemust be set to
trueat both the forest and the group level, as described in Enabling Failover in a Group, for failover to be active.
failover hostssection, choose a failover host. If you are specifying more than one failover hosts, choose additional failover hosts in the drop-down lists below.
If a forest fails over to a failover host, it will remain mounted locally to the failover host until the host unmounts the forest. If you have a failed over forest and want to revert it back to the primary host (unfailover the forest), you must either restart the forest or restart the host in which the forest is locally mounted. After restarting, the forest will automatically mount locally on the primary host if the primary host is back online and corrected. To check the status of the hosts in the cluster, see the Cluster Status Page in the Admin Interface.
unmounted, the forest might not have completed mounting. Refresh the page and the Mount State should indicate that the forest is open.
The forest is restarted, and if the primary host is available, the primary host will mount the forest. If the primary host is not available, the first failover host will try to mount the forest, and so on until there are no more failover hosts to try. If you look in the
ErrorLog.txt log file for the primary host, you will see a message similar to the following:
2007-03-28 13:20:29.644 Info: Mounted forest myFailoverForest locally on /veritas/marklogic/Forests/myFailoverForest