Are there signs of a serious problem?
If you are encountering a serious problem in which MarkLogic Server is unable to effectively service your applications, some questions to ask are:
Did MarkLogic Server abort or fail to start? This may indicate that there not enough disk space for the log files on the log file device. If this is the cause, you will need to either add more disk space or free up enough disk space for the log files.
Is an application unable to update data in MarkLogic Server? This may indicate that you have exceeded the 64-stand limit for a forest. This could be the result of running out of merge space or that merges are suppressed.
Are queries failing with SVC-MEMALLOC messages? This indicates that there is not enough memory or swap space. You may need to add memory or reconfigure your swap memory, as described in Query Performance in MarkLogic Server in Query Performance and Tuning Guide.
Are there any forests in the async replicating state? This state indicates that a primary forest is asynchronously catching up to its replica forest after a failover or that a new replica forest was added to a primary forest that already contains content. If a forest has failed over, see Scenarios that Cause a Forest to Fail Over in Scalability, Availability, and Failover Guide for possible causes.
Are there any serious messages in the error logs? The various log levels are described in Understanding the Log Levels in Administrating MarkLogic Server. All log messages at Warning level and higher should be investigated. Messages at Notice level should be tracked, as some conditions initially arising at Notice level later progress to become warnings or errors. Messages at Debug level are often necessary to determine the root cause of incidents, and messages at Info level are largely informational. Log messages that indicate a particularly serious problem include:
Repeated server restart messages. Possible causes include a corrupted forest, segmentation faults, or some problem with the host’s operating system.
XDQP disconnect. Possible causes include an XDQP timeout or a network failure.
Forest unmounted. Possible causes include the forest is disabled, it has run out of merge space, or the forest data is corrupted.
SVC-* errors. These are system-level errors that result from timeouts, socket connect issues, lack of memory, and so on.
XDMP-BAD errors. These indicate serious internal error conditions that shouldn’t happen. Look at the error text for details and the logs for context. If you have an active maintenance contract, you can contact MarkLogic Technical Support for help.