Loading TOC...
Ops Director Guide (PDF)

Ops Director Guide — Chapter 1

Monitoring MarkLogic with Ops Director

Ops Director enables administrators to monitor MarkLogic clusters ranging from a single node to large multi-node deployments. A single Ops Director server can monitor multiple clusters. Ops Director provides a unified browser-based interface for easy access and navigation.

Ops Director presents a consolidated view of your MarkLogic infrastructure, to streamline monitoring and troubleshooting of clusters with alerting, performance, and log data. Ops Director provides enterprise-grade security of your cluster configuration and performance data with robust role-based access control and information security powered by MarkLogic Server.

This chapter covers the following topics:


Ops Director is designed to accommodate your evolving IT strategy, whether for one or many clusters, on premises or the cloud. Ops Director complements the current MarkLogic admin tools, bringing enterprise IT administrators a simple, flexible, and proactive management experience.

Ops Director decreases the learning curve for administrators with guided visual representations of databases, clusters, and applications. It enables users to identify potential problems and bring them to attention before they occur. It also provides learning and analysis opportunities with centralized data collection, delivery, and storage.

You can use Ops Director for the following tasks:

  • To keep track of the day-to-day operations of your MarkLogic Server environment.
  • To plan initial capacity and fine-tune your MarkLogic Server environment. For details on how to configure your MarkLogic Server cluster, see the Scalability, Availability, and Failover Guide.
  • To troubleshoot application performance problems. For details on how to troubleshoot and resolve performance issues, see the Query Performance and Tuning Guide.
  • To troubleshoot application errors and failures.

The monitoring metrics and thresholds of interest will vary depending on your specific hardware/software environment and configuration of your MarkLogic Server cluster. This chapter lists some of the metrics of interest when configuring and troubleshooting MarkLogic Server. However, MarkLogic Server is just one part of your overall environment. The health of your cluster depends on the health of the underlying infrastructure, such as network bandwidth, disk I/O, memory, and CPU.


  • Ops Director Application Cluster -- a MarkLogic cluster that contains the host that runs the Ops Director application.
  • Managed Cluster -- a MarkLogic cluster that is specifically configured to be managed by the Ops Director application.
  • Ops Director Application -- Ops Director application server that is responsible for communication with the browser.
  • Ops Director System -- Ops Director application server that is responsible for inter-cluster communication.
  • Resource Group -- a configuration that represents one or more resources of a certain type, such as hosts, App Servers, databases, and so on.
  • RBAC -- Role-Based Access Control -- a method of regulating access to computer or network resources based on the roles of individual users within an enterprise.
  • XDQP -- XML Data Query Protocol -- a MarkLogic internal protocol used for communication between nodes in a cluster.
  • View -- a top-level page of Ops Director UI, which you can access from the menu bar; provides a high-level view of MarkLogic resources.
  • Tab -- a next-level page of Ops Director UI, which you can access from clicking on one of the tabs in a view; enables you to drill down for specific resources or resource types.
  • Report -- a document-centric display of information about one or more assets at a specific date time period.
  • Rates -- The amount of data (MB per second) currently being read from or written to a resource.
  • Loads -- The execution time (in seconds) of current read and write requests on a resource, which includes the time requests spend in the wait queue when maximum throughput is achieved.
  • Color-coded severity -- colors used in graphic representations of alert severity, as indicated in the following table:
    Color Alert Severity
    Red Critical
    Yellow At Risk / Warning
    Green Healthy
    Dark Gray Maintenance
    Light Gray Offline
    Blue Security
    Dark Green Information
    White / Hollow Unknown

Ops Director High-Level Architecture

An Ops Director instance includes two application servers:

  • Ops Director Application server, which by default runs on port 8008, provides services to the Ops Director browser application.
  • Ops Director System server, which by default runs on port 8009, receives data transmitted from Managed Clusters and stores it in the Ops Director database.

When Ops Director application is installed on a cluster, you can view these two application servers in the Summary page of the Admin Interface.

Ops Director Deployment Configurations

A single Ops Director instance can manage multiple clusters. Managed clusters collect information about the state of their cluster (error logs, configuration data, and meters) and securely send that data to the Ops Director System. The Ops Director System receives this information and stores it in the Ops Director database. The Ops Director Application executes queries against the Ops Director database to retrieve information for display in the browser.

Ops Director is designed to adapt to a range of MarkLogic deployments, from a single cluster in a single data center to hundreds of clusters. Ops Director has three setup options:

Collocated: A cluster serves as both an Ops Director Application Cluster and a Managed Cluster.

Single Data Center: A cluster serves as an Ops Director Application Cluster that communicates with Managed Clusters.

Hybrid: A cluster serves as both an Ops Director Application Cluster and a Managed Cluster that communicates with Managed Clusters.

A multi-cluster configuration can manage local and remote clusters across a network, so long as all of the clusters are available and not concealed behind a firewall. All communication between an Ops Director Application Cluster and Managed Clusters is performed securely.

Ops Director Security

In a multi-cluster environment, where Ops Director provides alerting, management, and reporting capabilities, different administrators will have different goals. In order to ensure that each type of administrator is capable of achieving their goals while preventing them from causing harm or accessing information to which they should not have access, Ops Director employs comprehensive role-based access controls. The assets and views displayed to an administrator are based upon group participation and the group's roles. Each role has specific privileges. The combination of an administrator's role, resource group membership, and the asset will determine which capabilities are available to that administrator.

When a Managed Cluster connects to Ops Director, it grants Ops Director the ability to perform actions by making requests back to the Managed Cluster. These requests are protected by certificate-based authentication. User ids and passwords are not sent across the network. Ops Director's users can manage clusters on which they have no direct login capability.

This section provides a conceptual overview of Ops Director security. The procedures for configuring Ops Director security are described in Installing and Configuring Ops Director and Console Settings View.

The main topics in this section are:

Communication with Managed Clusters

When a cluster agrees to be managed by Ops Director, it will establish a long-lived secure communication channel between itself and Ops Director using certificates. Ops Director requests signed with the appropriate certificate will be able to communicate with the Managed Cluster and perform admin tasks on the cluster. The certificate authority for these certificates can be either self-signed, as described in Installing Ops Director on an Application Cluster, or externally signed, as described in Securing Ops Director with Externally Signed Certificates.

For a particular request, Ops Director determines what roles and privileges should be in effect for that request. It will then make the request to the Managed Cluster, using a short-lived certificate, to perform some action. It will include in that request an identifier for the request.

The Managed Cluster will pass that identifier back to Ops Director, over the secure channel, to obtain the set of roles and privileges that should apply for the request.

Once that context is established, it will be used to perform the request which will succeed or fail on its own merits.

The procedure for establishing secure certificate-based communication between Ops Director and the Managed Clusters is described in Installing and Configuring Ops Director.

Resource Groups

In a large environment, it is useful to group resources together, for example, 'the staging hosts,' or 'the production databases.' In Ops Director, these are called resource groups. Resource groups are homogenous, which removes complexity from the meaning of 'apply this action to this group.' For example, 'show me statistics for the production databases group.'

Each resource group consists of the following:

  • name -- The user-visible name for the resource group.
  • type -- The type of resources contained in the resource group.
  • resources -- The list of resources in the resource group.
  • role -- An optional role or roles that control access to this resource group.

The resource group role enables you to establish access to a resource group at a finer and more ad hoc granularity than is provided by the established roles. It is likely that roles defined within the enterprise are fairly coarse-grained and that changing roles (in an external LDAP server, for example), may be considered too 'heavy weight' for ad hoc groupings.

For more detail on roles and resource groups and how to create them, see Console Settings View.

An administrator can configure a resource group so that it grants additional privileges within the context of that group.

Execute Privileges

The following privileges are specific to Ops Director:

  • http://marklogic.com/xdmp/privileges/opsdir-admin -- Grants Ops Director Administrator privileges.
  • http://marklogic.com/xdmp/privileges/opsdir-license-admin -- Grants privileges to manage licenses in Ops Director.
  • http://marklogic.com/xdmp/privileges/opsdir-user -- Grants privileges to use Ops Director.

These execute privileges are pre-defined and included with every installation of MarkLogic Server. You can view them in the Execute Privileges Summary page of the Admin Interface.

Data Collected by Ops Director

Once configured, Ops Director begins collecting data from all the hosts in the Managed Clusters and storing it the Ops Director database.

Three types of data are collected:

Log Data

Log messages generated at or above the level specified for delivery to Ops Director will be queued and sent as quickly as possible. Errors generated at a level of critical or higher will block while they are being sent in order to make sure that a delivery attempt is made before the host restarts.

Server log messages are filtered by the log level. You can configure the minimum log level for log messages sent to Ops Director via the Admin Interface, Ops Director Setup page. For example, if you configure the minimum log level to 'error', then you will see all messages starting from 'error' level.

The minimum log level might be configured to be:

  • fine,
  • debug,
  • config,
  • info,
  • notice,
  • warning,
  • error,
  • critical,
  • alert,
  • emergency.

    It is not recommended to configure the minimum log level to the finer granularity than info.

Configuration Data

Whenever the configuration of a Managed Cluster changes, Ops Director is notified.

When Ops Director is notified of a configuration change, if the change is more recent than the local data that Ops Director has for a particular configuration, Ops Director will retrieve the necessary payloads from the Managed Clusters in the form of resource documents. This configuration data includes any changes made to server configuration files. Ops Director leverages this interaction to obtain information about every applicable resource on each Managed Cluster: groups, hosts, databases, application servers, etc.

Timestamps allow Ops Director to maintain a history of the configuration of Managed Clusters over time.

Ops Director calls a host on the Managed Cluster to get the modified configurations. Any new or changed configuration is saved with a new timestamped URI. Efficient access to the most recent configuration is managed with properties.

Metering Data

Each Managed Cluster sends metering data (such as documents from the Meters database) to Ops Director.

Filtering of meters data is by time (raw, hourly, or daily). The managed hosts introduce a small, random adjustment factor in the actual intervals in order to avoid the situation where every managed host transmits to Ops Director at exactly the same moment every time.

Explicit Limits on Type and Rates

You can configure Managed Cluster to define the type of data sent to Ops Director and its frequency, for example, hourly metering data. By default, Ops Director will accept the configuration as it is defined for the Managed Cluster.

If necessary, you can configure Ops Director to override settings provided by the Managed Cluster, for example, to receive only daily metering data. You can do this with use of XQuery API's internal functions. Contact MarkLogic technical support for details.

Security of Collected Data

Ops Director is expected to operate within an enterprise, and not likely to be on the public internet. Nevertheless, the service is designed to be secure and to protect customer confidential data.

  • Local and transient storage of data is secured at the same level or better than the server protects the same data in other use cases.
  • Data transmission uses only HTTP/SSL encrypted secure channels in a ‘point to point' architecture.
  • Received data is stored in a MarkLogic database and is thus as secure as any other MarkLogic data.
Unattended Operation

The collection service runs unattended and require no active management by the source administrator. Performance of the cluster is not significantly affected by this service.

'Best Attempt' SLA

Ops Director collects and transmits data in a 'Best Attempt' SLA (Service Level Agreement). Transmissions are not 'transactional' in the sense of database operations and have less priority than all critical and most other MarkLogic services. Since the service requires network connectivity, bandwidth, and local resources that compete with existing services, it is possible that even in periods of continuous functional operation the service may not be able to provide uninterrupted and complete streams. A 'Best Attempt' approach is used to provide for periods of resource interruptions, heavy load, and improperly configured systems, while providing the configured data within the SLA.

If the connection between Ops Director and a Managed Cluster goes down, the Managed Cluster keeps a buffer of high-priority data that it will transmit to Ops Director when the connection is restored. However, some data might be lost, depending on the volume of data and the duration of the outage. Priority is given to retaining errors and warnings.

Data Type Separation

To decouple the collection, transmission, storage, and access requirements of the different kinds of data, Ops Director maintains a separate logical ‘stream' for each of the types of data. Each type of data (log, configuration, meter) is collected, configured, prioritized, identified, and delivered independently. This allows for simpler logic and specialization for the sending, receiving and access use cases.

Data Transmission

All data is delivered over HTTP(S) to Ops Director. Data is stored in MarkLogic on the Ops Director Application Cluster. Ops Director operates inside the enterprise; there is no expectation that Ops Director should refuse data from some clusters or be required to guard against denial-of-service or other malicious behavior.

« Table of contents
Next chapter »