Loading TOC...
Ops Director Guide (PDF)

Ops Director Guide — Chapter 1

Monitoring MarkLogic with Ops Director

Ops Director enables you to monitor MarkLogic clusters ranging from a single node to large multi-node deployments. A single Ops Director server can monitor multiple clusters. Ops Director provides a unified browser-based interface for easy access and navigation.

Ops Director presents a consolidated view of your MarkLogic infrastructure, to streamline monitoring and troubleshooting of clusters with alerting, performance, and log data. Ops Director provides enterprise-grade security of your cluster configuration and performance data with robust role-based access control and information security powered by MarkLogic Server.

This chapter covers the following topics:

Overview

Ops Director is designed to accommodate your evolving IT strategy, whether for one or many clusters, on premises or the cloud. Ops Director complements the current MarkLogic admin tools, bringing enterprise IT administrators a simple, flexible, and proactive management experience.

Ops Director decreases your learning curve by using guided visual representations of databases, clusters, and applications. It enables you to identify potential problems and bring them to attention before they occur. It also provides learning and analysis opportunities with centralized data collection, delivery, and storage.

You can use Ops Director for the following tasks:

  • To keep track of the day-to-day operations of your MarkLogic Server environment.
  • To plan initial capacity and fine-tune your MarkLogic Server environment. For details on how to configure your MarkLogic Server cluster, see the Scalability, Availability, and Failover Guide.
  • To troubleshoot application performance problems. For details on how to troubleshoot and resolve performance issues, see the Query Performance and Tuning Guide.
  • To troubleshoot application errors and failures.

The monitoring metrics and thresholds of interest will vary depending on your specific hardware and software environment and configuration of your MarkLogic Server cluster. This chapter lists some of the metrics of interest when configuring and troubleshooting MarkLogic Server. However, MarkLogic Server is just one part of your overall environment. The health of your cluster depends on the health of the underlying infrastructure, such as network bandwidth, disk I/O, memory, and CPU.

Terms

  • Ops Director Application Cluster -- a MarkLogic cluster that contains the host that runs the Ops Director application.
  • Managed Cluster -- a MarkLogic cluster that is specifically configured to be managed by the Ops Director application.
  • Ops Director Application -- Ops Director application server that is responsible for communication with the browser.
  • Ops Director System -- Ops Director application server that is responsible for inter-cluster communication.
  • Resource Group -- a configuration that represents one or more resources of a certain type, such as hosts, App Servers, databases, and so on.
  • RBAC -- Role-Based Access Control -- a method of regulating access to computer or network resources based on the roles of individual users within an enterprise.
  • XDQP -- XML Data Query Protocol -- a MarkLogic internal protocol used for communication between nodes in a cluster.
  • CSV -- Comma-Separated Values file stores tabular data in a plain text format, where each line of the file is a data record, each record consists of one or more fields, and the fields are separated by commas.
  • View -- a top-level page of Ops Director UI, which you can access from the menu bar; provides a high-level view of MarkLogic resources.
  • Tab -- a next-level page of Ops Director UI, which you can access from clicking on one of the tabs in a view; enables you to drill down for specific resources or resource types.
  • Report -- a document-centric display of information about one or more assets at a specific date time period.
  • Rates -- The amount of data (MB per second) currently being read from or written to a resource.
  • Loads -- The execution time (in seconds) of current read and write requests on a resource, which includes the time requests spend in the wait queue when maximum throughput is achieved.
  • Color-coded severity -- colors used in graphic representations of alert severity, as indicated in the following table:
    Color Alert Severity
    Red Critical
    Yellow At Risk / Warning
    Green Healthy
    Dark Gray Maintenance
    Light Gray Offline
    Blue Security
    Dark Green Information
    White / Hollow Unknown

Ops Director High-Level Architecture

An Ops Director instance includes two application servers:

  • Ops Director Application server, which by default runs on port 8008, provides services to the Ops Director browser application.
  • Ops Director System server, which by default runs on port 8009, receives data transmitted from Managed Clusters and stores it in the Ops Director database.

When Ops Director application is installed on a cluster, you can view these two application servers in the Summary page of the Admin Interface.

Ops Director Deployment Configurations

A single Ops Director instance can manage multiple clusters. Managed clusters collect information about the state of their cluster (error logs, configuration data, and meters) and securely send that data to the Ops Director System. The Ops Director System receives this information and stores it in the Ops Director database. The Ops Director Application executes queries against the Ops Director database to retrieve information for display in the browser.

Ops Director is designed to adapt to a range of MarkLogic deployments, from a single cluster in a single data center to hundreds of clusters. Ops Director has three possible deployment configurations:

  • Collocated: A cluster serves as both an Ops Director Application Cluster and a Managed Cluster.
  • Single Data Center: A cluster serves as an Ops Director Application Cluster that communicates with Managed Clusters.
  • Hybrid: A cluster serves as both an Ops Director Application Cluster and a Managed Cluster that communicates with Managed Clusters.

A multi-cluster configuration can manage local and remote clusters across a network as long as all of the clusters are available and not concealed behind a firewall. All communication between an Ops Director Application Cluster and Managed Clusters is performed securely.

Ops Director can manage only clusters that run either the same version of MarkLogic Server as the version it was shipped with or one of the older versions. This limitation has an impact on colocating Ops Director on clusters with other applications (hybrid deployment configuration). You must not put Ops Director on a cluster with other applications that you would be reluctant to upgrade to the latest version of MarkLogic Server. If you choose to run Ops Director in a hybrid environment, make sure to put Ops Director on a cluster that you are willing and able to upgrade first, because Ops Director will not be able to manage any upgraded clusters until Ops Director Application Cluster has been upgraded.

Ops Director Security

In a multi-cluster environment, where Ops Director provides alerting, management, and reporting capabilities, different administrators have different goals. To ensure that each type of administrator can achieve their goals without causing harm or accessing information to which they are not supposed to have access, Ops Director employs comprehensive role-based access controls. The assets and views displayed to an administrator are based upon group participation and the group's roles. Each role has specific privileges. The combination of an administrator's role, resource group membership, and the asset determines which capabilities are available to that administrator.

When a Managed Cluster connects to Ops Director, it grants Ops Director the ability to perform actions by making requests back to the Managed Cluster. These requests are protected by certificate-based authentication. User IDs and passwords are not sent across the network. Ops Director enables you to manage clusters on which you have no direct login capability.

This section provides a conceptual overview of Ops Director security. The procedures for configuring Ops Director security are described in Installing and Configuring Ops Director and Console Settings View.

The main topics in this section are:

Communication with Managed Clusters

When a cluster agrees to be managed by Ops Director, it establishes a long-lived secure communication channel between itself and Ops Director using certificates. Ops Director requests signed with the appropriate certificate can to communicate with the Managed Cluster and perform admin tasks on the cluster. The certificate authority for these certificates can be either self-signed, as described in Installing Ops Director on an Application Cluster, or externally signed, as described in Securing Ops Director with Externally Signed Certificates.

For a particular request, Ops Director determines which roles and privileges are in effect for that request. It then makes the request to the Managed Cluster, using a short-lived certificate, to perform some action. In the request, Ops Director includes an identifier for the request.

The Managed Cluster passes that identifier back to Ops Director, over the secure channel, to obtain the set of roles and privileges that apply for the request. Once that context is established, it is used to perform the request, which will succeed or fail on its own merits.

The procedure for establishing secure certificate-based communication between Ops Director and the Managed Clusters is described in Installing and Configuring Ops Director.

Security and Database Dependencies of Managed Clusters

When you connect a managed cluster to Ops Director, a SecureManage application server is created for all groups that contain hosts from this cluster.

Each SecureManage application server is a copy of the ManageServer, with support for SSL and certificate-based authentication. This allows Ops Director to communicate with managed clusters using SSL without knowing any user credentials on the managed cluster.

The SecureManage application server uses the App-Services database and the database's forests. If you disable or delete the SecureManage application server or any of its resources (such as the App-Services database and the forests for that database) on a managed cluster, Ops Director will not be able to retrieve information from this managed cluster, and the cluster will be assigned the Unknown state. Likewise, if you remove an External Certificate from a managed cluster, Ops Director will not be able to retrieve information from this managed cluster, and the cluster will be assigned the Unknown state.

Resource Groups

In a large environment, it is useful to group resources together, for example, 'the staging hosts,' or 'the production databases.' In Ops Director, these are called resource groups. Resource groups are homogenous, which removes complexity from the meaning of 'apply this action to this group.' For example, 'show me statistics for the production databases group.'

Each resource group consists of the following:

  • name -- The user-visible name for the resource group.
  • type -- The type of resources contained in the resource group.
  • resources -- The list of resources in the resource group.
  • role -- An optional role or roles that control access to this resource group.

The resource group role enables you to establish access to a resource group at a finer and more ad hoc granularity than is provided by the established roles. It is likely that roles defined within the enterprise are fairly coarse-grained and that changing roles (in an external LDAP server, for example), may be considered too heavy weight for ad hoc groupings. For more details on roles and resource groups and how to create them, see Console Settings View.

You can configure a resource group so it grants additional privileges within the context of that group.

Access Inheritance in Resource Groups

Resource groups consist of a single resource type and an explicitly enumerated list of resources.

Inheritance in resource groups extends the reach of a single resource group across multiple resource types, as follows:

  • A cluster in a cluster resource group inherits all of the resources in the cluster (hosts, application servers, databases, and forests).
  • A host in a host resource group inherits all of the databases and forests on that host.
  • A database in a database resource group inherits all the forests in the database.
  • An application server in a server resource group inherits the databases that the server uses.

Inheritance is transitive. If a server resource group gives you access to a database, you have access to its forests. The following inheritence access rules apply in Ops Director:

  • If you can access a cluster, you can access all resources of this cluster: hosts, application servers, databases, and forests.
  • If you can access a host, you can access all databases and forests of that host.
  • If you can access a database, you can access all forests of that database.
  • If you can access an application server, you can access all databases that this application server uses and all forests of these databases.

Execute Privileges

The following privileges are specific to Ops Director:

  • http://marklogic.com/xdmp/privileges/opsdir-admin -- Grants Ops Director Administrator privileges.
  • http://marklogic.com/xdmp/privileges/opsdir-license-admin -- Grants privileges to manage licenses in Ops Director.
  • http://marklogic.com/xdmp/privileges/opsdir-user -- Grants privileges to use Ops Director.

These execute privileges are pre-defined and included with every installation of MarkLogic Server. You can view them in the Execute Privileges Summary page of the Admin Interface.

Data Collected by Ops Director

Once configured, Ops Director begins collecting data from all the hosts in the Managed Clusters and storing it the Ops Director database.

The following types of data are collected:

Log Data

Log messages generated at or above the level specified for delivery to Ops Director will be queued and sent as quickly as possible. Errors generated at a level of critical or higher will block while they are being sent to make sure that a delivery attempt is made before the host restarts.

Server log messages are filtered by the log level. You can configure the minimum log level for log messages sent to Ops Director via the Admin Interface, Ops Director Setup page. For example, if you configure the minimum log level to 'error', then you will see all messages starting from 'error' level.

The minimum log level can be configured as:

  • fine
  • debug
  • config
  • info
  • notice
  • warning
  • error
  • critical
  • alert
  • emergency

    It is not recommended to configure the minimum log level to a finer granularity than info.

Configuration Data

Whenever the configuration of a Managed Cluster changes, Ops Director is notified.

When Ops Director is notified of a configuration change, if the change is more recent than the local data that Ops Director has for a particular configuration, Ops Director retrieves the necessary payloads from the Managed Clusters in the form of resource documents. This configuration data includes any changes made to server configuration files. Ops Director leverages this interaction to obtain information about every applicable resource on each Managed Cluster: groups, hosts, databases, application servers, and so forth.

Timestamps allow Ops Director to maintain a history of the configuration of Managed Clusters over time.

Ops Director calls a host on the Managed Cluster to get the modified configurations. Any new or changed configuration is saved with a new timestamped URI. Efficient access to the most recent configuration is managed with properties.

Metering Data

Each Managed Cluster sends metering data (such as documents from the Meters database) to Ops Director. Filtering of metered data is by time (raw, hourly, or daily). The managed hosts introduce a small, random adjustment factor in the actual intervals to avoid the situation where every managed host transmits to Ops Director at exactly the same moment every time.

Explicit Limits on Type and Rates

You can configure Managed Cluster to define the type of data sent to Ops Director and its frequency, for example, hourly metering data. By default, Ops Director accepts the configuration as it is defined for the Managed Cluster.

If necessary, you can configure Ops Director to override settings provided by the Managed Cluster, for example, to receive only daily metering data. You can do this with the internal functions of the XQuery API. Contact MarkLogic technical support for details.

Security of Collected Data

Ops Director is expected to operate within an enterprise rather than on the public Internet. Nevertheless, the service is designed to be secure and to protect customer confidential data.

  • Local and transient storage of data is secured at the same level or better than the server protects the same data in other use cases.
  • Data transmission uses only HTTP/SSL encrypted secure channels in a 'point to point' architecture.
  • Received data is stored in a MarkLogic database and is thus as secure as any other MarkLogic data.
Unattended Operation

The collection service runs unattended and requires no active management by the source administrator. Performance of the cluster is not significantly affected by this service.

'Best Attempt' SLA

Ops Director collects and transmits data in a 'Best Attempt' SLA (Service Level Agreement). Transmissions are not 'transactional' in the sense of database operations and have less priority than all critical and most other MarkLogic services. Since the service requires network connectivity, bandwidth, and local resources that compete with existing services, it is possible that even in periods of continuous functional operation the service may not be able to provide uninterrupted and complete streams. A 'Best Attempt' approach is used to provide for periods of resource interruptions, heavy load, and improperly configured systems, while providing the configured data within the SLA.

If the connection between Ops Director and a Managed Cluster goes down, the Managed Cluster keeps a buffer of high-priority data that it will transmit to Ops Director when the connection is restored. However, some data might be lost, depending on the volume of data and the duration of the outage. Priority is given to retaining errors and warnings.

Data Type Separation

To decouple the collection, transmission, storage, and access requirements of the different kinds of data, Ops Director maintains a separate logical 'stream' for each of the types of data. Each type of data (log, configuration, meter) is collected, configured, prioritized, identified, and delivered independently. This allows for simpler logic and specialization for send, receive, and access use cases.

Data Transmission

All data is delivered over HTTP(S) to Ops Director. Data is stored in MarkLogic on the Ops Director Application Cluster. Ops Director operates inside the enterprise; there is no expectation that Ops Director will refuse data from some clusters or be required to guard against denial-of-service or other malicious behavior.

« Table of contents
Next chapter »