Monitoring MarkLogic Guide (PDF)

MarkLogic 9 Product Documentation
Monitoring MarkLogic Guide
— Chapter 3

« Previous chapter
Next chapter »

MarkLogic Server Monitoring History

This chapter describes how to use the Admin Interface and Monitoring History dashboard to capture and make use of historical performance data for a MarkLogic cluster. These same Monitoring History operations can also be done using the XQuery and REST APIs, as described in XQuery and XSLT Reference Guide and the MarkLogic REST API Reference.

All MB and GB metrics described in this chapter are base-2.

The main topics in the chapter are:

Overview

The Monitoring History feature allows you to capture and view critical performance data from your cluster. Once the performance data has been collected, you can view the data in the Monitoring History pages. The top-level Monitoring History page provides an overview of the performance metrics for all of the key resources in your cluster. For each resource, you can drill down for more detail. You can also adjust the time span of the viewed data and apply filters to view the data for select resources to compare and spot exceptions.

By default, the performance data is stored in the Meters database. Monitoring history capture is enabled at the group level. Typically you have one group per cluster. You can also configure a consolidated Meters database that captures performance metrics from multiple groups. The group configuration defines which database is used to store performance metrics for that group (defaulting to a shared Meters database per cluster), as well as all configuration parameters for performance metrics, such as the frequency of data capture and how long to retain the performance data. The Meters database can participate in all normal database replication, security, and failover operations.

Enabling Monitoring History on a Group

To collect monitoring history data for your cluster, you must enable performance metering for your group.

  1. Log into the Admin Interface.
  2. Click the Groups icon on the left tree menu.
  3. Locate the Performance Metering Enabled field toward the bottom of the Group Configure page and click on true.

    You can configure the parameters for collecting monitoring history data, as described in the following table.

    Parameter Description
    meters database The database in which performance monitoring history data and usage metrics documents are stored. By default, historical performance and usage metrics are stored in the Meters database.
    performance metering period The performance metering period, in minutes. Performance data is collected at each period. The period can be any value of 1 minute or more.

    If you are collecting monitoring history for multiple groups, you should either set the same period for each group or configure your filter to view the history data for one group at a time.

    performance metering retain raw The number of days raw performance monitoring history data is retained. See Setting the Monitoring History Data Retention Policy for details.
    performance metering retain hourly The number of days hourly performance monitoring history data is retained. See Setting the Monitoring History Data Retention Policy for details.
    performance metering retain daily The number of days daily performance monitoring history data is retained. See Setting the Monitoring History Data Retention Policy for details.

Setting the Monitoring History Data Retention Policy

The retention policy (for raw, hourly, daily) is a value set in days. If performance metering is enabled, then all data that is older than that many days for the specified period (raw, hour, day) is deleted. The retention policy is set at a group level, so different groups can have different retention policies. For example, GroupA may have raw set to 1 day and GroupB may have raw set to 10 days. The cleanup code follows this retention value on a per-group basis.

There are cases where metering data may become orphaned, so it may no longer belong to an existing group. Some examples of when this could occur are:

  • Deleting a group
  • Importing metering data from another cluster

Any metering data that no longer belongs to any active group in the current cluster is deleted. To avoid this, turn off metering or avoid deleting groups and instead move hosts out of the group but keep the group in the cluster configuration.

Loading older monitoring history data (for example, by restoring a backup of the Meters database) will be immediately affected by data retention policy. So, you should turn off performance metering prior to restoring any data that is older than the time specified by your retention policy.

Deletion of data older then the retention policy occurs no sooner than the retention policy, but may, for various reasons, still be maintained for an unspecified amount of time.

Changing the retention policy from smaller to larger values does not restore data that has already been deleted.

The default data retention policy settings are as shown in the following table. To maximize efficiency, it is a best practice to retain raw data for the least number of days and the daily data for the most number of days.

Period Retention Period
Raw 7 Days
Hourly 30 Days
Daily 90 Days

Viewing Monitoring History

You can display the Monitoring History dashboard by doing the following:

  1. Open a browser and enter the URL:
    http://monitor-host:8002/

    where monitor-host is a host in the cluster you want to monitor

  2. At the top of the page, click on Monitoring and click on History in the pull-down menu:

  3. The Monitoring History page appears. From the Monitoring History Overview page, you can navigate to any of the pages described in this chapter.

Each line in a chart represents a metric for the resource. In the Overview page, the lines represent an aggregate of the metrics for all of the cluster resources. In each Details page, the lines represent the metric for each specific resource.

Chart titles on the Overview page include bracketed information specifying how chart data gathered across multiple resources is aggregated. For example, [Sum of Hosts] means that the data retrieved from one or more hosts is summed for display as points on the chart.

Each point on a line represents a period in which the performance data was captured. Hovering over a chart point displays the name of the resource metric, along with the performance value for the metric at that point in time.

The displayed metrics (in MegaBytes per second) are color coded. You can display a legend that indicates which colors represent which metrics by clicking on the red dot in the upper right-hand section of the graph. To close the legend, click on the 'x' in the upper right-hand portion of the legend window.

To simplify the view of charts on a page, you can collapse a chart or a group of charts for a resource by clicking on the triangle in the upper right-hard portion of the chart or chart group.

To expand a collapsed chart view, click on the triangle in the upper right-hard portion of the collapsed chart.

Viewing Monitoring History by Time Span and Frequency

As described in Enabling Monitoring History on a Group, the frequency in which performance metrics are captured is configurable, in minute intervals. The snapshots of performance metrics for each host are rolled up into a summary document that contains aggregate calculations on the values for that host.

You can configure your view of the captured performance data by time span and frequency.

The Time Span settings are located in the upper left corner of the Monitoring History page.

There are three basic settings you can adjust to control how the data is displayed:

  • A date/time range, down to the granularity of a minute, that determines the time span of the displayed data. (By default, this is the last 24 hours.)
  • A period interval that determines the frequency of the displayed data. The possible intervals are shown in the following table.
    Period Description
    Raw Display the performance data just as it was captured with the set frequency.
    Hour Display the performance data, in aggregate form, per hour. (This is the default.)
    Day Display the performance data, in aggregate form, per day.

You can zoom in to display part of the timespan by selecting the begin time of your zoom on any chart and click and hold your left mouse button and drag it to the end zoom time. The selected timeframe is highlighted and the zoomed-in time is displayed for all of the charts in the page. Navigating to another Monitoring History page resets all of the charts to the timespan selected in the TIME SPAN panel.

After changing either the time span and/or the period, click on refresh to display the updated charts. Clicking refresh will also update any changes you've made to the Filters settings. For details about filters, see Filtering Monitoring History by Resources. If you have zoomed into a portion of a timespan, refresh will redisplay the charts using the timespan selected in the TIME SPAN panel.

You can use the Shortcut links to display either the last hour, day or 30 days of performance data. Selecting a Shortcut link will automatically refresh the displayed charts.

Each Shortcut also sets the Period value, as shown in the following table.

Shortcut Period
1h Raw
1d Hour
30d Day

Labeling Monitoring History Time Spans

You can use the Label feature to capture and tag metrics for the set time span. You can store any number of labels. These labels can be used to identify events, instances, and periods of time. Labels can be added, updated or deleted at any time. Labels themselves are not stored with the raw metric data. They are only used for reporting purposes.

  1. To create a label for your current view of the Monitoring History, select New Label from the Label pull-down menu.

  2. In the Create a New Label popup window, the name of the label is the time span of the currently displayed charts, by default.

  3. You can keep the default name for the label, or change it to be more descriptive. Click Save.

  4. You can edit your label names or delete labels by selecting Edit Labels from the Labels pull-down menu.

  5. In the Edit Labels popup window, you can either edit the label name or delete the label. To delete a label, hover over the label and a click on the garbage can icon to the right. When finished editing, click Close.

    If you edit a label and, before closing the Edit Labels window, decide not to save your edits, press the Esc key to terminate the edits and keep the original labels.

  6. You can view all of the labels that have data within the currently selected timespan by clicking on the triangle to the right of the Labels section at the top of the Monitoring History page to expand the Labels chart.

  7. Each label appears as a timeline. Hover over a timeline to display the label name. Click on a timeline to update the view to the time span associated with the label. Selecting a timeline is functionally equivalent to selecting a label from the Label menu in that it updates the view with the start and end times in the TIME SPAN panel.

    If your labeled data has been purged from the Meters database, as the result of the retention policy or some other reason, the label will remain but there will be no data associated with that label.

  8. You can click on the label icon at the top right-hand portion of the page to create a label for the currently displayed time span. Follow the same procedure as described in steps 2 and 3 to finish creating the label.

If the data for a label does not fall within the currently displayed timespan, the label will not be displayed in the Labels chart. To display the charts for such labels, select the label from the Label pull-down menu.

Filtering Monitoring History by Resources

You can set filters for select resources to display only the stored performance metrics for those resources. You can filter by groups and databases. And in each group, by hosts and servers. By default, the metrics for all of the resources in the cluster are displayed.

Filter types that are active for the current view have headings highlighted in blue. For example, on the Overview page, all filters are active while on the Databases Detail view, only database resources are active.

In the filters panel, you can check or uncheck a resource to display or not display the performance metrics for that resource.

To focus on the resources of interest, you can collapse a category by clicking on the triangle in the right-hand section of the panel. The number of resources for the collapsed category are displayed.

Clicking the checkmark updates the charts with the current filter settings. It does not apply any changes that may have been made to the above TIME SPAN settings.

You can mouse over the resource names in the filter list to get extra information about the resources. For example, mousing over a host name shows the number of forests associated with the host and mousing over a server name shows the server type.

Historical Performance Charts by Resource

From the Monitoring History dashboard, you can view Overview and Detailed performance metrics in graph form for each resource in the cluster. In the Overview page, the lines on a graph represent an aggregate of the metrics for all of the cluster resources of that type. In each Details page, the lines represent the metric for each specific resource in the cluster.

To view the Detail page for a resource, click on the down arrow at the upper left-hand section of the resource graph on the Overview page.

To return to the Overview page from a Detail page, click on the up arrow at the upper left-hand section of the resource graph on the Detail page.

This section describes the Overview and Detail pages for the following resources:

Disk Performance Data

The Overview page displays a graph of the aggregate I/O performance data for the disks used by the hosts selected in the filter.

As described in Viewing Monitoring History, you can hover on a period point to view what disk operation was taking place at that point in time. Each performance metric is described in the following table.

Metric Description
Writes The disk I/O performance during journal and save write operations. This is the sum of journal-write-rate, save-write-rate, and large-write-rate.
Query Traffic The disk I/O performance during a query or queries. This is the sum of query-read-rate and large-read-rate.
Merge Reads The disk I/O performance during a merge read operation.
Merge Writes The disk I/O performance during a merge write operation.
Backup Reads Throughput of reading backup data from disk, in megabytes per second.
Backup Writes Throughput of writing data for backups, in megabytes per second.
Restore Reads Disk read throughput for restore, in megabytes per second.
Restore Writes Disk writing throughput for restore, in megabytes per second.

Click on the arrow in the upper left-hand section of the DISKS graph in the Overview page to view charts that present more detailed disk performance metrics.

The metrics displayed by the charts on the DISKS DETAIL page are described in the following table.

Chart Definition of Displayed Metric
Journal Write Rate The moving average of data writes to the journal.
Save Write Rate The moving average of data writes to in-memory stands.
Query Read Rate The moving average of reading query data from disk
Merge Read Rate The moving average of reading merge data from disk
Merge Write Rate The moving average of writing data for merges
Backup Read Rate Throughput of reading backup data from disk, in megabytes per second.
Backup Write Rate Throughput of writing data for backups, in megabytes per second.
Restore Read Rate Disk read throughput for restore, in megabytes per second.
Restore Write Rate Disk writing throughput for restore, in megabytes per second.
Large Binary Read Rate The moving average of reading large documents from disk.
Large Binary Write Rate The moving average of writing data for large documents to disk.

By default, Host data is viewed in aggregated form and must be viewed that way if multiple hosts are selected. When in the DISKS DETAIL page, you can rollover any Host filter to reveal the Select and Expand button. This will deselect all of the other Hosts across all Groups, and apply all pending filter changes. The expanded charts display the data for each forest in that host as separate line in each chart.

To return to the aggregate view, click on Aggregate button on an expanded Host. Doing so will also apply all pending filter changes to the displayed charts.

CPU Performance Data

The Overview page displays a graph of the aggregate performance data for the CPUs used by the hosts selected in the filter.

As described in Viewing Monitoring History, you can hover on a period point to view what CPU operation was taking place at that point in time. Each performance metric in the CPU Overview chart is described in the following table.

Metric Description
User Total percentage of CPU used running user processes that are not niced.
Nice Total percentage of CPU used running user processes that are niced.
System Total percentage of CPU used running the operating system kernel and its processes.
I/O Wait Total percentage of CPU time spent waiting for I/O operations to complete.
IRQ Total percentage of CPU utilization for servicing soft interrupts.
Steal Total percentage of CPU 'stolen' from this virtual machine by the hypervisor for other tasks (such as running another virtual machine).

Click on the arrow in the upper left-hand section of the CPU graph in the Overview page to view graphs that present more detailed CPU performance metrics. The charts on the CPU DETAIL page are described in the following table.

Chart Description
I/O Wait The percentage of CPU used waiting for I/O operations to complete for each host.
User The percentage of CPU used running user processes that are not niced for each host.
System The percentage of CPU used running the operating system kernel and its processes for each host.
Nice The percentage of CPU used running user processes that are niced for each host.
Steal The percentage of CPU 'stolen' from this virtual machine by the hypervisor for other tasks (such as running another virtual machine) for each host.
Idle The percentage of CPU that is not doing any work for each host.
IRQ The percentage of CPU servicing soft interrupts for each host.

Memory Performance Data

The Overview page displays a graph of the aggregate performance data for the Memory used by the hosts selected in the filter.

As described in Viewing Monitoring History, you can hover on a period point to view what CPU operation was taking place at that point in time. Each chart and associated performance metrics are described in the following table.

Chart Description
Memory Footprint

The total amount (in GB) of memory consumed by all of the hosts in the cluster.

The displayed metrics are:

  • RSS: The total amount of GB of Process Resident Size (RSS) consumed by the cluster.
  • Anon: The total amount of GB of Process Anonymous Memory consumed by the cluster.
Memory Size The amount of space forest data files take up in memory.
Memory I/O

The number of pages per second moved between memory and disk.

The displayed metrics are:

  • Page-In Rate: The page-in rate (from Linux /proc/vmstat) for the cluster in pages/sec.
  • Page-Out Rate: The page-out rate (from Linux /proc/vmstat) for the cluster in pages/sec.
  • Swap-In Rate: The swap-in rate (from Linux /proc/vmstat) for the cluster in pages/sec.
  • Swap-Out Rate: The swap-out rate (from Linux /proc/vmstat) for the cluster in pages/sec.
Virtual Memory Size of virtual memory used by different objects and processes; includes:
  • Data Files: Size of virtual memory mapped to data files.
  • Forests: Size of virtual memory used by forests.
  • Unclosed Stands: Size of virtual memory used by unclosed stands.
  • Caches: Size of virtual memory used by caches.
  • Registered Queries: Size of virtual memory used to store registered queries.
  • Joins: Size of virtual memory used for join processing.

Click on the arrow in the upper left-hand section of the MEMORY graph in the Overview page to view graphs that present more detailed MEMORY performance metrics. The charts on the MEMORY DETAIL page are described in the following table. The displayed metrics are drawn from /proc/vmstat.

Chart Description
RSS The amount of GB of Process Resident Size (RSS) for each host in the cluster.
Anon The amount of GB of Process Anonymous Memory for each host in the cluster.
Process Size The number of MB of total process memory for the MarkLogic process.
Huge Pages The size of huge pages for the MarkLogic process in MB. Available on Linux platform. Sum of Sizes after /anon_hugepage in /proc/[MLpid]/smaps.
System Free Memory The free system memory in MB. MemFree from /proc/meminfo on Linux, m.ullAvailPhys from GlobalMemoryStatusEx(m) on Windows.
Page-In Rate The page-in rate (in pages/sec) for each host in the cluster.
Page-Out Rate The page-out rate (in pages/sec) for each host in the cluster.
Swap-In Rate The swap-in rate (in pages/sec) for each host in the cluster.
Swap-Out Rate The swap-out rate (in pages/sec) for each host in the cluster.
Data File Memory Size of virtual memory mapped to data files.
Forest Memory Size of virtual memory used by forests.
Unclosed Stand Memory Size of virtual memory used by unclosed stands.
Cache Memory Size of virtual memory used by caches.
Registered Query Memory Size of virtual memory used to store registered queries.
Join Memory Size of virtual memory used for join processing.

XDQP Server Requests Performance Data

The Overview page displays a graph of the aggregate performance data for the XDQP Server Requests processed by the hosts selected in the filter.

Each chart and associated performance metrics are described in the following table.

Chart Description
XDQP Server Request Rate Number of XDQP requests processed per second.
XDQP Server Request Time Average response time to XDQP requests from other nodes.

Click on the arrow in the upper left-hand section of the XDQP SERVER REQUESTS graph in the Overview page to view graphs that present more detailed performance metrics. The charts on the XDQP SERVER REQUESTS DETAIL page are described in the following table.

Chart Description
XDQP Server Request Rate Number of XDQP requests processed per second.
XDQP Server Request Time Average response time to XDQP requests from other nodes.

Server Performance Data

The Overview page displays graphs of the aggregate performance data for the App Servers selected in the filter.

The Overview page displays the charts described in the following table.

Chart Description
App Server Request Rate The total number of requests being processed per second, across all of the App Servers.
App Server Latency The average time (in seconds) it takes to process queries, across all of the App Servers.
Task Server Queue Size The number of tasks in the Task Server queue.
Expanded Tree Cache Hits/Misses The number of times per second that queries could use (Hits) and could not use (Misses) the expanded tree cache.

With the exception of the Task Server Queue Size chart, which only displays the queue size for the one task server, the color-coded metrics for the server charts are as shown in the following table.

Metric Description
HTTP The metrics for the HTTP servers.
ODBC The metrics for the ODBC servers.
WebDAV The metrics for the WebDAV servers.
XDBC The metrics for the XDBC servers.
Task The metrics for the Task server.

Click on the arrow in the upper left-hand section of the SERVERS graph in the Overview page to view graphs that present more detailed performance metrics for each App Server. The charts displayed on the SERVERS DETAIL page are described in the following table.

If there are multiple groups defined, server names have the group that they are associated with in square brackets in the legend and rollovers.

The number of servers displayed out of the number of servers of each type in the cluster (for example, HTTP) is shown in the upper right-hand section of each server type group.

The following detailed charts are displayed for each type of App Server:

Chart Description
Request Rate The number of queries being processed per second by each App Server.
Latency The average time it takes each App Server to process queries.
Expanded Tree Cache Rate Hits The number of times queries could use the expanded tree cache on each App Server.
Expanded Tree Cache Rate Misses The number of times queries could not use the expanded tree cache on each App Server.
Queue Size (Task Server only) The number of tasks in the Task Server queue on each host.
Send Rate (for any type of App Server except Task Server) Throughput of application servers of that type sending data, in megabytes per second.
Receive Rate (for any type of App Server except Task Server) Throughput of application servers of that type receiving data, in megabytes per second.

Network Performance Data

The network performance data graphs display performance in terms of XDQP reads and writes. XDQP is the protocol MarkLogic uses for internal host-to-host communication on port 7999.

The Overview page displays various XDQP performance as the sum of XDQP activity across the cluster. High XDQP rates are usually not an issue unless they are so high as to saturate your internal network. Higher usage occurs during data load and query execution. Merges do not involve XDQP.

If XDQP is excessively high during loads, running the MarkLogic Content Pump (mlcp) with fast forest placement will minimize XDQP communication needs. For details on the MarkLogic Content Pump, see Loading Content Using MarkLogic Content Pump in the Loading Content Into MarkLogic Server Guide.

The Overview page displays a chart with the metrics described in the following table.

Metric Description
Network

The network traffic between nodes in the cluster. Heavy queries or ingestion will create a spike.

The displayed metrics are:

  • XDQP Read: The total volume of all XDQP reads between hosts in the cluster. This is the sum of xdqp-client-receive-rate and xdqp-server-receive-rate.
  • XDQP Write: The total volume of all XDQP writes between hosts in the cluster. This is the sum of xdqp-client-send-rate and xdqp-server-send-rate.
  • Foreign XDQP Read: The total volume of all XDQP reads by the hosts in the cluster from a foreign cluster. This is the sum of foreign-xdqp-client-receive-rate and foreign-xdqp-server-receive-rate.
  • Foreign XDQP Write: The total volume of all XDQP writes by the hosts in the cluster to a foreign cluster. This is the sum of foreign-xdqp-client-send-rate and foreign-xdqp-server-send-rate.
External KMS Request Rate Number of requests per second to the external key management server.
External KMS Request Time Average round-trip time for a request to an external key management server.
LDAP Request Rate Number of requests per second to the LDAP server.
LDAP Request Time Average round-trip time for a request to an LDAP server.

Click on the arrow in the upper left-hand section of the NETWORK graph in the Overview page to view graphs that present more detailed performance metrics for each host in the cluster. The charts displayed on the NETWORK DETAIL page are described in the following table.

Chart Description
XDQP Read Rate The amount of data (in MB/sec) read over XDQP by each host in the cluster. This is the sum of foreign-xdqp-client-receive-rate and foreign-xdqp-server-receive-rate.
XDQP Write Rate The amount of data (in MB/sec) written over XDQP by each host in the cluster. This is the sum of foreign-xdqp-client-send-rate and foreign-xdqp-server-send-rate.
XDQP Read Load The execution time (in seconds) of read requests by each host in the cluster. This is the sum of xdqp-client-receive-load and xdqp-server-receive-load.
XDQP Write Load The execution time (in seconds) of write requests by each host in the cluster. This is the sum of xdqp-client-send-load and xdqp-server-send-load.
Foreign XDQP Read Rate The amount of data (in MB/sec) read over XDQP by each host in the cluster from a foreign cluster. This is the sum of foreign-xdqp-client-receive-rate and foreign-xdqp-server-receive-rate.
Foreign XDQP Write Rate The amount of data (in MB/sec) written over XDQP by each host in the cluster to a foreign cluster. This is the sum of foreign-xdqp-client-send-rate and foreign-xdqp-server-send-rate.
Foreign XDQP Read Load The execution time (in seconds) of read requests by each host in the cluster from a foreign cluster. This is the sum of foreign-xdqp-client-receive-load and foreign-xdqp-server-receive-load.
Foreign XDQP Write Load The execution time (in seconds) of write requests by each host in the cluster to a foreign cluster. This is the sum of foreign-xdqp-client-send-load and foreign-xdqp-server-send-load.
External KMS Request Rate Number of requests per second to the external key management server.
External KMS Request Time Average round-trip time for a request to an external key management server.
LDAP Request Rate Number of requests per second to the LDAP server.
LDAP Request Time Average round-trip time for a request to an LDAP server.

Database Performance Data

The Overview page displays graphs of the aggregate performance data for all of the databases in the cluster.

The following table describes the charts displayed in the Databases section of the Overview page.

Chart Description
Fragments

Displays the aggregate number of fragments in all of the databases in the cluster.

The displayed lines are:

  • Active Fragments: The fragments available to queries.
  • Deleted Fragments: The fragments to be deleted during the next merge operation.
Storage Footprint

The total disk capacity (in GBs) used by all of the databases in the cluster.

The displayed lines are:

  • Data Size: The amount of data in the forest data directories.
  • Fast Data Size: The amount of data in the forest fast data directories.
  • Large Data Size: The amount of data in the forest large data directories.
Lock Rate

The number of locks set per second across all of the databases in the cluster.

The displayed lines are:

  • Read: The number of read locks set per second.
  • Write: The number of write locks set per second.
  • Deadlock: The number of deadlocks per second.
Lock Wait Load

The aggregate time (in seconds) transactions wait for locks;

The displayed lines are:

  • Read: The time transactions wait for read locks.
  • Write: The time transactions wait for write locks.
Lock Hold Load

The aggregate time (in seconds) locks are held.

The displayed lines are:

  • Read: The time read locks are held.
  • Write: The time write locks are held.
Deadlock Wait Load The aggregate time (in seconds) deadlocks remain unresolved.
Database Replication

The amount of data (in MB per second) sent by and received from this cluster and foreign clusters.

The displayed lines are:

  • Database Replication Send: The amount of data sent to foreign clusters.
  • Database Replication Receive: The amount of data received from foreign clusters.
List Cache Hits/Misses

The number of times per second that queries could use (Hits) and could not use (Misses) the list cache.

The displayed lines are:

  • List Cache Hit Rate: The average number of hits on the list cache.
  • List Cache Miss Rate: The average number of misses on the list cache.
Compressed Tree Cache Hits/Misses

The number of times per second that queries could use (Hits) and could not use (Misses) the compressed tree cache.

The displayed lines are:

  • Compressed Tree Cache Hit Rate: The average number of hits on the compressed tree cache.
  • Compressed Tree Cache Miss Rate: The average number of misses on the compressed tree cache.
Triple Cache Hits/Misses

The number of times per second that queries could use (Hits) and could not use (Misses) the triple cache.

The displayed lines are:

  • Triple Cache Hit Rate: The average number of hits on the triple cache.
  • Triple Cache Miss Rate: The average number of misses on the triple cache.
Triple Value Cache Hits/Misses

The number of times per second that queries could use (Hits) and could not use (Misses) the triple value cache.

The displayed lines are:

  • Triple Value Cache Hit Rate: The average number of hits on the triple value cache.
  • Triple Value Cache Miss Rate: The average number of misses on the triple value cache.

Click on the arrow in the upper left-hand section of the DATABASES graph in the Overview page to view graphs that present more detailed performance metrics for each database. The charts displayed on the DATABASES DETAIL page are described in the following table. The metrics for each database in the cluster are displayed as a separate line.

Chart Description
Active Fragments The number of active fragments (the fragments available to queries) in each database.
Deleted Fragments The number of deleted fragments (the fragments to be removed by the next merge operation) in each database.
Data Size The amount of data in the data directories of the forests attached to each database.
Fast Data Size The amount of data in the fast data directories of the forests attached to each database.
Large Data Size The amount of data in the large data directories of the forests attached to each database.
Read Lock Rate The number of read locks set per second on each database.
Write Lock Rate The number of write locks set per second on each database.
Deadlock Rate The number of deadlocks per second on each database.
Read Lock Wait Load The time (in seconds) transactions wait for read locks on each database.
Write Lock Wait Load The time (in seconds) transactions wait for write locks on each database.
Deadlock Wait Load The aggregate time (in seconds) deadlocks remain unresolved on each database.
Read Lock Hold Load The time (in seconds) read locks are held on each database.
Write Lock Hold Load The time (in seconds) write locks are held on each database.
Database Replication Send Rate The amount of replication data (in MB per second) sent by each database to foreign clusters.
Database Replication Receive Rate The amount of replication data (in MB per second) received by each database from foreign clusters.
Database Replication Send Load The time (in seconds) it takes each database to send replication data to foreign clusters.
Database Replication Receive Load The time (in seconds) it takes each database to receive replication data from foreign clusters.
Database Replication Lag The amount of time, in seconds, that the replica database is lagged behind the master database.
List Cache Hit Rate The number of times per second that queries could use (Hits) the list cache. The average number of hits on the list cache.
List Cache Miss Rate The number of times per second that queries could not use (Misses) the list cache. The average number of misses on the list cache.
Compressed Tree Cache Hit Rate The number of times per second that queries could use (Hits) the compressed tree cache. The average number of hits on the compressed tree cache.
Compressed Tree Cache Miss Rate The number of times per second that queries could not use (Misses) the compressed tree cache. The average number of misses on the compressed tree cache.
Triple Cache Hit Rate The number of times per second that queries could use (Hits) the triple cache. The average number of hits on the triple cache.
Triple Cache Miss Rate The number of times per second that queries could not use (Misses) the triple cache. The average number of misses on the triple cache.
Triple Value Cache Hit Rate The number of times per second that queries could use (Hits) the triple value cache. The average number of hits on the triple value cache.
Triple Value Cache Miss Rate The number of times per second that queries could not use (Misses) the triple value cache. The average number of misses on the triple value cache.
Reindex Refragment Rate The rate of reindexing and refragmenting.
Rebalance Rate The rate of rebalancing.

Exporting and Printing Monitoring History

You can export and print your monitoring history data.

To export the monitoring history data to an Excel Spreadsheet file, click the Export at the upper-right portion of the Monitoring History page.

The metrics are displayed in separate tabs at the bottom of the spreadsheet.

To print out the charts displayed on the current page, click Print. This will open the printer dialog page from which you can print the charts.

Using Ops Director to Monitor History

You can also use Ops Director to monitor MarkLogic Server. For more information, see the Ops Director Guide.

« Previous chapter
Next chapter »
Powered by MarkLogic Server | Terms of Use | Privacy Policy