Loading TOC...
Installation Guide for All Platforms (PDF)

MarkLogic 10 Product Documentation
Installation Guide for All Platforms
— Chapter 1

Requirements and Database Compatibility

This chapter introduces MarkLogic Server, lists the product requirements and supported platforms, and describes the database compatibility with previous releases. It includes the following sections:

Introduction

MarkLogic Server is a powerful NoSQL database for harnessing your digital content base, complete with Enterprise features demanded by real world, mission-critical applications. MarkLogic enables you to build complex applications that interact with large volumes of content in JSON, XML, SGML, HTML, and other popular content formats, as well as binary formats. The unique architecture of MarkLogic ensures that your applications are both scalable and high-performance, delivering query results at search-engine speeds while providing transactional integrity over the underlying content repository. MarkLogic can be configured for a distributed environment, enabling you to scale your infrastructure through hardware expansion.

This installation guide explains the procedures needed to install MarkLogic on your system. It is intended for a technical audience. This document only explains how to install the software, not how to use the software. To learn how to get started using the software, see the rest of the MarkLogic documentation (available on docs.marklogic.com), including the following documents:

MarkLogic Server Assumptions

When MarkLogic installs, it sets memory and other settings based on the characteristics of the computer in which it is running. MarkLogic is a scalable, multi-threaded server product, and as such it assumes it has the entire machine available to it, including the CPU and disk I/O capacity. It is important to follow the guidelines set up in this chapter. Furthermore, MarkLogic assumes there is only one MarkLogic Server process running on any given machine, so it is not recommended to run multiple instances of MarkLogic on a single machine.

MarkLogic Server expects the system clocks to be synchronized across all the nodes in a cluster. The clock skew should be less than 0.5 seconds. You should use a time service such as NTP to keep your system clocks synchronized. For more details, see the following Knowledge Base article:

https://help.marklogic.com/knowledgebase/article/View/24/15/synchronizing-system-clocks-in-a-cluster

Memory, Disk Space, and Swap Space Requirements

Before installing the software, be sure that your system meets the following requirements:

  • For a production deployment, MarkLogic recommends at least 8 vCPUs per host, with 8 GB of memory per vCPU. For example, for a production host with 16 vCPUs the recommended memory is at least 128GB. For bare-metal systems, a hardware thread (hyperthread), is equivalent to a vCPU. Use memory optimized cloud compute instances or virtual machines. Memory requirements may increase over time as projects evolve and databases grow with more content and more indexes. See comment [1] in the following table.
  • For a prototyping or development deployment, MarkLogic requires a minimum of 4 GB of system memory and recommends at least 8 GB of memory. See comment [1] in the following table.
  • For small forests that will not grow, such as Security and Schemas, the reserve size is two times the size of the forest.

    For data forests, we recommend that you target a size of 500 GB, where 400 GB is allocated to content, and 100 GB is left as reserved space to handle merges. See comment [2] in the following table for details about this storage calculation.

  • On Linux systems, you need at least as much swap space as the amount of physical memory on the machine or 32 GB, whichever is lower. MarkLogic also recommends setting Linux Huge Pages on Red Hat Enterprise Linux 7 or 8 systems to 3/8 the size of your physical memory. For details on setting up Huge Pages, see https://access.redhat.com/solutions/1578873 on the Red Hat website. (Note: This is Subscriber Exclusive Content.)

    If you have Huge Pages set up on a Linux system, your swap space on that machine must be at least the size of your physical memory minus the size of your Huge Page (because Linux Huge Pages are not swapped), or 32 GB, whichever is lower. For example, if you have 48 GB of physical memory, and if you have Huge Pages set to 18 GB, then you need swap space of 30 GB (48 - 18).

    At system startup on Linux machines, MarkLogic Server logs a message to the ErrorLog.txt file showing the Huge Page size, and the message indicates if the size is below the recommended level.

    If you are using Red Hat Enterprise Linux 7, you must turn off Transparent Huge Pages (Transparent Huge Pages are configured automatically by the operating system). For details on disabling Transparent Huge Pages, see https://kb.informatica.com/solution/23/PublishingImages/Disable%20Transparent%20Huehpages%20on%20Linux%207.pdf or see the Red Hat instructions for how to disable transparent huge pages.

  • On Windows systems twice the physical memory is also recommended for the swap (page) file. You configure this in the System Control Panel > Advanced system settings > Performance Settings > Advanced tab. Set the Virtual memory settings on that tab to twice your physical memory.
    No. Comment
    [1] MarkLogic automatically configures itself to reserve as much system memory as it can the first time it runs. If you need to change the default configuration, you can manually override these defaults at a later time using the Admin Interface.
    [2]

    For content forests that are expected to grow over time, with the default merge settings, you need to reserve 100 GB of storage. Here is the calculation:

    You need at least 2 times the merge max size of free space per forest, regardless of the forest size. Therefore, with the default merge max size of 48 GB, you need at least 96 GB of free space. Additionally, if your journals are not yet created, you need 2 times the journal size of free disk space (if the journal space is not yet allocated). Therefore, to be safe, you need 100 GB of free space for each content forest.

Supported Platforms

MarkLogic Server is supported on the following platforms:

Platform Comment

Microsoft Windows Server 2019

Microsoft Windows Server 2016

Microsoft Windows Server 2012 (x64)

Microsoft Windows Server supports Open Neural Network Exchange format (ONNX) for machine learning. To use the ONNX APIs and obtain the required Machine Learning libraries, download the GPU-enabled version of MarkLogic Server for Windows.
Microsoft Windows 10 (x64) Desktop Microsoft Windows 10 (x64) is supported for development only. Use Windows Server for Production.
Mac OS X 10.14 or later Mac OS X is supported for development only. Conversion (Office and PDF) and entity enrichment are not available on Mac OS X. A 64-bit capable processor is required (http://support.apple.com/kb/HT3696).

The Apple M1 chip is not supported.

Docker Docker is supported for development only. One Docker container per host. For more details, see https://developer.marklogic.com/code/docker/.

Red Hat Enterprise Linux 7 (x64)

Red Hat Enterprise Linux 8 (x64)

CentOS 7 (x64)

CentOS 8 (x64)

Amazon Linux 1 (x64)

Amazon Linux 2 (x64)

Red Hat Enterprise Linux 7 (x64) and CentOS 7 (x64) are supported on VMware ESXi 6.0 and Kernel-based Virtual Machine. Starting with MarkLogic 10.0-2, Red Hat Enterprise Linux 8 (x64) and CentOS 8 (x64) are also supported.

CentOS 7 and 8 (x64) are supported on the Azure platform.

All Linux platforms support Open Neural Network Exchange format (ONNX) for machine learning. To use the ONNX APIs and obtain the required Machine Learning libraries, download MarkLogic Server for Linux.

Either none, deadline, mq-deadline, or kyber I/O scheduler is required to ensure efficient disk I/O for MarkLogic Server on Linux. When configuring an I/O scheduler with SSDs in a virtualized environment (including any cloud-based virtual machines), the OS I/O scheduling must be set to none for 4.x kernels or noop/none for older 3.x kernels. For more details, see http://help.marklogic.com/Knowledgebase/Article/View/8/0/notes-on-io-schedulers, https://lonesysadmin.net/2013/12/06/use-elevator-noop-for-linux-virtual-machines/, and https://access.redhat.com/solutions/5427.

For a list of packages required for each Linux platform, see Appendix: Packages by Linux Platform.

MarkLogic now supports the 1-Click AWS option in AWS Marketplace. Because of this, the published MarkLogic AMIs will have data volume predefined.

Supported Filesystems

MarkLogic relies on the operating system for filesystem operations. While any filesystem that works properly (including under heavy load) should work, the following table lists the operating systems along with the filesystems under which they are supported. Other filesystems may work but have not been thoroughly tested by MarkLogic.

Operating System Supported Filesystems
Linux (all varieties)

XFS (recommended), EXT3, and EXT4 as well as the clustered filesystems for shared-disk failover mentioned in Requirements for Shared-Disk Failover in the Scalability, Availability, and Failover Guide.

Do not use data=writeback with EXT3 and EXT4 filesystems.

NAS is supported on Red Hat Enterprise Linux 7 and NetAPP.

Windows NTFS
Mac OS HFS+
All Amazon S3 (no journaling with S3)

The Solaris OS is not supported for MarkLogic 10.

Java Virtual Machine Requirements

MarkLogic Server can function with or without a Java Virtual Machine (JVM). The only requirement needed for a JVM to be installed on MarkLogic Server would be if you use HDFS (Hadoop Distributed File System).

Our provided Amazon AMIs have a JDK pre-installed that is used during the MarkLogic bootstrap process to setup and configure MarkLogic in the Amazon environment. Therefore, you do not need to install a JVM on any EC2 instance.

The following MarkLogic products and features require a JVM to either run or install:

MarkLogic supports the Java 8, 9, 10, and 11 versions of the following JVMs:

  • Oracle/Sun
  • OpenJDK

    The IBM JRE is not supported.

By default, MarkLogic looks for Java in the location specified via the JAVA_HOME environment variable or in a specific set of default locations. If JAVA_HOME is not set in the startup environment, MarkLogic uses the first JRE or JDK found in one of the following locations. These locations are searched in the order listed.

  • /usr/java/default
  • /usr/java/latest
  • /usr/java/jdk1.N* where N is a supported Java version. For example, /usr/java/jdk1.7.0_79 qualifies if Java 7 is a supported Java version.
  • /usr/lib/jvm/java
  • /usr/lib/jvm/java-openjdk
  • /usr/lib/jvm/jre-1.N.0-*.x86_64 where N is a supported Java version, such as Java 8.

If you have Java installed in a different location, you can communicate your JAVA_HOME to MarkLogic through the file /etc/marklogic.conf. For example:

cat > /etc/marklogic.conf
export JAVA_HOME=/path/to/your/jdk

Upgrades and Database Compatibility

MarkLogic 10 supports upgrades from MarkLogic 7 or from MarkLogic 8 or later databases. If you are upgrading from an earlier version of MarkLogic Server, you must first upgrade to 7 or 8 before moving to MarkLogic 10. For the procedure for upgrading, see Upgrading from Previous Releases.

During the upgrade, the Security database, the Schemas database, and the configuration files are automatically upgraded. The Security database is upgraded with the latest execute privileges and the Schemas database is upgraded with the latest version of the Schemas used by MarkLogic Server. The upgrade occurs as part of the installation procedure.

Databases that contain your own content are also upgraded to work with MarkLogic 10; once you upgrade to MarkLogic 10, you will no longer be able to use that database with previous versions of MarkLogic.

MarkLogic Corporation strongly recommends performing a backup of your databases before upgrading to MarkLogic 10. Additionally, MarkLogic Corporation recommends that you first upgrade to the latest maintenance release of the major version of MarkLogic you are running before upgrading to MarkLogic 10.

For the procedure for upgrading to MarkLogic 10, see Upgrading from Previous Releases. For details about known incompatibilities between MarkLogic 7 or MarkLogic 8 and MarkLogic 10, see Known Incompatibilities with Previous Releases in the Release Notes.

This section contains database compatibility information between various releases, and includes the following sections.

Prerequisites for Application Services Portion of the Upgrade

When upgrading from releases prior to MarkLogic 7 to MarkLogic 10, the upgrade reconfigures the Docs and App Services App Servers, which by default are on port 8000 and port 8002 in older releases. In order for those App Servers to be upgraded, the following conditions must be met:

  • Either no App Server is running on port 8000 or the App Server on port 8000 has a root of Docs/.
  • Either no App Server is running on port 8002 or the App Server on port 8002 has a root of Apps/ or Apps/appbuilder.

If the above conditions are met, then those App Servers are reconfigured during the MarkLogic 10 upgrade and the resulting configurations have the following settings:

App-Services:

App-Services App Server
Port
8000
Name
App-Services
Root
Apps/
Error Handler
error-handler.xqy
URL Rewriter
rewriter.xqy
Database
App-Services

Manage:

Manage App Server
Port
8002
Name
Manage
Root
Apps/
Error Handler
manage/error-handler.xqy
URL Rewriter
manage/rewriter.xqy
Database
App-Services
Privilege
manage

If the conditions are not met, then the upgrade logs an error to the ErrorLog.txt file and the Application Services portion of the upgrade is skipped. MarkLogic Server will still operate, but you will not be able to use Query Console, the Management API, and the rest of the Application Services features. To restore the Application Services functionality after a failed upgrade, create two App Servers with the preceding configurations. If you have any problems and you have an active maintenance contract, you can contact MarkLogic Technical Support for help.

Compatibility of MarkLogic 10 Databases With MarkLogic 7 and 8

MarkLogic 10 does not require a reindex from MarkLogic 7 or MarkLogic 8 databases. Therefore, if you are upgrading from MarkLogic 7 or MarkLogic 8, the database will not reindex, even if reindex enable is set to true.

MarkLogic Converters Installation Changes Starting at Release 9.0-4

MarkLogic converters are used to convert Microsoft Office Word, Excel, and PowerPoint documents, as well as Adobe PDF files, to XHTML. MarkLogic filters are used to filter a variety of document formats, extract metadata and text from them, and return XHTML. The following MarkLogic XQuery API functions, described in the MarkLogic XQuery and XSLT Function Reference, provide this functionality:

xdmp:word-convert
xdmp:excel-convert
xdmp:powerpoint-convert
xdmp:pdf-convert
xdmp:document-filter

Converters/filters are also used as part of conversion pipeline in Content Processing Framework. For more details, see The Default Conversion Option in the Content Processing Framework Guide.

Prior to MarkLogic release 9.0-4, converters/filters were bundled and automatically installed with MarkLogic Server. Starting at MarkLogic release 9.0-4, converters/filters are offered as a separate package, MarkLogic Converters.

This change provides better flexibility and enables you to install/uninstall MarkLogic converters/filters separately from MarkLogic Server.

With this change, MarkLogic Server does not include MarkLogic Converters. To use converters/filters, install both packages: MarkLogic Server and MarkLogic Converters. An XDMP-CVTNOTFOUND error will be thrown upon an attempt to use converters/filters on a MarkLogic node with no MarkLogic Converters installed.

The version of MarkLogic Converters is synchronized with the version of MarkLogic Server. For example, MarkLogic Converters 9.0-4 corresponds to MarkLogic Server 9.0-4 and may be installed with it.

You can obtain the version of MarkLogic Converters installed on a node by calling to MarkLogic server-side API function xdmp:host-status and examining the value of the converters-version element in the response. If the converters package is not installed on a node, the converters-version element will be empty.

MarkLogic Converters packages for all supported platforms are available for download at the same location where MarkLogic Server packages are available, namely at http://developer.marklogic.com.

If you want to use the converters package with MarkLogic 9.0-4 or later, you will have to perform a two-step installation: first install MarkLogic Server and then install MarkLogic Converters.

For details on MarkLogic Server and MarkLogic Converters installation for all supported platforms, see Installing MarkLogic.

If you want to uninstall MarkLogic 9.0-4 or later, and if the converters package was previously installed with it, you will have to perform a two-step uninstall: first uninstall MarkLogic Converters and then uninstall MarkLogic Server.

For details on MarkLogic Server and MarkLogic Converters uninstall for all supported platforms, see Removing MarkLogic.

« Table of contents
Next chapter »