You can use the Management REST API to script setting up and adding hosts to a cluster. This chapter covers the following topics:
The scripts and examples in this chapter use the curl
command line tool to make HTTP requests. If you are not familiar with curl, see Introduction to the curl Tool in REST Application Developer's Guide. You can replace curl in the example scripts with another tool capable of sending HTTP request.
The examples and scripts assume a Unix shell environment. If you are on Windows and do not have access to a Unix-like shell environment such as Cygwin, you will not be able to use these scripts directly. To understand how to transform the key curl requests for Windows, see Modifying the Example Commands for Windows in REST Application Developer's Guide.
You can run the sample scripts in this chapter from any host from which the MarkLogic Server hosts are reachable.
The Management REST API is a set of interfaces for administering, monitoring, and configuring a MarkLogic Server cluster and the resources it contains, such as databases, forests, and App Servers. Though you can interactively set up a cluster during product installation using the Admin Interface, you can only script setting up a cluster and adding hosts to it using the Management REST API.
Cluster creation has two phases: You first fully initialize the first host in the cluster (or a standalone host), and then you add additional hosts. The process of bringing up the first host differs from adding subsequent hosts. You must not apply the first host initialize process to a host you expect to eventually add to a cluster.
License installation is optional. If you choose to install a license, you can install it during the basic initialization or add it later. Installing a license later causes an additional restart. Licenses must be installed separately on each host. They are not shared across the cluster.
The diagram below shows the sequence of Management API REST requests required to bring up a multi-host cluster.
When a request causes a restart, MarkLogic Server returns a restart
element or key-value pair that includes the last startup time of all affected hosts. You can use this information with GET /admin/v1/timestamp to determine when the restart is complete; for details see Using the Timestamp Service to Verify a Restart.
Use the procedure outlined in this section to set up the first or only host in a cluster.
You must not use this procedure to bring up the 2nd through Nth host in a cluster. Once you initialize security by calling POST /admin/v1/instance-admin, you cannot add the host to a different cluster without reinstalling MarkLogic Server.
This section covers the following topics:
Setting up the first (or only) host in a cluster involves the following Management REST API requests to the host:
POST http://
bootstrap-host:8001/admin/v1/init
POST http://
bootstrap-host:8001/admin/v1/instance-admin
The following procedure outlines the scriptable steps:
sudo rpm -i /your/location/MarkLogic-8.0-1.x86_64.rpm
sudo /sbin/service MarkLogic start
curl -X POST -d "" http://${BOOTSTRAP_HOST}:8001/admin/v1/init
curl -X POST -H "Content-type: application/x-www-form-urlencoded" \ --data "admin-username=${USER}" --data "admin-password=${PASS}" \ --data "wallet-password=${WPASS}" --data "realm=${SEC_REALM}" \ http://${BOOTSTRAP_HOST}:8001/admin/v1/instance-admin
When this procedure is completed, MarkLogic Server is fully operational on the host, and you can configure forests, databases, and App Servers, or add additional hosts to the cluster.
Once you successfully complete POST /admin/v1/instance-admin
, security is initialized, and all subsequent requests require authentication.
The following bash
shell script assumes
Use the script by specifying at least two hosts on the command line.
this_script [options] bootstrap_host
Use the command line options in the following table to tailor the script to your environment:
Option | Description |
---|---|
|
The HTTP authentication method to use for requests that require authentication. Allowed values: basic , digest , anyauth . Default: anyauth . |
|
The password for the administrative user to use for HTTP requests that require authentication. Default: password . |
-r sec_realm |
The authentication realm for the host. For details, see Realm in Administrator's Guide. Default: public . |
|
The administrative username to use for HTTP requests that require authentication. Default: admin . |
This script makes use of the restart checking technique described in Setting Up the First Host in a Cluster. This script performs only minimal error checking and is not meant for production use.
#!/bin/bash ################################################################ # Use this script to initialize the first (or only) host in # a MarkLogic Server cluster. Use the options to control admin # username and password, authentication mode, and the security # realm. If no hostname is given, localhost is assumed. Only # minimal error checking is performed, so this script is not # suitable for production use. # # Usage: this_command [options] hostname # ################################################################ BOOTSTRAP_HOST="localhost" USER="admin" PASS="password" WPASS="wpass" AUTH_MODE="anyauth" SEC_REALM="public" N_RETRY=5 RETRY_INTERVAL=10 ####################################################### # restart_check(hostname, baseline_timestamp, caller_lineno) # # Use the timestamp service to detect a server restart, given a # a baseline timestamp. Use N_RETRY and RETRY_INTERVAL to tune # the test length. Include authentication in the curl command # so the function works whether or not security is initialized. # $1 : The hostname to test against # $2 : The baseline timestamp # $3 : Invokers LINENO, for improved error reporting # Returns 0 if restart is detected, exits with an error if not. # function restart_check { LAST_START=`$AUTH_CURL "http://$1:8001/admin/v1/timestamp"` for i in `seq 1 ${N_RETRY}`; do if [ "$2" == "$LAST_START" ] || [ "$LAST_START" == "" ]; then sleep ${RETRY_INTERVAL} LAST_START=`$AUTH_CURL "http://$1:8001/admin/v1/timestamp"` else return 0 fi done echo "ERROR: Line $3: Failed to restart $1" exit 1 } ####################################################### # Parse the command line OPTIND=1 while getopts ":a:p:r:u:" opt; do case "$opt" in a) AUTH_MODE=$OPTARG ;; p) PASS=$OPTARG ;; w) WPASS=$OPTARG ;; r) SEC_REALM=$OPTARG ;; u) USER=$OPTARG ;; \?) echo "Unrecognized option: -$OPTARG" >&2; exit 1 ;; esac done shift $((OPTIND-1)) if [ $# -ge 1 ]; then BOOTSTRAP_HOST=$1 shift fi # Suppress progress meter, but still show errors CURL="curl -s -S" # Add authentication related options, required once security is initialized AUTH_CURL="${CURL} --${AUTH_MODE} --user ${USER}:${PASS}" --wpass ${WPASS} ####################################################### # Bring up the first (or only) host in the cluster. The following # requests are sent to the target host: # (1) POST /admin/v1/init # (2) POST /admin/v1/instance-admin?admin-user=W&admin-password=X&wallet-password=Y&realm=Z # GET /admin/v1/timestamp is used to confirm restarts. # (1) Initialize the server echo "Initializing $BOOTSTRAP_HOST..." $CURL -X POST -d "" http://${BOOTSTRAP_HOST}:8001/admin/v1/init sleep 10 # (2) Initialize security and, optionally, licensing. Capture the last # restart timestamp and use it to check for successful restart. TIMESTAMP=`$CURL -X POST \ -H "Content-type: application/x-www-form-urlencoded" \ --data "admin-username=${USER}" --data "admin-password=${PASS}" \ --wallet-password "wpass=${WPASS}" --data "realm=${SEC_REALM}" \ http://${BOOTSTRAP_HOST}:8001/admin/v1/instance-admin \ | grep "last-startup" \ | sed 's%^.*<last-startup.*>\(.*\)</last-startup>.*$%\1%'` if [ "$TIMESTAMP" == "" ]; then echo "ERROR: Failed to get instance-admin timestamp." >&2 exit 1 fi # Test for successful restart restart_check $BOOTSTRAP_HOST $TIMESTAMP $LINENO echo "Initialization complete for $BOOTSTRAP_HOST..." exit 0
Use the procedure described by this section to configure the 2nd through Nth hosts in a cluster. This section covers the following topics:
Once you configure the first host in a cluster, add additional hosts to the cluster by using the following series of Management REST API requests for each host:
POST http://joining-host:8001/admin/v1/init
GET http://joining-host:8001/admin/v1/server-config
POST http://bootstrap-host:8001/admin/v1/cluster-config
POST http://joining-host:8001/admin/v1/cluster-config
The following procedure outlines the scriptable steps:
sudo rpm -i /your/location/MarkLogic-8.0-1.x86_64.rpm
sudo /sbin/service MarkLogic start
/admin/v1/init
. Authentication is not required. This step causes a restart. curl -X POST -d "" http://${JOINING_HOST}:8001/admin/v1/init
-o
option.JOINER_CONFIG=`curl -s -S -X GET -H "Accept: application/xml" \ http://${JOINING_HOST}:8001/admin/v1/server-config`
The following example command assumes the input server config is in the shell variable JOINER_CONFIG
and saves the output cluster configuration to cluster-config.zip
.
curl -s -S --digest --user admin:password -X POST -o cluster-config.zip -d "group=Default" \ --data-urlencode "server-config=${JOINER_CONFIG}" \ -H "Content-type: application/x-www-form-urlencoded" \ http://${BOOTSTRAP_HOST}:8001/admin/v1/cluster-config
curl -s -S -X POST -H "Content-type: application/zip" \ --data-binary @./cluster-config.zip \ http://${JOINING_HOST}:8001/admin/v1/cluster-config
Once this procedure completes, the joining host is a fully operational member of the cluster.
The following bash
shell script assumes
The example script completes the cluster join sequence for each host serially. However, you can also add hosts concurrently. For details, see Adding Hosts to a Cluster Concurrently.
Use the script by specifying at least two hosts on the command line. A fully initialized host that is already part of the cluster must be the first parameter.
this_script [options] bootstrap_host joining_host [joining_host...]
Use the command line options in the following table to tailor the script to your environment:
The script makes use of the restart checking technique described in Using the Timestamp Service to Verify a Restart. This script performs only minimal error checking and is not meant for production use.
#!/bin/bash ################################################################ # Use this script to initialize and add one or more hosts to a # MarkLogic Server cluster. The first (bootstrap) host for the # cluster should already be fully initialized. # # Use the options to control admin username and password, # authentication mode, and the security realm. At least two hostnames # must be given: A host already in the cluster, and at least one host # to be added to the cluster. Only minimal error checking is performed, # so this script is not suitable for production use. # # Usage: this_command [options] cluster-host joining-host(s) # ################################################################ USER="admin" PASS="password" AUTH_MODE="anyauth" N_RETRY=5 RETRY_INTERVAL=10 ####################################################### # restart_check(hostname, baseline_timestamp, caller_lineno) # # Use the timestamp service to detect a server restart, given a # a baseline timestamp. Use N_RETRY and RETRY_INTERVAL to tune # the test length. Include authentication in the curl command # so the function works whether or not security is initialized. # $1 : The hostname to test against # $2 : The baseline timestamp # $3 : Invokers LINENO, for improved error reporting # Returns 0 if restart is detected, exits with an error if not. # function restart_check { LAST_START=`$AUTH_CURL "http://$1:8001/admin/v1/timestamp"` for i in `seq 1 ${N_RETRY}`; do if [ "$2" == "$LAST_START" ] || [ "$LAST_START" == "" ]; then sleep ${RETRY_INTERVAL} LAST_START=`$AUTH_CURL "http://$1:8001/admin/v1/timestamp"` else return 0 fi done echo "ERROR: Line $3: Failed to restart $1" exit 1 } ####################################################### # Parse the command line OPTIND=1 while getopts ":a:p:u:" opt; do case "$opt" in a) AUTH_MODE=$OPTARG ;; p) PASS=$OPTARG ;; u) USER=$OPTARG ;; \?) echo "Unrecognized option: -$OPTARG" >&2; exit 1 ;; esac done shift $((OPTIND-1)) if [ $# -ge 2 ]; then BOOTSTRAP_HOST=$1 shift else echo "ERROR: At least two hostnames are required." >&2 exit 1 fi ADDITIONAL_HOSTS=$@ # Curl command for all requests. Suppress progress meter (-s), # but still show errors (-S) CURL="curl -s -S" # Curl command when authentication is required, after security # is initialized. AUTH_CURL="${CURL} --${AUTH_MODE} --user ${USER}:${PASS}" ####################################################### # Add one or more hosts to a cluster. For each host joining # the cluster: # (1) POST /admin/v1/init (joining host) # (2) GET /admin/v1/server-config (joining host) # (3) POST /admin/v1/cluster-config (bootstrap host) # (4) POST /admin/v1/cluster-config (joining host) # GET /admin/v1/timestamp is used to confirm restarts. for JOINING_HOST in $ADDITIONAL_HOSTS; do echo "Adding host to cluster: $JOINING_HOST..." # (1) Initialize MarkLogic Server on the joining host TIMESTAMP=`$CURL -X POST -d "" \ http://${JOINING_HOST}:8001/admin/v1/init \ | grep "last-startup" \ | sed 's%^.*<last-startup.*>\(.*\)</last-startup>.*$%\1%'` if [ "$TIMESTAMP" == "" ]; then echo "ERROR: Failed to initialize $JOINING_HOST" >&2 exit 1 fi restart_check $JOINING_HOST $TIMESTAMP $LINENO # (2) Retrieve the joining host's configuration JOINER_CONFIG=`$CURL -X GET -H "Accept: application/xml" \ http://${JOINING_HOST}:8001/admin/v1/server-config` echo $JOINER_CONFIG | grep -q "^<host" if [ "$?" -ne 0 ]; then echo "ERROR: Failed to fetch server config for $JOINING_HOST" exit 1 fi # (3) Send the joining host's config to the bootstrap host, receive # the cluster config data needed to complete the join. Save the # response data to cluster-config.zip. $AUTH_CURL -X POST -o cluster-config.zip -d "group=Default" \ --data-urlencode "server-config=${JOINER_CONFIG}" \ -H "Content-type: application/x-www-form-urlencoded" \ http://${BOOTSTRAP_HOST}:8001/admin/v1/cluster-config if [ "$?" -ne 0 ]; then echo "ERROR: Failed to fetch cluster config from $BOOTSTRAP_HOST" exit 1 fi if [ `file cluster-config.zip | grep -cvi "zip archive data"` -eq 1 ]; then echo "ERROR: Failed to fetch cluster config from $BOOTSTRAP_HOST" exit 1 fi # (4) Send the cluster config data to the joining host, completing # the join sequence. TIMESTAMP=`$CURL -X POST -H "Content-type: application/zip" \ --data-binary @./cluster-config.zip \ http://${JOINING_HOST}:8001/admin/v1/cluster-config \ | grep "last-startup" \ | sed 's%^.*<last-startup.*>\(.*\)</last-startup>.*$%\1%'` restart_check $JOINING_HOST $TIMESTAMP $LINENO rm ./cluster-config.zip echo "...$JOINING_HOST successfully added to the cluster." done
The REST Management API is designed to support safe, concurrent cluster topology changes. For example, you can send server configuration data to a bootstrap host from multiple hosts, at the same time.
Only the REST Management API offers safe concurrency support. If you make changes using another interface, such as the Admin Interface, Configuration Manager, XQuery Admin API, or REST Packaging API, no such concurrency guarantees exist. With any other interface, even in combination with the REST Management API, you must ensure that no concurrent change requests occur.
When you use the Management REST API to perform an operation that causes MarkLogic Server to restart, the request that caused the restart returns a response that includes the last restart time, similar to the following:
<restart xmlns="http://marklogic.com/manage"> <last-startup host-id="13544732455686476949"> 2013-05-15T09:01:43.019261-07:00 </last-startup> <link> <kindref>timestamp</kindref> <uriref>/admin/v1/timestamp</uriref> </link> <message>Check for new timestamp to verify host restart.</message> </restart>
If the operation causes multiple hosts to restart, the data in the response includes a last-startup
timestamp for all affected hosts.
You can use the last-startup
timestamp in conjunction with /admin/v1/timestamp
to detect a successful restart. If MarkLogic Server is operational, a GET request to /admin/v1/timestamp
returns a 200 (OK) HTTP status code and the timestamp of the last MarkLogic Server startup:
$ curl --anyauth --user user:password -X GET \ http://localhost:8001/admin/v1/timestamp 2013-05-15T10:34:38.932514-07:00
If such a request returns an HTTP response code other than 200 (OK), it is not safe to proceed with subsequent administrative requests.
By comparing the last-startup
to the current timestamp, you can detect when the restart is completed. If the timestamp doesn't change after some reasonable time, you can conclude the restart was not successful. The following bash
shell function performs this check:
#!/bin/bash # ... AUTH_CURL="curl -s -S --${AUTH_MODE} --user ${USER}:${PASS}" N_RETRY=5 RETRY_INTERVAL=10 ... function restart_check { LAST_START=`$AUTH_CURL "http://$1:8001/admin/v1/timestamp"` for i in `seq 1 ${N_RETRY}`; do if [ "$2" == "$LAST_START" ] || [ "$LAST_START" == "" ]; then sleep ${RETRY_INTERVAL} LAST_START=`$AUTH_CURL "http://$1:8001/admin/v1/timestamp"` else return 0 fi done echo "ERROR: Line $3: Failed to restart $1" exit 1 }
To use the function, capture the timestamp from an operation that causes a restart, and pass the timestamp, host name, and current line number to the function. The following example assumes only a single host is involved in the restart and uses the sed
line editor to strip the timestamp from the <restart/>
data returned by the request.
TIMESTAMP=`curl -s -S -X POST ... \ http://${HOST}:8001/admin/v1/instance-admin \ | sed 's%^.*<last-startup.*>\(.*\)</last-startup>.*$%\1%'` if [ "$TIMESTAMP" == "" ]; then echo "ERROR: Failed to get restart timestamp." >&2 exit 1 else restart_check $BOOTSTRAP_HOST $TIMESTAMP $LINENO fi
The /admin/v1/timestamp
service requires digest authentication only after security is initialized, but the restart_check
function shown here skips this distinction for simplicity and always passes authentication information.
An operation that causes multiple hosts to restart requires a more sophisticated check that iterates through all the last-startup
host-id's and timestamps.
This section describes REST Management API conventions for the format of input data, response data, and error details. The following topics are covered:
Most methods of the REST Management API accept input as XML or JSON. Some methods accept URL-encoded form data (MIME type application/x-www-form-urlencoded
). Use the HTTP Content-type request header to indicate the format of your input.
Many methods can return data as XML or JSON. The monitoring GET methods, such as GET /manage/v2/clusters, also support HTML.
The response data format is controlled through either the HTTP Accept header or a format
request parameter (where available). When both the header and the parameter are present, the format parameter takes precedence.
If a request results in an error, the body of the response includes error details. The MIME type of error details in the response is determined by the format
request parameter (where supported), Accept header, or request Content-type header, in that order of precedence.
For example, if a request supplies XML input (request Content-type set to application/xml
), but specifies JSON output using the format
parameter, then error details are returned as JSON. If a request supplies JSON input, but no Accept header or format
parameter, then error details are returned as JSON.
Once you have initialized the hosts in your cluster, you can configure databases and App Servers using the Admin Interface, the XQuery Admin API, or the Management REST API. The REST Management API supports scripting of many administrative tasks, including the ones listed in the table below. For more details, see the REST Management API Reference.
Operation | REST Method |
---|---|
Restart or shutdown a cluster | POST /manage/v2/clusters/{id|name} |
Restart or shutdown a host | POST /manage/v2/hosts/{id|name} |
Remove a host from a cluster | DELETE /admin/v1/host-config |
Create a forest | POST /manage/v2/forests |
Enable or disable a forest | PUT /manage/v2/forests/{id|name}/properties |
Combine forests or migrate data to a new data directory | PUT /manage/v2/forests |
Delete a forest | DELETE /manage/v2/forests/{id|name} |
Change properties of a host, such as hostname or group | PUT /manage/v2/hosts/{id|name}/properties |
Create a database partition | POST /manage/v2/databases/{id|name}/partitions |
Resize, transfer, or migrate a partition | PUT /manage/v2/databases/{id|name}/partitions/{name} |
Package database and App Server configurations for deployment on another host. | |
Install packaged database and App Server configuration on a host. | POST:/manage/v2/packages/{pkgname}/install |
Monitor real time usage and status of a cluster and its resources | Where resource is one of: clusters, databases, forests, groups, hosts, or servers. |
Manage historical usage of a cluster and its resources |
Where resource is one of: clusters, databases, forests, groups, hosts, or servers. |
You can also use the Admin Interface and the XQuery Admin API to perform these and other operations. For details, see the following: