Loading TOC...
MarkLogic Server on Amazon EC2 Guide (PDF)

MarkLogic Server on Amazon EC2 Guide — Chapter 3

Deploying MarkLogic on EC2 Using CloudFormation

This chapter describes how to deploy MarkLogic Server using a CloudFormation Template.

What CloudFormation Template Version to Use

There are two basic versions of the MarkLogic CloudFormation Template, Version 1 and Version 2. The basic difference between the two template versions is the database used to store the metadata:

  • Version 1.0 -- This version stores the metadata in SimpleDB. Version 1 is compatible with MarkLogic 7.x to 8.0.1 for new installs. Use this version if updating an existing EC2 cluster, as described in Upgrading the MarkLogic AMI.
  • Version 2.0 -- This version stores the metadata in DynamoDB. Version 2 is compatible with MarkLogic 8.0.3+ , but not earlier releases.

Overview

A Managed Cluster is automatically initialized and pre-configured with recommended topology, such as the one illustrated below.

The sample CloudFormation templates implement a simple example of this reference architecture and makes use of the Managed Cluster feature. Regardless of how the cluster is created, the necessary components need to be created, configured and deployed in a controlled fashion.

Cloud Formation is an AWS Technology that allows you to specify the set of components necessary for creating a Stack. You can use one of the provided Amazon Cloud Formation templates to create a Managed Cluster. The Managed Cluster templates create:

  • IAM Roles necessary for running AWS services without needing to pass in security credentials
  • Security groups to control the incoming network traffic delivered to the instances.
  • AutoScaling groups one per node
  • Launch Configuration for the AutoScaling Groups
  • Load balancer fronting all of the nodes
  • EBS Volumes for each node

When using the Cloud Formation templates there are parameters that must be filled in (either via the AWS Console or any 3rd party command line tool that can launch a cloud formation stack). These parameters include:

  • What Zone each node will run in
  • The admin user and password for initially creating the security database
  • The SSL Key name (Used to login to the instances once they are started)
  • The size and EBS type of the volumes (in GB) to create for the initial data volume /var/opt/MarkLogic
  • The EC2 instance type of the created instance.
  • Optional: The Simple Notification Service (SNS) topic to be used to capture messages from the AutoScaling Groups and Managed Cluster Support startup procedure.

When launched, the Cloud Formation creates all the necessary resources. On startup, the Amazon EC2 nodes recognize that they are part of a Managed Cluster and perform the following actions without user intervention:

  • Attach any volumes associated with this node
  • Create a filesystem, if needed
  • Mount the filesystem
  • Start MarkLogic
  • Apply and accept the EC2 license
  • Either create the initial node (master) and set the admin username and password or attach to the cluster
  • Associate the node with the Load Balancer

The Load Balancer detects proper running of MarkLogic via the HealthCheck App Server on port 7997 and will only direct traffic to that node if it has verified that the MarkLogic instance is up and running.

Each AutoScaling Group (ASG) detects system stability and will terminate and restart the node if the operating system is having problems. At any time you can pause the cluster by setting the ASG NodesPerZone value to 0 for all nodes. You can then restart the node by resetting the NodesPerZone to a value of 1 - 20 for each ASG. On restart, either by resuming from pause or restarting from the ASG detecting faults and restarting the server, the system will automatically do the following:

  • Detect any previously attached volumes and re-attach them
  • Detect if the hostname has changed since the previous start and, if so, rename the host to the new hostname in the MarkLogic cluster
  • Re-attach to the cluster

Deployment and Startup

MarkLogic is started as either a system service (from /etc/init.d) or manually (for example, service MarkLogic start). The standard install starts MarkLogic on the next reboot after install, however it may be started via a script or system configuration at any point.

Any customization to the startup environment must be completely in place before MarkLogic starts the first time after an install so that it properly configures its role (single, cluster master, cluster joiner), detects the correct data volumes, Java JVM, paths, and other configurable information. This section describes the AWS-specific configuration variables.

MarkLogic is typically configured to start on boot, but also may be started manually. All startup paths should be configured to inherit the same environment so that behavior is consistent. The biggest variation depends on whether or not MarkLogic is pre-installed on the AMI.

During the init process, the interaction and dependency between MarkLogic services and other services may need to be considered especially if using an AMI without MarkLogic pre-installed and configured.

The following table shows the typical startup ordering of services on an AWS Linux system.

Order Service
02 lvm2-monitor
08 ip6tables
08 iptables
10 network
11 auditd
12 rsyslog
58 ntpd
80 sendmail
85 MarkLogic ( Version 7 )
86 tomcat-jsvc
98[c] cloud-final (All User defined upstart and cloud-init scripts)
98[M] MarkLogic ( Version 8 )
99 local (/etc/rc.local)

Note that cloud-init has several components, you can arrange using very low level configurations for file and config data to be populated in cloud-config state (52) but deployment tools use this for their own purposes. Most common is 'user scripts' which are run in 'cloud-final' (98[c]).

In Version 8, MarkLogic was moved to the LSB init configuration format which adds a dependency to run after cloud-final. This allows user configuration to be applied before MarkLogic whether or not it was pre-installed.

When MarkLogic is started, the following process runs:

  1. /etc/init.d/MarkLogic is invoked . This runs via init (e.g /etc/rc5.d/S98MarkLogic), manually (e.g. service MarkLogic start )
  2. /etc/sysconfig/MarkLogic is sourced (performing the following)
  3. Default values for core env vars are defaulted
  4. /etc/marklogic.conf is sourced (if it exists). This can modify or add variable.
  5. If MARKLOGIC_EC2_HOST !=1, no additional EC2 specific processing is performed.
  6. MARKLOGIC_HOSTNAME is calculated if not defined by using EC2 metadata in order
    • public-hostname
    • public-ipv4
    • local-hostname
    • local-ipv4
    • hostname
  7. MARKLOGIC_AWS_ROLE is fetched from the IAM Role associated with the instance.
  8. MARKLOGIC_EBS is set to /dev/sdf if not already set.
  9. If MARKLOGIC_EC2_USERDATA != 0, then EC2 user data is read and parsed. Any name/value pairs overwrite existing settings.
  10. If MARKLOGIC_CLUSTER_NAME, MARKLOGIC_NODENAME and MARKOGIC_CLUSTER_MASTER is defined then the Managed Cluster logic is performed.
    • Forming or joining a cluster
    • Creating / attaching data volumes
    • Resolving hostname changes
    • Updating cluster configuration

      This process is repeated on every boot and service start.

  11. If Step 10 is performed, all resolved variables are cached by writing to /usr/local/mlcmd.conf to avoid the overhead of recalculating the values on a restart.
  12. If Step 10 is not performed, the following occurs:
    • If MARKLOGIC_ADMIN_AUTOCREATE is set and not empty:
      • MARKLOGIC_ADMIN_PASSWORD is set to the value of the EC2 metadata who's key is $MARKLOGIC_ADMIN_AUTOCREATE. This overwrites any previous setting of MARKLOGIC_ADMIN_PASSWORD
      • If MARKLOGIC_ADMIN_PASSWORD is not empty and MARKLOGIC_ADMIN_USERNAME is empty then set MARKLOGIC_ADMIN_USERNAME="admin"
      • If MARKLOGIC_ADMIN_PASSWORD and if MARKLOGIC_ADMIN _USERNAME are both not empty then:

    • Log the success or failure to the system log and console.

Creating a CloudFormation Stack using the AWS Console

This section describes how to use the AWS Console to create a CloudFormation Stack from a template. This section describes each step in the procedure, but does not discuss all of the options for each step. For more details, see:

http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-console-create-stack.html.

Before you can create a CloudFormation Stack, you will need the following:

The following procedure describes how to create a CloudFormation Stack from a template:

  1. Click on the box in the upper left-hand portion of the Services page to access the Amazon Web Services home page:

  2. In the Amazon Web Services home page, click on CloudFormation:

  3. In the CloudFormation Stacks page, click Create Stack.

  4. In the Select Template window, enter a name for your stack. Click Choose File and select the CloudFormation script file you created in Sample CloudFormation Template. Alternatively, you can click Provide a Template URL and enter one of the URLs listed in http://developer.marklogic.com/products/aws. When done, click Continue.

    Your Stack Name is used to identify all of the resources for your stack, including the names of your EBS volumes. It is a best practice to name your stack with an easily identifiable name, such as your user name. The EBS volumes for all but the first node in each zone are not removed when you delete the stack, so you will want to be able to easily identify those volumes should you want to remove them after deleting your stack.

  5. In the Specify Parameters window, enter the information shown in the table below. When done, click Continue.

    Parameter Name Description
    AdminUser The username you want to use to log in as the MarkLogic Administrator.
    AdminPass The password you want to use to log in as the MarkLogic Administrator.
    IAMRole The name of the IAM Role you created in Creating an IAM Role.
    InstanceType The type of EC2 instance to launch. These vary by release, product type, zone, region, and availability. Refer to http://developer.marklogic.com/products/aws for the current supported values for these fields. For details on each instance type, see http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instance-types.html.

    Only HVM instance types are now supported for Marketplace AMI's, PVM types may be used with custom AMIs.

    KeyName The name of the Key Pair you created in Creating a Key Pair.
    Licensee The name of the licensee obtained from your MarkLogic representative. Enter none if you plan to enter the license information later.
    LicenseKey The license key obtained from your MarkLogic representative. Enter none if you plan to enter the license information later.
    LogSNS The Simple Notification Service (SNS) needed for logging. Enter the entire Topic ARN as it appears in the SNS Dashboard (for example, arn:aws:sns:us-east-1:1234567890123456:mytopic). For details on how to obtain an SNS Topic, see Creating a Simple Notification Service (SNS) Topic.
    NodesPerZone The number of nodes (hosts) to create for each zone. For example, a value of 1 will create one node for each zone, a total of three nodes for the cluster. A value of 0 will pause all nodes.
    SpotPrice Spot price for instances in USD/Hour. This is 0 by default. If not 0, then the amount given is a spot request for the instances is used instead of on-demand.
    VolumeSize The initial EBS volume size (GB). The range of valid values are 10 - 1000. The default is 10.
    Zone1 Zone2 Zone3 The zones on which to create each host in the cluster. Each zone in your cluster should be in the same region, such as us-east or us-west.
  6. In the Add Tags window, enter any tags for your stack. The tag(s) you provide identify your EC2 resources in the EC2 dashboard. For example, if you identify the Key as Name, the given Value (Test Stack, for example) will appear in the Name column of the Instance list in the EC2 dashboard. For details on tags, see http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-console-add-tags.html. When done, click Continue.

  7. In the Review window, review the settings. Click edit to make any changes. When done, click Continue.

  8. You will be notified that the stack is being created. Click Close.

  9. The name, create date, and status of your stack will appear at the top of the page.

  10. It takes a few minutes depending on the speed of AWS and the number of resources you are creating in the stack. You can Use the Events tab in the bottom portion of the page to view the progress of your stack creation. Click Refresh to see the latest status.

  11. A status of CREATE_COMPLETE indicates that your AutoScaling groups have been created. Wait approximately 5-10 minutes for your EC2 instances to boot up before navigating to the Outputs tab and clicking the Load Balancer URL in the Value column. This will open the MarkLogic Admin Interface on an available instance.

    If the URL in the Outputs tab does not work, wait another 5-10 minutes and try again.

  12. Log in using the administrator username and password you specified in 5.

    Do not make any changes in the Administrator Interface until all of the hosts have been created and joined the cluster. If in doubt about the status of your stack, check the logs from the SNS topic described in Creating a Simple Notification Service (SNS) Topic.

Creating a CloudFormation Stack using the AWS Command-Line Interface

In addition to using the AWS CloudFormation console, you can use the AWS CloudFormation command line interface (CLI) to create a CloudFormation stack. The AWS CloudFormation CLI is described in http://aws.amazon.com/cli/.

The AWS command line tools do not work with spaces for CloudFormation parameter values. Any parameter values containing a space will result in an error.

The list of CLI commands are documented in http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/CFN_CMD.html.

The following is a summary on how to create a stack using the AWS CloudFormation CLI:

  1. Install and configure AWS CloudFormation CLI environment for your system, as described in http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-installing-cli.html.
  2. Call the cfn-create-stack function with similar parameters as shown in Creating a CloudFormation Stack using the AWS Console.
  3. The cfn-create-stack function runs asynchronously, so it will return an id for the stack before the stack is created. You can use the cfn-describe-stack-events command with the stack id to check the status of your stack.
  4. Once the stack is created, you can use the cfn-describe-stacks function to obtain the URL to the MarkLogic Admin Interface.

Sample CloudFormation Template

CloudFormation Templates consist of JSON code that is used to create a collection of AWS resources known as a stack. CloudFormation Templates are described in detail in http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/template-guide.html. This section describes the CloudFormation Template used to create a stack that consists of a three-plus node MarkLogic cluster.

The Sample Templates available from http://developer.marklogic.com/products/aws are designed to demonstrate the architecture and IT requirements for the managed cluster feature and be useable out of the box as an example only. A production template will likely need to be customized to accommodate your specific IT requirements and may hard code many of the values exposed as parameters and mappings in these examples. For example, if you will only run in one region, there is no need for a mapping table of Region to AMI ID.

Before attempting to modify this template, it is a best practice to run the unmodified template, as described in Creating a CloudFormation Stack using the AWS Console, to become familiar with the procedures for building a cloud stack.

The main sections of the CloudFormation Template are as follows:

Parameters Declaration

The Parameters portion of the template defines the parameters necessary to build your MarkLogic cluster. The three zones define the hosted zones on which the servers in cluster are to be created. All of the zones should be in the same region, as described in http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html.

For a description of each parameter, see the table at the end of 5 in Creating a CloudFormation Stack using the AWS Console.

{
  "AWSTemplateFormatVersion":"2010-09-09",
  "Description":"Create a cluster with three node or more nodes, 
multi az, load balanced, MarkLogic Cluster. 
MarkLogic-8.0-20150512.x86_64.rpm",
  "Parameters":
   {
    "AdminUser":
     {
      "Description":"The MarkLogic Administrator Username",
      "Type":"String"
     },
    "AdminPass":
     {
      "Description":"The MarkLogic Administrator Password",
      "Type":"String",
      "NoEcho":"true"
     },
    "InstanceType":
     {
      "Description":"Type of EC2 instance to launch",
      "Type":"String",
      "Default":"r3.8xlarge",
      "AllowedValues":
[ "t2.small", "t2.medium", "m1.medium", "m3.medium", "m3.large",
"m3.xlarge", "m3.2xlarge", "cc1.4xlarge", "cc2.8xlarge", "c3.large",
"c3.xlarge", "c3.2xlarge", "c3.4xlarge", "c3.8xlarge", "cr1.8xlarge",
"r3.large", "r3.xlarge", "r3.2xlarge", "r3.4xlarge", "r3.8xlarge",
"i2.xlarge", "i2.2xlarge", "i2.4xlarge", "i2.8xlarge", "hi1.4xlarge",
"hs1.8xlarge" ]
     },
    "IAMRole": {
      "Description":"IAM Role",
      "Type":"String"
     },
    "Licensee": {
      "Description":"The MarkLogic Licensee or 'none'",
      "Type":"String",
      "Default":"none"
     },
    "LicenseKey": {
      "Description":"The MarkLogic License Key or 'none'",
      "Type":"String",
      "Default":"none"
     },
    "KeyName": {
      "Description":"Name of and existing EC2 KeyPair to enable 
SSH access to the instance",
      "Type":"String"
     },
    "LogSNS": {
      "Description":"SNS Topic for logging - optional/advanced",
      "Type":"String",
      "Default":"none"
     },
    "NodesPerZone": {
      "Description":"Total number of nodes per Zone. (3 zones). 
Set to 0 to shutdown/hibernate",
      "Type":"Number",
      "MinValue":"0",
      "MaxValue":"20",
      "Default":"1"
     },
    "SpotPrice": {
      "Description":"Spot price for instances in USD/Hour - Optional/advanced",
      "Type":"Number",
      "MinValue":"0",
      "MaxValue":"2",
      "Default":"0"
     },
    "VolumeSize": {
      "Description":"The EBS Data volume size (GB) for all nodes",
      "Type":"Number",
      "MinValue":"10",
      "MaxValue":"1000",
      "Default":"10"
     },
    "VolumeType" : {
      "Description" : "The EBS Data volume Type",
      "Type" : "String",
      "AllowedValues" : [ "standard", "gp2" ],
      "Default" : "gp2"
    },
    "Zone1": {
      "Description":"The AZ Zone 1 (e.g. us-west-2a)",
      "Type":"String",
      "AllowedValues":
       [ "ap-northeast-1a", "ap-northeast-1b", "ap-northeast-1c", 
"ap-southeast-1a", "ap-southeast-1b", "ap-southeast-2a", 
"ap-southeast-2b", "eu-west-1a", "eu-west-1b", "eu-west-1c", 
"sa-east-1a", "sa-east-1b", "us-east-1a", "us-east-1b", "us-east-1c",
"us-east-1d", "us-east-1e", "us-west-1a", "us-west-1b", "us-west-1c",
"us-west-2a", "us-west-2b", "us-west-2c" ]
     },
    "Zone2": {
      "Description":"The AZ Zone 2",
      "Type":"String",
      "AllowedValues":
       [ "ap-northeast-1a", "ap-northeast-1b", "ap-northeast-1c", 
"ap-southeast-1a", "ap-southeast-1b", "ap-southeast-2a", 
"ap-southeast-2b", "eu-west-1a", "eu-west-1b", "eu-west-1c", 
"sa-east-1a", "sa-east-1b", "us-east-1a", "us-east-1b", "us-east-1c",
"us-east-1d", "us-east-1e", "us-west-1a", "us-west-1b", "us-west-1c",
"us-west-2a", "us-west-2b", "us-west-2c" ]
     },
    "Zone3": {
      "Description":"The AZ Zone 3",
      "Type":"String",
      "AllowedValues":
       [ "ap-northeast-1a", "ap-northeast-1b", "ap-northeast-1c", 
"ap-southeast-1a", "ap-southeast-1b", "ap-southeast-2a", 
"ap-southeast-2b", "eu-west-1a", "eu-west-1b", "eu-west-1c", 
"sa-east-1a", "sa-east-1b", "us-east-1a", "us-east-1b", "us-east-1c",
"us-east-1d", "us-east-1e", "us-west-1a", "us-west-1b", "us-west-1c",
"us-west-2a", "us-west-2b", "us-west-2c" ]
     }
   },
  "Conditions": {
    "UseLogSNS": {
      "Fn::Not": [ {
         "Fn::Equals": [ {
            "Ref":"LogSNS"
           },"none"]
        }]
     },
    "UseSpot": {
      "Fn::Not": [ {
         "Fn::Equals": [ {
            "Ref":"SpotPrice"
           }, 0 ]
        }]
     }
   },

Mappings Declaration

The Mappings portion of the template provides a way of looking up values from a table.

The AWSInstanceType2Arch map defines the values for all of the possible instance types. The PVM value defines the instance type as a Paravirtual Machine, whereas the HVM value defines the instance type as a Hardware Virtual Machine. The AWSRegionArch2AMI map defines the AMIs for each region. Each region has both a PVM and HVM AMI.

For details on HVM AMIs, see http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using_cluster_computing.html.

"Mappings":
  {
    "AWSInstanceType2Arch": {
      "c1.medium" : {
        "Arch" : "PVM"
      },
      "c1.xlarge" : {
        "Arch" : "PVM"
      },
      "c3.2xlarge" : {
        "Arch" : "HVM"
      },
      "c3.4xlarge" : {
        "Arch" : "HVM"
      },
      "c3.8xlarge" : {
        "Arch" : "HVM"
      },
      "c3.large" : {
        "Arch" : "HVM"
      },
      "c3.xlarge" : {
        "Arch" : "HVM"
      },
      "cc2.8xlarge" : {
        "Arch" : "HVM"
      },
      "cr1.8xlarge" : {
        "Arch" : "HVM"
      },
      "hi1.4xlarge" : {
        "Arch" : "HVM"
      },
      "hs1.8xlarge" : {
        "Arch" : "HVM"
      },
      "i2.2xlarge" : {
        "Arch" : "HVM"
      },
      "i2.4xlarge" : {
        "Arch" : "HVM"
      },
      "i2.8xlarge" : {
        "Arch" : "HVM"
      },
      "i2.xlarge" : {
        "Arch" : "HVM"
      },
      "m1.large" : {
        "Arch" : "PVM"
      },
      "m1.medium" : {
        "Arch" : "PVM"
      },
      "m1.small" : {
        "Arch" : "PVM"
      },
      "m1.xlarge" : {
        "Arch" : "PVM"
      },
      "m2.2xlarge" : {
        "Arch" : "PVM"
      },
      "m2.4xlarge" : {
        "Arch" : "PVM"
      },
      "m2.xlarge" : {
        "Arch" : "PVM"
      },
      "m3.2xlarge" : {
        "Arch" : "HVM"
      },
      "m3.large" : {
        "Arch" : "HVM"
      },
      "m3.medium" : {
        "Arch" : "HVM"
      },
      "m3.xlarge" : {
        "Arch" : "HVM"
      },
      "r3.2xlarge" : {
        "Arch" : "HVM"
      },
      "r3.4xlarge" : {
        "Arch" : "HVM"
      },
      "r3.8xlarge" : {
        "Arch" : "HVM"
      },
      "r3.large" : {
        "Arch" : "HVM"
      },
      "r3.xlarge" : {
        "Arch" : "HVM"
      },
      "t2.medium" : {
        "Arch" : "HVM"
      },
      "t2.small" : {
        "Arch" : "HVM"
      },
      "cc1.4xlarge" : {
        "Arch" : "HVM"
      }
    },

AWSRegionArch2AMI describes the AMIs used. These will change for each new release of MarkLogic.

    "AWSRegionArch2AMI": {
      "us-east-1" : {
        "HVM" : "ami-a5ee08ce"
      },
      "us-west-1" : {
        "HVM" : "ami-d313fb97"
      },
      "us-west-2" : {
        "HVM" : "ami-ed81bddd"
      },
      "eu-west-1" : {
        "HVM" : "ami-e9b9c99e"
      },
      "ap-southeast-1" : {
        "HVM" : "ami-181e264a"
      },
      "ap-southeast-2" : {
        "HVM" : "ami-a993ea93"
      },
      "ap-northeast-1" : {
        "HVM" : "ami-ae61b0ae"
      },
      "sa-east-1" : {
        "HVM" : "ami-35b13028"
      }
    }
  },

Resources Declaration

The Resources portion of the template defines all of the AWS resources created for your stack by this template. Each resource is defined as a specific AWS type. The details of each resource type are described in http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-template-resource-type-ref.html.

These resources defined in this template include:

  • Elastic Block Store (EBS) volumes
  • DynamoDB Table (DynamoDB is the Amazon implementation of the Metadata Database)
  • AutoScaling Groups (ASG). For each ASG, there are the following resources:
    • Security Group
    • Instance Type
    • Identity and Access Management (IAM) Instance Profile
    • Launch Configuration
    • UserData
    • Elastic Load Balancer (ELB)
  • ELB ports
  • Health Check values
  • Security Group for each EC2 Instance

The first part of the Resources portion define the EBS volumes used by /var/opt/MarkLogic for the first node in Zone1, Zone2 and Zone3. For details on the AWS::EC2::Volume type, see http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-ec2-ebs-volume.html.

All EBS volume definitions are similar to MarklogicVolume1 for Zone1, shown below.

  "Resources":
   {
    "MarklogicVolume1":
     {
      "Type":"AWS::EC2::Volume",
      "Properties": {
        "AvailabilityZone": {
          "Ref":"Zone1"
         },
        "Size": {
          "Ref":"VolumeSize"
         },
        "Tags": [ {
          "Key":"Name",
          "Value":"MarkLogicData 1"
          }],
        "VolumeType" : {
          "Ref" : "VolumeType"
        }
      }
    },

MarkLogicDDBTable creates a DynamoDB database used as the Metadata Database, described in Amazon EC2 Terminology, and returns the name of the DynamoDB Table.

The read and write capacity are both set to 10 for a three-node template and 2 for a single-node template. It is critical to make sure you have enough capacity provisioned for peak periods, which occur when the instances in large cluster are restarted simultaneously. If you don't have enough capacity, the cluster may not recouple correctly when nodes are replaced following termination. You can set a CloudWatch alarm on capacity, which can either alert you manually or trigger a script to modify the capacity.

For details on the AWS::DynamoDB::Table type, see http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-dynamodb-table.html.

"MarkLogicDDBTable" : {
      "Type" : "AWS::DynamoDB::Table",
      "Properties" : {
        "AttributeDefinitions" : [ {
          "AttributeName" : "node",
          "AttributeType" : "S"
        } ],
        "KeySchema" : [ {
          "AttributeName" : "node"
          "KeyType" : "HASH",
        } ],
        "ProvisionedThroughput" : {
          "WriteCapacityUnits" : "10",
          "ReadCapacityUnits" : "10"
        }
      }
    },

MarkLogicServerGroup1, MarkLogicServerGroup2 and MarkLogicServerGroup3 are the AutoScaling Groups (ASGs) for Zone1, Zone2 and Zone3. For details on the AWS::AutoScaling::AutoScalingGroup type, see http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-as-group.html. All of them are similar to MarkLogicServerGroup1 for Zone1, shown below.

"MarkLogicServerGroup1":
     {
      "Type":"AWS::AutoScaling::AutoScalingGroup",
      "Properties": {
        "AvailabilityZones": [ {
           "Ref":"Zone1"
          } ],
        "LaunchConfigurationName": {
          "Ref":"LaunchConfig1"
         },
        "MinSize":"0",
        "MaxSize": {
          "Ref":"NodesPerZone"
         },
        "DesiredCapacity": {
          "Ref":"NodesPerZone"
         },
        "Cooldown": "300",
        "HealthCheckType": "EC2",
        "HealthCheckGracePeriod": "300",
        "LoadBalancerNames": [ {
           "Ref":"ElasticLoadBalancer"
        } ],
        "NotificationConfiguration": {
           "Fn::If" : [ "UseLogSNS", {
             "TopicARN": {
               "Ref":"LogSNS"
             },

NotificationTypes describes the notifications to be sent to the SNS Topic supplied to the cloud formation script to allow monitoring of AutoScaling group actions.

          "NotificationTypes": ["autoscaling:EC2_INSTANCE_LAUNCH",
"autoscaling:EC2_INSTANCE_LAUNCH_ERROR",
"autoscaling:EC2_INSTANCE_TERMINATE",
"autoscaling:EC2_INSTANCE_TERMINATE_ERROR"]
        }, {
          "Ref" : "AWS::NoValue"
        } ]
      }
    }
  },

LaunchConfig1, LaunchConfig2 and LaunchConfig3 are the Launch Configurations for ASG 1, ASG 2 and ASG 3. These describe how to look up the AMI id associated with the region, instance type, and architecture (PVM vs. HVM). All are similar to that below for ASG 1. For details on the AWS::AutoScaling::LaunchConfiguration type, see http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-as-launchconfig.html.

    "LaunchConfig1": {
      "Type":"AWS::AutoScaling::LaunchConfiguration",
      "Properties": {
        "KeyName": {
          "Ref":"KeyName"
         },
        "ImageId": {
          "Fn::FindInMap": ["AWSRegionArch2AMI", {
             "Ref":"AWS::Region"
          }, {
          "Fn::FindInMap": ["AWSInstanceType2Arch", {
             "Ref":"InstanceType"
          }, "Arch" ]
        } ]
      },

Each Launch Configuration has a UserData and a SecurityGroups property, as shown below.

The UserData property that is populated with the data assigned to the variables described in AWS Configuration Variables. Below is the UserData property for ASG 1.

In VolumeSize, the ,* defines the volume size for the 2nd and any additional nodes in each ASG. The # indicates that the nodes are dynamically named and a numeric suffix is added from 1 - MaxNodesPerZone.

         "UserData": {
           "Fn::Base64" : {
             "Fn::Join" : [ "", [ "MARKLOGIC_CLUSTER_NAME=", {
               "Ref" : "MarkLogicDDBTable"
             }, "\n", "MARKLOGIC_EBS_VOLUME=", {
               "Ref" : "MarklogicVolume1"
             }, ",:", {
               "Ref" : "VolumeSize"
             }, "::", {
               "Ref" : "VolumeType"
             }, "::,*\n", "MARKLOGIC_NODE_NAME=NodeA#\n",                 "MARKLOGIC_ADMIN_USERNAME=", {
               "Ref" : "AdminUser"
             }, "\n", "MARKLOGIC_ADMIN_PASSWORD=", {
               "Ref" : "AdminPass"
             }, "\n", "MARKLOGIC_CLUSTER_MASTER=1\n", 
                "MARKLOGIC_LICENSEE=", {
               "Ref" : "Licensee"
             }, "\n", "MARKLOGIC_LICENSE_KEY=", {
               "Ref" : "LicenseKey"
             }, "\n", "MARKLOGIC_LOG_SNS=", {
               "Ref" : "LogSNS"
             }, "\n" ] ]
           }
         },

Each Launch Configuration has a SecurityGroups property that assigns the security group defined by InstanceSecurityGroup to the Amazon EC2 instances in the Auto Scaling group. Each property is like the following.

"SecurityGroups": [ {
           "Ref":"InstanceSecurityGroup"
          }],
        "InstanceType": {
          "Ref":"InstanceType"
         },
        "IamInstanceProfile": {
          "Ref":"IAMRole"
         },
        "SpotPrice": {
          "Fn::If": ["UseSpot", {
             "Ref":"SpotPrice"
            }, {
             "Ref":"AWS::NoValue"
            } ]
         }
       }
     },

ElasticLoadBalancer is the Load Balancer for all of the ASGs. For details on the AWS::ElasticLoadBalancing::LoadBalancer type, see http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-ec2-elb.html.

"ElasticLoadBalancer": {
      "Type":"AWS::ElasticLoadBalancing::LoadBalancer",
      "Properties": {
        "AppCookieStickinessPolicy": [ {
           "CookieName":"SessionID",
           "PolicyName":"MLSession"
          } ],
        "AvailabilityZones": [ {
           "Ref":"Zone1"
          }, {
           "Ref":"Zone2"
          }, {
           "Ref":"Zone3"
          }],
        "ConnectionDrainingPolicy": {
            "Enabled": "true",
            "Timeout": "60"
         },
        "CrossZone": "true",

Listeners defines all of the ports the Elastic Load Balancer (ELB) opens to the public.

    "Listeners": [ {
           "LoadBalancerPort": "8000",
           "InstancePort": "8000",
           "Protocol":"HTTP",
           "PolicyNames": ["MLSession"]
          }, {
           "LoadBalancerPort": "8001",
           "InstancePort": "8001",
           "Protocol":"HTTP",
           "PolicyNames": ["MLSession"]
          }, {
           "LoadBalancerPort": "8002",
           "InstancePort": "8002",
           "Protocol":"HTTP",
           "PolicyNames": ["MLSession"]
          }, {
           "LoadBalancerPort": "8003",
           "InstancePort": "8003",
           "Protocol":"HTTP",
           "PolicyNames": ["MLSession"]
          }, {
           "LoadBalancerPort": "8004",
           "InstancePort": "8004",
           "Protocol":"HTTP",
           "PolicyNames": ["MLSession"]
          }, {
           "LoadBalancerPort": "8005",
           "InstancePort": "8005",
           "Protocol":"HTTP",
           "PolicyNames": ["MLSession"]
          }, {
           "LoadBalancerPort": "8006",
           "InstancePort": "8006",
           "Protocol":"HTTP",
           "PolicyNames": ["MLSession"]
          }, {
           "LoadBalancerPort": "8007",
           "InstancePort": "8007",
           "Protocol":"HTTP",
           "PolicyNames": ["MLSession"]
          }, {
           "LoadBalancerPort": "8008",
           "InstancePort": "8008",
           "Protocol":"HTTP",
           "PolicyNames": ["MLSession"]
          } ],

HealthCheck checks the health of each MarkLogic instance by contacting its HealthCheck App Server on port 7997 every number of seconds specified by Interval. Any answer other than "200 OK" within the Timeout period (in seconds) is considered unhealthy and that instance is removed from the ELB. For details on the HealthCheck parameters, see http://docs.aws.amazon.com/ElasticLoadBalancing/latest/APIReference/API_HealthCheck.html.

        "HealthCheck": {
          "Target":"HTTP:7997/",
          "HealthyThreshold": "3",
          "UnhealthyThreshold": "5",
          "Interval": "10",
          "Timeout": "5"
         }
       }
     },

InstanceSecurityGroup defines the Security Group for each EC2 Instance. For details on the AWS::EC2::SecurityGroup type, see http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-ec2-security-group.html.

     "InstanceSecurityGroup": {
      "Type":"AWS::EC2::SecurityGroup",
      "Properties": {
        "GroupDescription":"Enable SSH access and HTTP 
         access on the inbound port",
        "SecurityGroupIngress": [ {
           "IpProtocol":"tcp",
           "FromPort": "22",
           "ToPort": "22",
           "CidrIp":"0.0.0.0/0"
          }, {
           "IpProtocol":"tcp",
           "FromPort": "7998",
           "ToPort": "7998",
           "CidrIp":"0.0.0.0/0"
          }, {
           "IpProtocol":"tcp",
           "FromPort": "8000",
           "ToPort": "8010",
           "SourceSecurityGroupOwnerId": {
             "Fn::GetAtt": [
               "ElasticLoadBalancer","SourceSecurityGroup.OwnerAlias"
             ] },
           "SourceSecurityGroupName": {
             "Fn::GetAtt": [
               "ElasticLoadBalancer","SourceSecurityGroup.GroupName"
             ] }
          }, 
          {
           "IpProtocol":"tcp",
           "FromPort": "8000",
           "ToPort": "8010",
           "CidrIp":"0.0.0.0/0"
          }, {
           "IpProtocol":"tcp",
           "FromPort": "7997",
           "ToPort": "7997",
           "SourceSecurityGroupOwnerId": {
             "Fn::GetAtt": [
               "ElasticLoadBalancer","SourceSecurityGroup.OwnerAlias"
             ] },
           "SourceSecurityGroupName": {
             "Fn::GetAtt": [
               "ElasticLoadBalancer","SourceSecurityGroup.GroupName"
             ] }
          } ]
       }
     },

InstanceSecurityGroupIngress defines the ingress rules that control the inbound traffic that is allowed to reach your instances. For details on the AWS::EC2::SecurityGroupIngress type, see http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-ec2-security-group-ingress.html.

The reason InstanceSecurityGroupIngress is separate from InstanceSecurityGroup is that it needs to reference the security group itself.

"InstanceSecurityGroupIngress": {
      "Type":"AWS::EC2::SecurityGroupIngress",
      "Properties": {
        "IpProtocol":"tcp",
        "GroupName": {
          "Ref":"InstanceSecurityGroup"
         },
        "FromPort": "7999",
        "ToPort": "7999",
        "SourceSecurityGroupName": {
          "Ref":"InstanceSecurityGroup"
         }
       }
     }
   },

Outputs Declaration

If the CloudFormation launch is successful, Outputs generates the URL of the ELB pointing to the MarkLogic Admin Interface port (8001).

  "Outputs": {
    "URL": {
      "Description":"The URL of the MarkLogic Cluster",
      "Value": {
        "Fn::Join": [ "", [ "http://", {
            "Fn::GetAtt": ["ElasticLoadBalancer","DNSName"]
           },":8001" ] ]
       }
     }
   }
 }

Using CloudFormation with Secure Credentials

The sample templates are not designed for production environments. Most deployments will have specific infrastructure and integration requirements you will need address. An important issue is how to manage secure credentials for MarkLogic in a automated "hands off" process . The sample templates pass the Admin Password in plain text as cloud formation parameters which then are converted into simple EC2 User Data name/value pairs. This is not a secure method of handling credentials.

As Mentioned in Configuration using the /etc/marklogic.conf File, an alternative to EC2 UserData is creating /etc/marklogic.conf during the deployment. This can be done in CloudFormation fairly easily. For Production deployments using CloudFormation, the AWS::CloudFormation::Init Resource (and the helper cfn-init commands) are recommended for deployment and configuration. See: http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-init.html.

If not using CloudFormation the cloud-init service, the low-level API which CloudFormation uses, can be used directly. See http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html for details.

With the CloudInit resource, EC2 UserData is only used for a small 'bootstrap' script that accesses the configuration variables from the template metadata resource securely via cfn-init. By passing a reference to a secure channel for credentials instead of the credentials themselves, no confidential data is passed directly from the origin to the EC2 instance. This process is recommended by AWS and discussed in this posting:

http://blogs.aws.amazon.com/application-management/post/Tx3LKFZ27CWZBKO/Authenticated-File-Downloads-with-CloudFormation

There are many options for configuring the necessary authentication and providing a protected storage and access. Choosing the appropriate configurations is specific to your requirements and integration strategy and should be part of your overall IT and security planning. Integration MarkLogic deployment with CloudFormation or another orchestration requires only that the file /etc/marklogic.conf be created prior to the first startup of MarkLogic on that instance.

Below are snippets of the Launch Configuration and AutoScalingGroup sections from an example CloudFormation template that makes use of CloudInit and a secure S3 bucket for the admin password. Note that the URL itself for the S3 file does not need to be confidential, so it may be safely passed as a CloudFormation parameter and stored for the lifetime of the instance. In the Launch Configuration, a simple script is used to invoke cfn-init, passing a reference to the MetaData resource associated with the AutoScalingGroup for a zone. The MetaData resource is a sibling of the "Properties" tag in the AutoScalingGroup section.

The "files" entry in the AutoScalingGroup section writes /etc/marklogic.conf with the root owner and group (read-only by owner).

The "services" entry in the AutoScalingGroup section starts MarkLogic after CloudInit is complete and restarts it if /etc/marklogic.conf or /etc/sysconfig/MarkLogic is updated by CloudInit in the future.

Example Launch Configuration Snippet:

"LaunchConfig1" : {
      "Type" : "AWS::AutoScaling::LaunchConfiguration",
      "Properties" : {
         .... },
"UserData": {"Fn::Base64": {"Fn::Join": [
  "",
   [
    "#!/bin/bash\n",
    "function error_exit\n",
    "{\n",
    "logger -t MarkLogic  \"$1\"",
    "exit 1\n",
    "}\n",
    "yum update -y aws-cfn-bootstrap\n",
    "yum update -y\n",
    "# Install application\n",
    "/opt/aws/bin/cfn-init -v -s ",
    {"Ref": "AWS::StackId"}, " -r ASG1  --region ",
    {"Ref": "AWS::Region"}, " || error_exit 'Failed to run cfn-init'\n",
    "\n",
    "# All is well so signal success\n",
    "\n"
  ]
]}}}

Example AutoScalingGroup Snippet:

"ASG1" : {
       "Type" : "AWS::AutoScaling::AutoScalingGroup",
       "Properties" : {               ..... 
     },
  "Metadata": {
  "MarkLogic::MetaDataVersion": "2015-07-17-14:49:23",
  "AWS::CloudFormation::Init": {
    "config": {
      "files": {"/etc/marklogic.conf": {
        "content": {"Fn::Join": [
          "",
          [
          "MARKLOGIC_CLUSTER_NAME=",{"Ref": "MarkLogicDDBTable"}, "\n",
          "MARKLOGIC_EBS_VOLUME=", {"Ref": "MarkLogicVolume1"}, "\n",
          "MARKLOGIC_NODE_NAME=NodeA#\n",
          "MARKLOGIC_ADMIN_USERNAME=",{"Ref": "AdminUser"},"\n",
          "# Password obtained via protected S3 file\n",
          "# MARKLOGIC_ADMIN_PASSWORD=\n", 
          "# $(s3 cp --region us-west-2 s3://bucket/secret-password - ) \n",
          "MARKLOGIC_ADMIN_PASSWORD=$( aws s3 --region ",
          {"Ref": "AWS::Region"}, " cp ", {"Ref": "AdminPassS3URL"}, " - )\n",
          "MARKLOGIC_CLUSTER_MASTER=0\n"
          ] ]} ,
        "mode": "000400",
        "owner": "root",
        "group": "root"
      }},
      "services": {"sysvinit": 
    {"MarkLogic": {
        "enabled": "true",
        "ensureRunning": "true",
        "files": [
          "/etc/marklogic.conf",
          "/etc/sysconfig/MarkLogic"
        ]  }
  }}
}}

Deleting a CloudFormation Stack

To delete a CloudFormation stack, follow the procedure described in http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-console-delete-stack.html.

Deleting your CloudFormation stack removes most of the EC2 resources (instances, security groups, etc.) created by your CloudFormation template. The exception is that the EBS volumes are not removed. Should you want to remove the EBS volumes after deleting your stack, you must manually remove them by following the procedure described in http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-deleting-volume.html.

When a stack is deleted, the EBS volume that was created for the first node in each zone is also deleted. However the EBS volumes for any additional nodes in each zone are not deleted. This is because they were not created directly in the CloudFormation stack, but instead as a part of the startup process of the additional nodes.

« Previous chapter
Next chapter »