This chapter describes how to launch a MarkLogic Server AMI and access the MarkLogic Server Admin interface. This chapter includes the following sections:
This section describes how to access the an instance of MarkLogic Server in EC2.
There are two ways to access a MarkLogic Server instance:
The difference is that the ELB URL will direct you to any available instance of MarkLogic Server in your cluster. If you want to access a specific instance, as you would when running the mlcmd
script described in Using the mlcmd Script, then use the Public DNS for that instance.
You can access the MarkLogic Admin Interface through the ELB by clicking on the URL in the Outputs portion of the CloudFormation Console, as described in 11 in Creating a CloudFormation Stack using the AWS Console.
This section describes how to access MarkLogic Server through the ELB from the EC2 Dashboard.
You use the URL to access the MarkLogic Server. You can access any of the ports you have defined as an ELB port. For example, if the URL is DLEE-CF-L-ElasticL-OCCR192PW0OO-925510329.us-east-1.elb.amazonaws.com
, then, to access MarkLogic port 7800, the URL you enter into the browser would be:
http://DLEE-CF-L-ElasticL-OCCR192PW0OO-925510329.us-east-1.elb.amazonaws.com:7800
Do not use a load balancer to access MarkLogic port 8001. The Admin Interface is not designed to be used behind a load balancer.
This section describes how to access MarkLogic Server through the Public DNS of an Instance.
You use the Public DNS to formulate part of the URL to access MarkLogic Server. For example, if the Public DNS is ec2-54-242-94-98.compute-1.amazonaws.com
, then, to access the Admin Interface, the URL you enter into the browser would be:
http://ec2-54-242-94-98.compute-1.amazonaws.com:8001
You may need to SSH into an EC2 instance for certain task, such as checking the log files for that instance, as described in Detecting EC2 Errors.
You cannot SSH to the load balancer, you must SSH to a specific EC2 instance. To SSH into an EC2 instance, you must have the key pair used by the instance downloaded to your local host.
To SSH into an instance, do the following:
ec2-user
as the User Name and provide the path to your copy of the key pair you downloaded to your local host. Click Launch SSH Client.Yes
for each prompt.Alternatively you can open a shell window and SSH into an instance using the following command:
ssh -i /path/to/keypair.pem ec2-user@<Public DNS>
For example, if your keypair, named newkey.pem
, is stored in your c:/stuff/
directory, you can access the instance with a public DNS of ec2-54-242-94-98.compute-1.amazonaws.com
as follows:
ssh -i c:/stuff/newkey.pem ec2-user@ec2-54-242-94-98.compute-1.amazonaws.com
Start up errors are stored in the /var/log/messages
file in each instance. To view the messages
file, SSH into an instance as described in Accessing an EC2 Instance.
To access the messages
file, you must be super user. For example, if you want to tail the messages
file, enter:
sudo tail -f /var/log/messages
You can also capture errors related to CloudFormation stack by means of the SNS Topic, as described in Creating a Simple Notification Service (SNS) Topic.
The mlcmd
script supports startup operations and advanced use of the Managed Cluster features. The mlcmd
script is installed as an executable script in /opt/MarkLogic/bin/mlcmd
.
In order to run mlcmd
, you must be logged into the host and running as root or with root privileges. You must also have Java installed and the java
command in the PATH
or JAVA_HOME
set to the JRE
or JDK
home directory. The first time you start MarkLogic on your server the /var/local/mlcmd.conf
file is created, which is required to use the mlcmd
script. Once the /var/local/mlcmd.conf
file is created, it is not necessary to start MarkLogic to use the mlcmd
script.
The syntax of mlcmd
is as follows:
mlcmd command
The mlcmd
commands are listed below:
mlcmd Command | Description |
---|---|
sync-volumes-from-mdb | Attaches EBS volumes not currently attached to this instance. |
sync-volumes-to-mdb | Synchronizes the EBS volumes currently attached to the system and stores them in the Metadata Database. |
init-volumes-from-system | Initialize volumes identical to the process performed on startup. |
leave-cluster | Removes the host on which it is executed from the cluster. |
This command looks in the Metadata Database and does the following:
marklogic:
to the EBS volume.This command can be run any time after the initial startup. It synchronizes the EBS volumes currently attached to the system and stores them in the Metadata Database so that on the next restart they will be attached and mounted. The following steps are performed:
No changes to existing attachments, filesystem, or mount points are performed.
This command looks at the current system and attempts to initialize volumes identical to the process performed on startup.
This command can be executed on a host to remove that host from the cluster. This command also removes the host from the cluster configuration information stored in the Metadata Database. The command leaves the host server in pre-initialized state (same as a fresh install). If the server is restarted, then it will re-join the cluster the same manner as an initial start.
Use the optional -terminate
argument to terminate the instance and decrement the DesiredCount
attribute of the AutoScaling group by one after leaving the cluster.
Amazon S3 support is built into MarkLogic Server as an available file system type. You configure S3 access at the group level. Once you have configured a group for S3, any forest in the group can be placed on S3 by specifying an S3 Path. Additionally, any host in the group can do backups to S3, restore from S3, as well as read and write directories and files on S3.
Transaction journaling does not work on S3 because the S3 file system cannot do the file operations necessary to maintain a journal. Unless your S3 forest is configured with a fast data directory or updates allowed is set to read-only
, you must set the journaling option on your database to off
before attaching the forest to the database. This is not a requirement for backup/restore operations on a database, however.
To configure MarkLogic to access Amazon S3, do the following:
Follow the directions in http://docs.aws.amazon.com/gettingstarted/latest/wah/getting-started-create-bucket.html to set up your S3 bucket.
There can be multiple problems if the bucket name contains a period (.). Instead use a dash (-) for maximum compatibility with S3.
Bucket names are global and they are not scoped to your account. You should choose bucket names that have a good chance of being universally unique. For example:
test
zippy-software-org-test
Do not use the S3 Management Console to upload your content to S3. Instead, follow any of the procedures described in the Loading Content Into MarkLogic Server Guide after you have completed the configuration procedures.
The S3 Endpoint is configured by specifying the S3 properties for your MarkLogic group.
Setting | Description |
---|---|
s3 domain | The domain used for the S3 endpoint. The default value is set for your region. However, you can change it, if necessary. References to the regional endpoints can be found at http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region. |
s3 protocol | You can choose either http or https for communication with S3. The default is http . |
s3 server side encryption | Storage on S3 can participate in server-side encryption. The default is none but you can set aes256 to enable server-side encryption. |
In order to use S3 you must supply S3 credentials. You can configure S3 credentials in one of three ways:
The order of precedence for locating S3 credentials is:
In the Security, Credentials, Configure tab are fields for specifying the S3 credentials.
You can set a pair of environment variables that the server will use as S3 AWS Credentials. These can be passed in as EC2 User Data or set into the environment in which MarkLogic runs.
If you run an EC2 instance with an associated IAM Role, you can select a policy template that provides S3 access, such as 'Amazon S3 Full Access' or 'Amazon S3 Read Only Access.'
Your IAM Role will be used for your security credentials so you do not need to store any AWS Credentials in MarkLogic or on the EC2 instance in order access S3 resources. This is the most secure way of accessing S3.
IAM roles are only used on the server if the MARKLOGIC_AWS_ROLE
environment variable is set. This happens automatically for you unless you disable the EC2 configuration (such as setting MARKLOGIC_EC2_HOST=0
), in which case the server will not use the MARKLOGIC_AWS_ROLE
variable.
Set the data directory for the forest to a valid S3 path. For details on setting the forest data directory, see Creating a Forest in the Administrator's Guide. Multiple forests can be configured for the same bucket.
s3://bucket/directory/file
Item | Description |
---|---|
bucket | The name of your S3 bucket. |
directory | Zero or more directory names, separated by forward slashes (/). |
file | The filename, if the path is to a specific file. |
For a directory path (such as a Forest data directory), then a bucket by itself is sufficient and files will be placed in the bucket root.
Example paths to S3 directories:
s3://my-company-bucket s3://my-company-bucket/directory s3://my-company-bucket/dir1/dir2/dir3
s3://my-company-bucket/file.xml s3://my-company-bucket/directory/file.txt s3://my-company-bucket/dir1/dir2/dir3/file.txt
Unless your S3 forest is configured with a fast data directory or updates allowed is set to read-only
, you must set journaling on your database to off
before attaching the forest to the database. Failure to do so will result in a forest error and you will have to restart the forest after you have disabled journaling on the database.
Load content into your S3 database using any of the methods described in Loading Content Into MarkLogic Server Guide and run a query to confirm you have successfully configured MarkLogic Server with S3.
Content uploaded directly to your bucket using the S3 Management Console will not be recognized by MarkLogic Server.
If you have created your stack using the 3+ CloudFormation template, you can temporarily add nodes and forests to scale up your cluster for periods of heavy use and then remove them later when less resources are needed.
Adding more hosts to a cluster is simple. Simply use the Update Stack feature to reapply the 3+ CloudFormation template and provide a larger number for the NodesPerZone
setting. Alternatively, you can add hosts by means of your Auto Scaling Groups. The recommended way to scale up the data capacity of your cluster is to add additional volumes, as described in Creating an EBS Volume and Attaching it to an Instance.
Scaling a cluster down involves some manual intervention. The procedure is as follows:
tieredstorage
API to migrate partitions or forests to a volume on another host. For details on migrating data, see Migrating Forests and Partitions in the Administrator's Guide.-terminate
command on each host to be removed. This will cause the node to leave the cluster, and adjust the AutoScaling Group DesiredCount
setting. For details, see leave-cluster.AN existing Cloud Formation Template can be updated as long as the software and IT architecture are compatible and if the changes do not require destructive modifications to AWS Resources needed by the Managed Cluster feature or your data. General guidance on the effects of template updates can be found at http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/using-cfn-updating-stacks.html.
The latest sample templates are versioned starting with V8.0.3. Major version number changes represent incompatible implementations of the Managed Cluster feature so an EC2 stack created using an earlier CloudFormation template cannot be updated with a new CloudFormation template.
Within the same major template version, upgrades are supported by updating the AMI ids in your original template with the latest AMI ids from Marketplace corresponding to the same:
Depending on your customizations, even using the same version and AMIs, some stack updates may be destructive to a running cluster and should be tested before applied to a production workload.
The following procedure describes how to upgrade your stack to use new AMIs for a new release of MarkLogic within the same major template version:
AWSRegionArch2AMI
definition in your original template might look like the following:"AWSRegionArch2AMI": { "us-east-1": { "PVM":"ami-41633528", "HVM":"ami-4363352a" }, "us-west-2": { "PVM":"ami-e85bc5d8", "HVM":"ami-ea5bc5da" }, "eu-west-1": { "PVM":"ami-68fa1c1f", "HVM":"ami-56fa1c21" } } },
If, for example, your instances are located in the us-east-1
and us-west-2
regions, open the new template, locate the AWSRegionArch2AMI
definition, and copy the AMI ids for the us-east-1
and us-west-2
regions. For example, the new template contains:
"us-east-1" : { "HVM" : "ami-96ffe7fe" }, "us-west-2" : { "HVM" : "ami-75d3f245" },
You can then update your AWSRegionArch2AMI
definition as follows:
"AWSRegionArch2AMI": { "us-east-1": { "PVM":"ami-41633528", "HVM":"ami-96ffe7fe" }, "us-west-2": { "PVM":"ami-e85bc5d8", "HVM":"ami-75d3f245" }, "eu-west-1": { "PVM":"ami-68fa1c1f", "HVM":"ami-56fa1c21" } } },
Some changes made outside of CloudFormation before the upgrade will cause the upgrade to fail.
security-upgrade.xqy
screen)The following procedure describes how to upgrade instances that are brought up directly from an AMI. For each MarkLogic instance in your cluster, do the following:
AWS provides robust monitoring of EC2 instances, EBS volumes, and other services via the CloudWatch service. You can use CloudWatch to set thresholds on individual AWS services and send notifications via SMS or Email when these thresholds have been exceeded. For example, you can set a threshold on excessive storage throughput. You can also create your own metrics to monitor with CloudWatch. For example, you might write a custom metric to monitor the current free memory on your instances and to alarm or trigger an automatic response should a memory threshold be exceeded.
For details on the use of CloudWatch, see http://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/CHAP_UsingCloudWatch.html.
This section describes to steps for migrating your data and configuration from a data center to EC2.
There are a number of ways you could migrate from a local data center to EC2. The following is one possible procedure.
This section describes how to create an EBS volume and attach it to your MarkLogic Server instance.
In general, it is a best practice is to have one volume per node and one forest per volume.
Use the following procedure to create an EBS volume.
The zones for your instance and EBS volume may not be the same by default.
When finished, click Yes, Create. Locate the reference to this new volume in the right-hand section of the management console and verify that the State is available
.
This section describes how to use the EC2 Dashboard to attach a volume to an instance.
/dev/sdf
. Click Yes, Attach when you are finished. Locate the reference to this volume in the right-hand section of the management console and verify that the status is "in-use". If the status is not 'in-use,' continue to click Refresh until the status changes to 'in-use.'At any time you can pause the cluster by using the Update Stack feature to reapply the CloudFormation template to your stack and setting the ASG NodesPerZone
value to 0 for all nodes. You can later restart the node by resetting the NodesPerZone
to a value of 1 - 20 for each ASG.
Do not manually stop your MarkLogic instances from the EC2 dashboard, as each AutoScaling Group will detect that they have stopped and will automatically restart them. The same is true if you shutdown MarkLogic from the Admin Interface or by means of a MarkLogic API call.
To terminate your MarkLogic cluster, you can delete the stack, as described in Deleting a CloudFormation Stack.