MarkLogic Server on Microsoft® Azure® Guide (PDF)

MarkLogic 10 Product Documentation
MarkLogic Server on Microsoft® Azure® Guide
— Chapter 3

« Previous chapter

Managing MarkLogic on Azure

This chapter describes how to manage, scale, and monitor MarkLogic Servers in the Azure environment. This section contains the following topics related to managing and scaling MarkLogic on Azure:

Accessing Your MarkLogic Azure VM

After the Azure VM is up and available, navigate to the Virtual Machines tab on your Azure portal and click on the VM you just created. You can find the IP address there for accessing the MarkLogic Admin Console at port 8001.

In the MarkLogic Azure image, the following ports are open by default: 8000, 8001, 8002, 7997.

Attaching Data Disks

To preserve your data, you must attach a persistent data disk to your Azure VM, in case of VM re-provisioning. Microsoft recommends using managed disks over unmanaged disks for new VMs.

The steps for attaching data disks to your VM from the Azure Portal can be found here:

https://docs.microsoft.com/en-us/azure/virtual-machines/linux/attach-disk-portal

Microsoft Azure Storage is the back-end to store VM disks and has two main SKUs, with different performance characteristics:

  • The Premium storage SKUs provide high-performance, low-latency disk support for I/O intensive workloads.
  • The Standard storage based SKUs offer a better price/GB ratio.

Not all VMs support attaching Premium Storage. In addition, Microsoft Azure provides unmanaged and managed disks in each of the two performance tiers. Azure recommends using managed disks for new VMs and converting your previous unmanaged disks to managed disks, to take advantage of the many features available in Managed Disks.

The MarkLogic Tiered Storage feature enables you to take advantage of many Azure Storage types. For more information, see Tiered Storage in the Administrator's Guide.

Place your most frequently used data on Premium Storage data disks for faster access. Place your less frequently used data on Standard Storage data disks at a lower cost. If you are considering leveraging the MarkLogic Tiered Storage capabilities with different Azure Storage types, please read the following discussion on scalability and performance targets:

https://docs.microsoft.com/en-us/azure/storage/common/storage-scalability-targets

Partitioning and Mounting Disks

To use data disks after they have been attached, you must also mount the data disk. The steps for partitioning and mounting data disks can be found here:

https://docs.microsoft.com/en-us/azure/virtual-machines/linux/add-disk

Loading Content into MarkLogic

To load contents into your MarkLogic server in Azure, refer to the Loading Content Into MarkLogic Server Guide.

Configuring MarkLogic for Azure Blob Storage

Azure Blob Storage is similar to Amazon S3, even better in some ways, because it doesn't have S3's read-after-write eventual consistency limitation. Also because Azure Blob Storage supports the ability to append, so journal files will work with it.

In MarkLogic, Azure pathnames are of the form: azure://mycontainer/mydirectory/myfile. They are similar to MarkLogic S3 pathnames of the form: S3://mybucket/mydirectory/myfile. What are called buckets on S3 are called containers in Azure.

To use Azure Blob Storage, you need an azure storage account, and you need a container for your blobs. You can create and manage azure storage accounts on the Azure portal: https://portal.azure.com/

For production environment, Premium or Standard storage disks in Azure are required. MarkLogic does not support regular or replica forests on Azure Blob Storage. We do support Azure Blob Storage for backup and read-only forest in our Tiered storage feature.

After logging into the portal, click on Storage accounts on the left navigation area. From there you can manage an existing storage account or add a new one. Storage account names are global, so Microsoft provides guidance how to name them: https://docs.microsoft.com/en-us/azure/architecture/best-practices/naming-conventions#naming-rules-and-restrictions

Within your storage account you also need to have at least one container for your blobs. From this page you can manage an existing account or add a new one. On the storage account management page, click Blobs to get to the containers page. From there you can manage an existing container or add a new one.

Also on the Storage Account Management page is an Access Keys link. In a bit you will need to click this link, so keep the browser open to your Storage Account Management page.

MarkLogic needs Azure storage credentials to access Azure Blob Storage. To configure Azure storage credentials in MarkLogic, go to Security -> Credentials in the MarkLogic Admin GUI. On that page, right below where you would enter AWS credentials for S3, you can enter your Azure storage credentials.

You may also use your Azure VM identity to access Azure Blob Storage. Please see this web tutorial to learn how to create a storage account and a VM with identity on Azure.

Enter your Azure storage account and your Azure storage key. Your Azure storage account is the name of your storage account. Your Azure storage key is an access key automatically generated by Microsoft and provided to you. You can get your Azure storage key from the Storage Account Management page of the Azure portal.

On the storage account management page on the Azure portal, click on the Access keys link. You an use either the Key value for key1 or the Key value for key2; it doesn't matter which you use.

The easiest way to do this is to have two open browser windows, one open to the MarkLogic Admin GUI, and one open to the Azure Portal. Then you can just copy the value from one browser window and paste it into the other.

Azure storage access keys are like passwords, so they should be treated as passwords. MarkLogic stores them in encrypted form in its security database.

Once you have configured your Azure storage credentials in MarkLogic, you can access your containers as file systems. The filesystem built-in functions work with azure pathnames, which are of the form azure://mycontainer/mydirectory/myfile. External binary references can have Azure pathnames and MarkLogic will transparently access them.

You can do easily do backups to an Azure pathname, and you can also do journal archiving to Azure. You can specify an Azure directory as a forest data directory. Forests on Azure blob storage can have journals. Forests on Azure blob storage can be transactionally and robustly read/write, instead of just read-only as they are on S3.

There are security API functions to script the configuration of Azure storage credentials: sec:credentials-get-azure and sec:credentials-set-azure. The REST management API endpoint /manage/v2/credentials/properties will support credentials configuration.

If there are no Azure credentials in the Security database, MarkLogic will check environment variables. See Environment Variables for more information.

If you want to configure MarkLogic Server to access Azure Blob Storage through a proxy server, you can specify the URL to the proxy server by setting the MARKLOGIC_AZURE_STORAGE_PROXY in the MarkLogic Azure Solution Template, or use the Admin Interface to configure MarkLogic Server to access S3 Storage through a proxy server, as follows:

  1. Log into the Admin Interface.
  2. Click the Groups icon on the left tree menu.
  3. Click the Configure tab at the top right.
  4. Locate the group for which you want to view settings.
  5. Click the icon for this group.
  6. Enter the URL to the Azure Blob Storage proxy server in the azure storage proxy field toward the bottom of the configuration page. The proxy URL should start with https:// (for example, https://proxy.marklogic.com:8080). If you don't specify the port number, MarkLogic assumes the proxy server is listening on port 8080.

    If the MARKLOGIC_AZURE_STORAGE_PROXY variable is set and azure storage proxy is not set in the Admin Interface group configuration, the value of MARKLOGIC_AZURE_STORAGE_PROXY is used . If MARKLOGIC_AZURE_STORAGE_PROXY is set and azure storage proxy is also set in the Admin Interface group configuration, the Admin Interface azure storage proxy setting is used.

Credentials Configuration

There are security API functions to script the configuration of azure storage credentials, specifically the functions sec:credentials-get-azure and sec:credentials-set-azure, which are a part of security.xqy. The REST management API endpoint /manage/v2/credentials/properties will support credentials configuration.

Environment Variables

If there are no Azure credentials in the security database, MarkLogic will check environment variables. You can specify your azure storage account name with the environment variable MARKLOGIC_AZURE_STORAGE_ACCOUNT. You can specify your azure storage access key with the environment variable MARKLOGIC_AZURE_STORAGE_KEY. As usual, on Linux you can set these MarkLogic environment variables with export commands in /etc/marklogic.conf. Since azure storage access keys are like passwords, it is better to let MarkLogic keep them encrypted in its security database, but this alternate mechanism is available if needed.

Adding a Node to the Existing Cluster

To add more VMs as additional hosts for MarkLogic Clusters on Azure platform, set up the number of VM instances as needed from your Azure portal. For more details, see Adding a Host to a Cluster in the Administrator's Guide.

Since the IP address for each node changes as you start and stop an instance on Azure, make sure a public DNS name is assigned to every node before stopping an instance so that this node can join the cluster again.

MarkLogic recommends assigning a public DNS name to your VMs. For instructions on how to do this, see the following:

https://docs.microsoft.com/en-us/azure/virtual-machines/create-fqdn

Resizing Virtual Machines

See the following Microsoft Azure blog for recommendations on changing VM instance sizes:

https://azure.microsoft.com/en-us/blog/resize-virtual-machines/

You should expect some downtime when resizing a VM. In addition, data stored on temporary OS disk will be lost during resizing.

Monitoring Virtual Machine Status

Azure has built-in single VM monitoring, and you will be able to see the VM status. You can select which performance metrics you want to see. The VM status display is similar to the following:

Monitoring MarkLogic Server

MarkLogic provides a monitoring dashboard and REST Management APIs that you can use to integrate with other monitoring tools. For details, see Monitoring MarkLogic Server in the Monitoring MarkLogic Guide.

Plugins for New Relic and App Dynamics are also available. The Open Source, community supported New Relic plugins can be found in the tools section of our Developer Site.

Upgrading MarkLogic Server on Azure

The Azure Solution Template does not support updating nodes. MarkLogic Server on Azure must be upgraded manually.

To upgrade a MarkLogic cluster on Azure:

  1. Detach the data disks: https://docs.microsoft.com/en-us/azure/virtual-machines/windows/detach-disk.
  2. Create new VMs: https://docs.microsoft.com/en-us/azure/virtual-machines/windows/quick-create-portal.
  3. Attach and mount the data disks: https://docs.microsoft.com/en-us/azure/virtual-machines/linux/attach-disk-portal.
  4. Join the upgraded MarkLogic node to the cluster, as described in Adding Another Node to a Cluster in the Scalability, Availability, and Failover Guide.

« Previous chapter
Powered by MarkLogic Server | Terms of Use | Privacy Policy