This chapter describes how to manage, scale, and monitor MarkLogic Servers in the Azure environment. This section contains the following topics related to managing and scaling MarkLogic on Azure:
After the Azure VM is up and available, navigate to the Virtual Machines tab on your Azure portal and click on the VM you just created. You can find the IP address there for accessing the MarkLogic Admin Console at port 8001.
In the MarkLogic Azure image, the following ports are open by default: 8000, 8001, 8002, 7997.
To preserve your data, you must attach a persistent data disk to your Azure VM, in case of VM re-provisioning. Microsoft recommends using managed disks over unmanaged disks for new VMs.
The steps for attaching data disks to your VM from the Azure Portal can be found here:
https://docs.microsoft.com/en-us/azure/virtual-machines/linux/attach-disk-portal
Microsoft Azure Storage is the back-end to store VM disks and has two main SKUs, with different performance characteristics:
Not all VMs support attaching Premium Storage. In addition, Microsoft Azure provides unmanaged and managed disks in each of the two performance tiers. Azure recommends using managed disks for new VMs and converting your previous unmanaged disks to managed disks, to take advantage of the many features available in Managed Disks.
The MarkLogic Tiered Storage feature enables you to take advantage of many Azure Storage types. For more information, see Tiered Storage in the Administrator's Guide.
Place your most frequently used data on Premium Storage data disks for faster access. Place your less frequently used data on Standard Storage data disks at a lower cost. If you are considering leveraging the MarkLogic Tiered Storage capabilities with different Azure Storage types, please read the following discussion on scalability and performance targets:
https://docs.microsoft.com/en-us/azure/storage/common/storage-scalability-targets
To use data disks after they have been attached, you must also mount the data disk. The steps for partitioning and mounting data disks can be found here:
https://docs.microsoft.com/en-us/azure/virtual-machines/linux/add-disk
To load contents into your MarkLogic server in Azure, refer to the Loading Content Into MarkLogic Server Guide.
Azure Blob Storage is similar to Amazon S3, even better in some ways, because it doesn't have S3's read-after-write eventual consistency limitation. Also because Azure Blob Storage supports the ability to append, so journal files will work with it.
In MarkLogic, Azure pathnames are of the form: azure://mycontainer/mydirectory/myfile
. They are similar to MarkLogic S3 pathnames of the form: S3://mybucket/mydirectory/myfile
. What are called buckets on S3 are called containers in Azure.
To use Azure Blob Storage, you need an azure storage account, and you need a container for your blobs. You can create and manage azure storage accounts on the Azure portal: https://portal.azure.com/
For production environment, Premium or Standard storage disks in Azure are required. MarkLogic does not support regular or replica forests on Azure Blob Storage. We do support Azure Blob Storage for backup and read-only forest in our Tiered storage feature.
After logging into the portal, click on Storage accounts on the left navigation area. From there you can manage an existing storage account or add a new one. Storage account names are global, so Microsoft provides guidance how to name them: https://docs.microsoft.com/en-us/azure/architecture/best-practices/naming-conventions#naming-rules-and-restrictions
Within your storage account you also need to have at least one container for your blobs. From this page you can manage an existing account or add a new one. On the storage account management page, click Blobs to get to the containers page. From there you can manage an existing container or add a new one.
Also on the Storage Account Management page is an Access Keys link. In a bit you will need to click this link, so keep the browser open to your Storage Account Management page.
MarkLogic needs Azure storage credentials to access Azure Blob Storage. To configure Azure storage credentials in MarkLogic, go to Security -> Credentials in the MarkLogic Admin GUI. On that page, right below where you would enter AWS credentials for S3, you can enter your Azure storage credentials.
You may also use your Azure VM identity to access Azure Blob Storage. Please see this web tutorial to learn how to create a storage account and a VM with identity on Azure.
Enter your Azure storage account and your Azure storage key. Your Azure storage account is the name of your storage account. Your Azure storage key is an access key automatically generated by Microsoft and provided to you. You can get your Azure storage key from the Storage Account Management page of the Azure portal.
On the storage account management page on the Azure portal, click on the Access keys link. You an use either the Key value for key1 or the Key value for key2; it doesn't matter which you use.
The easiest way to do this is to have two open browser windows, one open to the MarkLogic Admin GUI, and one open to the Azure Portal. Then you can just copy the value from one browser window and paste it into the other.
Azure storage access keys are like passwords, so they should be treated as passwords. MarkLogic stores them in encrypted form in its security database.
Once you have configured your Azure storage credentials in MarkLogic, you can access your containers as file systems. The filesystem built-in functions work with azure pathnames, which are of the form azure://mycontainer/mydirectory/myfile
. External binary references can have Azure pathnames and MarkLogic will transparently access them.
You can do easily do backups to an Azure pathname, and you can also do journal archiving to Azure. You can specify an Azure directory as a forest data directory. Forests on Azure blob storage can have journals. Forests on Azure blob storage can be transactionally and robustly read/write, instead of just read-only as they are on S3.
There are security API functions to script the configuration of Azure storage credentials: sec:credentials-get-azure and sec:credentials-set-azure. The REST management API endpoint /manage/v2/credentials/properties
will support credentials configuration.
If there are no Azure credentials in the Security database, MarkLogic will check environment variables. See Environment Variables for more information.
If you want to configure MarkLogic Server to access Azure Blob Storage through a proxy server, you can specify the URL to the proxy server by setting the MARKLOGIC_AZURE_STORAGE_PROXY
in the MarkLogic Azure Solution Template, or use the Admin Interface to configure MarkLogic Server to access S3 Storage through a proxy server, as follows:
azure storage proxy
field toward the bottom of the configuration page. The proxy URL should start with https://
(for example, https://proxy.marklogic.com:8080
). If you don't specify the port number, MarkLogic assumes the proxy server is listening on port 8080
.If the MARKLOGIC_AZURE_STORAGE_PROXY
variable is set and azure storage proxy
is not set in the Admin Interface group configuration, the value of MARKLOGIC_AZURE_STORAGE_PROXY
is used . If MARKLOGIC_AZURE_STORAGE_PROXY
is set and azure storage proxy
is also set in the Admin Interface group configuration, the Admin Interface azure storage proxy
setting is used.
There are security API functions to script the configuration of azure storage credentials, specifically the functions sec:credentials-get-azure and sec:credentials-set-azure, which are a part of security.xqy
. The REST management API endpoint /manage/v2/credentials/properties
will support credentials configuration.
If there are no Azure credentials in the security database, MarkLogic will check environment variables. You can specify your azure storage account name with the environment variable MARKLOGIC_AZURE_STORAGE_ACCOUNT
. You can specify your azure storage access key with the environment variable MARKLOGIC_AZURE_STORAGE_KEY
. As usual, on Linux you can set these MarkLogic environment variables with export commands in /etc/marklogic.conf
. Since azure storage access keys are like passwords, it is better to let MarkLogic keep them encrypted in its security database, but this alternate mechanism is available if needed.
To add more VMs as additional hosts for MarkLogic Clusters on Azure platform, set up the number of VM instances as needed from your Azure portal. For more details, see Adding a Host to a Cluster in the Administrator's Guide.
Since the IP address for each node changes as you start and stop an instance on Azure, make sure a public DNS name is assigned to every node before stopping an instance so that this node can join the cluster again.
MarkLogic recommends assigning a public DNS name to your VMs. For instructions on how to do this, see the following:
https://docs.microsoft.com/en-us/azure/virtual-machines/create-fqdn
For details on removing nodes, see Leaving the Cluster in the Administrator's Guide.
See the following Microsoft Azure blog for recommendations on changing VM instance sizes:
https://azure.microsoft.com/en-us/blog/resize-virtual-machines/
You should expect some downtime when resizing a VM. In addition, data stored on temporary OS disk will be lost during resizing.
Azure has built-in single VM monitoring, and you will be able to see the VM status. You can select which performance metrics you want to see. The VM status display is similar to the following:
MarkLogic provides a monitoring dashboard and REST Management APIs that you can use to integrate with other monitoring tools. For details, see Monitoring MarkLogic Server in the Monitoring MarkLogic Guide.
Plugins for New Relic and App Dynamics are also available. The Open Source, community supported New Relic plugins can be found in the tools section of our Developer Site.
The Azure Solution Template does not support updating nodes. MarkLogic Server on Azure must be upgraded manually.
To upgrade a MarkLogic cluster on Azure: