Skip to main content

Using MarkLogic Content Pump (mlcp)

Configuring Your MarkLogic Cluster

The mlcp tool uses an XDBC App Server to communicate with each host in a MarkLogic Server cluster that has at least one forest attached to a database used in your mlcp job. Optionally, you can configure the mlcp tool to connect to a load balancer that sits in front of the MarkLogic Server cluster. When configured to use a load balancer, the mlcp tool communicates with the load balancer to reach the forests. The load balancer can communicate with hosts that are evaluator nodes, data nodes, or both. For details, see Controlling How mlcp Connects to MarkLogic Server.

When you use mlcp with MarkLogic 8 or later on the default port (8000), no special cluster configuration is necessary. Port 8000 includes a pre-configured XDBC App Server. The default database associated with port 8000 is the Documents database. To use mlcp with a different database and port 8000, use the -database, -input_database, or -output_database options. For example:

mlcp.sh import -host myhost -port 8000 -database mydatabase ...

When using MarkLogic 8 or later with a port other than 8000, the port should connect to either an XDBC App Server or an App Server with a rewriter that is set up to handle XDBC traffic.

Hosts within a group share the same App Server configuration, but hosts in different groups do not. Therefore, if all your forest hosts are in a single group, you only need to configure one App Server to handle XDBC traffic. If your forests are on hosts in multiple groups, then you must configure an App Server for XDBC that listens on the same port in each group.

For example, the cluster shown below is properly configured to use Database A as an mlcp input or output source. Database A has 3 forests, located on 3 hosts in 2 different groups. Therefore, both Group 1 and Group 2 must make Database A accessible via XDBC on port 9001.

Diagram of Cluster with XDBC App Server Configured for Each Group

If the forests of Database A are only located on Host1 and Host2, which are in the same group, then you would only need to configure one XDBC App Server on port 9001.

If you use MarkLogic 8 or later and port 8000 instead of port 9001, then you do not need to explicitly create any XDBC App Servers to support the above database configuration because both groups automatically have an XDBC App Server on port 8000. You might need to explicitly specify the database name (Database A) in your mlcp command, though, if it is not the default database associated with port 8000.