Skip to main content

Using MarkLogic Content Pump (mlcp)

Basic Steps for Redacting Documents

Use the -redaction option of mlcp to apply redaction rules to an export or copy operation. This option accepts a comma-separated list of redaction rule collection URIs. For example:

-redaction "pii-rules,sec-rules"

Before you can use redaction, you must install one or more redaction rule sets in the Schemas database. For details on defining and installing redaction rules, see Redacting Document Content in the Application Developer’s Guide.

Preparing to redact documents with mlcp requires the following steps. For a complete example, see Example: Using mlcp for Redaction.

  1. Install one or more redaction rules in the Schemas database. Each rule must be part of at least one collection. For details, see Defining Redaction Rules and Installing Redaction Rules in the Application Developer’s Guide.

  2. If you create a rule that uses a user-defined redaction function, install the implementation of your redaction function in the modules database associated with the App Server you will connect to using mlcp. For details, see User-Defined Redaction Functions in the Application Developer’s Guide.

  3. Add the -redaction option to your mlcp command line. For example, the following command applies the rules in the collections “pii-rules” and “sec-rules” to all exported documents.

    # Windows users, see Modifying the Example Commands for Windows
    $ mlcp.sh export -host localhost -port 8000 -username user \
        -password password -mode local -output_file_path \
        /space/mlcp/export/files -directory_filter /people/ \
        -redaction "pii-rules,sec-rules"

The -redaction option works similarly for copy operations. For details, see Redacting Content During a Copy.

The user who extracts redacted documents must have read permissions on the source documents and the rules but need not be able to modify the rule collection or rule definitions. For details, see Security Considerations in Application Developer’s Guide.

The following behaviors apply when exceptional conditions occur. You should be aware of these behaviors so you understand when content might not be redacted as expected:

  • If a rule collection is empty, mlcp issues a warning and continues with the job.

  • If any of the rules contain errors, an error is reported and mlcp aborts the export or copy operation.

  • If a rule is valid, but an error occurs when applying the rule, the rule is skipped for the current document and a warning is logged. The job continues.