Skip to main content

Using MarkLogic Content Pump (mlcp)

Redacting Content During a Copy

Redaction is the process of eliminating or obscuring portions of a document when retrieving the document from MarkLogic. For example, you can eliminate or mask sensitive personal information such as credit card numbers, phone numbers, or email addresses from documents. You can only redact document content, not document properties.

Redaction is performed as documents are read from the source database. For example, if you copy documents between databases in two different MarkLogic installations, the unredacted content never leaves the source installation.

Redaction support in MarkLogic is covered in detail in Redacting Content During Export or Copy Operations and Redacting Document Content in the Application Developer’s Guide.

Use the -redaction option to apply redaction rules during a copy. For example, the following command applies the redaction rules in the rule collections “hipaa-rules" and “biz-rules” to the source documents in the collection "my_docs" before copying them to the destination database.

# Windows users, see Modifying the Example Commands for Windows
$ mlcp.sh copy -mode local -input_host srchost -input_port 8000 \
    -input_username user1 -input_password password1 \
    -output_host desthost -output_port 8000 -output_username user2 \
    -output_password password2 -collection_filter my_docs \
    -redaction "hipaa-rules,biz-rules"

For more details, see Redacting Content During Export or Copy Operations.