Data Hub Gradle Tasks

The Gradle tasks available in Data Hub Gradle Plugin (ml-data-hub).

Using Gradle in Data Hub

To use Data Hub Gradle Plugin in the Data Hub flows, see Data Hub Gradle Plugin.

To pass parameters to Gradle tasks, use the -P option.

./gradlew taskname ... -PparameterName=parameterValue ... -igradlew.bat taskname ... -PparameterName=parameterValue ... -i
Important: If the value of a Gradle parameter contains a blank space, you must enclose the value in double quotation marks. If the value does not contain a blank space, you must not enclose the value in quotation marks.

You can use Gradle's -i option to enable info-level logging.

This page provides the list of Gradle tasks available in Data Hub Gradle Plugin (ml-data-hub).

  • Tasks with names starting with ml are customized for Data Hub from the ml-gradle implementation.
  • Tasks with names starting with hub are created specifically for Data Hub.
Tip: You can view the complete list of available Gradle tasks and their descriptions by running gradle tasks.

The tasks are grouped as follows:

Setup Tasks

These tasks initialize or upgrade your MarkLogic Data Hub instance.

hubInit

Initializes the current directory as a Data Hub project.

./gradlew hubInit -igradlew.bat hubInit -i

Requires the security role data-hub-developer or any role that inherits it.

For on-premises and DHS.

hubUpdate

Updates your Data Hub instance to a newer version.

./gradlew hubUpdate -igradlew.bat hubUpdate -i

Requires the security role data-hub-developer or any role that inherits it.

For on-premises and DHS.

Before you run the hubUpdate task, edit the build.gradle file. Under plugins, change the value of 'com.marklogic.ml-data-hub' version to the new Data Hub version.

For example, if you are updating to the latest Data Hub version:

   plugins {
    id 'com.marklogic.ml-data-hub' version 'VERSION_NUMBER'
  }

For complete instructions on upgrading to a newer Data Hub version, see Upgrading Data Hub.

Running the hubUpdate task with the -i option (info mode) displays specifically what the task does, including configuration settings that changed.

hubExportProject

Exports the Data Hub project artifacts into the file build/datahub-project.zip.

./gradlew hubExportProject -igradlew.bat hubExportProject -i

Requires the security role data-hub-developer or any role that inherits it.

For on-premises and DHS.

hubVersion

Displays the versions of Data Hub and MarkLogic Server associated with the host (mlHost), as well as the version of Data Hub used by Gradle locally.

./gradlew hubVersion -igradlew.bat hubVersion -i

Requires the security role data-hub-operator or any role that inherits it.

For on-premises and DHS.

hubDescribeRole

Retrieves information about the specified role.

./gradlew hubDescribeRole -Prole=name-of-role -igradlew.bat hubDescribeRole -Prole=name-of-role -i
role
(Required) The role to get information about.
Returns a prettified JSON document with the following information:
  • the role name
  • the ML Server version
  • the Data Hub version
  • the inherited roles and the privileges associated with those roles
  • the default document permissions and collections associated with the role

Requires the security role data-hub-developer or any role that inherits it.

For on-premises and DHS.

hubDescribeUser

Retrieves information about the specified user.

./gradlew hubDescribeUser -Puser=username -igradlew.bat hubDescribeUser -Puser=username -i
user
(Required) The user account to get information about.
Returns a prettified JSON document with the following information:
  • the user name
  • the ML Server version
  • the Data Hub version
  • the roles assigned to the user and the privileges associated with those roles
  • the default document permissions and collections associated with the user

Requires the security role data-hub-developer or any role that inherits it.

For on-premises and DHS.

hubPrintInheritableRoles

Retrieves the list of roles that can be inherited by a custom role.

./gradlew hubPrintInheritableRoles -igradlew.bat hubPrintInheritableRoles -i
Note: Creating custom roles and privileges in Data Hub requires the data-hub-security-admin role or any role that inherits it.

Learn more: Custom Roles and Privileges

Requires the security role data-hub-security-admin or any role that inherits it.

For on-premises and DHS.

Conversion Tasks

These tasks convert or clean up your artifacts for use in Hub Central in DHS.

hubConvertForHubCentral

Converts your artifacts from the QuickStart format to the Hub Central format.

./gradlew hubConvertForHubCentral -Pconfirm=true -igradlew.bat hubConvertForHubCentral -Pconfirm=true -i
confirm
(Required) Confirmation to convert your artifacts.

Requires the security role data-hub-developer or any role that inherits it.

For on-premises and DHS.

QuickStart Accepts project artifacts in the QuickStart format only.

Learn more: Convert from QuickStart to Hub Central

hubDeleteLegacyMappings

Deletes the legacy mapping configuration files.

./gradlew hubDeleteLegacyMappings -PenvironmentName=myEnvName -Pconfirm=true -igradlew.bat hubDeleteLegacyMappings -PenvironmentName=myEnvName -Pconfirm=true -i
environmentName
(Required) The name of your environment.
confirm
(Required) Confirmation to convert your artifacts.

Requires the security role data-hub-developer or any role that inherits it.

For on-premises and DHS.

Hub Central Accepts project artifacts in the Hub Central format only. Learn more: Convert from QuickStart to Hub Central

In QuickStart, mapping configurations are stored in files separate from the step definitions and the flow configurations. During the conversion to Hub Central, the mapping configurations are merged into mapping steps, but the original mapping configuration files remain.

Run this task against all your environments in which you intend to use Hub Central.

Tip: You can determine the list of environments you have by searching for gradle-*.properties files in your project directory.

Learn more: Convert from QuickStart to Hub Central

Development Tasks

These tasks perform basic functionality for flows and steps, equivalent to those available in Hub Central.

hubCreateEntity

Creates a boilerplate entity.

./gradlew hubCreateEntity -PentityName=YourEntityName -igradlew.bat hubCreateEntity -PentityName=YourEntityName -i
entityName
(Required) The name of the entity to create.

For on-premises and DHS.

hubCreateFlow

Creates a boilerplate flow configuration file.

./gradlew hubCreateFlow -PflowName=YourFlowName -PwithInlineSteps=true -igradlew.bat hubCreateFlow -PflowName=YourFlowName -PwithInlineSteps=true -i
flowName
(Required) The name of the flow to create.
withInlineSteps
  • To create a flow in the Hub Central format, set to false. The flow configuration includes only references to the steps.

    Example of a step reference:

       "stepId" : "yourstepname-yoursteptype"
    

The default is false (Hub Central format).

For on-premises and DHS.

The resulting flow configuration is stored locally with the local project files. If you run this task while connected to your MarkLogic Server instance, the flow configuration is also automatically deployed to both the STAGING and the FINAL databases.

hubCreateStepDefinition

Creates a custom step definition that can be added to a flow as a step.

./gradlew hubCreateStepDefinition -PstepDefName=yourstepname -PstepDefType=[ingestion|custom] -Pformat=[sjs|xqy] -igradlew.bat hubCreateStepDefinition -PstepDefName=yourstepname -PstepDefType=[ingestion|custom] -Pformat=[sjs|xqy] -i
stepDefName
(Required) The name of the custom step definition to create.
stepDefType
The type of the custom step definition to create:
  • ingestion (To create a Custom-Ingestion step.)
  • custom (To create a Custom-Mapping, Custom-Mastering, or Custom-Other step.)

The default is custom.

format
The format of the module to associate with the new step definition: xqy for XQuery or sjs for JavaScript. The default is sjs.

For on-premises and DHS.

A module is created under your-project-root/src/main/ml-modules and is associated with the step definition to perform the processes required for the step; for example, you can create a module to wrap each document in your own custom envelope.

  • If -Pformat=sjs or if the option is not specified, only one file is created:
    • main.sjs, which is the JavaScript module that you must customize.
  • If -Pformat=xqy, two files are created:
    • lib.xqy, which is the XQuery module that you must customize.
    • main.sjs, which acts as a wrapper around lib.xqy.
Tip: If your needs can be met by making minor changes to a step of a default type (ingestion, mapping, or mastering), simply modify the appropriate example step in the flow created by hubCreateFlow. The example steps use the predefined default-ingestion, default-mapping, and default-mastering step definitions, so you won't need to create a new one.
hubCreateStep

Creates a step based on a default step definition or on a new step definition and its module.

./gradlew hubCreateStep -PstepName=yourstepname -PstepType=[ingestion|mapping|matching|merging|custom] -PstepDefName=yourstepdefinitionname -PentityType=myEntityTypeName -igradlew.bat hubCreateStep -PstepName=yourstepname -PstepType=[ingestion|mapping|matching|merging|custom] -PstepDefName=yourstepdefinitionname -PentityType=myEntityTypeName -i
stepName
(Required) The name of the step to create based on a step definition.
stepType
(Required) The type of step to create: ingestion, mapping, matching, merging, or custom. For Custom-Ingestion, use ingestion and specify stepDefName.
stepDefName
The name of the step definition to create. Allowed only if stepType is ingestion or custom. The specified step definition and its associated module are created and used.
entityType
(Required if stepType is mapping) The name of the entity type to associate with the step.

Requires the security role data-hub-developer or any role that inherits it.

For on-premises and DHS.

Hub Central Accepts project artifacts in the Hub Central format only. Learn more: Convert from QuickStart to Hub Central

If you run this task while connected to Data Hub in DHS, the resulting artifacts are automatically deployed. If not connected, a connection exception is thrown.

Note: If stepDefName is specified, the new step definition is automatically deployed with the step, but the new module is not. To deploy the new module, run hubDeploy or hubDeployAsDeveloper.
hubAddStepToFlow

For Hub Central. Adds a step to the specified flow. The new step is assigned the next number in the sequence of steps within the flow.

./gradlew hubAddStepToFlow -PflowName=yourflowname -PstepName=yourstepname -PstepType=[ingestion|mapping|matching|merging|custom] -igradlew.bat hubAddStepToFlow -PflowName=yourflowname -PstepName=yourstepname -PstepType=[ingestion|mapping|matching|merging|custom] -i
flowName
(Required) The name of the flow to add the step to.
stepName
(Required) The name of the step to create.
stepType
(Required) The type of step to add to the flow: ingestion, mapping, matching, merging, or custom.

Requires the security role data-hub-developer or any role that inherits it.

For on-premises and DHS.

Hub Central Accepts project artifacts in the Hub Central format only. Learn more: Convert from QuickStart to Hub Central

Only one line is added to the flow configuration file:

   "stepId" : "yourstepname-yoursteptype"

If you run this task while connected to Data Hub in DHS, the resulting artifacts are automatically deployed. If not connected, a connection exception is thrown.

hubClearUserArtifacts

Deletes all user artifacts in the STAGING and FINAL databases. (DHS-relevant)

./gradlew hubClearUserArtifacts -Pconfirm=true -igradlew.bat hubClearUserArtifacts -Pconfirm=true -i
confirm
(Required) Confirmation to delete all user artifacts in both the STAGING and FINAL databases.

All default Data Hub artifacts and all user data remain.

Requires the security role data-hub-developer or any role that inherits it.

For on-premises and DHS.

hubClearUserData

Deletes all user data in the STAGING, FINAL, and JOBS databases. (DHS-relevant)

./gradlew hubClearUserData -Pconfirm=true -igradlew.bat hubClearUserData -Pconfirm=true -i
confirm
(Required) Confirmation to delete all user data in the STAGING, FINAL, and JOBS databases.

All default Data Hub artifacts and all user artifacts remain.

Requires the security role data-hub-admin or any role that inherits it.

For on-premises and DHS.

hubClearUserModules

Deletes all custom modules in the MODULES database. (DHS-relevant)

./gradlew hubClearUserModules -Pconfirm=true -igradlew.bat hubClearUserModules -Pconfirm=true -i
confirm
(Required) Confirmation to delete all custom modules in the MODULES database.
Important: Artifacts, such as custom steps and other steps with interceptors and custom hooks, might still refer to the deleted custom modules. If you intend to keep those artifacts, you must modify them to refer to new custom modules and then redeploy them. To delete all user artifacts, use hubClearUserArtifacts.

Requires the security role data-hub-developer or any role that inherits it.

For on-premises and DHS.

mlClearDatabase

Deletes the user data and user artifacts in the specified database. (DHS-relevant)

./gradlew mlClearDatabase -Pdatabase=data-hub-database -Pconfirm=true -igradlew.bat mlClearDatabase -Pdatabase=data-hub-database -Pconfirm=true -i
database
(Required) The name of the database to clear. Examples: data-hub-STAGING, data-hub-fINAL, data-hub-JOBS, data-hub-MODULES.
confirm
(Required) Confirmation to delete all user data and all user artifacts in the specified database.

If clearing the STAGING database or the FINAL database, all default Data Hub artifacts remain.

Requires a security role with the privilege to clear the specified database.

For on-premises and DHS.

hubPullChanges

Downloads your Hub Central files and applies them to your local project directory.

./gradlew hubPullChanges -igradlew.bat hubPullChanges -i

Requires the security role Data Hub Developer (data-hub-developer), Hub Central Developer (hub-central-developer), or any role that inherits any of these.

For on-premises and DHS.

Hub Central Accepts project artifacts in the Hub Central format only. Learn more: Convert from QuickStart to Hub Central

Only the project artifacts that Hub Central can handle are overwritten in your local project directory; the rest remain as is.

To download the artifacts and immediately apply them to your local project directory, use hubPullChanges.

To inspect the artifacts before applying them to your local project directory:
  1. Download your Hub Central files using Hub Central.
  2. Inspect the artifacts.
  3. Use hubApplyProjectZip to apply the artifacts to your local project directory.
hubApplyProjectZip

Applies the artifacts from the specified zip file to your local project directory.

./gradlew hubApplyProjectZip -Pfile=datahub-project.zip -igradlew.bat hubApplyProjectZip -Pfile=datahub-project.zip -i
file
(Required) The zip file containing project artifacts, downloaded from your DHS instance.

For on-premises and DHS.

Hub Central Accepts project artifacts in the Hub Central format only. Learn more: Convert from QuickStart to Hub Central

To download the artifacts and immediately apply them to your local project directory, use hubPullChanges.

To inspect the artifacts before applying them to your local project directory:
  1. Download your Hub Central files using Hub Central.
  2. Inspect the artifacts.
  3. Use hubApplyProjectZip to apply the artifacts to your local project directory.
mlWatch

Extends ml-gradle's WatchTask by ensuring that modules in Data Hub-specific folders (plugins and entity-config) are monitored.

./gradlew mlWatch -igradlew.bat mlWatch -i

Requires the security role data-hub-developer or any role that inherits it.

For on-premises and DHS.

Important: To avoid deploying untested code to production or to a shared development environment, use mlWatch only in a local development environment.

mlWatch continuously monitors your local module directories for any changes and automatically deploys modified modules to your local MODULES database, so you can immediately test them.

You can stop the mlWatch process as you would end any other process in your operating system.

Deployment Tasks

These tasks deploy and undeploy your project artifacts to your production environment.

hubDeploy, hubDeployAsDeveloper, hubDeployAsSecurityAdmin, hubDeployToReplica

Installs modules and other resources to the MarkLogic Server. (Data Hub 5.2 or later)

Depending on the roles assigned to your user account, you can deploy different assets using the appropriate hubDeploy task.

Role(s) Use this Gradle task To deploy
data-hub-developer
./gradlew hubDeployAsDeveloper -PenvironmentName=dhs -igradlew.bat hubDeployAsDeveloper -PenvironmentName=dhs -i
  • User modules and artifacts (entities, flows, mappings, and step definitions)
  • Alert configurations, rules, and actions
  • STAGING, FINAL, and JOBS database indexes
  • Scheduled tasks
  • Schemas
  • Temporal axes and collections
  • Triggers
  • Protected paths and query rolesets
data-hub-security-admin
./gradlew hubDeployAsSecurityAdmin -PenvironmentName=dhs -igradlew.bat hubDeployAsSecurityAdmin -PenvironmentName=dhs -i
  • Definitions of custom roles and privileges with the following restrictions:
    • A custom role cannot inherit from any other role.
    • A custom role can only inherit privileges granted to the user creating the role.
    • A custom execute privilege must be assigned an action starting with http://datahub.marklogic.com/custom/.
Both data-hub-developer and data-hub-security-admin
./gradlew hubDeploy -PenvironmentName=dhs -igradlew.bat hubDeploy -PenvironmentName=dhs -i
  • All of the above
Both data-hub-developer and data-hub-security-admin
./gradlew hubDeployToReplica -PenvironmentName=dhs -igradlew.bat hubDeployToReplica -PenvironmentName=dhs -i
  • Configuration changes to the disaster recovery cluster
    Note: This task does not write to the databases.

Learn more: Users and Roles

For on-premises and DHS.

mlDeploy

(On-premises only) Uses hubPreinstallCheck to deploy your Data Hub project to a Data Hub instance.

./gradlew mlDeploy -igradlew.bat mlDeploy -i

Requires the security role data-hub-admin or any role that inherits it.

For on-premises only.

To deploy to DHS, use hubDeploy or its variations.

mlDeployToReplica

(On-premises only) Deploys configuration changes to the disaster recovery cluster.

./gradlew mlDeployToReplica -igradlew.bat mlDeployToReplica -i
Note: This task does not write to the databases.

Requires the security role data-hub-admin or any role that inherits it.

For on-premises only.

To deploy to DHS, use hubDeployToReplica.

mlUndeploy

(On-premises only) Removes Data Hub and all components of your project from MarkLogic Server, including databases, application servers, forests, and users.

./gradlew mlUndeploy -Pconfirm=true -igradlew.bat mlUndeploy -Pconfirm=true -i

Requires the security role data-hub-admin or any role that inherits it.

For on-premises only.

If your Data Hub instance is deployed on DHS, contact Support to undeploy your project components.

Execution Tasks

These tasks run flows, perform actions on specific records outside a flow, and clean up.

hubRunFlow

Runs a flow.

./gradlew hubRunFlow -PflowName=YourFlowName -PentityName=YourEntityName -PbatchSize=100 -PthreadCount=4 -PshowOptions=[true|false] -PfailHard=[true|false] -Psteps="1,2" -PjobId="abc123" [ -Poptions="{ customkey: customvalue, ... }" | -PoptionsFile=/path/to.json ] -igradlew.bat hubRunFlow -PflowName=YourFlowName -PentityName=YourEntityName -PbatchSize=100 -PthreadCount=4 -PshowOptions=[true|false] -PfailHard=[true|false] -Psteps="1,2" -PjobId="abc123" [ -Poptions="{ customkey: customvalue, ... }" | -PoptionsFile=/path/to.json ] -i
flowName
(Required) The name of the harmonize flow to run.
entityName
(Required if the flow includes a mapping step) The name of the entity used with the mapping step.
batchSize
The maximum number of items to process in a batch. The default is 100.
threadCount
The number of threads to run. The default is 4.
showOptions
If true, options that were passed to the command are printed out. The default is false.
failHard
If true, the flow's execution is ended immediately if a step fails. The default is false.
steps
The comma-separated numbers of the steps to run. If not provided, the entire flow is run.
jobId
A unique job ID to associate with the flow run. This option can be used if the flow run is part of a larger process (e.g., a process orchestrated by NiFi with its own job/process ID). Must not be the same as an existing Data Hub job ID. If not provided, a unique Data Hub job ID will be assigned.
options
A JSON structure containing key-value pairs to be passed as custom parameters to your step modules.
optionsFile
The path to a JSON file containing key-value pairs to be passed as custom parameters to your step modules.

Requires the security role data-hub-operator or any role that inherits it.

For on-premises and DHS.

The custom key-value parameters passed to your step module are available through the $options (xqy) or options (sjs) variables inside your step module.

Note: To run a flow using Gradle, you must be in the local directory that contains the Data Hub project files.
hubMergeEntities

Merges the specified records according to the settings of the specified mastering or matching step.

./gradlew hubMergeEntities -PmergeURIs=URI1,URI2,URIn -PflowName=YourFlowName -Pstep=1 -Ppreview=[true|false] -Poptions={YourStepOptionOverrides} -igradlew.bat hubMergeEntities -PmergeURIs=URI1,URI2,URIn -PflowName=YourFlowName -Pstep=1 -Ppreview=[true|false] -Poptions={YourStepOptionOverrides} -i
mergeURIs
(Required) The comma-separated list of the URIs of the records to merge.
flowName
(Required) The name of a flow that includes a mastering or matching step.
step
The step number of the mastering or matching step in the specified flow. This task uses the settings in the mastering or matching step. The default is 1, which assumes that the first step in the flow is a mastering or matching step.
preview
If true, no changes are made to the database and a simulated merged record is returned; otherwise, the merged record is saved to the database. The default is false.
options
A JSON-formatted string that contains the mastering settings to override the settings in the specified mastering step. The default is {}.

Requires the security role data-hub-operator or any role that inherits it.

For on-premises only.

hubUnmergeRecord

Reverses the set of merges that created the specified merged record.

./gradlew hubUnmergeEntities -PmergeURI=URIofMergedRecord -PretainAuditTrail=[true|false] -PblockFutureMerges=[true|false] -igradlew.bat hubUnmergeEntities -PmergeURI=URIofMergedRecord -PretainAuditTrail=[true|false] -PblockFutureMerges=[true|false] -i
mergeURI
(Required) The URI of the record to unmerge.
removeURI
-PremoveURIs=[URI1],...,[URIn] – the URIs of the documents to unmerge, separated by commas.
retainAuditTrail
If true, the merged record will be moved to an archive collection; otherwise, it will be deleted. The default is true.
blockFutureMerges
If true, the component records will be blocked from being merged together again. The default is true.

Requires the security role data-hub-operator or any role that inherits it.

For on-premises only.

Note: This task archives or deletes the specified merged record and unarchives the component records that were combined to create it. If one of the component records is itself a merged record, the component record will remain so.

Legacy (DHF 4.x) Tasks

hubCreateHarmonizeFlow

Creates a legacy (DHF 4.x) harmonization flow. The resulting DHF 4.x flow must be executed using hubRunLegacyFlow.

./gradlew hubCreateHarmonizeFlow -PentityName=YourEntityName -PflowName=YourFlowName -PdataFormat=[xml|json] -PpluginFormat=[xqy|sjs] -PmappingName=yourmappingname -igradlew.bat hubCreateHarmonizeFlow -PentityName=YourEntityName -PflowName=YourFlowName -PdataFormat=[xml|json] -PpluginFormat=[xqy|sjs] -PmappingName=yourmappingname -i
entityName
(Required) The name of the entity that owns the flow.
flowName
(Required) The name of the harmonize flow to create.
dataFormat
xml or json. The default is json.
pluginFormat
xqy or sjs. The plugin programming language.
mappingName
The name of a model-to-model mapping to use during code generation.
hubCreateInputFlow

Creates a legacy (DHF 4.x) input flow. The resulting DHF 4.x flow must be executed using hubRunLegacyFlow.

./gradlew hubCreateInputFlow -PentityName=YourEntityName -PflowName=YourFlowName -PdataFormat=[xml|json] -PpluginFormat=[xqy|sjs] -igradlew.bat hubCreateInputFlow -PentityName=YourEntityName -PflowName=YourFlowName -PdataFormat=[xml|json] -PpluginFormat=[xqy|sjs] -i
entityName
(Required) The name of the entity that owns the flow.
flowName
(Required) The name of the input flow to create.
dataFormat
xml or json. The default is json.
pluginFormat
xqy or sjs. The plugin programming language.
hubDeleteJobs

Deletes job records. This task does not affect the contents of the staging or final databases.

./gradlew hubDeleteJobs -PjobIds=ID1,ID2,IDn -igradlew.bat hubDeleteJobs -PjobIds=ID1,ID2,IDn -i
jobIds
(Required) A comma-separated list of job IDs to delete.

Requires the security role data-hub-operator or any role that inherits it.

For on-premises only.

hubExportJobs

Exports job records. This task does not affect the contents of the staging or final databases.

./gradlew hubExportJobs -PjobIds=ID1,ID2,IDn -Pfilename=export.zip -igradlew.bat hubExportJobs -PjobIds=ID1,ID2,IDn -Pfilename=export.zip -i
jobIds
A comma-separated list of job IDs to export.
filename
The name of the zip file to generated, including the file extension. The default is jobexport.zip.

Requires the security role data-hub-operator or any role that inherits it.

For on-premises only.

hubRunLegacyFlow

Runs a (legacy) DHF 4.x harmonization flow.

./gradlew hubRunLegacyFlow -PentityName=YourEntityName -PflowName=YourFlowName -PbatchSize=100 -PthreadCount=4 -PsourceDB=data-hub-STAGING -PdestDB=data-hub-FINAL -PshowOptions=[true|false] -Pdhf.YourKey=YourValue -igradlew.bat hubRunLegacyFlow -PentityName=YourEntityName -PflowName=YourFlowName -PbatchSize=100 -PthreadCount=4 -PsourceDB=data-hub-STAGING -PdestDB=data-hub-FINAL -PshowOptions=[true|false] -Pdhf.YourKey=YourValue -i
entityName
(Required) The name of the entity containing the harmonize flow.
flowName
(Required) The name of the harmonize flow to run.
batchSize
The maximum number of items to process in a batch. The default is 100.
threadCount
The number of threads to run. The default is 4.
sourceDB
The name of the database to run against. The default is the name of your staging database.
destDB
The name of the database to put harmonized results into. The default is the name of your final database.
showOptions
Whether to print out options that were passed in to the command. The default is false.
dhf.YourKey
The value to associate with your key. These key-value pairs are passed as custom parameters to your flow. You can pass additional key-value pairs as separate options:
hubrunlegacyflow ... -Pdhf.YourKeyA=YourValueA -Pdhf.YourKeyB=YourValueB ...

Requires the security role data-hub-operator or any role that inherits it.

For on-premises and DHS.

QuickStart Accepts project artifacts in the QuickStart format only.

The custom key-value parameters passed to your step module are available through the $options (xqy) or options (sjs) variables inside your step module.

Alternative Tasks

If you are using the following tasks, switch to hubDeploy instead.

  • hubDeployUserArtifacts
  • hubGeneratePii
  • hubSaveIndexes
  • mlLoadModules
  • mlUpdateIndexes