Create and Add a Step to a Flow Using Gradle

Before you begin

You need:

About this task

A flow must have at least one step. You can create steps in two ways:

  • By customizing the example steps. If you created your flow using Gradle, the resulting flow definition file includes an example step for each predefined type of step (ingestion, mapping, matching, merging, mastering). Simply customize these example steps.
  • By creating and customizing a step definition using Gradle. Use the Gradle task hubCreateStepDefinition to initialize a step definition, copy the step definition to the flow definition, and customize.

Steps that are configured directly in the flow definition file, as well as steps that use the step definition created by running the Gradle task hubCreateStepDefinition, are equivalent to the custom steps generated by QuickStart. The stepDefType parameter for hubCreateStepDefinition is equivalent to the Custom Step Type field in QuickStart, except -PstepDefType=custom in the Gradle syntax is equivalent to Custom Step Type being set to Other in QuickStart.

This task describes how to create and customize a step definition using Gradle.

Procedure

  1. Using Gradle, create the step definition.
    1. Open a command-line window, and go to your project root directory.
    2. At your project root, run the Gradle task hubCreateStepDefinition with the stepType as ingestion. the stepType as mapping. the stepType as matching. the stepType as merging. the stepType as mastering. the stepType as custom.
      ./gradlew hubCreateStepDefinition -PstepDefName=your-ingestion-step-name -PstepDefType=ingestion -igradlew.bat hubCreateStepDefinition -PstepDefName=your-ingestion-step-name -PstepDefType=ingestion -i
      ./gradlew hubCreateStepDefinition -PstepDefName=your-mapping-step-name -PstepDefType=mapping -igradlew.bat hubCreateStepDefinition -PstepDefName=your-mapping-step-name -PstepDefType=mapping -i
      ./gradlew hubCreateStepDefinition -PstepDefName=your-matching-step-name -PstepDefType=matching -igradlew.bat hubCreateStepDefinition -PstepDefName=your-matching-step-name -PstepDefType=matching -i
      ./gradlew hubCreateStepDefinition -PstepDefName=your-merging-step-name -PstepDefType=merging -igradlew.bat hubCreateStepDefinition -PstepDefName=your-merging-step-name -PstepDefType=merging -i
      ./gradlew hubCreateStepDefinition -PstepDefName=your-mastering-step-name -PstepDefType=mastering -igradlew.bat hubCreateStepDefinition -PstepDefName=your-mastering-step-name -PstepDefType=mastering -i
      ./gradlew hubCreateStepDefinition -PstepDefName=your-custom-step-name -PstepDefType=custom -igradlew.bat hubCreateStepDefinition -PstepDefName=your-custom-step-name -PstepDefType=custom -i

      To associate an XQuery module (instead of the default JavaScript module) with the step definition, add the option -Pformat=xqy. This creates an XQuery sample module, which you can customize, and a JavaScript wrapper.

      Note: The default stepType is custom.

      The following files are created:

      • The step definition file
        your-project-root/step-definitions/ingestion/your-ingestion-step-name/your-ingestion-step-name.step.json
        your-project-root/step-definitions/mapping/your-mapping-step-name/your-mapping-step-name.step.json
        your-project-root/step-definitions/matching/your-matching-step-name/your-matching-step-name.step.json
        your-project-root/step-definitions/merging/your-merging-step-name/your-merging-step-name.step.json
        your-project-root/step-definitions/mastering/your-mastering-step-name/your-mastering-step-name.step.json
        your-project-root/step-definitions/custom/your-custom-step-name/your-custom-step-name.step.json
      • The custom module file
        your-project-root/src/main/ml-modules/root/custom-modules/ingestion/your-ingestion-step-def-name/main.sjs
        your-project-root/src/main/ml-modules/root/custom-modules/mapping/your-mapping-step-def-name/main.sjs
        your-project-root/src/main/ml-modules/root/custom-modules/matching/your-matching-step-def-name/main.sjs
        your-project-root/src/main/ml-modules/root/custom-modules/merging/your-merging-step-def-name/main.sjs
        your-project-root/src/main/ml-modules/root/custom-modules/mastering/your-mastering-step-def-name/main.sjs
        your-project-root/src/main/ml-modules/root/custom-modules/custom/your-custom-step-def-name/main.sjs

      The modulePath setting in the step definition points to the custom module file.

  2. Manually add the step to the flow.
    1. In a text editor, open the step definition file.
      The custom ingestion step definition file contains the following:
         {
          "language" : "zxx",
          "name" : "your-ingestion-step-def-name",
          "description" : null,
          "type" : "INGESTION",
          "version" : 1,
          "options" : {
            "collections" : [ "your-ingestion-step-def-name" ],
            "outputFormat" : "json",
            "targetDatabase" : "data-hub-STAGING"
          },
          "customHook" : { },
          "modulePath" : "/custom-modules/ingestion/your-ingestion-step-def-name/main.sjs",
          "retryLimit" : 0,
          "batchSize" : 100,
          "threadCount" : 4,
          "fileLocations" : {
            "inputFilePath" : "",
            "outputURIReplacement" : "",
            "inputFileType" : ""
          }
        }
      
    2. Likewise, open the flow definition file.

      You can find your flow definition file in your-project-root/flows.

      The default flow definition file without any steps contains the following:

         {
          "name": "your-flow-name",
          "description": "",
          "batchSize": 100,
          "threadCount": 4,
          "options": {
            "sourceQuery": null
          },
          "steps": {}
        }
      
    3. In the steps node, add the step as a key-value pair.
      • For the key, enter a string containing a number which represents the order of the step in the sequence.
        Note: The steps can be listed in any order, as long as the keys are unique within the steps node of the flow. Duplicate keys can produced unexpected results. The key number must be greater than 0.
      • For the value, copy and paste the entire content of the step definition file.
    4. Edit the step in the flow definition file.
      • Rename the name setting to stepDefinitionName.
      • Rename the type setting to stepDefinitionType.
      • Create a new name setting and assign it a name for the step.
      • (Optional) Delete the following settings:
        • language
        • version
        • modulePath

        The values for these settings are retrieved from the step definition.

    After adding a default ingestion step to the default flow as the first step ("1"), the flow definition file looks as follows:

       {
        "name": "your-flow-name",
        "description": "",
        "batchSize": 100,
        "threadCount": 4,
        "options": {
          "sourceQuery": null
        },
        "steps": {
          "1": {
            "name" : "your-step-name",
            "stepDefinitionName" : "your-ingestion-step-def-name",
            "description" : null,
            "stepDefinitionType" : "INGESTION",
            "options" : {
              "collections" : [ "your-ingestion-step-name" ],
              "outputFormat" : "json",
              "targetDatabase" : "data-hub-STAGING"
            },
            "customHook" : { },
            "retryLimit" : 0,
            "batchSize" : 100,
            "threadCount" : 4,
            "fileLocations" : {
              "inputFilePath" : "",
              "outputURIReplacement" : "",
              "inputFileType" : ""
            }
          }
        }
      }
    

What to do next