Creating a Custom Hook Module

Custom Hook Modules

A custom hook allows you to perform tasks that are outside the scope of Data Hub. Custom hook modules can run immediately before (pre-step hook) or immediately after (post-step hook) the step's core processes.

A step is generally processed as follows:

  1. The selected data is read from the source database.
  2. The pre-step hook module runs, if any.
  3. The main step module runs. This can be a default Data Hub step functionality or a custom step module.
  4. The post-step hook module runs, if any.
  5. The processed data is written to the target database.

A custom hook is added as a node to the step's JSON structure:

  • QS format: In the flow configuration file.
  • HC format: In the step configuration file.

If a process in the custom hook module conflicts with a process in the step's core module or functionality, the core module or functionality overrides the hook module's conflicting process.

Each custom hook module is executed in its own environment, separate from Data Hub processes and other modules.

Custom hooks can be added to any type of step (ingestion, mapping, matching, merging, mastering, or custom).

Store all your custom modules under your-project-root/src/main/ml-modules/root/custom-modules. The Gradle task mlDeploy deploys the contents of that directory to the MODULES database.

Required Inputs and Outputs

A custom hook module does not require specific inputs or outputs. However, it can access some information by declaring and using the following variables in the code.

   // A custom hook module receives values for the following parameters via Data Hub. You can declare only the ones you need and ignore the rest.
  var uris;         // An array of one or more URIs being processed.
  var content;      // An array of objects that represent each document being processed.
  var options;      // The Options object passed to the step by Data Hub.
  var flowName;     // The name of the flow being processed.
  var stepNumber;   // The index of the step within the flow being processed. The stepNumber of the first step is 1.
  var step;         // The step object.
  var database;     // The target database.
Tip: If the custom task would involve reading from and writing to the database, create a custom step module instead and embed it in a separate custom step, which you add to your flow.

Example

The following custom hook module example archives a record before ingesting a new one with the same URI, so that the new one would have a fresh history. ()

Next Steps

After creating your custom hook module, add a custom hook to your step in the flow and and specify the path to your new custom hook module.