Migrating 4.3 Flows to 5.x Steps

In Data Hub 5.x,
  • A step is equivalent to a Data Hub Framework 4.3 flow. See About Steps.
  • A flow is a sequence of steps. See About Flows.

Migrating your 4.3 flows into Data Hub 5.x steps is optional. You can continue to run your DHF 4.3 flows inside Data Hub 5.x using legacy Gradle tasks.

However, to take advantage of Data Hub 5.x features, such as mastering and improved mapping, you can migrate your DHF 4.3 flows into Data Hub 5.x steps in the following ways or a combination of both, and then add your Data Hub 5.x steps to one or more flows.

Use Configuration-Based Steps

You can create Data Hub 5.x ingestion step and mapping step to replace your DHF 4.3 input flow and harmonize flow, respectively.

4x Flows 5x Steps
input flow ingestion step
harmonize flow mapping step

You can customize these steps by adding custom hook modules that run before or after the default step processes.

  • You can take greater advantage of Data Hub 5.x features.
  • New steps are more compatible with current and future Data Hub 5.x features.
  • Less code to maintain, although you might have to initially trim down your existing plugins to avoid duplicating the default step processes. If your plugins perform more processes than the default, you can add the associated code for those processes to a custom hook module.

See Getting Started to learn how to create ingestion and mapping steps, as well as entities and flows.

See Creating a Custom Hook Module.

Use Custom Steps

You can create Data Hub 5.x custom steps and add your DHF 4.3 plugin code (as is or rewritten) to each custom step's module.

4x Flows 5x Steps
input flow custom-ingestion step
harmonize flow custom-mapping step
  • Ideal if your plugins perform extensive custom processes that the Data Hub 5.x configuration-based steps do not handle.
  • Initially faster to implement because you don't need to change your headers, triples, or content code. However, you would need to maintain your modules to keep them compatible with future Data Hub releases.
  • If you copy your plugin code as is into the custom step module, you might be importing unnecessary libraries for functionality that the Data Hub 5.x configuration-based steps already handle. These libraries could adversely impact performance.
  • Requires more technical expertise to maintain.

See Getting Started to learn how to create custom steps, as well as entities and flows.

See Editing a Custom Step Module.

Examples:

  • To add your DHF 4.3 plugin code as is, see an example project in which a Data Hub 5.x custom step calls the headers, triples, and content JavaScript libraries copied from a DHF 4.3 project.
  • To rewrite your DHF 4.3 plugin code into a native Data Hub 5.x module, see an example that uses the custom step code template as a guide.