Migrating 4.3 Flows to 5.x Steps

In Data Hub 5.x,
  • A step is equivalent to a Data Hub Framework 4.3 flow. See About Steps.
  • A flow is a sequence of steps. See About Flows.

Migrating your 4.3 flows into Data Hub 5.x steps is optional. You can continue to run your DHF 4.3 flows inside Data Hub 5.x using legacy Gradle tasks.

However, to take advantage of Data Hub 5.x features, such as mastering and improved mapping, you must migrate your DHF 4.3 flows into Data Hub 5.x steps in the following ways or a combination of both, and then add your Data Hub 5.x steps to one or more flows.

Additionally, if you intend to use Hub Central, you must also convert your models, steps, and flows to the Hub Central format. Learn more: convert your project artifacts from QuickStart to Hub Central

Use Configuration-Based Steps

You can create Data Hub 5.x ingestion steps and mapping steps to replace your DHF 4.3 input flows and harmonize flows, respectively.

4x Flows 5x Steps
input flow ingestion step
harmonize flow mapping step

You can customize these steps by adding interceptor or custom hook modules that run before or after the default step processes.

Note: Custom hooks are deprecated. Use interceptors instead.
  • You can take greater advantage of Data Hub 5.x features.
  • New steps are more compatible with current and future Data Hub 5.x features.
  • Less code to maintain, although you might have to initially trim down your existing plugins to avoid duplicating the default step processes. If your plugins perform more processes than the default, you can add the associated code for those processes to an interceptor or a custom hook. Learn more: About Interceptors and Custom Hooks

Learn how to create ingestion and mapping steps, as well as entities and flows: Getting Started

Use Custom Steps

You can create Data Hub 5.x custom steps and add your DHF 4.3 plugin code (as is or rewritten) to each custom step's module.

4x Flows 5x Steps
input flow custom-ingestion step
harmonize flow custom-mapping step
  • Ideal if your plugins perform extensive custom processes that the Data Hub 5.x configuration-based steps do not handle.
  • Initially faster to implement because you don't need to change your headers, triples, or content code. However, you would need to maintain your modules to keep them compatible with future Data Hub releases.
  • If you copy your plugin code as is into the custom step module, you might be importing unnecessary libraries for functionality that the Data Hub 5.x configuration-based steps already handle. These libraries could adversely impact performance.
  • Requires more technical expertise to maintain.

See Getting Started to learn how to create custom steps, as well as entities and flows.

See Editing a Custom Step Module.

Examples:

  • To add your DHF 4.3 plugin code as is, see an example project in which a Data Hub 5.x custom step calls the headers, triples, and content JavaScript libraries copied from a DHF 4.3 project.
  • To rewrite your DHF 4.3 plugin code into a native Data Hub 5.x module, see an example that uses the custom step code template as a guide.