Flows
Overview of flows in Data Hub.
About Flows
A flow defines a sequence of one or more steps, which are modules that process and enhance your data. A flow declares which steps will be executed in what order and with which options.
For example, a flow can be comprised of an ingestion step to pull in your raw data, followed by a mapping step to specify how values are assigned to the properties of your entity model based on the fields of your raw data.
Each flow must contain at least one step. The number of steps in a flow is unlimited; however, flows with fewer steps can be easier to maintain and debug.
Data Hub flows are not designed to replace your external orchestration tool (e.g., Apache NiFi); however, chaining multiple steps in your flows might reduce the complexity of the scenarios that your orchestration tool must handle.
- In Data Hub Framework 4.x, a flow (input or harmonization) is comprised of plugins that process the data.
- In Data Hub 5.x, a flow is a sequence of steps (ingestion, mapping, matching, merging, mastering, or custom) that process the data.
v4.x and earlier | v5.0 | v5.1 | v5.4 - HC Format |
---|---|---|---|
Input flow | Ingestion step | Ingestion step | Loading step |
Harmonization flow | Mapping step | Mapping step | Mapping step |
Mastering step |
|
||
Custom step |
|
|