Envelope Pattern
Overview
The MarkLogic Data Hub uses the envelope pattern to encapsulate data. In the envelope pattern, the original content and the associated metadata are stored in the same envelope (an entity) but remain separate. This preserves the original content, while allowing additional metadata to be added.
{
"envelope": {
"headers": {},
"triples": [],
"instance": {
"your original data": "goes here"
}
}
}
<envelope>
<headers/>
<triples/>
<instance>
your original data goes here
</instance>
</envelope>
In MarkLogic Data Hub, metadata can be added to the envelope to harmonize data with different field names and/or formats, while the original data is preserved.
Example
Suppose you have two data sources that both contain gender information but in fields with different names and different formats.
- Source 1 has a field named
gender
. - Source 2 has the same information in a field named
gndr
.
With the envelope, you can normalize the fields into a single field (e.g., normalizedGender), which is stored as metadata in the same envelope. Then applications can query your data hub using the same field name and get the result in the same format (e.g., female
), regardless of the data source.