Create a Project Using QuickStart

Before you begin

You need:

About this task

QuickStart is the easiest way to use MarkLogic Data Hub.

In this task, you will download and run the QuickStart .war file to do the following:

  • Set up the local directories and files required for your project.
  • Deploy the required Data Hub components to your MarkLogic Server.
Important: QuickStart is not supported for production use.

Procedure

  1. Create a directory for your Data Hub project. This directory will be referred to as "your project root" or simply "root".
  2. Open a command-line window, and go to your project root directory.
  3. Download the marklogic-datahub-5.0.0.war file and place it your project root directory.
  4. Run the QuickStart .war.
    • To use the default port number for the internal web server (port 8080):
      java -jar marklogic-datahub-5.0.0.war
    • To use a custom port number; e.g., port 9000:
      java -jar marklogic-datahub-5.0.0.war --server.port=9000
    Note: If you are using Windows and a firewall alert appears, click Allow access.

    QuickStart command-line output

  5. Go through the wizard to initialize your project and install Data Hub to your MarkLogic Server.
    1. Open a web browser, and navigate to http://localhost:8080.
    2. Browse to your project root directory. Then click NEXT.


    3. Click INITIALIZE to initialize your project directory.


    4. After initializing your Data Hub Framework project, your project directory contains additional files and directories. Click NEXT.


    5. Choose the local environment, then click NEXT.


    6. Enter your MarkLogic Server credentials, then click LOGIN.


    7. Click INSTALL to install the Data Hub into MarkLogic.


    8. Wait for the installation to complete.


    9. When installation is complete, click FINISHED.


Results

When installation is complete, the Dashboard page displays the three initial databases and the number of records in each.

  • Staging holds ingested data.
  • Final holds processed data.
  • Jobs holds data about the jobs that are run and tracing data about each processed record.

The STAGING and FINAL databases are prepopulated with default steps and flows.