Concepts Guide (PDF)

Concepts Guide — Chapter 2

« Previous chapter
Next chapter »

How is MarkLogic Server Used?

This chapter includes some customer stories. These stores are based on actual customer use-cases, though the customer names are fictious and some of the details of their internal operations have been changed.

MarkLogic Server currently operates in a variety of industries. Though the data stored in and extracted from MarkLogic is different in each type of industry, many customers have similar data-management challenges.

Common themes include:

The topics in this chapter are:

Publishing/Media Industry

BIG Publishing receives data feeds from publishers, wholesalers, and distributors and sells its information in data feeds, web services, and websites, as well as through other custom solutions. Demand for the vast amount of information housed in the company's database was high and the company's search solution working with a conventional relational database were not effectively meeting that demand. The company recognized that a new search solution was necessary to help customers retrieve relevant content from its enormous database.

The database had to handle 600,000 to 1 million updates a day while it is being searched and while new content is being loaded. The company was typically six to eight days behind from when a particular document would come in to when it would be available to its customers.

MarkLogic combines full-text search with the W3C-standard XQuery language. The MarkLogic platform can concurrently load, query, manipulate and render content. When content is loaded into MarkLogic, it is automatically converted into XML and indexed, so it is immediately available for search. Employing MarkLogic enabled the company to improve its search capabilities through a combination of XML element query, XML proximity search, and full-text search. MarkLogic's XQuery interface searches the content and the structure of the XML data, making that XML content more easily accessible. It took only about 4 to 5 months for the company to develop the solution and implement it.

The company discovered that the way in which MarkLogic stores data makes it easier for them to make changes in document structure and add new content when desired. With the old relational database and search tools, it was very difficult to add different types of content. Doing so, used to require them to rebuild the whole database and that would take 3 to 4 weeks. With MarkLogic, they can now restructure documents and drop in new document types very quickly.

Another key benefit is the cost savings the company has realized as a result of the initiative. The company needed a full-time employee on staff to manage their old infrastructure. Now, the company has an employee who spends one-quarter of his time managing the MarkLogic infrastructure. The company saves on the infrastructure side internally and their customers get the content more quickly.

Government / Public Sector

The Quakezone County government wants to make it easier for county employees, developers and residents to access real-time information about zoning changes, county land ordinances, and property history. The county has volumes of data in disparate systems and in different formats and need to provide more efficient access to the data, while maintaining the integrity of the record data. They need a solution that fits within county IT infrastructure, that can be quickly implemented, and that keeps the hardware and licensing costs both low and predictable.

The solution is to migrate all of existing PDF, Word, or CAD files from the county's legacy systems into MarkLogic, which provides a secure repository for all of the record data, easy-to-use search and the ability to display the results in a geospatial manner on a map.

By having their data centralized in MarkLogic, county clerks can access all of the data they need from one central repository. MarkLogic enables the county to transform and enrich the data, as well as to view and correlate it in multiple ways by multiple applications. Tasks that once took days or weeks to accomplish can now be completed in seconds or minutes. Additionally, Quakezone County can make this information even more accessible to its constituents by deploying a public-facing web portal with powerful search capabilities on top of the same central MarkLogic repository.

Financial Services Industry

TimeTrader Services Inc. provides financial research to customers on a subscription basis. Because every second counts in the fast-paced world of stock trading, the firm needs to deliver new research to its subscribers as quickly as possible to help them make better decisions about their trades.

Unfortunately, these efforts were hampered by the firm's legacy infrastructure. Because of shortcomings with the current tool they were not able to easily respond to new requirements or to fully leverage the documents that were being created. Additionally they could not meet their goals for delivering alerts in a timely fashion.

TimeTrader Services replaced its legacy system with a MarkLogic Server. Now the firm can take full advantage of the research information. The solution drastically reduces alert latency and delivers information to the customer's portal and email. In addition, the ability to create triple indexes and do semantic searches has vastly improved the user experience.

Thanks to the new system, TimeTrader Services delivers timely research to 80,000 users worldwide, improving customer satisfaction and competitive advantage. By alerting customers to the availability of critical new research more quickly, financial traders gain a definite edge in the office and on the trading floor.

Healthcare Industry

HealthSmart is a Health Information Exchange (HIE) that is looking into using new technologies as a differentiating factor for success. They seek a technology advantage to solve issues around managing and gaining maximum use of a large volume of complex, varied, and constantly changing data. The number of requests for patient data and the sheer volume of that data are growing exponentially, in communities large and small, whether serving an integrated delivery network (IDN), hospital, or a large physician practice.

These challenges include aggregating diverse information types, finding specific information in a large dataset, complying with changing formats and standards, adding new sources, and maintaining high performance and security, all while keeping costs under control.

HIE solutions that solve big data challenges must meet the strict requirements of hospitals, IDNs and communities to lead to an effective and successful exchange. To develop a successful HIE, communities need to embrace technologies that help with key requirements around several important characteristics:

  • Performance: The system should be able to provide real time results. As a result, doctors can get critical test results without delay.
  • Scalability: As data volumes grow, the system should be able to scale quickly on commodity hardware with no loss in performance. Hospitals can then easily accommodate data growth in systems critical to patient care.
  • Services: An effective exchange should have the option of rich services such as search, reporting, and analytics. Doctors will be notified if a new flu trend has developed in the past week in a certain geographic location.
  • Systems: both of which will impact the quality, risks and costs of care.
  • Interoperability: It should be easy to integrate different systems from other members of the community into the exchange through a common application programming interface (API). Community members can leverage the exchange sooner to share data and improve patient care.
  • Security: Only authenticated and authorized users will be allowed to view private data. Community members want to ensure patient privacy and also comply with regulations such as HIPAA.
  • Time to delivery: Implementation should be measured in weeks, not months or years and overhead should remain low. Healthcare can save millions in maintenance and hardware with low overhead, next generation technology.
  • Total cost of ownership: The system must make economic sense. It should help to cut healthcare costs, not increase them.

By leveraging MarkLogic, HealthSmart gained a significant performance boost that reduced queries and transformations to sub-second response times, which was critical for accomplishing their mission. In addition, MarkLogic's flexible data model enabled integration of new sources of data in a matter of days instead of weeks or months.

MarkLogic can efficiently manage billions of documents in the hundreds of terabytes range. It offers high speed indexes and optimizations on modern hardware to deliver sub-second responses to users. HealthSmart leverages the performance and scalability benefits of MarkLogic not only to quickly deliver information to users, but also to grow the deployment as the load grows.

Consequently, HealthSmart provides the wide range of functionality required in information heavy healthcare environments today. Features such as full-text search, granular access to information, dynamic transformation capabilities, and a web services framework all lower the development overhead typically required with other technologies. This makes HealthSmart a feature-rich HIE solution that does not make the tradeoffs that other solutions make.

HealthSmart is particularly advantageous with regard to interoperability. Since MarkLogic is based on XML, and XML is widely used in healthcare systems, the technology fit is ideal. The ability to load information as is dramatically lowers the barrier of adding new systems in an HIE community. This, in part, enabled HealthSmart to build the patient registry component in a mere two months, far faster than any other vendor. HealthSmart's dynamic transformation capabilities facilitate compliance with transport standards and regulatory legacy interfaces. And MarkLogic's services-oriented architecture (SOA) and its support for building REST endpoints enable an easy and standardized way to access information.

As an enterprise-class database, MarkLogic supports the security controls needed to keep sensitive information private. MarkLogic is used in top secret installations in the Federal Government, and provides access controls to ensure classified data is only accessible to authorized personnel.

Finally, HealthSmart and MarkLogic help to significantly lower time-to-delivery and the total cost of ownership. The lower overhead in adding new community members directly leads to quick adoption and cost savings. MarkLogic's optimization on modern, commodity hardware enables exchanges to benefit from lower cost hardware systems. High performance enables systems with fewer hardware servers, and scalability allows growth by simply adding more commodity servers, rather than replacing existing servers with larger, high cost servers.

Other Industries

Other industries benefiting from deployments of MarkLogic Server include:

  • Legal -- Laws, regional codes, public records, case files, and so on.
  • Government Intelligence -- Identify patterns and discover connections from massive amounts of heterogeneous data.
  • Airlines -- Flight manuals, service records, customer profiles, and so on.
  • Insurance -- Claims data, actuary data, regulatory data, and so on.
  • Education -- Student records, test assembly, online instructional material, and so on.

For more information on customers and their uses of MarkLogic, see http://www.marklogic.com/solutions/.

« Previous chapter
Next chapter »
Powered by MarkLogic Server | Terms of Use | Privacy Policy