Spring XD

Spring XD is a unified, distributed, and extensible system for data ingestion, real time analytics, batch processing, and data export. The project's goal is to simplify the development of big data applications.

Quick Start
Fork me on GitHub

Benefits

Unified Platform

Spring XD is a unified platform for a fragmented Hadoop ecosystem. It’s built on top of battle-tested open source projects, and dramatically simplifies orchestration of Big Data workloads and data pipelines.

Open and Extensible

Spring XD is built to be adapted from the ground up to suit your enterprise’s unique needs, not dictate your technology choices for you. Extend in any direction with open plug-in points for your existing technology investments, implemented with simple Java classes.

Developer Productivity

Developers new to Big Data can use a no-coding, configuration driven tool to develop Spring XD applications. Java developers can also easily extend the platform or the DSL with familiar extensibility, testing, and automation tools inherited directly from Spring Batch & Integration.

The unified platform for big data

Features

Data from anywhere, to anywhere

Data-driven apps require refined and consolidated data at scale. Spring XD’s stream and batch workflow lets you build pipelines to consume data from various endpoints and consolidate them in Hadoop, in-memory data grids such as Redis or GemFire, and virtually any data store.

Rock-solid distributed runtime

Spring XD provides PMML model scoring to compute predictions in real-time. Apache Spark Streaming is an out-of-the-box processor module in Spring XD, and can be plugged in to perform online machine learning with the help of MLLib algorithms.

Deep Analytics

Developers new to Big Data can use a no-coding, configuration driven tool to develop Spring XD applications. Java developers can also easily extend the platform or the DSL with familiar extensibility, testing, and automation tools inherited directly from Spring Batch & Integration.

Developer Friendly

It’s easy to integrate data with Hadoop and any data store - like Greenplum Database, HAWQ or GemFire. No coding is required to use the DSL (Domain Specific Language) and interacting with the server is done via REST, in any programming language.

Monitoring and Management

Remote monitoring and management of the runtime components are supported via JMX endpoints. A built-in Admin UI allows visualization and remote management of containers in the distributed setup.

Portable and Extensible Runtime

Spring XD runs anywhere Java does - on-prem, Pivotal Cloud Foundry, YARN, EC2, Mesos, Docker, etc. A plug-in based architecture allows Java/Hadoop experts to extend the runtime components, allowing DSL (Domain Specific Language) users to leverage the extensions immediately.

Use-Cases

Closed-loop Analytics

Spring XD orchestrates the entire analytics loop - gathering data from any source, triggering actions, handling feedback loops from machine learning models, and computing real-time predictions.

Internet of Things

Enable predictive analytics in-real time over large amounts of machine data, driving business and operation improvements in real-time. Spring XD’s data-integration adapters connect with various data-producing devices, and can be extended to meet any unique device or protocol.

Batch Workflow Orchestration

Traditional enterprise “Big Data” was often done with batch processing. Get productive by using out-of-the-box jobs as templates - avoiding the need to write code. The infrastructure, environment specifics and automation is handled by Spring XD, allowing the enterprise to solely focus on business logic.

Complex Event Processing

Spring XD provides integration with Project Reactor Streams, RxJava Observables, and Spark Streaming. Creating a data stream processor in XD allows you to use a functional programming model to filter, transform and aggregate data in a very concise and performant way. By working with events as you would with collections, Spring XD’s reactive-stream integration allows you to build complex event processors to respond to events in real-time.

Quick Start

Requirements

  • To get started, make sure your system has as a minimum Java JDK 7 or newer installed.

Manual Installation

  • Download spring-xd-1.1.2.RELEASE.zip
  • Unzip the distribution. This will yield the installation directory spring-xd-1.1.2.RELEASE. All the commands below are executed from this directory, so change into it before proceeding
$ cd spring-xd-1.1.2.RELEASE
  • Set the environment variable XD_HOME to the installation directory <root-install-dir>\spring-xd\xd

Create a stream

  • Start-up the runtime: The single node option is the easiest to get started with. It runs everything you need in a single process. To start it, you just need to cd to the xd directory and run the following command: xd/bin>$ ./xd-singlenode
  • Start XD Shell: ./xd-shell
  • Create your first stream by typing: xd:> stream create --definition "time | log" --name ticktock --deploy