Spring XD

Spring XD is a unified, distributed, and extensible system for data ingestion, real time analytics, batch processing, and data export. The project's goal is to simplify the development of big data applications.

Quick Start
Fork me on GitHub

Introduction

Spring XD is a unified, distributed, and extensible system for data ingestion, real time analytics, batch processing, and data export. The project's goal is to simplify the development of big data applications.

Big data applications share many characteristics with Enterprise Integration and Batch applications. Spring has provided proven solutions for building integration and batch applications for more than 7 years now via the Spring Integration and Spring Batch projects. Spring XD builds upon this foundation and provides a scalable, fault-tolerant runtime environment that is easily configured and assembled via a simple DSL.

The Spring ecosystem of projects provides an excellent foundation for building big data applications. The Spring XD project aims to build upon this foundation and provide a one stop shop solution for these use-cases. This is in contrast to many other offerings that are more siloed and fragmented. One of our first features is to create an out-of-the-box server that provide a consistent configuration model and runtime that spans the four use-case categories listed above.

You don't need to code anything up to get going, no build scripts, no IDE, no maven coordinates. You can use a high level configuration DSL that will allow you to hit your head against the keyboard and get started quickly. However, if you choose to extend the platform (and we hope you will), Spring provides the foundation for extensibility.

For the curious, XD is an abbreviation for eXtreme Data.

Features

  • Unified platform - Stream Processing and Batch Jobs
    • Off-Hadoop Batch Jobs
    • Hadoop Batch workflow orchestration
    • NoSQL Analytics
    • Scoring of Machine Learning algorithms
  • Runtime that provides critical non-functional requirements
    • Scalable, distributed, Fault-Tolerant
    • Portable. On prem DIY cluster, YARN, EC2.
  • Easy to use
  • Easy to extend and integrate other technologies

Quick Start

Requirements

  • To get started, make sure your system has as a minimum Java JDK 6 or newer installed. Java JDK 7 is recommended.

Manual Installation

  • Download spring-xd-1.0.1.RELEASE.zip
  • Unzip the distribution. This will yield the installation directory spring-xd-1.0.1.RELEASE. All the commands below are executed from this directory, so change into it before proceeding
$ cd spring-xd-1.0.1.RELEASE
  • Set the environment variable XD_HOME to the installation directory <root-install-dir>\spring-xd\xd

OSX Homebrew installation

If you are on a Mac and using homebrew, all you need to do to install Spring XD is:

$ brew tap pivotal/tap
$ brew install springxd

Homebrew will install springxd to /usr/local/bin. Now you can jump straight into using Spring XD:

$ xd-singlenode

Create a stream

  • Start-up the runtime: The single node option is the easiest to get started with. It runs everything you need in a single process. To start it, you just need to cd to the xd directory and run the following command: xd/bin>$ ./xd-singlenode
  • Start XD Shell: ./xd-shell
  • Create your first stream by typing: xd:> stream create --definition "time | log" --name ticktock --deploy