Spring for Apache Hadoop

Spring for Apache Hadoop2.5.0.RELEASE

NOTICE: The Spring for Apache Hadoop project has reached End-Of-Life status on April 5th, 2019. The current Spring for Apache Hadoop 2.5.0 release is built using Apache Hadoop version 2.7.3 and should be compatible with the latest releases of the most popular Hadoop distributions.

Introduction

Spring for Apache Hadoop simplifies developing Apache Hadoop by providing a unified configuration model and easy to use APIs for using HDFS, MapReduce, Pig, and Hive. It also provides integration with other Spring ecosystem project such as Spring Integration and Spring Batch enabling you to develop solutions for big data ingest/export and Hadoop workflow orchestration.

Check out the book from O’Reilly Media Spring Data: Modern Data Access for Enterprise Java that contains several chapters on using Spring for Apache Hadoop. Sample code for the book is also available in the GitHub project spring-data-book.

Features

Support to create Hadoop applications that are configured using Dependency Injection and run as standard Java applications vs. using Hadoop command line utilities.
Integration with Spring Boot to simply creat Spring apps that connect to HDFS to read and write data.
Create and configure applications that use Java MapReduce, Streaming, Hive, Pig, or HBase
Extensions to Spring Batch to support creating Hadoop based workflows for any type of Hadoop Job or HDFS operation.
Script HDFS operations using any JVM based scripting language.
Easily create custom Spring Boot based aplications that can be deployed to execute on YARN.
DAO support (Template & Callbacks) for HBase.
Support for Hadoop Security.

Versions and Distribution Support

Spring for Apache Hadoop supports a number of Apache releases as well as commercial distributions from Pivotal, Hortonworks and Cloudera.

The supported distros varies by release version, see wiki page for details.

Also, see the wiki page for Maven build details.

The continuous integration builds for most supported versions can be seen on the build page.

Spring Boot Config

<dependencies>
    <dependency>
        <groupId>org.springframework.data</groupId>
        <artifactId>spring-data-hadoop</artifactId>
        <version>2.5.0.RELEASE</version>
    </dependency>
</dependencies>

Quickstart Your Project

Bootstrap your application with Spring Initializr.

Get ahead

VMware offers training and certification to turbo-charge your progress.

Learn more

Get support

Tanzu Spring Runtime offers support and binaries for OpenJDK™, Spring, and Apache Tomcat® in one simple subscription.

Learn more

Upcoming events

Check out all the upcoming events in the Spring community.

View all