Simple ETL with Spring Batch

Spring Batch is literally a batch framework based on Spring Framework. I usually use it to develop a simple ETL(Extraction, Transaformation and Loading) program.

In this post, I’ll show you how to write a simple ETL program. (This sample is tested on Spring Batch 3.0.10)


  • Database (MySQL or Oracle)
  • Spring batch context database
  • Spring batch libraries (spring-batch-core, spring framework, spring-jdbc)
  • Database JDBC driver
  • DBCP (Optional)

Spring batch context database

Spring batch context database must be created to run Spring batch job. Table creation DDL can be found on spring-batch-core-version.jar (org.springframework.batch.core package contains DDL for several databases)

It contains the following tables.


Test scenario

In this post, I use two tables : TB_SOURCE, TB_TARGET. The sample program reads from TB_SOURCE and writes all data into TB_TARGET.

Writing base Spring context

To run a Spring batch program, some beans need to be declared.

  • Spring Batch Context DataSource
  • DataSource for source and target
  • TransactionManager for Spring Batch Context
  • TransactionManager for source and target
  • Job Repository bean
  • Job Launcher (which is the starting point of a job)

The following is a snippet of context.xml.

Writing Job flow

Now, it’s ready to write a ETL program. A job is also declared in spring context.xml. The following is the basic structure of a Job.


  • A job can have several Steps (for simplicity, I use one Step for this sample)
  • Each Step has a Reader, a Processor and a Writer
  • Reader reads data from database, file or some data store
  • Reader invokes RowMapper to convert raw data into Source VO
  • Processor reads a Source VO and convert it into Target VO
  • Writer writes Target VO into database, file or some data store

The following is a sample ETL job definition.

As the above sample shows, if a Reader or a Writer’s target is database, SQL can be used directly. (It’s the strength of Spring batch)

Writing RowMapper

RowMapper is a component which converts raw source data into Source VO. It must implement org.springframework.jdbc.core.RowMapper. The sample RowMapper is as follows.

Writing Processor

Processor is the core of Spring Batch. It can transform source data, verify it or execute any additional logic. In this sample, it simply maps source to another column name. The sample Processor is as follows.

Writing Source VO

Writing Target VO

Target VO must have getter method in this sample.


And there is no more components to write except a program which invokes this sample Job.

You can download the full sources from github.






Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.