Spring Batch – Imperfect Yet Worthwhile

January 23rd, 2012 Jeff Zapotoczny, Consultant  (email the author)

A lot of folks have used the Spring framework to build applications – and what’s not to love? It’s allowed us to solve enterprise problems with a minimum of tedium. And the framework has evolved to address criticisms, for example the continued reduction in need for lengthy XML configuration in application contexts in favor of ever more terse syntax and support for code annotations.

The framework as-packaged includes facilities to help with many common aspects of contemporary application development, from a Model-View-Controller (MVC) framework to  integration with persistence managers and elegant transaction management. But it doesn’t necessarily cover every need. One example of a gap is batch processing.

The idea behind batch processing is to allow a program to run without the need for human intervention – this may involve solving a problem that takes a lengthy amount of time, processing a large input set, or just automating some business process that doesn’t necessarily need to be triggered by a human. The batch paradigm was far more dominant in years past than it is in contemporary business computing, but that doesn’t mean valid batch use cases don’t continue to crop up. The operating system you’re using right now likely uses a batch approach to do all sorts of things for you without bugging you about it. Enjoying those automatic backups?

So what if you’ve got a problem that calls for a batch style solution but you’d still like to use Spring? On a recent project, we realized that much of the business logic we’d just finished writing within a web service framed with Spring could be reused for an upcoming batch component (which involved processing a potentially very large XML document that would be uploaded by a business partner). We briefly considered rolling a separate application that reused some of the same code, but before we went down that road someone mentioned hearing of “Spring Batch” – something that was not a part of Spring proper, but an auxiliary framework meant to complement it.

As you can read on its home page,

Spring Batch builds upon the productivity, POJO-based development approach, and general ease of use capabilities people have come to know from the Spring Framework, while making it easy for developers to access and leverage more advanced enterprise services when necessary.

Sounds great! So I dove in and started learning more about it. As I read the documentation, however, I have to admit I got a little overwhelmed by the complexity and amount of configuration needed for even a simple example. Unlike your average Spring application context, the context of a Spring Batch application grows pretty quick and involves configuring a lot of stuff that, at the outset, it just doesn’t seem like you should need to configure. A “job repository” to track the status and history of job executions, which itself requires a data source – just to get started? Wow, that’s a bit heavy handed. Worried that I might be misunderstanding what I’d found, I did some searching and found that I wasn’t alone in thinking the API left something to be desired.

But I’m not going to get as down on Spring Batch as the author of the article I just linked to. The reason being – once I determined what we had to configure in order to get a functional batch job running, I discussed the components involved with my team members and our client and we realized we would appreciate having all of those things. Logged history of batch runs in a database would be good, especially if there were ever a problem or failure. Then our client asked – will we have a UI to control it if we need to?

Enter Spring Batch Admin. It’s unfortunate that the Admin capability is distributed separately from Spring Batch itself – because unless you hear about it or search with the right query, you might wind up rolling your own administrative UI unnecessarily. Of course you can write your own if you’d like – especially if you need to fit in with some pre-existing administrative app you may already have in place – but chances are you have better things to do and batch processing is just one portion of your project.

I’ve got a very important time-saving tip if you think you’d like to use the Admin UI, though: decide if you need it first before starting to develop your batch capability. I made the mistake of thinking it would be easier to get our batch capability working headlessly first, then bolt the Admin app onto the project, in an attempt to reduce complexity and break things down. Instead I wound up having to redo some of the POJOs and configuration involved in kicking off jobs to make what I’d done compatible with the Admin framework. In fact, starting with the Admin framework, and getting it up and running with its ability to run abstract jobs first, then customizing the jobs to meet your needs, would be the path of least resistance.

Once I had everything together, though, what a sweet capability this added up to! Using Spring Batch plus Spring Batch Admin gave my team a relatively low-overhead solution that allowed us to easily hand batch testing duties over to our client – all they had to know how to do was access a web page and click a button. Plus, we’ve given them a framework and a concrete example that is now easily repeatable – given a few POJOs and some copy/paste/modify of the application context XML, they could easily add new batch job types in the future.

Be Sociable, Share!

Entry Filed under: Agile and Development

9 Comments Add your own

  • 1. Paco  |  January 25th, 2012 at 7:49 pm

    And finally do you recommend other alternatives for lightweigh batch-proccessing?

  • 2. Sam  |  January 26th, 2012 at 12:13 am

    I’ve been using Spring Batch for some time now & I find it pretty awesome. The frameworks got all the right components – specially when it comes to building scale-able batch jobs.

    Its (the only?) industry standard batch framework available that I know of!

    Quote
    “I wasn’t alone in thinking the API left something to be desired.”

    That article about three years old, and spring batch has come a long way since. There been loads of improvements between version 1.xx and version 2.xx

    P.S. – You can also have a in memory repository and so the point of requiring a database is moot. But like you said, it’s definitely good to have it persisted in a database!

  • 3. Jeff Zapotoczny  |  January 26th, 2012 at 10:41 am

    Like Sam’s already said, Spring Batch seems to be the only game in town when it comes to free batch processing frameworks.

    You’re right, Sam, in that Batch seems to have improved pretty drastically. My point was just that there’s a higher startup cost to get a custom batch job running than is typical for most other Spring endeavors I’ve tried. In the end, though, it’s all good.

  • 4. sactel  |  February 2nd, 2012 at 6:09 am

    but talking about real practical scenarios like

    *handling server crash (where the framework fails to store the state in job repos), how can this be handled; any concrete information that can be shared ?

    *for job running say every 30 min (scheduled by external scheduler and batch started using commandjobrunner), what logic can be used to restart the failed job (automatically) . keeping in view job instance= job + job parameter; any sample for this

    thanks in advance

  • 5. Jeff Zapotoczny  |  February 2nd, 2012 at 2:17 pm

    I don’t have convenient examples for either case. But in both cases I’d be thinking about approaching it by having a scheduled task check the job repository to see if the desired job was run successfully and, if not, restart it.

  • 6. sactel  |  February 3rd, 2012 at 3:04 am

    this is precisely my point – when it comes to practical scenarios like these there is not enough help or information out there (documentation / samples).

    that’s why i feel in enterprise where batch is critical – batch platforms like datagrid, tivoli wlm etc would be a better strategy? just an opinion from enterprise architecture perspective, kindly note, though, i am myself am using spring batch framework

  • 7. Tariq Ahsan  |  June 22nd, 2012 at 9:16 am

    Though related to Spring Batch but yet completely separate topic.
    Our development group is trying to propose using Spring Batch for migrating existing legacy based batch processes. But client is requesting list of reputable companies or organizations who are already using Spring Batch.
    Is it possible to get this sort of information?

    Thanks

  • 8. Eesan  |  March 8th, 2014 at 3:50 am

    I see that when people say “Spring Batch” is the only available batching framework, it clearly means they are just promoting it. what about Quartz ? anybody heard of it ??

  • 9. Bryant  |  June 9th, 2014 at 12:20 pm

    Um Eesan, Quartz is just a scheduling framework, all it can do is schedule a spring batch job, but Quartz is NOT batch application framework.

    Currently spring batch is the only batching framework for java that I found worth while to learn, because the Java EE7 batch framework api is just an API spec and doesn’t have any implementation for jdbc cursor readers or paging jdbc readers or xml streaming writers, etc etc). Sure the Oracle evangelists would love to tell you that you can write it yourself. Sure you can write anything yourself, it just wastes time and the project would get done much sooner is you don’t roll your own crappy implementation that now needs to be supported.

Leave a Comment

Required

Required, hidden


6 + five =

Some HTML allowed:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Trackback this post  |  Subscribe to the comments via RSS Feed

© 2010-2014 Summa All Rights Reserved