Issue
Our Spring Web Application uses Spring Batch with Quartz to carry out complex jobs. Most of these jobs run in the scope of a transaction because if one part of the complex system fails we want any previous database works to be rolled back. We would then investigate the problem, deploy a fix, and restart the servers.
It's getting to be an issue because some of these jobs do a HUGE amount of processing and can take a long time to run. As execution time starts to surpass the 1 hour mark, we find ourselves unable to deploy fixes to production for other problems because we don't want to interrupt a vital job.
I have been reading up on the Reactor implementation as a solution to our problems. We can do a small bit of processing, publish an event, and have other systems do the appropriate action as needed. Sweet!
The only question I have is, what is the best way to handle failure? If I publish an event and a Consumer fails to conduct some critical functionality, will it restart at a later time?
What if an event is published, and before all the appropriate consumers that listen for it can handle it appropriately, the server shuts down for a deployment?
Solution
I just started to use reactor recently so I may have some misconception about it, however I'll try to answer you.
Reactor is a library which helps you to develop non-blocking code with back-pressure support which may help you to scale your application without consuming a lot of resources.
The fluent style of reactor can easily replace Spring Batch however the reactor by itself doesn't provide any way to handle transaction nor Spring and in case the jdbc current implementation it will be always blocking since there's no support in the drive level to non-blocking processing. There are discussions around how to handle transactions anyway but as far as know there's no final decision about this matter.
You can always use transactions but remember that you are not going to have non-blocking processing since you need to update/delete/insert/commit in the same thread or manually propagate the transactional context to the new thread and block the main thread
So I believe Reactor won't help you solve your performance issues and another kind of approach may take place.
My recommendation is:
- Use parallel processing in Spring Batch - Find the optimal chunk number - Review your indexes (not just create but delete it) - Review your queries - Avoid unneeded transformations - And even more important: Profile it! the bottleneck can be something that you have no idea
Answered By - Felipe Rotilho
Answer Checked By - Marilyn (JavaFixing Volunteer)