Lately, I was asked to help out in a project with suffered from a serious performance problem. The Spring-based application was supposed to process business transactions of various types, coming in over channels like JMS and Web services. Each transactions had to bypass a validation pre-processor to ensure that everything was correct for the later steps. Each transaction then would be mingled up and down and ended up in a database persistence layer, handled by Hibernate. So far, rocket science does look different, doesn’t it?
What was happening under the hood?
The environment of this application was pretty straight-forward:
- Hibernate entities were created and accessed by DAO classes, extending HibernateDaoSupport
- Database connectivity was configured via the Spring container and put to work in a HibernateSessionFactory
- The SessionFactory got injected into all DAO classes
- Transaction management was configured on the SessionFactory, using HibernateTransactionManager
- Transaction handling was enacted using Spring-AOP
From the list of technologies involved and components wired together, everything seemed normal and pretty much standard.
The application was processing business transactions from „online“ sources by means of JMS and Web services already quite nicely and had also had a batch-layer for processing larger datasets coming with end-of-day jobs. The pre-processing, mentioned above, was enacted only for the „online“ sources, as the batch processing was considered „safe“.
The application was – up to now – performing as expected and handling things very smoothly.
What changed?
The application was developed incrementally, extending its functionality in each stage. The change now implemented was to add the validation pre-processor also to the batch processing for reasons of sanity and also to address some „rogue“ deliveries which were bothering lately. So, the developer got to work and simply inserted the validation step before the real batch processing started. That way, only valid stuff entered the system.
With this change, batch performance dropped from about a minute per 500.000 transactions to almost 2 hours for the same amount of data. That did not look good.
What the heck is happening?
As you all have probably already expected, the developer focused on the validator as being the most likely part causing the issue (Remember: What got changed last is most likely the first thing to cause trouble!). Nevertheless, he didn’t have any clues whatsoever was going wrong. Since the deadline drew nearer, I jumped in to shed some light into the dark.
To get some feeling about whether it really is the validator, I added some dump timing statements to the code and let it run again (Remember: A smell around does not tell you the exact location, just that it’s somewhere near). The timing statements were put around at some logical points, though nothing highly elaborated. It’s was more the compass approach: Get an initial heading and take a look from there again.
The timings smelled dearly, so I was likely around the spot: a cluster of load operations, probably. I then put the timings into that cluster, with an interesting result. In the beginning, the performance was quite nice, but over time it grew worse.
It turned out that all operations were part of the same Hibernate transaction. That means that each processed business transaction was stored in the same Hibernate session and all load operations were also performed in the very same Hibernate transaction. As everything was done in the same transaction, it was also the very same Hibernate session. As it turned out that caused Hibernate some serious trouble, as the dirty objects in the session grew on and on, being flushed only at the end of the operation.
Reclaimed performance
To address this issue in a first step, I suggested the following:
- Creation of a second HibernateSessionFactory for read-only access, without any TransactionManager
- Changing the „read“ methods in the DAO classes to use the read-only SessionFactory, which was not part of any transaction. This was a sound decision as the business case supported this (Beware that you do not change data in the transaction and want to access it outside the transaction, while not being committed!)
- Adding a second-level cache to Hibernate (EHCache in this case) for that read-only SessionFactory
With these changes, any added entity was only added in the transacted SessionFactory, which could be committed safely at the end of the transaction. All entities needed for the validation process where safely kept in the read-only SessionFactory, which was queried only if the desired entity was not already cached. With these rather simple measures, the performance was reclaimed quite nicely.
The architecture itself will get overhauled nonetheless to better support possible scalability issues, which should further increase performance, but for the quick win, the project was happy.
Next Steps
What should you do now? Start with these things:
- Check your Hibernate performance!
Use either a more advanced tool like JProfiler, or a simple System.currentTimeMillis() check, which does not cost you lot. - On each change affecting the persistence layer, watch your surroundings!
Clearly verify how big your „dirty“ Session might become and plan for even that to be exceeded. - Check your usage of HibernateSessionFactory!
Are you doing read-only and write/update operations on the same SessionFactory, heavily? If so, consider separating those, or dig into Hibernate’s „dirty check“ algorithm. - Check your cache!
Are you already using a second-level cache? If not, try it out! It’ll get some stress off your database and makes your application faster. If you do, do you have adjusted its settings? Optimizing eviction policies and memory usage can help you significantly. - Get a Hibernate training!
Do not think that Hibernate is „simply“ an O/R layer. It’s complex and not using it correctly may cost you both in time and reputation. Remember: If you don’t sharpen your saw, you might better not use it at all.