Saturday, September 29, 2007

Performance upgrade month

Last month for our product was performance upgrade month. Wise men say: "for beginers - do not optimize. For experts - do not optimize, yet". So we did not optimize for a very long time, but the time has come to do it.

For this various random thoughs popped, and some posts were written to NHibernate users forums. The first post and the second.

The problem:
We have a program where you can choose whatever columns you want to view in a datagrid. And those columns can be chosed whatever you can trace in the object graph, having that you only go only via one-to-one relationships (i.e. the "one" side of the "one-to-many" relationship).

Say, we are viewing a collectin of Teaching Group objects. Then in the grid we can choose any properties from the object graph, for example:


group object properties
Group.TeachingProgramme properties
Group.GroupLeader properties
Group.GroupLeader.Company properties
etc.



The problem is that if we have very big object graph, the it gets extremely slow to load all object graph.


We did tests withs 15k objects (which is not much), and we had around 15 seconds delay to load all those objects, and then display in the grid (having selected around 30 properties, with relationships, which results into select with 9 joins)

In another test we selected absolutely all properties to show in the grid (I did not count them, but the list was long, I believe something more than 50 columns in the grid). The NHibernate generated sql query was a few pages long, and it had 22 joins. Now if we count how many objects NHibernate has to create every time you want to refresh the grid. Say we have 17k objects, and each object has 22 subobjects. One left join is one subobject. So NHibernate would have to create 17k*22 ~> 400k objects. It's not the memory consumption that matters, it is the hydration process that matters. Just imagine how many lookups NHibernate has to do in order to set-up references for all those 400k objects. To add more, we still use Ayende's NHibernate.Generics package, even though it was deprecated with NHibernate 1.2. That library with its automatic association handling between objects adds some overhead too, though I can not say how much.

Then we thought what was the problem.
The first though was that joins cause that lot of problem, since we checked that with zero joins our grid fetches 15k objects and display them in 6 seconds, and 4 joins in 9 seconds, and 9 joins in 14 to 16 seconds.

It was reasonable to blame associations, but the problem was not actually in the joins.

Next test was we took the select statement generated with NHibernate with 22 joins and executed against our database (which happened to be FireBird). The result was promising - 17k objects with all properties where fetched in less than 4 seconds. (P.s. time measurement was done with eye-movement precision, that is, we just watched at the clock and counted the seconds).

The results suggested that it is not the database with many joins perform slow, but NHibernate. Most probably the hydration process gets slower as the depth of our fetched object graph increases. That is, whenether we add a column witch adds a new patch in object graph, the process becomes a lot slower, since it increases exponentialy the amount of objects NHibernate has to hydrate for us.

So the answer is clear - we have to bypass hydration process.

No comments: