Saturday, September 29, 2007

Why use NHibernate?

In NHibernate users forum i got a question

"To summarize, what I want is ability to construct queries using ICriteria API, and to bypass hydration - instead of real objects i want to get a raw sql result table"

Yeah for that you do want projection. But I'm wondering why you'd bother with NHibernate then? If you really, truly want to avoid all objects and want a table result-set, why not just use ADO.NET (or other database) directly?


Why do we use NHibernate

1. Surprisingly as it is, OOP is middle solution between functional programming and aspect programming. I want to use objects in my database aware application too. Unfortunately, object oriented databases still are not as evolved as relational databases. And it happens so that we are not working on Zope platform and cannot take advantage of ZODB (Zope Object DB). I want objects, not some random bits of data.

2. My first acquiantance with ORM solution was Neo .Net. The project is no longer in development, but it served very well for learning what is ORM. Neo is decent tradeoff of power and flexibility to ease of use and very low footpring of learning it.

If you happen to be new to ORM world, don't try NHibernate first (even with Castle.ActiveRecord), take some simple ORM tool, and use it. Get to know what the pros and cons are for ORM solutions. Then take NHibernate, since, IMHO, that's the only really flexible and powerful solution.

Here you can find NEO with generics support

3. Our way of programming is strictly object oriented. And having to work with objects allows us to make very incredible functionality in very short amount of time.

4. NHibernate is database agnostic. To explain this: NHibernate is not tied to any datbase. You create your model and the use it ony practically any databse exists. NHibernate is not a 80% solutions. It is 100% solution, and you can have big confidence that NHibernate will support database you need.

In our project, everything runs on FireBird, but tests run on SQLite. Unfortunately there's one problem with SQLite - it does not support database roles. Too bad.

5. NHibernate ir powerful. You can execute your direct sql against the databse, or use powerful ICriteria API. You can both fetch a result set containing object propertiess, and the objects themself.

6. NHibernate is free software, thus we have the source code. The free software license does not deny the right to bundle NHibernate library with your proprietary code.

Performance upgrade month 2

We use hibernate projections for a performance gain, and a very big performance gain.

Projections work this way: instead of loading real objects, we can tell NHibernate to load a specific property instead. So what we did was instead of loading an object with necessary objects pre-fetched, we build an ICriteria query and a projection for each property we need to display in the grid.

The gains are these:
1. We bypass hydration process, so that we get an array of columns, which we can databind to specific datagrid. Databinding process does not involve reflection code to extract data from objects, that's a side effect :)
2. We still have ICriteria interface to specify whatever filters we might need.
3. If one look at NHibernate projections, one would see that projections contains some nice things such as AVG, MIN, MAX, group by, COUNT etc. Do you see the possibilities? Tree views, additional helpful info for various reports, etc.

After making projections, so that it instead of loading 15k object graphs, we select only the properties we display in the grid, the time went down to 3-5 seconds from 15 seconds. That is, the performance gain was 3 to 5 times. That's alot.



Disclaimer
We still have many problems with performance, but at the moment the UI responsiveness is satisfactory.

Here are the other problems we have and what we could improve:

1. We use Sourcegrid as our datagrid solution. It is much much better than System.Windows.Forms.DataGridView, since it provides a rather solid MVC separation and extensibility mechanism. However, it has it's own problems. Some of them might be solved easily, and some might be rooted deep in the sourcegrid itself.

Yes, you guessed it. One of the problems is performance :). SourceGrid has a problem of managing many cells (i.e. > 50k). Maybe that might be solved using VirtualGrid (a class in SourceGrid), have not delved alot into, so can not tell much, now.

P.s. for 15k objects default SourceGrid solution might add 3 to 5 seconds delay solely for constructing cells.

2. Every time we edit an object, we create and destroy a form. Forms could be cached and reused.

3. Object loading might be put on a separate thread, and instead of loading whole 15k rows, a thread could small chunk first, display it in grid, then load the other rows. This would increase UI responsiveness.

4. Sometimes some forms fail to open in 3 seconds. This happens when for example object Group has a collection of Students (count > 50), a collection of Companies (count > 50), a collection of other stuff, etc. All those collections we put into separate tabs in the form. We could make that each tab load the data incrementaly, and make the form responsive to user actions as soons as it has minimum amount of data. I.e. something like ajax thing, but for desktop. We might call it Desktop 2.0 :D

Performance upgrade month

Last month for our product was performance upgrade month. Wise men say: "for beginers - do not optimize. For experts - do not optimize, yet". So we did not optimize for a very long time, but the time has come to do it.

For this various random thoughs popped, and some posts were written to NHibernate users forums. The first post and the second.

The problem:
We have a program where you can choose whatever columns you want to view in a datagrid. And those columns can be chosed whatever you can trace in the object graph, having that you only go only via one-to-one relationships (i.e. the "one" side of the "one-to-many" relationship).

Say, we are viewing a collectin of Teaching Group objects. Then in the grid we can choose any properties from the object graph, for example:


group object properties
Group.TeachingProgramme properties
Group.GroupLeader properties
Group.GroupLeader.Company properties
etc.



The problem is that if we have very big object graph, the it gets extremely slow to load all object graph.


We did tests withs 15k objects (which is not much), and we had around 15 seconds delay to load all those objects, and then display in the grid (having selected around 30 properties, with relationships, which results into select with 9 joins)

In another test we selected absolutely all properties to show in the grid (I did not count them, but the list was long, I believe something more than 50 columns in the grid). The NHibernate generated sql query was a few pages long, and it had 22 joins. Now if we count how many objects NHibernate has to create every time you want to refresh the grid. Say we have 17k objects, and each object has 22 subobjects. One left join is one subobject. So NHibernate would have to create 17k*22 ~> 400k objects. It's not the memory consumption that matters, it is the hydration process that matters. Just imagine how many lookups NHibernate has to do in order to set-up references for all those 400k objects. To add more, we still use Ayende's NHibernate.Generics package, even though it was deprecated with NHibernate 1.2. That library with its automatic association handling between objects adds some overhead too, though I can not say how much.

Then we thought what was the problem.
The first though was that joins cause that lot of problem, since we checked that with zero joins our grid fetches 15k objects and display them in 6 seconds, and 4 joins in 9 seconds, and 9 joins in 14 to 16 seconds.

It was reasonable to blame associations, but the problem was not actually in the joins.

Next test was we took the select statement generated with NHibernate with 22 joins and executed against our database (which happened to be FireBird). The result was promising - 17k objects with all properties where fetched in less than 4 seconds. (P.s. time measurement was done with eye-movement precision, that is, we just watched at the clock and counted the seconds).

The results suggested that it is not the database with many joins perform slow, but NHibernate. Most probably the hydration process gets slower as the depth of our fetched object graph increases. That is, whenether we add a column witch adds a new patch in object graph, the process becomes a lot slower, since it increases exponentialy the amount of objects NHibernate has to hydrate for us.

So the answer is clear - we have to bypass hydration process.

Friday, September 28, 2007

Web application unit testing

How to test applications on .Net

The thing that popped into my mind is "User Interface testing is not stable". Period.

Any time I tried to test user interface, everything ended up in recycle bin next week, if not next day. If you want to tage huge advantage of unit testing your code, you must write it clean and separated. That's true both for web and desktop applications.

If you ask what is clean and separated code I would describe it like this: your code must be written in as small chunks as possible. Your functions should only one single thing, your classess should have only one responsibility, your namespaces should be grouped by behaviour.

It's like in orchestra - if you have a violin, a guitar and a maestro, then you have three responsibilities - play violion, play guitar and act as a maestro.

If you happen to write a Web application, do not follow blind examples of putting together your data loading, decision and presentation logic into one single code-behind (or whatever) class. It's allways the same three letters - Model / View / Controller - MVC.

It's as important to separate M from C, as to P from M and C. Your presentation can NEVER be together with your controllers. Clean separation is both a prerequisite for unit testing, and a result of it.

Web unit testing could check your model and controller classes. hammet, maintainer of Castle Project also tests ViewComponents.

As for unit testing tools, one of the more popular are watin and Selenium. You also check my presentation about selenium: pdf format odp format

Friday, September 21, 2007

How to get perfect code

macournoyer is a good guy.

Read his blog about 5 ways to achieve perfect code


I would correct him - not five ways, but five things you have to do contstantly