NHibernate – Performance Lesson (Re)Learned

While trying to resolve the following exception from NHibernate, I (re)learned an important lesson in performance tuning.

Exception type: ADOExceptionException message: could not initialize a collection: 

We receive the above exception when our site is under high load (topic for another post). The collection that it complains about most is the OpenHouses collection in Listing. (A Listing is associated to one or more OpenHouses.)

[HasMany(typeof (OpenHouse),    Where = "EndDateTime > getdate()",    OrderBy = "StartDateTime",    BatchSize = 8,    Lazy = true)]public IList OpenHouses{    get { return openHouses;  }    set { openHouses = value; }}

We use the OpenHouses collection anytime we want to know if a Listing HasOpenHouses (as evident by the method defined below).

public bool HasOpenHouses{    get { return OpenHouses.Count > 0; }}

This is a pretty dumb way of checking to see if a Listing has at least one open house. Or is it?Why would this potentially be a bad way of checking the count of a collection? When the Count property is accessed from the OpenHouses collection the entire OpenHouses collection is loaded from the database. This is bad, right? We just need the count why would we load the whole collection.Given this “wisdom”, I set off to find a better way. I searched the internet and found this in the Hibernate documentation.How can I find the size of a collection without initializing it?

Integer size = (Integer) s.createFilter(    collection,    "select count(*)" ).uniqueResult();

Ok. How do I do this in Castle’s ActiveRecord?

public bool HasOpenHouses{    get    {        int count = Convert.ToInt32(Execute(            delegate(ISession session, object obj)            {                Listing listing = (Listing) obj;                return session.CreateFilter(                    listing.OpenHouses,                    "select count(*)").UniqueResult();            }));        return count > 0;    }}

After implementing the “better” solution, I decided that I should measure the performance. Not that I really needed to, everyone can see that this is a much better way to count OpenHouses. Wrong. This is a much worse way of counting OpenHouses.Why?

  1. OpenHouse is a very lightweight object (very few fields and no associations to other objects).
  2. There are very few OpenHouses associated with any given Listing.
  3. NHibernate cannot batch this version of HasOpenHouses. It must execute a query each time the HasOpenHouses question is asked. In our system, that turns out to be 8 times per page of results (once for each Listing displayed).
  4. The count query turns out to be no faster than original batch query that NHibernate was originally executing.

Lesson (re)learned. Always measure. No matter how obvious the “expected” results may seem.

Advertisements

7 Responses

  1. Wow, in my ignorance I have been doing this the “right” way. Actually, I guess your point is, ignorance is not bliss (unless you happen to be lucky). Thanks for the great post. All of us Nhib users need to constantly remind ourselves that, while NHib takes away a lot of the pain of dealing directly with the database, we must always understand how it does our bidding for us. Okay, right now I’m gonna put a shortcut on my desktop to SQL Server Profiler!!!!

  2. You are right… the SQL Server Profiler should be your friend 😉 Measure, Measure, Measure

  3. […] About Me « NHibernate – Performance Lesson (Re)Learned […]

  4. Actually, for best results in your page, you should do one of the followings:
    1. Use a Formula (Formula = “SELECT Count * FROM … “)
    2. Create a projection for your view

    In both ways, you’ll get your answer in a single query, fetching exactly the needed data for your view

  5. Hmmm…
    8 calls per page.

    I wonder if a “group by” (select x, count(*) … group by x) or “exists” (select x … where exists ( -open houses- )) would help.

    I’ve generally found Object-Relational Mapping tools, like Hibernate and TopLink, to be weak, in terms of leveraging the strengths of the relational database. Their ease of use is to be admired, but it often comes at a substantial cost, in performance. While caching and auto-generation of better queries by these tools can substantially improve performance, relative to typical hand-coded SQL, getting the most out of your database can be a problem.

    Fortunately, the Hibernate is doing a good job of providing more access to native SQL functionality through HSQL. And this can provide solutions to many of the most common and severe performance problems.

    (P.S. Code listings on this blog seem to get truncated on the 64th character, regardless of window size.)

  6. Another solution I just re-learned was that NHibernate treats tables and views in exactly the same manner. So, rather than map directly to the table, map to a view that joins in a count. Map the count to a property. Let the DB do the heaving lifting in batch.

  7. We did that as well on several occasions.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: