RavenDB Common Mistake 1 – Nested List Indexes

Suppose you were making an app that tracks mobile device usage in families. You might end up with an object model that looks something like this:

public class Family
    {
        public string Id { get; set; }
        public IList<Person> FamilyMembers { get; set; }
    }

    public class Person
    {
        public string FirstName { get; set; }
        public string LastName { get; set; }
        public IList<Device> Devices { get; set; }
    }

    public class Device
    {
        public string DeviceName { get; set; }
        public bool IsPhone { get; set; }
        public bool IsTablet { get; set; }
        public decimal UsageMinutes { get; set; }
    }

 

Suppose you wanted to query all people with a phone. You would probably write an index something like this to start:

public Person_Phone()
        {
            Map = families => from family in families
                              from person in family.FamilyMembers
                              from device in person.Devices
                              where device.IsPhone == true
                              select new {
                                  person.FirstName,
                                  person.LastName
                              }
        }

This is what the team at RavenDB dubs a fan-out index. It is an index that produces multiple entries for one document.
These indexes cause some issues inside RavenDB that can result in the server running itself out of memory. This problem has been mitigated in Raven 3 by preventing the server from generating more than a maximum number of results for an index. However, depending on your situation that might just result in your having *gasp* totally wrong data!

In this case, there isn’t a great way to eliminate the fan out. There will always be multiple people in one document. That said, you CAN drastically improve the performance of this operation. We had an index like this in our code at work, and we got the server to stop running out of resources and play nice by simply denormalizing some data within the document.
By changing the person class to the following:

public class Person
    {
        public string FirstName { get; set; }
        public string LastName { get; set; }
        public IList<Device> Devices { get; set; }
        public bool HasPhone { get; set; }
        public bool HasTablet { get; set; }
    }

And maintaining the denormalization of HasPhone and HasTablet, we are able to change our index definition to:

public Person_Phone()
        {
            Map = families => from family in families
                              from person in family.FamilyMembers
                              where person.HasPhone == true
                              select new {
                                  person.FirstName,
                                  person.LastName
                              }
        }

The removal of the single level of nesting is a game-changer for RavenDB and it allowed us to get our server back on the move!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s