One of the most common performance issues that can affect a rails application (or any other web application) is the n+1 queries problem. This is usually an easy issue to solve, but there may be situations where the solution is not so trivial. In this article I want to talk about a situation where that may happen and the solutions that I usually use to solve it.
To make sure we are on the same page, let me show an example where the n+1 query problem appears and how we can fix it in rails.
Consider the following models:
Now consider the following code:
At first glance this code may seem harmless but if we check the queries that it does to the database we see that it loads all the Books and then makes an additional query for each author.
This can be a huge performance hit!
Eager Loading
Active Record provides a really nice solution to this issue that allows you to specify which association you want to load.
This is called Eager Loading. So looking again at the previous example:
You just need to call the includes method on the Model with the association name you want to load and it will automatically make the query to load the necessary data.
You can see what is happening by looking at the sql code that is executed. It loads all the books from the database, then it extracts all the author_ids from the books and make a new query to fetch all the authors with those ids.
With this, when you try to access the author association of a given book, the data will already be available so there is not need to make a new query to load the information.
Eager loading with conditions
As you can see the you can fix the n+1 query problem quite easily in rails. But over the last years working on web applications in rails, there have been a lot of situations where I need to eager load only a subset of the data from an association. I will show a couple of ways to achieve this, and explain the advantages and disadvantages of each one.
Let’s start with an example of a feature:
Considering the models at the beginning, print all the authors names and for each author print the name of the published books.
Filtering in Memory
The most obvious way to do this is to just fetch all the Author from the database and filter the books in memory afterwards. It would be something like this:
This may work just fine most of the times, but if you have to load a lot of books in the database this requires you to load all of them into memory, when you only need a subset of the books, which can have a negative performance impact.
Note: You may be tempted to use the published scope to filter the books like this:
published_books = author.books.publishedbut that would be even worse since it would trigger a new query in each iteration bringing back the n+1 query issue.
Filtering in the database
An alternative solution to filter the association data in memory is to do it in the database so that we only retrieve the data we need.
ActiveRecord allows us to do just that by specifying where conditions on the table of the association we want to filter.
To implement the same feature we can do this:
As you can see it’s quite easy to do. Each of those authors will have only the published books loaded in the books association.
You can also use a scope or a string to specify the predicate. If you do that you have to call the method references to tell ActiveRecord what tables you will be referencing.
There is one caveat though. Take a look at the sql code that is executed:
As you can see it’s quite different from the previous examples. Since we specified a predicate referencing a different table, ActiveRecord has to make a join so that it can apply the condition in the same query.
Because of this, the query will get different set of authors than we originally wanted. Instead of getting all the authors, it is only getting the authors that have books published.
The first time I saw this result I didn’t understand what was going on. I always thought that the left outer join would guarantee that the data from the table on the left side of the join would always be present whether it had a match in the right side of the join or not. Turns out that when we specify a where clause referencing the table on the right side of a left join it will remove all the rows where the predicate does not match.
In this case, if we have an author that does not have at least one published book, all the rows referencing that author will be removed from the result. So if you want to filter a association of a model this way keep this in mind.
Scoped Associations
There is one last alternative to filter the data from an association. As we have seen, using the includes method ActiveRecord can perform eager loading for us for a given association without any problem.
Instead of trying to filter the association in the main query, we can define an association that only gets the filtered data. The associations api allows us to specify additional conditions to get the associated data. So looking again at our Author model, we can define a published_books association. It looks like this:
You just need to pass a lambda as the second argument that defines the conditions you want to perform. The code inside the lambda is performed against the Book model so you can use any scoped defined in the model as well:
has_many :published_books, -> { published }, class_name: 'Book'Then all you have to do is just eager load this new association. So getting back to the feature at hand it would like this:
This approach does not have the same issue has the last one. It will load all the authors whether they have published books or not.
Usually this is my preferred way to filter associations, but there is at least one situation where this will not work.
If you have some condition that is only known at runtime, you have no way to specify an association for that condition. A typical example of this is any condition that involves the current_user.
Let’s consider that we now need to show all the published books that belong to the current_user.
If we were to use the scoped association approach, we would need to find a way to define a current_user_published_books association in the Author model, but the main issue is that we don’t have a good way to access the current_user variable from the Model layer.
In this scenario we can actually mix both approaches and achieve the result we want:
Final Thoughts
As you can see, there are situations where solving an n+1 query problem may not be as simple as just calling the includes method. The examples I have shown are quite simple, but I think they illustrate some of the different ways you can filter eager loaded data and the different characteristics of each approach.
If you have any feedback or doubts feel free to leave a comment bellow.
I work at Runtime Revolution with a great team of developers and designers where we help our clients build the best products possible. I love all things web and I am always looking to learn new stuff.
Runtime RevolutionWe are Rails, mobile and product development experts. We can build your product or work with you on your project.www.runtime-revolution.com