select_related and prefetch_related in Django
Django ORM stands for Object Relationship Mapper and it is used to work with the database, and we can store the data in a Pythonic way. This tutorial will give you a detailed explanation about one of the important concepts of Django ORM Query optimization – select_related and pre_fetch related.
What is select_related in Django?
When we fetch any object from the database, it returns the new queryset consists of that object not a particular object. It will create the new queryset of all related objects on access time. This is not good in all cases.
Before moving the select_related operation, we should add below logging to the setting.py file to see all SQL queries which are run behind in the Django CRM.
Merge the following snippet with the LOGGING field in your settings.py:
Model.py
We create two models, a Movie and Director. The movie model has three fields, including movie_title, release_year, and director foreign key.
Now we will run the following commands to make these tables in the database.
We will open the Django shell and create objects in order to store the date in tables.
Create the director objects in the Django shell.
We get the queryset of directors using following commands.
We can see the corresponding SQL query which is running internally.
Now we will create objects of the movie model.
Above output displays the SQL queries of movie object. Same as we created the three movies objects.
The Movie table has the foreign key relationship with the Director table. So we can use the following Query to fetch data
As seen in the above code, we need to run a separate query to fetch the director name using the Movie object.
These separate queries for the related objects decrease the performance of an application. Suppose we have 1000 movies, and we have to create a list of movies with the author’s name.
Each time we access the foreign key, we need to make another query to retrieve the value. So, we will end up running 1001queries to fetch the list of the books.
To overcome this problem, Django provides the select_related(), which will reduce the 1001 queries to 1.
Select Related
The select_related performs the inner join operations in the table. In the below example, movie_id is inner joined with the director.id.
Example –
Let’s create a query that fetches all movies with the name of the director in 1 query.
Open the Django shell and type the following query.
As we can see in the output, only one join query has been called to fetch all movies with the associate director. It is a big improvement for an application.
Difference Between select_related() and all()
- Without select_related
- With select_related
Now we will run the query with the all() method.
We get the same data from the both queries but there is a difference in the query lookups.
Let’s have a look on another scenario where we want to get the director name of first movie.
- Without select_related
We can observe that there are two SQL queries fired to fetch the director name. The first query fetches the movie name, and the second query fetches the associated director name. It may cause a lot of redundancy in the application. Let’s see how we can do the same using a single query.
- With select_related
The select_related is only limited to foreign key relationship. If there is many to many relationships then we cannot use the select_related. In that case, we can use the prefech_related.
Prefetch Related
The prefetch_related can be used with the many to many relationships to improve performance by reducing the number of queries. Let’s understand the following example.
When we try to get the movie with the publisher, it runs the two SQL queries in the background
We can solve the problem using the prefetch_related(). Let’s understand the following example.
Conclusion
So far, we have seen how select_related and prefetch_related efficiently reduced the queries overhead in Django. The select_related is used with the foreign key relationship, and it performs the INNER join operations with the associate table.
On the other hand, prefech_related is used with the many to many relationships. Select_related obtains all data simultaneously through multi-table join Association query and improves performance by reducing database queries.
But in many to many relationships, it is not a good choice to solve them using the JOINS because it will be a very long and tedious process. It will be ended up very time-consuming and SQL statements. To solve such a problem, we will use prefetch_related.