Should I Create an Index on Foreign Keys in PostgreSQL?

Welcome to a weekly blog where I get to answer (like, really answer) some of the questions I’ve seen in the webinars I’ve presented lately. If you missed the latest one, PostgreSQL Performance Tuning Secrets, it might be helpful to give some of it a listen before or after you read this post. Each week, I’ll dive deep into one question. Let me know what you think in the comments.

We constantly hear that indexes improve read performance and it’s usually true, but we also know that it will always have an impact on writes. What we don’t hear about too often is that in some cases, it may not give any performance improvement at all. This happens more than we want and might happen more than we even notice, and foreign keys (FKs) are a great example. I’m not saying that all FK’s indexes are bad, but most of the ones I’ve seen are just unnecessary, only adding load to the system.

For example, the below relationship where we have a 1:N relationship between the table “Supplier” and table “Product”:

foreign keys index

If we pay close attention to the FK’s in this example it won’t have a high number of lookups on the child table using the FK column, “SupplierID” in this example, if we compare with the number of lookups using “ProductID” and probably “ProductName”. The major usage will be to keep the relationship consistent and search in the other direction, finding the supplier for a certain product. In this circumstance, adding an index to the FK child without ensuring the access pattern requires it will add the extra cost of updating the index every time we update the “Product” table.

Another point we need to pay attention to is the index cardinality. If the index cardinality is too low Postgres won’t use it and the index will be just ignored. One can ask why that happens and if that wouldn’t still be cheaper for the database to go, for example, through half of the indexes instead of doing a full table scan? The answer is no, especially for databases that use heap tables like Postgres. The table access in Postgres (heap table) is mostly sequential, which is faster than random access in spinning HDD disks and still a bit faster on SSDs, while b+-tree index access is random by nature.

When Postgres uses an index it needs to open the index file, find the records it needs and then open the table file, do a lookup using the page addresses it got from the indexes changing the access pattern from sequential to random and depending on the data distribution it will probably access the majority of the table pages, ending up in a full table scan but now using random access, which is much more expensive. If we have columns with low cardinality and we really need to index them we need to use an alternative to b-tree indexes, for example, a GIN index, but this is a topic for another discussion.

With all of that, we may think that FK indexes are always evil and never use them on a child table, right? Well, that’s not true either and there are many circumstances they are useful and needed, for example, the below picture has another two tables, “Customer” and “Order”:

FK child table

It can be handy to have an index on the child table “Order->CustomerId” as it’s common to show all orders from a certain user and the column “CustomerId” on the table “Order” will be used quite frequently as the lookup key.

Another good example is to provide a faster method to validate referential integrity. If one needs to change the parent table (update or delete any parent key) the children need to be checked to make sure that the relationship isn’t broken. In this case, having an index on the child’s side would help to improve performance. When it worths the index however is “load dependant”. If the parent key has many deletes it might be a case to consider, however, if it’s a mostly static table or mostly has inserts or updates to the other columns other than the parent key column then it’s not a good candidate to have an index on the children tables.

There are many other examples that can be given to explain why an index on the child table might be useful and worth the extra write cost/penalty.

Conclusion

The takeaway here is that we should not indiscriminately create indexes on all FKs because many of them will just not be used or so rarely used that they aren’t worth the cost. It’s better to initially design the database with the FKs but not the indexes and add them while the database grows and we understand the workload. It’s possible that at some point we find that we need an index on “Product->SupplierId” due to our workload and the index on “Order->CustomerId” isn’t necessary anymore. Loads change and data distribution as well, index shall follow them and not be treated as immutable entities.

As more companies look at migrating away from Oracle or implementing new databases alongside their applications, PostgreSQL is often the best option for those who want to run on open source databases.

Read Our New White Paper:

Why Customers Choose Percona for PostgreSQL

5 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Serhii Bulieienko

2 years ago

In the Oracle DB the recommendation to create the index on FK columns is to avoid the full table lock when update/delete rows in primary table. It is not for SELECT performance.

Charly Batista

Author

Reply to Serhii Bulieienko

2 years ago

Hi Serhii,

Thanks for the comment. This is the referential integrity benefit Ive mentioned and you are totally correct about deletes, it will be of great help. Regarding updates, it helps when updating the parent key itself. If an update happens to any other column other than the key in the parent table referenced by the FK the constraint doesnt need to be checked and the index would be of no help.

That said, they key points I would keep in mind is:
– Do I have casade deletes on my parent tables?
– If so, do they happen often?
– Do I often update the PK/UK on the parent table?
– Do I often join the tables starting from the parent table?
– Do I often use the FK column in the child table as search condition?

If any of the above is true then an index might be of great help, if not, it probably adds more load than help but I wouldn’t recommend to indiscriminitaly remove or add any index without checking the load.

Daniel Lidström

2 years ago

If I indiscriminately add the indexes then I know I won’t have worse query performance. I might have slightly worse insert performance but in my applications reads have outnumbered writes by several orders usually.

Charly Batista

Author

Reply to Daniel Lidström

2 years ago

Hi Daniel,

Thanks for the comment here.

Regarding the phrase If I indiscriminately add the indexes then I know I won’t have worse query performance,
I would say that’s not necessarily true, and while it’s common to assume that my SELECT performance will not drop because I’ve add a new index it can happen. There are many factors we need to take in consideration like load, cardinality and data distribution for example. I will refer to an old but good post from Laurenz Albe here[1] as example. If we follow his example we can see how the SELECT performance dropped by adding a new index, so the perception that new indexes do not have any negative effect to SELECT’s isn’t necessarily true and I would be cautious to indiscriminately add new indexes.

That said, one thing I like to keep in mind is that we can create and remove new indexes at any time we need/want. Indexes should not be treated as immutable entities. The same way we can check our monitoring system or get a complain from an user/developer and then create a new index to improve query, we can refer to the Postgres statistic tables and check the indexes not used and remove them.

[1] https://www.cybertec-postgresql.com/en/index-decreases-select-performance/

Daniel Lidström

Reply to Charly Batista

2 years ago

Hi Charly, interesting article from the guru. I think his example doesn’t apply to FK indexes. FK indexes shouldn’t be used in sorting operations. They are used when looking up related information. I am certainly not a guru though, but I still think the advice to always index foreign keys is good advice and as you say we can always adjust later if really necessary. A helpful article would be one that explains easy ways to detect those situations.

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Should I Create an Index on Foreign Keys in PostgreSQL?

Conclusion

Related

Related Blog Articles

RECOMMENDED ARTICLES

Why SELECT COUNT(*) FROM TABLE Is Sometimes Very Slow in MySQL or MariaDB

Partially Rolling Back a Transaction in MySQL or PostgreSQL

JSON_TABLE() Will Be in PostgreSQL 17

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Should I Create an Index on Foreign Keys in PostgreSQL?

Conclusion

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

Why SELECT COUNT(*) FROM TABLE Is Sometimes Very Slow in MySQL or MariaDB

Partially Rolling Back a Transaction in MySQL or PostgreSQL

JSON_TABLE() Will Be in PostgreSQL 17

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation