Новости и Блог Назад

Подписаться
Фильтры

pg_utils: Low_used_indexes.sql, how it works and why it should be lower than 0.01?

If you work with PostgreSQL the chances that you are aware or used our pg_utils are quite high. One of them is low_used_indexes.sql that lately we received a few questions about, so I thought it might be a good idea to write a short blog post about it. 

As you might gather from its name, low_used_indexes.sql goal is to identify indexes that are rarely used. “Rarely” is the key here, since those indexes that  are never used can be discarded and those that are used frequently should definitely be kept. With low_used_indexes.sql we are searching for indexes that need closer investigation since while taking up the precious disk space their usage also can’t be justified from the business logic perspective. 

How is this “rare usage” calculated? Let’s take a look at this example:

SELECT

     pg_stat_user_indexes.schemaname||'.'||pg_stat_user_indexes.relname,

     indexrelname,

     pg_stat_user_indexes.idx_scan,

     (coalesce(n_tup_ins,0)+coalesce(n_tup_upd,0)-coalesce(n_tup_hot_upd,0)+coalesce(n_tup_del,0)) as write_activity,

     pg_stat_user_tables.seq_scan,

     pg_stat_user_tables.n_live_tup,

    pg_size_pretty(pg_relation_size(pg_index.indexrelid::regclass)) as size

from pg_stat_user_indexes

join pg_stat_user_tables

     on pg_stat_user_indexes.relid=pg_stat_user_tables.relid

join pg_index

     ON pg_index.indexrelid=pg_stat_user_indexes.indexrelid

where

     pg_index.indisunique is false

     and pg_stat_user_indexes.idx_scan::float/(coalesce(n_tup_ins,0)+coalesce(n_tup_upd,0)-coalesce(n_tup_hot_upd,0)+coalesce(n_tup_del,0)+1)::float<0.01

     and (coalesce(n_tup_ins,0)+coalesce(n_tup_upd,0)-coalesce(n_tup_hot_upd,0)+coalesce(n_tup_del,0))>10000

order by 4 desc,1,2

At first glance, it’s not clear why pg_stat_user_indexes.idx_scan is divided by write_activity and why it should be lower than 0.01? Afterall, we could choose indexes withidx_scan = 0?

First, unfortunately, among indexes that are seldomly used idx_scan=0 will miss those that are used once in twenty-four hours or once a week or once a month.  

Second, if there are not many writes in the table such rarely used indexes do not pose any threat, that is aside the fact that they still use your disk space. However, if you have rarely used indexes AND your table has many writes you will sooner or later be facing quite severe consequences in terms of disk space. So, in the latter case it is better to delete it and perform a regular sequential scan, especially if pg_stat_user_tables.seq_scan isn’t equal to 0. 

<0.01 is a somewhat rule-of-thumb number that is mainly based on our own experience working with a variety of clients.

Normally indexes that fall into this category are worthwhile reviewing and based on what business logic stands behind them, in what queries they are used and based on that consider if there won’t be any faults in the process if you leave them as they are. 

Usually, the smaller the table the less risky is to remove one of those rarely used indexes, but one should consider any deletion of that level carefully.

Hope this is useful and if you would like us to post more information like this about the rest of our utils let us know in the comments below. 

Вам также может понравиться: