Deep dive into postgres stats: pg_stat_bgwriter reports

Deep dive into postgres stats: pg_stat_bgwriter reports

Everything you always wanted to know about Postgres stats

Today, I would like to make a little detour from the main series and will dive into pg_stat_bgwriter. If you’ve been following my previous posts, you will remember that pg_stat_bgwriter view has summary statistics about bgwriter and checkpointer. Here I would like to show an interesting report query which is based on pg_stat_bgwriter. Sources of the query were found in postgres mailing lists and shared by my colleague Viсtor Yegorov and slightly modified by me. This report provides comprehensive information about bgwriter and checkpointer activity and helps to better configure them.
A tiny recommendation to run this query with expanded output in psql. Report produces only one row and looks like this :

-[ RECORD 1 ]————————–+—————————————

Uptime | 112 days 12:58:21.394156

Since stats reset | 132 days 14:37:34.149519

Forced checkpoint ratio (%) | 7.6

Minutes between checkpoints | 56.61

Average write time per checkpoint (s) | 775.36

Average sync time per checkpoint (s) | 0.21

Total MB written | 4194915.7

MB per checkpoint | 105.26

Checkpoint MBps | 0.03

Bgwriter MBps | 0.24

Backend MBps | 0.10

Total MBps | 0.37

New buffer allocation ratio | 34.925

Clean by checkpoints (%) | 8.5

Clean by bgwriter (%) | 64.3

Clean by backends (%) | 27.2

Bgwriter halt-only length (buffers) | 0.00

Bgwriter halt ratio (%) | 0.25

————————————– | ————————————–

checkpoints_timed | 3117

checkpoints_req | 256

checkpoint_write_time | 2615284047

checkpoint_sync_time | 716686

buffers_checkpoint | 45447168

buffers_clean | 345439953

maxwritten_clean | 861

buffers_backend | 146062083

buffers_backend_fsync | 0

buffers_alloc | 18752891779

stats_reset | 2016-10-12 23:41:35.737168+03

total_checkpoints | 3373

total_buffers | 536949204

startup | 2016-11-02 01:20:48.492531+03

checkpoint_timeout | 1h

checkpoint_segments | 256

checkpoint_completion_target | 0.9

The report consists of two parts which are separated by horizontal dotted line, the first part is the report itself and the second is raw values from pg_stat_bgwriter and auxiliary items used in the report.

The first part is more interesting and here is why.

The first values that we see are the stats interval and uptime. General idea here is that the bigger the interval you have since last stats reset , the more inaccurate report you will get. Thus, I’d recommend resetting stats periodically, weekly or monthly. The uptime only shows how long postgres has been working. However, please note that in the example the uptime is less than stats collecting interval — it’s not bad, but for your report try using stats that were collected only after postgres startup, since at shutdown postgres runs forced checkpoint that isn’t related to the workload but might affect the stats.

Next is the information about checkpoints – “Forced checkpoint ratio” is the ratio of checkpoints which occurred by xlog. Current value is 7.6 and it’s good enough. High values, for example more than 40%, indicate that xlog checkpoints occur too frequently. As you might remember, xlog checkpoints are less preferred than time checkpoints, so general idea is to reduce the number of xlog checkpoints by increasing the number of WAL segments required to trigger checkpoint. “Minutes between checkpoints” is time interval between occured checkpoints. When everything is ok, this value should be near checkpoint_timeout. Values significantly lower than checkpoint_timeout also indicate on occurrence frequency of xlog checkpoints. General recommendation in both cases is raise max_wal_size (or checkpoint_segments for 9.4 or over).

Next is the average write and sync time in seconds. Write time is near the 13 minutes and that’s a good value, it shows that write stage of checkpoints was performed fast enough despite the long interval between checkpoints. Values that are closer to the earlier mentioned intervals between checkpoints aren’t good – it’s an indicator that storage spent too much time writing buffers. Average sync time should be near zero – values that far from zero would indicate on low performance of the storage.

Next items have informative value, they tell us about average throughput of checkpointer, bgwriter and backends. These numbers give us additional information about the workload. The “Total MB written” is obviously the size of written data by all subprocesses. The “MB per checkpoint” is an average value for checkpoints. Next values are measured in Mbps and they are about process throughput. In the example above, there are low values less than 1 Mbps means that server doesn’t have a huge amount of dirty data or maybe the report has been built with longer stats interval and throughput values are spread throughout.

“New buffer allocation ratio” field is the ratio of new allocated buffers to all written buffers. When backends handle data, firstly they check are data already in shared buffers area? If there are no required data in shared buffers, backends allocate new buffers and load data from main storage to shared buffers and then process it (see details in BufferAlloc() and StrategyGetBuffer() functions). Thus, high number here tell us that backends allocated a lot of buffers since required data didn’t exist among shared buffers.

There are two reasons for this, the first is that the backends read rarely uses “cold” data, old archived partitions or something similar; the second reason is that the early used data had been evicted from shared buffers because of lack of shared buffers. That’s not all however, this number means how many times data were read to shared buffers more than it had been written out from them. This item potentially comes with cache hit ratio and high allocation ratio and low cache hit ratio can indicate insufficient shared buffers, though it’s hard to know for sure.

Next set of values are on how many buffers in percent are cleaned by the sub-processes. High “Clean by checkpoints” value is the sign of write-intensive workload. High “Clean by bgwriter” tells us about read workload. High number of “Clean by backends” is the sign that backends done a lot of bgwriter’ work and that’s not good – values more than 50% tells us about ineffective setting of bgwriter, and in this case I would suggest trying to make it more aggressive.

Next values are “Bgwriter halt-only length” and “Bgwriter halt ratio“. They are about frequency with which bgwriter was delayed due to exceeded bgwriter_lru_maxpages. The values in our example are perfect and high values conversely indicate that bgwriter went to sleep mode too frequently and didn’t do its work fast enough. In this case, I’d recommend to configure bgwriter in a more aggressive way — decrease delay and increase maxpages parameters.

The second part of the report is the raw values from pg_stat_bgwriter and configuration parameters which also related to bgwriter and checkpointer – they are used in report’s query, hence, you don’t need to see them in separate queries.

Here I prepared a few reports with my comments from different pgbench workloads:

here is eight hours of read-only workload with 4GB shared buffers (default bgwriter)
here is eight hours of read-only workload with 8GB shared buffers (default bgwriter)
here is eight hours of read/write workload with 4GB shared buffers (default bgwriter)
here is eight hours of read/write workload with 8GB shared buffers (default bgwriter)
here is eight hours of read/write workload with 8GB shared buffers (aggressive bgwriter)

Tests were performed on the server with 16CPU, 32GB RAM, RAID1 on 2xSSD (datadir), RAID1 on 2xSAS (wal) with PostgreSQL 9.6.2 and test database size is 96GB.

That is all for this time and I hope you enjoyed this post and found it helpful.

Комментарии: 6

6 комментариев на «“Deep dive into postgres stats: pg_stat_bgwriter reports”»

Unknown:

30 марта, 2017 в 12:21 пп

Alexey, thanks for this great explanation!
I noticed, that in read/write workload reports with default bgwriter settings changing only shared_buffers from 4GB to 8GB reduces the fsync time from 8ms to 1.85ms. Why does one affect the other?

Ответить
Alexey Lesovsky:

31 марта, 2017 в 7:08 дп

Yep, there is no any changes except shared_buffers and sync time is seemed to me suspicious too. Finally I think, issue lies is in the customer's SSDs, and I don't have any other ideas. The way to confirm/refute that is check disks with fio and take a look on latency spreading.

Ответить
amine:

15 октября, 2020 в 1:19 пп

Thanks for your sharing. In my db environment, I saw that it still writes backend dirt page even though I increased the bgwriter values a lot. and if you notice the maxwritten_clean value is 0. Where could the problem be? Why is it still writing backends dirty pages?

checkpoints_timed | 13
checkpoints_req | 1
checkpoint_write_time | 19993770
checkpoint_sync_time | 642
buffers_checkpoint | 522273
buffers_clean | 107335
maxwritten_clean | 0
buffers_backend | 394018
buffers_backend_fsync | 0
buffers_alloc | 3148261
stats_reset | 2020-10-15 09:35:03.154167+03
total_checkpoints | 14
total_buffers | 1023626
startup | 2020-02-12 22:22:34.563076+03
checkpoint_timeout | 30min
max_wal_size | 15GB
checkpoint_completion_target | 0.9
bgwriter_delay | 10ms
bgwriter_lru_maxpages | 100000
bgwriter_lru_multiplier | 10

Ответить
- Valeria K:
  
  16 октября, 2020 в 9:13 дп
  
  Hi, Amine!
  The timestamp in stats_reset indicates that the stats accumulated by pg_stat_bgwriter covers period of time that is too short. For better analysis of bgwriter/checkpointer stats should be collected over longer period, at least one-two-three weeks.
  In either case, it’s impossible to completely avoid writes by backends, but it’s important to minimize them and ensure that these writes are smaller than those made by bgwriter/checkpointer.
  
  Ответить
amine:

17 октября, 2020 в 5:53 дп

Hi Valeria
Yes I have reset the bgriter statistics shortly before, because I saw the buffers_backend value is far more than buffers_clean. I have tuned the bgriter and checkpointer again and then reset the bwriter statistics. But as I said I saw that the situation has not changed after resetting. thank u i will watch a few more weeks..

Ответить
- Alexey Lesovsky:
  
  26 октября, 2020 в 11:31 дп
  
  Hi, Amine!
  After changing setting, it’s better to observe behavior using monitoring. Using charts it’s easier (and faster) to see changes in postgres behavior.
  
  Ответить

Добавить комментарий Отменить ответ

Базовый	Премиум	Enterprise
До 10 серверов	До 40 серверов	До 100 серверов
Чат, аварийный телефон	Чат, аварийный телефон	Чат, аварийный телефон
до 10 часов работы DBA/месяц*	до 25 часов работы DBA/месяц*	до 60 часов работы DBA/месяц*
SLA проблема – до 1 ч., стандартные работы – до 8 ч.	SLA проблема – до 1 ч., стандартные работы – до 3 ч.	SLA проблема – до 1 ч., стандартные работы – до 3 ч.
24/7 SLA на аварии – 1 ч.	24/7 SLA на аварии – 1 ч.	24/7 SLA на аварии – 30 мин
Автоматические Health Check	Автоматические Health Check с рекомендациями от DBA	Индивидуальная проверка ваших БД нашими DBA
Цена может варьировать в зависимости от индивидуальных требований клиента и обсуждается индивидуально. Настоящее предложение не является публичной офертой.Возможно платное увеличение лимита часов на базовые работы.При выработке лимита часов, включенных в пакет, дополнительные часы оплачиваются по дополнительному тарифу. По предварительной договоренности возможно увеличение лимита часов, включенных в пакет, по сниженному тарифу. Указанные условия, включая стоимость оказываемых услуг в рублях РФ, могут быть изменены в зависимости от согласованных в дальнейшем существенных условий договора и предпочтительной для клиента валюты платежа. Минимальная длительность контракта – 6 месяцев.*

Базовый

Премиум

Enterprise

До 10 серверов

До 40 серверов

До 100 серверов

Чат, аварийный телефон

до 10 часов работы DBA/месяц*

до 25 часов работы DBA/месяц*

до 60 часов работы DBA/месяц*

SLA

проблема – до 1 ч.,
стандартные работы – до 8 ч.

SLA

проблема – до 1 ч.,
стандартные работы – до 3 ч.

SLA

проблема – до 1 ч.,
стандартные работы – до 3 ч.

24/7 SLA на аварии – 1 ч.

24/7 SLA на аварии – 30 мин

Автоматические Health Check

Автоматические Health Check с рекомендациями от DBA

Индивидуальная проверка ваших БД нашими DBA

Цена может варьировать в зависимости от индивидуальных требований клиента и обсуждается индивидуально.
Настоящее предложение не является публичной офертой.*Возможно платное увеличение лимита часов на базовые работы.**При выработке лимита часов, включенных в пакет, дополнительные часы оплачиваются по дополнительному тарифу.
По предварительной договоренности возможно увеличение лимита часов, включенных в пакет, по сниженному тарифу.

Указанные условия, включая стоимость оказываемых услуг в рублях РФ, могут быть изменены в зависимости от согласованных в дальнейшем существенных условий договора и предпочтительной для клиента валюты платежа. Минимальная длительность контракта – 6 месяцев.

select case when setting::bigint < 90600 then 'Вы используете старую версию PostgreSQL, которая более не поддерживается сообществом.'||chr(10)|| 'Рекомендуем вам перейти на последнюю актуальную версию как можно скорее.' when setting::bigint < 100000 then 'Вы используете старую версию PostgreSQL, которая пока что поддерживается сообществом.'||chr(10)|| 'Рекомендуем вам перейти на последнюю актуальную версию.' when setting::bigint < 110000 then 'Вы используете достаточно современную версию PostgreSQL, которая активно поддерживается сообществом.'||chr(10)|| 'У вас все неплохо, но можно обновиться и на последнюю актуальную версию при возможности.' when setting::bigint < 140000 then 'Вы пользуетесь одной из самых последних версий PostgreSQL.'||chr(10)|| 'У вас все отлично.' else 'Вы используете версию которая находится в разработке,'||chr(10)|| 'если это production, то рекомендуем вам перейти на стабильную версию PostgreSQL.' end as "Проверка мажорной версии PostgreSQL" , case when setting::bigint between 130002 and 139999 or setting::bigint between 120006 and 129999 or setting::bigint between 110010 and 119999 or setting::bigint between 100015 and 109999 or setting::bigint between 90620 and 90699 then 'У вас стоит один из последних патчей PostgreSQL для вашей версии.'||chr(10)|| 'Похоже вы следите за обновлениями PostgreSQL. Это хороший факт.' else 'Похоже вы не обновляли PostgreSQL, после установки/последнего мажорного обновления, совсем.'||chr (10)|| 'Это плохо, рекомендуем вам обновиться до последней актуальной версии PostgreSQL.' end as "Проверка минорной версии PostgreSQL" , 'Актуальные версии на данный момент следующие, в порядке убывания актуальности:'||chr (10)|| '13.3, 12.7, 11.12, 10.17, 9.6.22' as "Список актуальных версий" from pg_settings where name = 'server_version_num';

SELECT now()-pg_postmaster_start_time() "Uptime", now()-stats_reset "Minutes since stats reset", round(100.0*checkpoints_req/checkpoints,1) "Forced checkpoint ratio (%)", round(min_since_reset/checkpoints,2) "Minutes between checkpoints", round(checkpoint_write_time::numeric/(checkpoints*1000),2) "Average write time per checkpoint (s)", round(checkpoint_sync_time::numeric/(checkpoints*1000),2) "Average sync time per checkpoint (s)", round(total_buffers/pages_per_mb,1) "Total MB written", round(buffers_checkpoint/(pages_per_mb*checkpoints),2) "MB per checkpoint", round(buffers_checkpoint/(pages_per_mb*min_since_reset*60),2) "Checkpoint MBps" FROM ( SELECT checkpoints_req, checkpoints_timed + checkpoints_req checkpoints, checkpoint_write_time, checkpoint_sync_time, buffers_checkpoint, buffers_checkpoint + buffers_clean + buffers_backend total_buffers, stats_reset, round(extract('epoch' from now() - stats_reset)/60)::numeric min_since_reset, (1024.0 * 1024 / (current_setting('block_size')::numeric))pages_per_mb FROM pg_stat_bgwriter ) bg

Новости и Блог Назад

Deep dive into postgres stats: pg_stat_bgwriter reports

Everything you always wanted to know about Postgres stats

Вам также может понравиться:

Operating PostgreSQL as a Data Source for Analytics Pipelines – Recap from the Stuttgart Meetup

Data archiving and retention in PostgreSQL. Best practices for large datasets

Taming large datasets in PostgreSQL: archiving and retention without the pain

How to Upgrade RDS PostgreSQL with Minimal Downtime

Новости и Блог Назад

Deep dive into postgres stats: pg_stat_bgwriter reports

Everything you always wanted to know about Postgres stats

Вам также может понравиться:

Operating PostgreSQL as a Data Source for Analytics Pipelines – Recap from the Stuttgart Meetup

Data archiving and retention in PostgreSQL. Best practices for large datasets

Taming large datasets in PostgreSQL: archiving and retention without the pain

How to Upgrade RDS PostgreSQL with Minimal Downtime

Готовы работать у нас?

Возникли вопросы? Просто напишите нам.