Skip to content

[bug]:PostgreSQL backend causes massive CPU load due to decayedlogdb_kv table access patterns #9867

@warioishere

Description

@warioishere

Context

I'm running a large routing node with LND using the new PostgreSQL backend.

After migrating my node from bbolt db to postgres via lndinit latest version, I'm experiencing constantly high CPU usage (~400% on a 6c mashine)) and extremely frequent read operations caused by queries on the table decayedlogdb_kv. Also this db is very large in size of arround 11GB. I dont know why exactly 11GB in size, but maybe it was added during the lndinit migration process. Dont have logs of that anymore..

schemaname |         table_name          | indexes_size 
------------+-----------------------------+--------------
 public     | decayedlogdb_kv             | 11 GB
 public     | channeldb_kv                | 2255 MB
 public     | invoices                    | 57 MB
 public     | invoice_payment_hashes      | 20 MB
 public     | invoice_features            | 19 MB
 public     | invoice_htlcs               | 15 MB
 public     | towerclientdb_kv            | 8216 kB
 public     | walletdb_kv                 | 3072 kB
 public     | invoice_events              | 208 kB
 public     | macaroondb_kv               | 64 kB
 public     | invoice_htlc_custom_records | 40 kB
 public     | towerserverdb_kv            | 32 kB
 public     | amp_sub_invoices            | 24 kB
 public     | amp_sub_invoice_htlcs       | 24 kB
 public     | invoice_event_types         | 16 kB
 public     | migration_tracker           | 16 kB
 public     | invoice_sequences           | 16 kB
 public     | schema_migrations           | 16 kB

High CPU load appears to be due to massive amounts of repetitive queries on parent_id, which do not properly leverage the existing partial index (parent_id, key).

System Info

  • LND: latest version (v0.19.0-beta)
  • PostgreSQL: 17
  • native SQL invoices enabled and migrated
  • RAM: 16 GB
  • CPU: 6 cores, average usage 30–60% according to proxmox scaling (avergage 8% up to 15% on highload on bbolt) or arround 50% on htop scale
  • Shared buffers: 8 GB

Index Size

lndb=# SELECT 
     pg_size_pretty(pg_total_relation_size('public.decayedlogdb_kv')) AS total_size,
     pg_size_pretty(pg_indexes_size('public.decayedlogdb_kv')) AS index_size,
     pg_size_pretty(pg_relation_size('public.decayedlogdb_kv')) AS data_size;
 total_size | index_size | data_size 
------------+------------+-----------
 21 GB      | 11 GB      | 9744 MB
(1 row)

Should the index be bigger then the actually data?

Example problematic query (repeated thousands of times per second):

lndb=# SELECT key, value
FROM decayedlogdb_kv
WHERE parent_id = 149208614
ORDER BY key
LIMIT 1;
                    key                     |   value    
--------------------------------------------+------------
 \x00000bfe0e6696137652a9a475dbb8c0538a045b | \x000db65b
(1 row)

lndb=# 

Observations

Table decayedlogdb_kv has ~150 million rows, index size: 11 GB
Constant insert + read pattern

EXPLAIN (ANALYZE, BUFFERS)
SELECT key, value
FROM decayedlogdb_kv                            
WHERE parent_id = 149208614
ORDER BY key
LIMIT 1;
                                                                      QUERY PLAN                                                                      
------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1.70..1.70 rows=1 width=18) (actual time=45.457..45.457 rows=1 loops=1)
   Buffers: shared hit=3766
   ->  Sort  (cost=1.70..1.70 rows=1 width=18) (actual time=45.456..45.456 rows=1 loops=1)
         Sort Key: key
         Sort Method: top-N heapsort  Memory: 25kB
         Buffers: shared hit=3766
         ->  Index Scan using decayedlogdb_kv_p on decayedlogdb_kv  (cost=0.57..1.69 rows=1 width=18) (actual time=0.054..29.719 rows=232662 loops=1)
               Index Cond: (parent_id = 149208614)
               Buffers: shared hit=3763
 Planning:
   Buffers: shared hit=58
 Planning Time: 0.325 ms
 Execution Time: 45.469 ms
(13 rows)

Could the high CPU usage in the PostgreSQL backend be caused by inefficient query planning on the decayedlogdb_kv table?

Specifically, is it possible that PostgreSQL often ignores the optimal index and falls back to a less efficient plan, reading and sorting hundreds of thousands of rows just to fetch a single result? This seems to happen constantly on large routing nodes and may explain the extreme load.

Has this been analyzed or optimized already?

Another stats:

relname         | idx_scan     | idx_tup_fetch     | total_reads
----------------+--------------+-------------------+-------------
decayedlogdb_kv | 456,924,012  | 462,423,861,984   | 6.6 billion

→ This accounts for 95%+ of total PostgreSQL I/O

Hit Ratio of the tables:

lndb=# SELECT
  schemaname,
  relname AS table_name,
  heap_blks_hit,
  heap_blks_read,
  CASE
    WHEN (heap_blks_hit + heap_blks_read) = 0 THEN NULL
    ELSE ROUND(100.0 * heap_blks_hit / (heap_blks_hit + heap_blks_read), 2)
  END AS hit_ratio
FROM
  pg_statio_user_tables
ORDER BY
  hit_ratio DESC NULLS LAST;
 schemaname |         table_name          | heap_blks_hit | heap_blks_read | hit_ratio 
------------+-----------------------------+---------------+----------------+-----------
 public     | towerclientdb_kv            |     115920334 |           2465 |    100.00
 public     | schema_migrations           |            12 |              0 |    100.00
 public     | macaroondb_kv               |        599226 |             27 |    100.00
 public     | invoice_htlc_custom_records |        322927 |             48 |     99.99
 public     | walletdb_kv                 |       3117766 |           1257 |     99.96
 public     | invoice_events              |         34014 |             38 |     99.89
 public     | invoice_event_types         |          7836 |              9 |     99.89
 public     | channeldb_kv                |    7406342580 |        8896397 |     99.88
 public     | decayedlogdb_kv             |    6603038283 |        9030643 |     99.86
 public     | invoice_sequences           |          1618 |              8 |     99.51
 public     | invoice_features            |       1102001 |           6663 |     99.40
 public     | invoice_htlcs               |        664875 |           5322 |     99.21
 public     | invoice_payment_hashes      |        746256 |           6388 |     99.15
 public     | invoices                    |       1447343 |          32033 |     97.83
 public     | migration_tracker           |             7 |              7 |     50.00
 public     | amp_sub_invoices            |             0 |              0 |          
 public     | towerserverdb_kv            |             0 |              0 |          
 public     | amp_sub_invoice_htlcs       |             0 |              0 |          
(18 rows)

Suggestions / Questions

VACUUM_FULL didnt help any on the decayedlogdb_kv
My channel.db was 6.8G in size before migration, my sphinx arround 3.8GB

channeldb_kv is also heavily used (as expected) but not causing the same performance issues. It's the repeated access pattern on decayedlogdb_kv that is clearly the bottleneck on larger nodes.

Let me know if I can help further. I can provide full EXPLAIN ANALYZE, table schemas, and stat snapshots if needed.

Thanks for all the great work – this PostgreSQL backend is promising, but currently not usable on large nodes in production due to this bottleneck. I had a CPU load of of 25% on a 6c machine instead of now 450%

I run about 50 channels and do heavily rebalance

Before After bbolt:

Image

I followed the PG optimizations suggest on gist:

https://gist.github.com/djkazic/526fa3e032aea9578997f88b45b91fb9

maybe relevant issue:

#7107 (comment)

Metadata

Metadata

Assignees

Labels

P0very high priority issue/PR, blocker on all othersadvancedIssues suitable for very experienced developersbugUnintended code behaviourdatabaseRelated to the database/storage of LNDpostgres

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions