The pitfalls of partitioning Postgres yourself
Posted by abelanger 4 days ago
Comments
Comment by abelanger 1 day ago
(plus an interesting discussion in the comments of that post on how the query planner chose a certain row estimate in the specific case that Laurenz shared!)
The other thing I'll add is that we still haven't figured out:
1. An optimal ANALYZE schedule here on parent partitions; we're opting to over-analyze than under-analyze at the moment, because it seems like our query distribution might change quite often.
2. Whether double-partitioned tables (we have some tables partitioned by time series first, and an enum value second) need analyze on the intermediate tables, or whether the top-level parent and bottom-level child tables are enough. So far just the top-level and leaf tables seem good enough.
Comment by mnahkies 1 day ago
But TIL, I didn't realize you could do multiple levels of partitioning in modern postgres, found this old blog post that touches on it https://joaodlf.com/postgresql-10-partitions-of-partitions.h...
Something that stresses me is the number of partitions - we have some weekly partitions that have a long retention period, and whilst it hasn't become a problem yet, it feels like a ticking time bomb as the years go on.
Would a multi level partitioning scheme of say year -> week be a feasible way to side step the issues of growing partition counts?
Comment by cldellow 1 day ago
It prompted Laurenz to submit the documentation patch that is cited in the article. In the discussion of the patch itself, people seem to conclude that it's a good improvement to the docs, but that the behaviour itself is a bit of a footgun. [2]
[1]: https://stackoverflow.com/questions/73951604/autovacuum-and-...
[2]: https://www.postgresql.org/message-id/Y8cQJIMFAe7QT73/%40mom...
Comment by wreath 17 hours ago
Comment by groundzeros2015 1 day ago
Comment by Tostino 1 day ago