规模化的PostgreSQL。节省空间(基本上)是免费的
Braintree Payments operates dozens of PostgreSQL clusters with over 100 terabytes of data. At this scale, even a few percentage points change in disk space growth rate can meaningfully impact the writable lifespan of a database cluster. Unfortunately, many ideas to save disk space require application changes and therefore need to be slotted into product timelines.
Braintree Payments运营着几十个PostgreSQL集群,拥有超过100TB的数据。在这种规模下,即使磁盘空间增长率有几个百分点的变化,也会对一个数据库集群的可写寿命产生有意义的影响。不幸的是,许多节省磁盘空间的想法需要改变应用程序,因此需要在产品时间表中安排时间。
But today I want to focus on a technique that saved us approximately 10% of our disk space with very little effort beyond existing processes. In short, carefully choosing column order when creating a table can eliminate padding that would otherwise be needed.
但今天我想着重介绍一种技术,它为我们节省了大约10%的磁盘空间,而在现有的程序之外几乎没有什么努力。简而言之,在创建表格时仔细选择列的顺序可以消除本来需要的填充。
This technique isn’t revolutionary: it’s been well-documented by 2ndQuadrant in On Rocks and Sand, EDB in Data Alignment in PostgreSQL, GitLab in Ordering Table Columns in PostgreSQL, the classic “Column Tetris” answer on a StackOverflow question, and I’m sure many more. What I hope we’re bringing to the table is tooling encoding these ideas so that you don’t have to re-invent the wheel (or apply the technique manually).
这种技术并不是革命性的:2ndQuadrant在On Rocks and Sand中对其进行了很好的记录,EDB在PostgreSQL中的数据对齐,GitLab在PostgreSQL中的表列排序,StackOverflow问题的经典答案"Column Tetris",我相信还有很多。我希望我们带来的是对这些想法进行编码的工具,这样你就不必重新发明轮子(或手动应用这些技术)。
Below I’ll describe the rules and heuristics we apply to determine an ideal column ordering. But a list of rules sounds a lot like the definition for an algorithm. And that implies a problem space we can tackle at the systems, not people, level. Instead of sending a mass email to every engineer writing database DDL changes and expecting them to remember these rules, we authored a Ruby gem called [pg_column_byte_packer](https://github.com/braintree/pg_column_byte_packer)
to automate the solution in our development cycle. We'll talk more about that soon, but first let's take a m...