InnoDB with reduced page sizes wastes up to 6% of disk space

In InnoDB bugs found during research on InnoDB data storage I mentioned MySQL Bug #67963 which was then titled “InnoDB wastes 62 out of every 16384 pages”. I said:

InnoDB needs to occasionally allocate some internal bookkeeping pages; two for every 256 MiB of data. In order to do so, it allocates an extent (64 pages), allocates the two pages it needed, and then adds the remainder of the extent (62 free pages) to a list of extents to be used for single page allocations called FREE_FRAG. Almost nothing allocates pages from that list, so these pages go to waste.

This is fairly subtle, wasting only 0.37% of disk space in any large InnoDB table, but nonetheless interesting and quite fixable.

Wasting 0.37% of disk space was unfortunate, but not a huge problem…

MySQL 5.6 brings adjustable page sizes

Since MySQL 5.6, InnoDB supports adjustable page size through the new configuration parameter innodb_page_size1, allowing you to use 4 KiB or 8 KiB pages instead of the default 16 KiB pages. I won’t go into the reasons why you would want to reduce the page size here. Instead, coming back to MySQL Bug #67963… neither the number 62 nor 16384 are fixed; they are in fact variable.

The number 62 actually comes from the size of the extent, in pages. For 16 KiB pages, with 1 MiB extents, this works out to 1048576 / 16384 = 64 pages per extent. Since two pages are stolen for bookkeeping, that leaves the 62 pages above.

The number 16384 comes from InnoDB’s need to repeat these bookkeeping pages every so often — it uses the page size, in pages, for this frequency2, which means that for 16 KiB pages it repeats the bookkeeping pages every 16,384 pages.

If we use 8 KiB pages instead by setting innodb_page_size=8k in the configuration? The number of pages per extent changes to 1048576 / 8192 = 128 pages per extent. The frequency of the bookkeeping pages changes to every 8192 pages. So we now waste 126 / 8192 = ~1.5% of disk space for this bug.

If we use 4 KiB pages instead by setting innodb_page_size=4k in the configuration? The number of pages per extent changes to 1048576 / 4096 = 256 pages per extent. The frequency of the bookkeeping pages changes to every 4096 pages. So we now waste 254 / 4096 = ~6.2% of disk space for this bug.

An aside: When is an extent not an extent?

An interesting aside to all of this is that although the manual claims it is so, in InnoDB an extent is actually not always 1 MiB. It is actually (1048576 / innodb_page_size) * table_page_size. As far as I can tell this was more or less a mistake in the InnoDB compression code; it should have used the table’s actual page size (which comes from KEY_BLOCK_SIZE aka zip_size for compressed tables) rather than the system default page size (UNIV_PAGE_SIZE) which was at the time fixed at compile-time.

So, for a system with innodb_page_size=16k (the default), and a table created with ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8, the “extent” is actually only 512 KiB.

The bug gets even worse if you mix InnoDB compression in…

If you mix the new configurable page size feature with InnoDB compression, due to the above weirdness with how extent size really works, you can get some pretty interesting results.

For a system with innodb_page_size=4k and a table created with ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=1, the system actually wastes 254 / 1024 = ~24.8% (!!!) of the disk space to this bug (in other words, every 4th extent will be an unusable fragment extent).

A new title for Bug #67963, and a conclusion

I updated Bug #67963 to add the above and changed the title to “InnoDB wastes almost one extent out of every innodb_page_size pages” to be slightly more accurate with the reality.

If you were thinking about using 4k pages in your systems, you may want to subscribe to the bug, and maybe hold off, unless you can afford to waste more than 6% of your disk space (in addition to all other waste).

1 And prior to MySQL 5.6, you could always have changed it by changing UNIV_PAGE_SIZE in the source code and recompiling.

2 As the page size is reduced, there is less disk space available to store the bitmaps that need to be stored in the XDES page, and reducing the amount of pages represented by each page proportionally with the page size is a good enough way to do it.

首页 - Wiki
Copyright © 2011-2024 iteam. Current version is 2.129.0. UTC+08:00, 2024-07-04 17:31
浙ICP备14020137号-1 $访客地图$