• Latest Articles
  • Atom Feed
  • About
  • When Naming Matters Tommie's blog

    The PostgreSQL database engine has something called pg_xact. It records the status of transactions. It’s a crucial piece of the transaction system, because it says whether a specific transaction was committed or aborted. It was added in 2001, where it was a directory called pg_clog. That directory lives under the PGDATA directory, whereas previously, pg_log was a system table.

    From the commit message, it seems this was done to simplify the code:

    Replace implementation of pg_log as a relation accessed through the buffer manager with ‘pg_clog’, a specialized access method modeled on pg_xlog. This simplifies startup (don’t need to play games to open pg_log; among other things, OverrideTransactionSystem goes away), should improve performance a little, and opens the door to recycling commit log space by removing no-longer-needed segments of the commit log.

    However, 16 years later, the directory was renamed to pg_xact, because

    Names containing the letters “log” sometimes confuse users into believing that only non-critical data is present. It is hoped this renaming will discourage ill-considered removals of transaction status data.

    This interesting change can be traced back to a discussion in 2015:

    While anyone who is familiar with postgres would never do something as stupid as to delete pg_xlog, according to Google, there appears to be a fair amount of end-users out there having made the irrevocable mistake of deleting the precious directory, a decision made on the assumption that since “it has log in the name so it must be unimportant” (http://stackoverflow.com/questions/12897429/what-does-pg-resetxlog-do-and-how-does-it-work).

    As long as this was a system table, it was obscure, and touching any table starting with pg_ probably means trouble. When that name (well, pg_clog) became part of another domain, the file system, users had other expectations.

    Sometimes names matter.