How to Run a Vacuum in Postgres

How to Run a Vacuum in Postgres?

Are you the one who is thinking and trying to know that why the Vacuum has a 6TB database and kept on running for over more than 3 days? Sadly, there is no progress meter to check the Vacuum when the vacuum command is running, but you can check it with the help of strace and vmstat. With the help of bash commands as ls and ps, we can deduce the progress and activity while the Vacuum is running. You surely want to know that how to run a vacuum in Postgres. Let us tell you detail about this query vacuum.

The Basics of How to Run a Vacuum in Postgres?

VACUUM in Postgres consists of three phases:

  1. Ingest Phase
  2. Pruning Phase
  3. Cleanup Phase

PostgreSQL’s Vacuum needs to process in each table due to some reasons as:

  • To recover disk space by updating the row
  • Update the statistics of the given data used by the query planner of PostgreSQL.
  • To update the clarity of a map.
  • Protect the loss of data by ID wraparound.

There are two alternates of VACUUM: Full Vacuum and Standard vacuumStandard vacuum can run resemble with production operation of the database. Some commands like SELECT, UPDATE, INSERT, and DELETE will function normally. The Full Vacuum needs an exclusive lock on the working table. So administrators want to work with standard form and try to avoid Full Vacuum. The configuration parameter is used to reduce background vacuuming of performance impact.

Recover the Disk Space

If you think that removing or updating the version by using the UPDATE or DELETE command will immediately remove the old version in PostgreSQL, then you are wrong. The old version must use for other transactions. Eventually, it should not happen as an older version is not be interested in any other transaction. This table should delete it, so the space it occupies is reclaimed by other rows to improve disk storage capacity. The Vacuum does this process. 

Standard Vacuum

The standard Vacuum removes the dead or unused, outdated row in the table and indexes and frees the space for future use. However, the free space will not return to the operating system but in particular cases of more than one-page space. 

Full Vacuum

On the contrary, dead space will be eliminated in the full Vacuum by writing a completely new version of the table. It helps to minimize the size of the table, but the drawback is it takes a long time and requires extra space for making a new copy of the table. In this perspective, the main idea is to maintain the space of the disk, not to keep the size of the table minimum.

 Full Vacuum is used to share the disk space to the operating system and shrink the table to its minimum size. So, a moderately-frequent approach is better than an infrequent approach that needs to maintain updated tables. Some controllers want to schedule vacuuming itself by doing the work when the load is low, like at night.

 The problem with doing vacuuming on a fixed schedule is if an unexpected prong in the updated table exists, it may get distended to some point. This problem can be eliminated if the auto daemon schedules the Vacuum dynamically with respective to updated activity. It is a wrong step to disable the daemon unless you have a heavy workload completely. The best compromise is to set the daemon’s parameters to react to burdensome update activity.

What is Autovacuum Daemon?

The “Autovacuum” daemon has multiple processes. A daemon process known as an auto vacuum launcher is the head of the new Autovacuum processes for all databases. The maximum worker’s processes run at the same time. In a case, if large tables are vacuuming in a brief time, all workers of autovacuum occupied those tables for a long time. As a result, all other databases and tables stay idle until the worker is available. For the workers in a single database, there is no limit of the workers.

Updating Planning Numbers

In PostgreSQL, to generate more queries, the query planner relies on statistical information about the content of the tables. You can gather these stats by ANALYZE command that is invoked by itself. If accurate statistics will not be correct, or there is a poor choice performance of the database may be degraded. The autovacuum daemon will automatically provide by giving ANALYZE command. But the administrator prefers to select manually scheduled ANALYZE command, primarily if it is understood that update activity on a table does not affect the exciting column. ANALYZE command is organized strictly by the daemon as the number of rows is updated and inserted. A simple rule is to think about value change in a table is maximum or minimum. 

Note:  The autovacuum daemon never issues the ANALYZE command for the foreign tables.

Updating The Visibility Map

Maintain the visibility map for each table of vacuums to track the only pages visible to all active transactions. This process has two purposes

First, If there is no cleanup in some pages the vacuum can skip that pages on the next run. Second, it helps PostgreSQL to answer the queries using an index without referring to the underlying table. PostgreSQL doesn’t have tuple visibility information, an index can scan fetches the row of each entry to check its transaction. On the other hand, index-only scans check the visibility map first. Hence the heap fetch can be skipped if all rows are visible. This is useful where a visibility map can prevent disk access. This visibility map is smaller than the heap, so if the heap is extremely large it can ve cached easily.

Conclusion

In this article, we let you know about the VACUUM command in PostgresSQL. Some crucial aspects of that query. I hope you understand the purpose and how to use this query. If you find this article helpful or have some suggestions or objections, kindly let me know in the comment section.

DMCA.com Protection Status
DMCA Protected & Monitored

There are affiliate links in this post. At no cost to you, I get commissions for purchases made through links in this post.