Description
Related dev. issue(s): tarantool/tarantool#5806
Product: Tarantool
Since: 2.8.1
Audience/target: dev, admin
Root document: https://www.tarantool.io/en/doc/latest/reference/configuration/#binary-logging-and-snapshots
SME: @ cyrillos
Details
What should we write about?
Describe the wal_cleanup_delay
configuration option:
- what it is doing
- what problem does it solve
- how to choose its value depending on the use case
The wal_cleanup_delay
option defines a delay in second
before write ahead log files (*.xlog
) are getting started
to prune upon a node restart.
This option is ignored in case if a node is running as
an anonymous replica (replication_anon = true
). Similarly
if replication is unused or there are no plans to use
replication at all then this option should not be considered.
An initial problem to solve is the case where a node is operating
so fast that its replicas do not manage to reach the node state
and in case if the node is restarted at this moment (for various
reasons, for example due to power outage) then *.xlog
files might
be pruned during restart. In result replicas will not find these
files on the main node and have to reread all data back which
is a very expensive procedure.
Since replicas are tracked via _cluster
system space this we use
its content to count subscribed replicas and when all of them are
up and running the cleanup procedure is automatically enabled even
if wal_cleanup_delay
is not expired.
The wal_cleanup_delay
should be set to:
0
to disable the cleanup delay;>= 0
to wait for specified number of seconds.
By default it is set to 14400
seconds (ie 4
hours).
In case if registered replica is lost forever and timeout is set to
infinity then a preferred way to enable cleanup procedure is not setting
up a small timeout value but rather to delete this replica from _cluster
space manually.
Note that the option does not prevent WAL engine from removing
old *.xlog
files if there is no space left on a storage device,
WAL engine can remove them in a force way.
Current state of *.xlog
garbage collector can be found in
box.info.gc()
output. For example
tarantool> box.info.gc()
---
...
is_paused: false
The is_paused
shows if cleanup fiber is paused or not.
Requested by @cyrillos in tarantool/tarantool@2fd51ae.
Definition of done
- add an option description
- specify
Since version
with a link to release notes - add links to related docs
- make sure all option properties (Type, Default, etc.) are specified
- add an option anchor to the top of the section