Offset tracking data can overtake stream user data #32

kjnilsson · 2021-05-21T14:33:07Z

Currently the embedded offset tracking can in read-heavy scenarios create a lot of data to be written to the stream that isn't user data. Taken to extremes and given small size based retention configurations it is possible for tracking data to "push out" user data entirely from the stream. Not dissimilar to the heat-death theory of the end of universe if you think about it.

At first thought we may feel that this makes embedding tracking data in the stream directly a bad thing but there are some nice benefits with doing so:

It is simple and portable, tracking data is replicated with user data which makes sure they will never drift apart. Leader elections take tracking data into account when selecting a new leader etc.
Availability of tracking data is the same as the availability of the stream
eviction of tracking ids is simple - we simply discard those that are old, that is tracking ids with a tracked offset that is lower than the first available offset in the stream is considered stale and will no longer be included in tracking snapshots.
external storage of tracking data would either require using Raft or implementing entry based compacting streams. Raft may be too slow for the use case we need to handle better (read heavy streams that commit very often)..

There are several improvements we can make to the current offset tracking approach which may or may not suffice:

Write tracking as a trailer when a batch is processed that has user messages in it. Currently we will always write a tracking delta chunk (when we've received tracking) after the user chunk. Doing so will reduce the number of chunks written to the segment and the index.
Only write tracking data as delta chunks periodically. This will reduce the amount of data that is written by allowing tracking requests for the same tracking id to be "pre-compacted" before persisted. The downside is that the interval determines the tracking data data loss window when a stream crashes or is shut down. The interval can be determined dynamically based on the proportion of tracking vs user data is written so that a longer interval is used the more tracking that is written.
Perform chunk-based compaction of closed segments. A background task can re-write old segments without tracking chunks. Chunk based compaction is simpler and faster than entry based compaction and as we always write a tracking snapshot (the entire tracking state) as the first chunk in a new segment we don't need any tracking data from any other segments than the latest one.

kjnilsson · 2021-05-21T14:44:31Z

Another problem with offset tracking (specifically with offset tracking delta chunks) is that we cannot currently calculate how many user entries there are in a segment even if we know the start and end offset. Thus we can't tell how many user entries are in the stream which isn't great. To help with this we can introduce a segment manifest file that includes a summary of the segment contents. This is written when a segment reaches it's max size limit and is closed. The manifest can also be used to determine the user vs tracking chunk proportions so that any chunk compactor process can easily determine whether doing the work is worth it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Offset tracking data can overtake stream user data #32

Offset tracking data can overtake stream user data #32

kjnilsson commented May 21, 2021

kjnilsson commented May 21, 2021

Offset tracking data can overtake stream user data #32

Offset tracking data can overtake stream user data #32

Comments

kjnilsson commented May 21, 2021

kjnilsson commented May 21, 2021