Skip to content

proposal: sync: support for sharded values #18802

Open
@aclements

Description

@aclements

Per-CPU sharded values are a useful and common way to reduce contention on shared write-mostly values. However, this technique is currently difficult or impossible to use in Go (though there have been attempts, such as @jonhoo's https://github.com/jonhoo/drwmutex and @bcmills' https://go-review.googlesource.com/#/c/35676/).

We propose providing an API for creating and working with sharded values. Sharding would be encapsulated in a type, say sync.Sharded, that would have Get() interface{}, Put(interface{}), and Do(func(interface{})) methods. Get and Put would always have to be paired to make Do possible. (This is actually the same API that was proposed in #8281 (comment) and rejected, but perhaps we have a better understanding of the issues now.) This idea came out of off-and-on discussions between at least @rsc, @hyangah, @RLH, @bcmills, @Sajmani, and myself.

This is a counter-proposal to various proposals to expose the current thread/P ID as a way to implement sharded values (#8281, #18590). These have been turned down as exposing low-level implementation details, tying Go to an API that may be inappropriate or difficult to support in the future, being difficult to use correctly (since the ID may change at any time), being difficult to specify, and as being broadly susceptible to abuse.

There are several dimensions to the design of such an API.

Get and Put can be blocking or non-blocking:

  • With non-blocking Get and Put, sync.Sharded behaves like a collection. Get returns immediately with the current shard's value or nil if the shard is empty. Put stores a value for the current shard if the shard's slot is empty (which may be different from where Get was called, but would often be the same). If the shard's slot is not empty, Put could either put to some overflow list (in which case the state is potentially unbounded), or run some user-provided combiner (which would bound the state).

  • With blocking Get and Put, sync.Sharded behaves more like a lock. Get returns and locks the current shard's value, blocking further Gets from that shard. Put sets the shard's value and unlocks it. In this case, Put has to know which shard the value came from, so Get can either return a put function (though that would require allocating a closure) or some opaque value that must be passed to Put that internally identifies the shard.

  • It would also be possible to combine these behaviors by using an overflow list with a bounded size. Specifying 0 would yield lock-like behavior, while specifying a larger value would give some slack where Get and Put remain non-blocking without allowing the state to become completely unbounded.

Do could be consistent or inconsistent:

  • If it's consistent, then it passes the callback a snapshot at a single instant. I can think of two ways to do this: block until all outstanding values are Put and also block further Gets until the Do can complete; or use the "current" value of each shard even if it's checked out. The latter requires that shard values be immutable, but it makes Do non-blocking.

  • If it's inconsistent, then it can wait on each shard independently. This is faster and doesn't affect Get and Put, but the caller can only get a rough idea of the combined value. This is fine for uses like approximate statistics counters.

It may be that we can't make this decision at the API level and have to provide both forms of Do.

I think this is a good base API, but I can think of a few reasonable extensions:

  • Provide Peek and CompareAndSwap. If a user of the API can be written in terms of these, then Do would always be able to get an immediate consistent snapshot.

  • Provide a Value operation that uses the user-provided combiner (if we go down that API route) to get the combined value of the sync.Sharded.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions