make facet indexes rectangular if they overlap #1068
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Yet another approach to the overlapping facet issue (alternatives: #1069, #1070).
Instead of reading the value of X for index i as X[i], we read it as X[i % X.length]. This allows to create "long" channels as rectangular representations of n * m values (n being the length of the data, and m the number of facets), where default channels are a "line" of n values. There is no need for a reindexation plan, since the addressing system doesn't need extra information to read the proper value when the index is "out of range".
Besides this change, only the transforms that need to care about overlapping facets need to be modified: stack, map, and dodge.
Performance. This works only if we are allowed to create a Uint32Array of length n*m — in some very specific cases (such as facets that overlap by pairs), it can break earlier than a subtler reindexing plan. Since it doesn't need to expand any existing channels, it should be faster and consume less memory in general.
(In the case of cumulative facets where each data point creates its own facet (which do not exist for now but will be created by time facets), this means we get a cap at something like 32k keyframes if we use one of the expanding transforms, such as stack or dodge).
Backward compatibility. Note that this will change the output if you used a column that had a length inferior to the data it covers. However, relying on this would have been a hack, which afaik was never used in the wild nor in our examples and tests. Furthermore, Plot's introduction of columnar channels explicitly mentions that a columnar channel is “an array of values of the same length as the data”.
Code aesthetics. It's a bit ugly that every lookup of a value in a channel has to be done with this "modulo length" operator. We could change this to a function
get(X,i)
for readability, but it's probably not that much more readable (only a bit less ugly), and maybe a bit slower. But happy to change if this looks nicer. It might also be something that we'd want to expose, for users who want to write their own reducers.I'm not sure how to document this.
supersedes the first item of #1041, as well as #1057 and the other branches we were working on…