Skip to content

x and y reducers for group and hexbin #1916

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 7, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/transforms/group.md
Original file line number Diff line number Diff line change
Expand Up @@ -366,6 +366,8 @@ The following named reducers are supported:
* *deviation* - the standard deviation
* *variance* - the variance per [Welford’s algorithm](https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Welford's_online_algorithm)
* *identity* - the array of values
* *x* - the group’s *x* value (when grouping on *x*)
* *y* - the group’s *y* value (when grouping on *y*)

In addition, a reducer may be specified as:

Expand Down
17 changes: 13 additions & 4 deletions docs/transforms/hexbin.md
Original file line number Diff line number Diff line change
Expand Up @@ -174,9 +174,9 @@ Plot.plot({

The *options* must specify the **x** and **y** channels. The **binWidth** option (default 20) defines the distance between centers of neighboring hexagons in pixels. If any of **z**, **fill**, or **stroke** is a channel, the first of these channels will be used to subdivide bins.

The *outputs* options are similar to the [bin transform](./bin.md); each output channel receives as input, for each hexagon, the subset of the data which has been matched to its center. The outputs object specifies the aggregation method for each output channel.
The *outputs* options are similar to the [bin transform](./bin.md); for each hexagon, an output channel value is derived by reducing the corresponding binned input channel values. The *outputs* object specifies the reducer for each output channel.

The following aggregation methods are supported:
The following named reducers are supported:

* *first* - the first value, in input order
* *last* - the last value, in input order
Expand All @@ -195,13 +195,22 @@ The following aggregation methods are supported:
* *variance* - the variance per [Welford’s algorithm](https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Welford's_online_algorithm)
* *mode* - the value with the most occurrences
* *identity* - the array of values
* a function to be passed the array of values for each bin and the extent of the bin
* *x* - the hexagon’s *x* center
* *y* - the hexagon’s *y* center

In addition, a reducer may be specified as:

* a function to be passed the array of values for each bin and the center of the bin
* an object with a *reduceIndex* method

In the last case, the **reduceIndex** method is repeatedly passed three arguments: the index for each bin (an array of integers), the input channel’s array of values, and the center of the bin (an object {data, x, y}); it must then return the corresponding aggregate value for the bin.

Most reducers require binding the output channel to an input channel; for example, if you want the **y** output channel to be a *sum* (not merely a count), there should be a corresponding **y** input channel specifying which values to sum. If there is not, *sum* will be equivalent to *count*.

## hexbin(*outputs*, *options*) {#hexbin}

```js
Plot.dot(olympians, Plot.hexbin({fill: "count"}, {x: "weight", y: "height"}))
```

Bins (hexagonally) on **x** and **y**. Also groups on the first channel of **z**, **fill**, or **stroke**, if any.
Bins hexagonally on **x** and **y**. Also groups on the first channel of **z**, **fill**, or **stroke**, if any.
36 changes: 35 additions & 1 deletion src/transforms/group.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,42 @@ export interface GroupOutputOptions<T = Reducer> {
z?: ChannelValue;
}

/**
* How to reduce grouped values; one of:
*
* - a generic reducer name, such as *count* or *first*
* - *x* - the group’s **x** value (when grouping on **x**)
* - *y* - the group’s **y** value (when grouping on **y**)
* - a function that takes an array of values and returns the reduced value
* - an object that implements the *reduceIndex* method
*
* When a reducer function or implementation is used with the group transform,
* it is passed the group extent {x, y} as an additional argument.
*/
export type GroupReducer = Reducer | GroupReducerFunction | GroupReducerImplementation | "x" | "y";

/**
* A shorthand functional group reducer implementation: given an array of input
* channel *values*, and the current group’s *extent*, returns the corresponding
* reduced output value.
*/
export type GroupReducerFunction<S = any, T = S> = (values: S[], extent: {x: any; y: any}) => T;

/** A group reducer implementation. */
export interface GroupReducerImplementation<S = any, T = S> {
/**
* Given an *index* representing the contents of the current group, the input
* channel’s array of *values*, and the current group’s *extent*, returns the
* corresponding reduced output value. If no input channel is supplied (e.g.,
* as with the *count* reducer) then *values* may be undefined.
*/
reduceIndex(index: number[], values: S[], extent: {x: any; y: any}): T;
// TODO scope
// TODO label
Comment on lines +71 to +72
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This made me think of how we would type the reduceIndex method for scope = "data" or "facet". Maybe something like the following, but it seems a bit messy:

  reduceIndex(
    index: number[],
    values: undefined | S[],
    valueOrExtent: any | {data: any[]},
    maybeExtent: undefined | {data: any[]}
  ): T;

}

/** Output channels (and options) for the group transform. */
export type GroupOutputs = ChannelReducers | GroupOutputOptions;
export type GroupOutputs = ChannelReducers<GroupReducer> | GroupOutputOptions<GroupReducer>;

/**
* Groups on the first channel of **z**, **fill**, or **stroke**, if any, and
Expand Down
46 changes: 42 additions & 4 deletions src/transforms/group.js
Original file line number Diff line number Diff line change
Expand Up @@ -76,10 +76,10 @@ function groupn(
inputs = {} // input channels and options
) {
// Compute the outputs.
outputs = maybeOutputs(outputs, inputs);
reduceData = maybeReduce(reduceData, identity);
sort = sort == null ? undefined : maybeOutput("sort", sort, inputs);
filter = filter == null ? undefined : maybeEvaluator("filter", filter, inputs);
outputs = maybeGroupOutputs(outputs, inputs);
reduceData = maybeGroupReduce(reduceData, identity);
sort = sort == null ? undefined : maybeGroupOutput("sort", sort, inputs);
filter = filter == null ? undefined : maybeGroupEvaluator("filter", filter, inputs);

// Produce x and y output channels as appropriate.
const [GX, setGX] = maybeColumn(x);
Expand Down Expand Up @@ -287,6 +287,32 @@ function invalidReduce(reduce) {
throw new Error(`invalid reduce: ${reduce}`);
}

export function maybeGroupOutputs(outputs, inputs) {
return maybeOutputs(outputs, inputs, maybeGroupOutput);
}

function maybeGroupOutput(name, reduce, inputs) {
return maybeOutput(name, reduce, inputs, maybeGroupEvaluator);
}

function maybeGroupEvaluator(name, reduce, inputs) {
return maybeEvaluator(name, reduce, inputs, maybeGroupReduce);
}

function maybeGroupReduce(reduce, value) {
return maybeReduce(reduce, value, maybeGroupReduceFallback);
}

function maybeGroupReduceFallback(reduce) {
switch (`${reduce}`.toLowerCase()) {
case "x":
return reduceX;
case "y":
return reduceY;
}
throw new Error(`invalid group reduce: ${reduce}`);
}

export function maybeSubgroup(outputs, inputs) {
for (const name in inputs) {
const value = inputs[name];
Expand Down Expand Up @@ -399,6 +425,18 @@ function reduceProportion(value, scope) {
: {scope, reduceIndex: (I, V, basis = 1) => sum(I, (i) => V[i]) / basis};
}

const reduceX = {
reduceIndex(I, X, {x}) {
return x;
}
};

const reduceY = {
reduceIndex(I, X, {y}) {
return y;
}
};

export function find(test) {
if (typeof test !== "function") throw new Error(`invalid test function: ${test}`);
return {
Expand Down
3 changes: 2 additions & 1 deletion src/transforms/hexbin.d.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import type {ChannelReducers, ChannelValue} from "../channel.js";
import type {Initialized} from "./basic.js";
import type {GroupReducer} from "./group.js";

/** Options for the hexbin transform. */
export interface HexbinOptions {
Expand Down Expand Up @@ -43,4 +44,4 @@ export interface HexbinOptions {
*
* To draw empty hexagons, see the hexgrid mark.
*/
export function hexbin<T>(outputs?: ChannelReducers, options?: T & HexbinOptions): Initialized<T>;
export function hexbin<T>(outputs?: ChannelReducers<GroupReducer>, options?: T & HexbinOptions): Initialized<T>;
30 changes: 14 additions & 16 deletions src/transforms/hexbin.js
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ import {map, number, valueof} from "../options.js";
import {applyPosition} from "../projection.js";
import {sqrt3} from "../symbol.js";
import {initializer} from "./basic.js";
import {hasOutput, maybeGroup, maybeOutputs, maybeSubgroup} from "./group.js";
import {hasOutput, maybeGroup, maybeGroupOutputs, maybeSubgroup} from "./group.js";

// We don’t want the hexagons to align with the edges of the plot frame, as that
// would cause extreme x-values (the upper bound of the default x-scale domain)
Expand All @@ -16,9 +16,8 @@ export function hexbin(outputs = {fill: "count"}, {binWidth, ...options} = {}) {
const {z} = options;

// TODO filter e.g. to show empty hexbins?
// TODO disallow x, x1, x2, y, y1, y2 reducers?
binWidth = binWidth === undefined ? 20 : number(binWidth);
outputs = maybeOutputs(outputs, options);
outputs = maybeGroupOutputs(outputs, options);

// A fill output means a fill channel; declaring the channel here instead of
// waiting for the initializer allows the mark constructor to determine that
Expand Down Expand Up @@ -65,15 +64,15 @@ export function hexbin(outputs = {fill: "count"}, {binWidth, ...options} = {}) {
const binFacet = [];
for (const o of outputs) o.scope("facet", facet);
for (const [f, I] of maybeGroup(facet, G)) {
for (const bin of hbin(I, X, Y, binWidth)) {
for (const {index: b, extent} of hbin(data, I, X, Y, binWidth)) {
binFacet.push(++i);
BX.push(bin.x);
BY.push(bin.y);
if (Z) GZ.push(G === Z ? f : Z[bin[0]]);
if (F) GF.push(G === F ? f : F[bin[0]]);
if (S) GS.push(G === S ? f : S[bin[0]]);
if (Q) GQ.push(G === Q ? f : Q[bin[0]]);
for (const o of outputs) o.reduce(bin);
BX.push(extent.x);
BY.push(extent.y);
if (Z) GZ.push(G === Z ? f : Z[b[0]]);
if (F) GF.push(G === F ? f : F[b[0]]);
if (S) GS.push(G === S ? f : S[b[0]]);
if (Q) GQ.push(G === Q ? f : Q[b[0]]);
for (const o of outputs) o.reduce(b, extent);
}
}
binFacets.push(binFacet);
Expand Down Expand Up @@ -106,7 +105,7 @@ export function hexbin(outputs = {fill: "count"}, {binWidth, ...options} = {}) {
});
}

function hbin(I, X, Y, dx) {
function hbin(data, I, X, Y, dx) {
const dy = dx * (1.5 / sqrt3);
const bins = new Map();
for (const i of I) {
Expand All @@ -127,11 +126,10 @@ function hbin(I, X, Y, dx) {
const key = `${pi},${pj}`;
let bin = bins.get(key);
if (bin === undefined) {
bins.set(key, (bin = []));
bin.x = (pi + (pj & 1) / 2) * dx + ox;
bin.y = pj * dy + oy;
bin = {index: [], extent: {data, x: (pi + (pj & 1) / 2) * dx + ox, y: pj * dy + oy}};
bins.set(key, bin);
}
bin.push(i);
bin.index.push(i);
}
return bins.values();
}
273 changes: 273 additions & 0 deletions test/output/hexbinFillX.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading