Skip to content

ENH: Create dataframes using every combination of given values, like R's expand.grid() #7426

Closed
@onesandzeroes

Description

@onesandzeroes

I find R's expand.grid() function quite useful for quick creation of example datasets. For example:

expand.grid(height = seq(60, 70, 5), weight = seq(100, 180, 40), sex = c("Male","Female"))
   height weight    sex
1      60    100   Male
2      65    100   Male
3      70    100   Male
4      60    140   Male
5      65    140   Male
6      70    140   Male
7      60    180   Male
8      65    180   Male
9      70    180   Male
10     60    100 Female
11     65    100 Female
12     70    100 Female
13     60    140 Female
14     65    140 Female
15     70    140 Female
16     60    180 Female
17     65    180 Female
18     70    180 Female

A simple implementation of this for pandas is easy to put together:

def expand_grid(dct):
    rows = itertools.product(*dct.values())
    return pd.DataFrame.from_records(rows, columns=dct.keys())

df = expand_grid(
    {'height': range(60, 71, 5),
     'weight': range(100, 181, 40),
     'sex': ['Male', 'Female']}
)
print(df)

Do people think this would be a useful addition?

If so, what kind of features should it have beyond the basics? A dtypes argument, specifying which column should be the index, etc.?

I'm also not sure if expand_grid is the most intuitive name, but given that it's duplicating
R functionality, maybe it's best just to leave it as is.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions