Skip to content

Infer names in MultiIndex.from_product if inputs have a name attribute #27292

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
user3483203 opened this issue Jul 8, 2019 · 3 comments · Fixed by #28417
Closed

Infer names in MultiIndex.from_product if inputs have a name attribute #27292

user3483203 opened this issue Jul 8, 2019 · 3 comments · Fixed by #28417

Comments

@user3483203
Copy link

user3483203 commented Jul 8, 2019

It would be convenient to have from_product infer level names from inputs if at all possible.

Current behavior

>>> a = pd.Series([1, 2, 3], name='a')
>>> b = pd.Series(['a', 'b'], name='b')

>>> pd.MultiIndex.from_product([a, b])
MultiIndex(levels=[[1, 2, 3], ['a', 'b']],
           codes=[[0, 0, 1, 1, 2, 2], [0, 1, 0, 1, 0, 1]])

Current workaround

>>> arrs = [a, b]
>>> pd.MultiIndex.from_product(ins, names=[el.name for el in arrs])
MultiIndex(levels=[[1, 2, 3], ['a', 'b']],
           codes=[[0, 0, 1, 1, 2, 2], [0, 1, 0, 1, 0, 1]],
           names=['a', 'b'])

Obviously this would't make sense for lists or arrays being passed in, but in the case of a Series, it would be nice to have the name persisted.


From glancing at the source, it looks like this would be somewhat simple to implement, the naive approach might be something like:

        from pandas.core.arrays.categorical import _factorize_from_iterables
        from pandas.core.reshape.util import cartesian_product

        if not is_list_like(iterables):
            raise TypeError("Input must be a list / sequence of iterables.")
        elif is_iterator(iterables):
            iterables = list(iterables)

        if names is None:
            names = [el.name if hasattr(el, 'name') else None for el in iterables]

        codes, levels = _factorize_from_iterables(iterables)
        codes = cartesian_product(codes)
        return MultiIndex(levels, codes, sortorder=sortorder, names=names)
@WillAyd
Copy link
Member

WillAyd commented Jul 8, 2019

cc @toobaz I think this makes sense though if you would like to try a PR

@user3483203
Copy link
Author

user3483203 commented Jul 8, 2019

@WillAyd I think this change would also make the behavior of from_frame a bit more consistent. from_frame does infer names from its inputs. I will put together a PR

@toobaz
Copy link
Member

toobaz commented Jul 8, 2019

Seems like a good idea!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants