Group by using time grouper and 1 other column does not work #14929

namank · 2016-12-20T17:04:40Z

Code Sample, a copy-pastable example if possible

list_of_dicts = [{
    'time': pandas.to_datetime('2016-12-03 18:00:00'),
    'building': 'tall',
    'type': 'steel'
}, {
    'time': pandas.to_datetime('2016-12-03 18:00:00'),
    'building': 'tall',
    'type': 'brick'
}]

df = pandas.DataFrame(list_of_dicts)

  building                time   type
0     tall 2016-12-03 18:00:00  steel
1     tall 2016-12-03 18:00:00  brick
    
    
df = df.groupby(['building', pandas.Grouper(key = 'time', freq = '1D')])

[(<pandas.tseries.resample.TimeGrouper at 0x2b1a5978aed0>,
    building                time   type
1     tall 2016-12-03 18:00:00  brick),
 ('building',
    building                time   type
0     tall 2016-12-03 18:00:00  steel)]

Problem description

Apparently this was fixed in #3794, but I am still seeing this issue. However, this goes away if I add 1 more grouper column to it, which is strange.

df.groupby(['building', pandas.Grouper(key = 'time', freq = '1D'), 'type'])

[(('tall', Timestamp('2016-12-03 00:00:00', offset='D'), 'brick'),
    building                time   type
1     tall 2016-12-03 18:00:00  brick),
 (('tall', Timestamp('2016-12-03 00:00:00', offset='D'), 'steel'),
    building                time   type
0     tall 2016-12-03 18:00:00  steel)]

Expected Output

If I give in the explicit list, it works correctly.

df.groupby([df['building'], pandas.Grouper(key = 'time', freq = '1D')])

[(('tall', Timestamp('2016-12-03 00:00:00', offset='D')),
    building                time   type
0     tall 2016-12-03 18:00:00  steel
1     tall 2016-12-03 18:00:00  brick)]

This should be the expected output by just using 'building' or even pandas.Grouper(key='building').

Output of `pd.show_versions()`

# Paste the output here pd.show_versions() here INSTALLED VERSIONS ------------------ commit: None python: 2.7.8.final.0 python-bits: 64 OS: Linux OS-release: 2.6.18-164.el5 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8

pandas: 0.15.2
nose: None
Cython: None
numpy: 1.9.0.dev-Unknown
scipy: 0.15.1
statsmodels: None
IPython: 0.13.2
sphinx: 1.3.1
patsy: None
dateutil: 2.4.2
pytz: 2013d
bottleneck: 0.8.0
tables: None
numexpr: None
matplotlib: 1.4.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
rpy2: None
sqlalchemy: None
pymysql: None
psycopg2: None

The text was updated successfully, but these errors were encountered:

jreback · 2016-12-20T17:24:13Z

you are using a pretty old version. IIRC there were some addtiional fixes. try upgrading.

jreback closed this as completed Dec 20, 2016

jreback added Groupby Usage Question labels Dec 20, 2016

jreback added this to the No action milestone Dec 20, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Group by using time grouper and 1 other column does not work #14929

Group by using time grouper and 1 other column does not work #14929

namank commented Dec 20, 2016

jreback commented Dec 20, 2016

Uh oh!

Uh oh!

Group by using time grouper and 1 other column does not work #14929

Group by using time grouper and 1 other column does not work #14929

Comments

namank commented Dec 20, 2016

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

jreback commented Dec 20, 2016

Uh oh!

Output of `pd.show_versions()`