Skip to content

Sum of pd.DataFrame.groupby.sum containing NaN should return NaN ? #15674

Closed
@flipdazed

Description

@flipdazed

Code Sample, a copy-pastable example if possible

import pandas as pd
import numpy as np
d = {'l':  ['left', 'right', 'left', 'right', 'left', 'right'],
     'r': ['right', 'left', 'right', 'left', 'right', 'left'],
     'v': [-1, 1, -1, 1, -1, np.nan]}
df = pd.DataFrame(d)

Problem description

When a grouped dataframe contains a value of np.NaN the expected output is not aligned with numpy.sum or pandas.Series.sum

NaN as is given by the skipna=False flag for pd.Series.sum and also pd.DataFrame.sum

In [235]: df.v.sum(skipna=False)
Out[235]: nan

However, this behavior is not reflected in the pandas.DataFrame.groupby object

In [237]: df.groupby('l')['v'].sum()['right']
Out[237]: 2.0

and cannot be forced by applying the np.sum method directly

In [238]: df.groupby('l')['v'].apply(np.sum)['right']
Out[238]: 2.0

see this StackOverflow post for a workaround

Expected Output

In [238]: df.groupby('l')['v'].apply(np.sum)['right']
Out[238]: nan

and

In [237]: df.groupby('l')['v'].sum(skipna=False)['right']
Out[237]: nan

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 2.7.13.final.0 python-bits: 64 OS: Windows OS-release: 7 machine: AMD64 processor: Intel64 Family 6 Model 26 Stepping 5, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: None.None

pandas: 0.19.1
nose: 1.3.7
pip: 9.0.1
setuptools: 32.3.1
Cython: 0.25.2
numpy: 1.12.0
scipy: 0.18.1
statsmodels: 0.6.1
xarray: None
IPython: 5.1.0
sphinx: 1.5.1
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 1.1.0
tables: 3.2.2
numexpr: 2.6.1
matplotlib: 1.5.3
openpyxl: 2.4.0
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.4
lxml: 3.7.0
bs4: 4.5.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.1.4
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.43.0
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions