Skip to content

Series.nonzero(): Returns locations, not indices #19312

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jscheithe opened this issue Jan 19, 2018 · 8 comments · Fixed by #19324
Closed

Series.nonzero(): Returns locations, not indices #19312

jscheithe opened this issue Jan 19, 2018 · 8 comments · Fixed by #19324

Comments

@jscheithe
Copy link

Problem description

Documentation says

Series.nonzero(): Return the indices of the elements that are non-zero

while it actually seems to return the integer locations.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.3.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: AMD64 Family 21 Model 16 Stepping 1, AuthenticAMD
byteorder: little
LC_ALL: None
LANG: en
LOCALE: None.None

pandas: 0.21.0
pytest: None
pip: 9.0.1
setuptools: 36.5.0.post20170921
Cython: None
numpy: 1.13.3
scipy: 0.19.1
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.6.3
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.1.0
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.9999999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@TomAugspurger
Copy link
Contributor

It'd be good to clarify that the return values is the integer locations.

@TomAugspurger TomAugspurger added this to the Next Major Release milestone Jan 19, 2018
@drorata
Copy link
Contributor

drorata commented Jan 19, 2018

What is the difference between indices and integer locations? It seems to me that the documentation is very clear.

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Jan 19, 2018 via email

@drorata
Copy link
Contributor

drorata commented Jan 19, 2018

Make sense. I will stress this in the documentation and add an example. See https://github.com/drorata/pandas/tree/fix-19312

@drorata
Copy link
Contributor

drorata commented Jan 19, 2018

Wait... even if the indices are not integers, the output of nonzero can still be used for slicing:

>>> import pandas as pd
>>> s = pd.Series([0, 3, 0, 4])
>>> s.index=['a', 'b', 'c', 'd']
>>> s.nonzero()
(array([1, 3]),)
>>> s.iloc[s.nonzero()[0]]
b    3
d    4
dtype: int64

Isn't it surprising?

@jscheithe
Copy link
Author

jscheithe commented Jan 20, 2018

Series.iloc is for selection by integer position (and supports slicing):
s.iloc[[1, 3]] returns the 2nd and 4th value in s (regardless of the index).

The label based indexer is Series.loc:
s.loc[['b', 'c']] returns the same elements.

@rahultelgote1989
Copy link

rahultelgote1989 commented Jan 20, 2018

Hi,
I am new to the pandas and this repository. My question is pandas.Series.nonzero() returns indices of the non zero elements whether you provide indexing or not is that the correct behavior? Should not it return the index values if the indexing is provided by user?

@TomAugspurger
Copy link
Contributor

.nonzero is primarily for compatability with NumPy, so what want to return the integer positions.

@jreback jreback modified the milestones: Next Major Release, 0.23.0 Jan 21, 2018
@LEO-E-100 LEO-E-100 mentioned this issue Jan 22, 2018
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants