Skip to content

Extension integer dtypes (such as pd.Int32Dtype) do not work with numexpr #331

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
techvslife opened this issue Feb 19, 2019 · 8 comments
Closed

Comments

@techvslife
Copy link

In the latest pandas version (0.24.1), combined with the latest version of numexpr (2.6.9), extension dtypes do not work (e.g. the “query” method on dataframes fails when the engine parameter is set to the default ):

https://stackoverflow.com/questions/54759936/extension-dtypes-in-pandas-appear-to-have-a-bug-with-query

Code to reproduce:
df_test = pd.DataFrame(data=[4,5,6], columns=["col_test"])
df_test = df_test.astype(dtype={"col_test": pd.Int32Dtype()})
df_test.query("col_test != 6")
Last lines of the long error message are:

File "...\site_packages\numexpr\necompiler.py", line 822, in evaluate zip(names, arguments)] File "...\site_packages\numexpr\necompiler.py", line 821, in signature = [(name, getType(arg)) for (name, arg) in File "...\site_packages\numexpr\necompiler.py", line 703, in getType raise ValueError("unknown type %s" % a.dtype.name) ValueError: unknown type object

Thanks!

@robbmcleod
Copy link
Member

Yeah, pandas has this problem with not discussing what they are doing in any sort of open forum. Like I'm subscribed to the [Pandas-dev] mailing list but I never hear what they are actually doing, so this is a complete surprise that they have changed the numpy dtype protocol. We are upstream of pandas, and numpy is upstream of us, so if pandas wants to change how numpy dtypes work, they have to discuss that with [Numpy-discussion].

@techvslife
Copy link
Author

techvslife commented Feb 20, 2019 via email

@techvslife
Copy link
Author

techvslife commented Feb 20, 2019 via email

@robbmcleod
Copy link
Member

Sounds like pandas should be raising an exception here.

@robbmcleod
Copy link
Member

Do we have an issue raised with pandas for this issue? In my opinion pandas should be erroring before this falls down into numexpr since it's non-standard with regards to NumPy. I can certainly raise this issue with them if they aren't doing that.

@techvslife
Copy link
Author

techvslife commented Apr 14, 2019

I have a post with the pandas guys at:
pandas-dev/pandas#25369
and with the numpy guys at:
numpy/numpy#13330

I believe the pandas team consider the issue an enhancement request, rather than a bug, so while they're open to someone fixing it as you suggest, it's not a priority. Hopefully as more people use the new nullable integer types, the issue will get assigned to someone for a fix.

@robbmcleod
Copy link
Member

Closing as pandas has closed the issue on their side.

@techvslife
Copy link
Author

I see the issue marked as “open” in pandas:
pandas-dev/pandas#25369

Or am I looking in the wrong place?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants