-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
ENH: use correct dtype in groupby cython ops when it is known (without try/except) #38291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
279c4d1
e6dc529
ea79027
97fcd22
b04d91f
202bee8
e888b3e
2566ec4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -45,6 +45,7 @@ | |
is_datetime64_any_dtype, | ||
is_datetime64tz_dtype, | ||
is_extension_array_dtype, | ||
is_float_dtype, | ||
is_integer_dtype, | ||
is_numeric_dtype, | ||
is_period_dtype, | ||
|
@@ -507,7 +508,19 @@ def _ea_wrap_cython_operation( | |
res_values = self._cython_operation( | ||
kind, values, how, axis, min_count, **kwargs | ||
) | ||
result = maybe_cast_result(result=res_values, obj=orig_values, how=how) | ||
dtype = maybe_cast_result_dtype(orig_values.dtype, how) | ||
if is_extension_array_dtype(dtype): | ||
cls = dtype.construct_array_type() | ||
return cls._from_sequence(res_values) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think you need to pass the dtype here as well (to ensure lower precision gets preserved) |
||
return res_values | ||
|
||
elif is_float_dtype(values.dtype): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. so would really like to move this entire wrapping to a method on EA / generic casting. We do this in multiple places (e.g. also on _reduce operatiosn), and this is likely leading to missing functionaility in various places. |
||
# FloatingArray | ||
values = values.to_numpy(na_value=np.nan) | ||
res_values = self._cython_operation( | ||
kind, values, how, axis, min_count, **kwargs | ||
) | ||
result = type(orig_values)._from_sequence(res_values) | ||
return result | ||
|
||
raise NotImplementedError(values.dtype) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit more off-topic here, but: for int dtypes with lower precision, we actually want int64 for those (eg sum of int8 gives int64)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to the extent that we can separate fixing of these from the avoiding the try/except goal, I'd like to do that