Skip to content

BUG: Fix .loc/.iloc/.at/iat cast unexpectedly with object dtype #49306

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 10 commits into from
Closed
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v2.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -327,8 +327,8 @@ Indexing
- Bug in :meth:`DataFrame.reindex` casting dtype to ``object`` when :class:`DataFrame` has single extension array column when re-indexing ``columns`` and ``index`` (:issue:`48190`)
- Bug in :func:`~DataFrame.describe` when formatting percentiles in the resulting index showed more decimals than needed (:issue:`46362`)
- Bug in :meth:`DataFrame.compare` does not recognize differences when comparing ``NA`` with value in nullable dtypes (:issue:`48939`)
- Bug in :meth:`Series.loc` casting :class:`Series` to ``np.dnarray`` when assigning :class:`Series` at predefined index of ``object`` dtype :class:`Series` (:issue:`48933`)
-

Missing
^^^^^^^
- Bug in :meth:`Index.equals` raising ``TypeError`` when :class:`Index` consists of tuples that contain ``NA`` (:issue:`48446`)
Expand Down
4 changes: 2 additions & 2 deletions pandas/core/internals/blocks.py
Original file line number Diff line number Diff line change
Expand Up @@ -973,8 +973,8 @@ def setitem(self, indexer, value) -> Block:

# length checking
check_setitem_lengths(indexer, value, values)

value = extract_array(value, extract_numpy=True)
if self.dtype != _dtype_obj:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we even need the extract_array call at all? I think the point of it is to explicitly convert a Series / Index to an object-dtype ndarray. If that's the case you are trying to avoid I don't think even need the condition here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need the extract_array call, otherwise, some tests in the indexing folder will not pass.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting - do you know what the failures are? Are they Index-related?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

definitely add a comment here pointing back to the PR/issue

might be np_can_hold_element assumes extract_array has been called, don't quote me on that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xr-chen can you confirm what @jbrockmendel is saying? I'm not sure yet if this is worth doing as is or if we should try to solve the issue comprehensively

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WillAyd I think he was right, he definitely knows extract_array and np_can_hold_element better than me. Do you have any plans on fixing it comprehensively?

value = extract_array(value, extract_numpy=True)
try:
casted = np_can_hold_element(values.dtype, value)
except LossySetitemError:
Expand Down
16 changes: 16 additions & 0 deletions pandas/tests/indexing/test_indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -1112,3 +1112,19 @@ def test_scalar_setitem_series_with_nested_value_length1(value, indexer_sli):
assert (ser.loc[0] == value).all()
else:
assert ser.loc[0] == value


def test_object_dtype_series_set_series_element():
# GH48933
s1 = Series(dtype="O", index=["a", "b"])

s1["a"] = Series()
s1.loc["b"] = Series()

tm.assert_series_equal(s1.loc["a"], Series())
tm.assert_series_equal(s1.loc["b"], Series())

s2 = Series(dtype="O", index=["a", "b"])

s2.iloc[1] = Series()
tm.assert_series_equal(s2.iloc[1], Series())