Skip to content

feat: Improve DataFrame.__getitem__ consistency #2393

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 97 commits into from
Apr 23, 2025

Conversation

MarcoGorelli
Copy link
Member

@MarcoGorelli MarcoGorelli commented Apr 17, 2025

closes #2405

What type of PR is this? (check all applicable)

  • πŸ’Ύ Refactor
  • ✨ Feature
  • πŸ› Bug Fix
  • πŸ”§ Optimization
  • πŸ“ Documentation
  • βœ… Test
  • 🐳 Other

Related issues

  • Related issue #<issue number>
  • Closes #<issue number>

Checklist

  • Code follows style guide (ruff)
  • Tests added
  • Documented the changes

If you have comments or can explain your changes, please do so below

@dangotbanned
Copy link
Member

dangotbanned commented Apr 17, 2025

Yes please 😍

I'm pretty sure these were on the chopping block from

Comment on lines 412 to 419
is_native_series = self.__narwhals_namespace__()._series._is_native
if isinstance(rows, int):
compliant = compliant._gather([rows])
elif isinstance(rows, (slice, range)):
compliant = compliant._gather_slice(rows)

elif is_sequence_like_ints(rows) or is_native_series(rows):
compliant = compliant._gather(rows)
Copy link
Member

@dangotbanned dangotbanned Apr 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel like I didn't do a good enough job documenting this pattern

@overload
def from_native(self, data: NativeFrameT, /) -> EagerDataFrameT: ...
@overload
def from_native(self, data: NativeSeriesT, /) -> EagerSeriesT: ...
def from_native(
self, data: NativeFrameT | NativeSeriesT | Any, /
) -> EagerDataFrameT | EagerSeriesT:
if self._dataframe._is_native(data):
return self._dataframe.from_native(data, context=self)
elif self._series._is_native(data):
return self._series.from_native(data, context=self)
msg = f"Unsupported type: {type(data).__name__!r}"
raise TypeError(msg)

@MarcoGorelli MarcoGorelli changed the title WIP feat: Improve DataFrame.__getitem__ consistency feat: Improve DataFrame.__getitem__ consistency Apr 18, 2025
@MarcoGorelli MarcoGorelli added the enhancement New feature or request label Apr 18, 2025
@dangotbanned dangotbanned removed their assignment Apr 21, 2025
Copy link

@lorentzenchr lorentzenchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MarcoGorelli I think this PR will be super useful for scikit-learn.
Do you also envision to allow for boolean masks, especially for row indexing? (Not a must, but - again - useful.)

@MarcoGorelli
Copy link
Member Author

Thanks for your feedback!

Do you also envision to allow for boolean masks, especially for row indexing?

Maybe, in the meantime you can use filter for that

@MarcoGorelli
Copy link
Member Author

MarcoGorelli commented Apr 23, 2025

thanks all for comments πŸ™ really appreciate it!

especially:

  • @lorentzenchr for highlighting this area of improvement - we'd added __getitem__ a while ago but then hadn't touched it in some time, really glad to have revisited it
  • @sjdenny for the hypothesis tests, which helped avoid regressing on some edge cases
  • @dangotbanned for your patient reviews and help with annotations, which make everything SO much better

EDIT: I stripped down the commit message to avoid it containing 97 commits (to avoid git log becoming unreadable), but in doing so accidentally removed the co-author information - sorry about that, I'll be more careful in the future

@MarcoGorelli MarcoGorelli merged commit 40fabfe into narwhals-dev:main Apr 23, 2025
30 checks passed
dangotbanned added a commit that referenced this pull request Apr 26, 2025
All of these are equivalent, follow-up to (#2393)
dangotbanned added a commit that referenced this pull request Apr 29, 2025
@dangotbanned dangotbanned mentioned this pull request Apr 29, 2025
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

enh: allow slicing DataFrame with native objects enh: support slicing DataFrame (or Series) with Series
4 participants