Skip to content

GH-39010: [Python] Introduce maps_as_pydicts parameter for to_pylist, to_pydict, as_py #45471

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Feb 20, 2025

Conversation

jonded94
Copy link
Contributor

@jonded94 jonded94 commented Feb 9, 2025

Rationale for this change

Currently, unfortunately MapScalar/Array types are not deserialized into proper Python dicts, which is unfortunate since this breaks "roundtrips" from Python -> Arrow -> Python:

import pyarrow as pa

schema = pa.schema([pa.field('x', pa.map_(pa.string(), pa.int64()))])
data = [{'x': {'a': 1}}]
pa.RecordBatch.from_pylist(data, schema=schema).to_pylist()
# [{'x': [('a', 1)]}]

This is especially bad when storing TiBs of deeply nested data (think of lists in structs in maps...) that were created from Python and serialized into Arrow/Parquet, since they can't be read in again with native pyarrow methods without doing extremely ugly and computationally costly workarounds.

What changes are included in this PR?

A new parameter maps_as_pydicts is introduced to to_pylist, to_pydict, as_py which will allow proper roundtrips:

import pyarrow as pa

schema = pa.schema([pa.field('x', pa.map_(pa.string(), pa.int64()))])
data = [{'x': {'a': 1}}]
pa.RecordBatch.from_pylist(data, schema=schema).to_pylist(maps_as_pydicts="strict")
# [{'x': {'a': 1}}]

Are these changes tested?

Yes. There are tests for to_pylist and to_pydict included for pyarrow.Table, whilst low-level MapScalar and especially a nesting with ListScalar and StructScalar is tested.

Also, duplicate keys now should throw an error, which is also tested for.

Are there any user-facing changes?

Yes. The as_py() method on Scalar instances can be called with a new keyword argument maps_as_pydicts.

As a consequence, if you implement your own Scalar subclass (for example for an extension type), you should change its signature to accept that new argument. For example this definition:

class JSONArrowScalar(pa.ExtensionScalar):
    def as_py(self):
        return deserialize_json(self.value.as_py() if self.value else None)

could be changed to:

class JSONArrowScalar(pa.ExtensionScalar):
    def as_py(self, **kwargs):
        return deserialize_json(self.value.as_py(**kwargs) if self.value else None)

Fix ExampleUuidScalarType

Add tests for `maps_as_pydicts`

Add test for duplicate map keys

Formatting fixes

Add docstring for 'maps_as_pydicts'

Formatting fixes

Call from_arrays from Table

Fix last hopefully issues

Correct MapScalar method "as_py" when there are multiple keys present
Copy link

github-actions bot commented Feb 9, 2025

⚠️ GitHub issue #39010 has been automatically assigned in GitHub to PR creator.

@pitrou
Copy link
Member

pitrou commented Feb 10, 2025

While this is not a bad idea in itself, it seems like the roundtripping concern could be solved more efficiently by making from_pylist accept a list of tuples for map fields.

@pitrou
Copy link
Member

pitrou commented Feb 10, 2025

Also:

they can't be read in again with native pyarrow methods without doing extremely ugly and computationally costly workarounds

Please note that from_pylist and to_pylist are quite costly in themselves. Usually you want to avoid these kinds of roundtrips to/from Python objects if you are concerned with performance.

@jonded94
Copy link
Contributor Author

While this is not a bad idea in itself, it seems like the roundtripping concern could be solved more efficiently by making from_pylist accept a list of tuples for map fields.

Let me clarify what this is about. Map fields are already createable with from_pylist by using list of tuples, as I show in the tests I added. Even the code in my initial message can show this. Fundamentally, it's about adding opt-in behaviour to to_pylist to arrive at a functionality one would expect from a Python perspective:

data = [{'x': {'a': 1}}]
pa.RecordBatch.from_pylist(data, schema=schema).to_pylist()
^---------------------------------------------^
  this works fine, data will properly encoded in the Arrow way of encoding Maps
                                                ^---------^
                                                this will give lists of tuples instead of dicts 

You can use data = [{'x': [('a', 1)]}] here too, this will yield the same RecordBatch. This then of course technically would qualify as a proper "roundtrip", but this is not what this issue is about, it's about deserializing Map Arrow types as the ~"expected" Python equivalent, at least as an opt-in method such as pandas already supports for some couple of years now (shown in the linked Github issue).

Please note that from_pylist and to_pylist are quite costly in themselves.

Yes, but this is part of a very large distributed machine learning setup, where relatively intricate filters applied on deeply nested list/struct/map columns. The compute of the actual machine learning outclasses the compute one has to do to deserialize Python objects by many orders of magnitude.

For pure data queries, we would not use bare Python objects of course.

@pitrou
Copy link
Member

pitrou commented Feb 10, 2025

it's about deserializing Map Arrow types as the ~"expected" Python equivalent, at least as an opt-in method such as pandas already supports for some couple of years now (shown in the linked Github issue).

I see, thanks. Then, do we want to reuse the same parameter signature as in the Pandas-related PR? I.e., allow either None, "lossy" and "strict", rather than a boolean.

@jonded94
Copy link
Contributor Author

allow either None, "lossy" and "strict", rather than a boolean.

Sure, I actually also stumbled across that when I revisited that original Github issue. Before I do that, I'd like to ask whether you're generally fine with adding this new parameter to every as_py method? This has to be done because the to_pylist method calls as_py on its member arrays (which can be all possible types), and therefore all array/scalar types have to support this parameter. I did not see any other way to easily implement this. I'm willing to do quick progress here, so if you come up with another idea, let me know.

@pitrou
Copy link
Member

pitrou commented Feb 10, 2025

Sure, I actually also stumbled across that when I revisited that original Github issue. Before I do that, I'd like to ask whether you're generally fine with adding this new parameter to every as_py method?

That sounds ok to me. Ideally, to_pylist wouldn't call as_py in a loop (which is going to be quite slow), but that would be a major refactor.

@pitrou
Copy link
Member

pitrou commented Feb 10, 2025

By the way, we probably want to make the new parameter keyword-only?

@jonded94
Copy link
Contributor Author

I addressed the remarks :) There is some weird error in the "Docs" job, I don't know what this is about.

@pitrou
Copy link
Member

pitrou commented Feb 11, 2025

Hmm, it looks like some of the CI failures will need #45500 to be merged first

@jonded94
Copy link
Contributor Author

I rebased the branch, now the CI tests seem fine again, I think?

Could we get a approval/review of this? :)

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jonded94 ! This looks good on the principle, here are some assorted comments.

Comment on lines 1667 to 1668
This can change the ordering of (key, value) pairs, and will
deduplicate multiple keys, resulting in a possible loss of data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the ordering comment is obsolete, as Python dicts are ordered nowadays. Unless the underlying implementation does something weird, ordering should therefore be preserved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the ordering part, added some explanation of which value survives on duplicate keys.

Arrow Map, as in [(key1, value1), (key2, value2), ...].

If 'lossy' or 'strict', convert Arrow Map arrays to native Python dicts.
This can change the ordering of (key, value) pairs, and will
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment re: ordering

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

with pytest.raises(ValueError):
assert s.as_py(maps_as_pydicts="strict")

assert s.as_py(maps_as_pydicts="lossy") == {'a': 2}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we check that a warning is actually emitted? See pytest.warns

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented a check for this warning

raise ValueError(
"Invalid value for 'maps_as_pydicts': "
+ "valid values are 'lossy', 'strict' or `None` (default). "
+ f"Received '{maps_as_pydicts}'."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: it may be more idiomatic to use the repr here

Suggested change
+ f"Received '{maps_as_pydicts}'."
+ f"Received {maps_as_pydicts!r}."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented the suggested change

for key, value in self:
if key in result_dict:
if maps_as_pydicts == "strict":
raise ValueError(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make this a KeyError. Also, the message should perhaps contain the duplicate key?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made it a KeyError

@pitrou
Copy link
Member

pitrou commented Feb 20, 2025

@github-actions crossbow submit -g python

Copy link

Revision: 93045c4

Submitted crossbow builds: ursacomputing/crossbow @ actions-9728f80818

Task Status
example-python-minimal-build-fedora-conda GitHub Actions
example-python-minimal-build-ubuntu-venv GitHub Actions
test-conda-python-3.10 GitHub Actions
test-conda-python-3.10-hdfs-2.9.2 GitHub Actions
test-conda-python-3.10-hdfs-3.2.1 GitHub Actions
test-conda-python-3.10-pandas-latest-numpy-latest GitHub Actions
test-conda-python-3.11 GitHub Actions
test-conda-python-3.11-dask-latest GitHub Actions
test-conda-python-3.11-dask-upstream_devel GitHub Actions
test-conda-python-3.11-hypothesis GitHub Actions
test-conda-python-3.11-pandas-latest-numpy-1.26 GitHub Actions
test-conda-python-3.11-pandas-latest-numpy-latest GitHub Actions
test-conda-python-3.11-pandas-nightly-numpy-nightly GitHub Actions
test-conda-python-3.11-pandas-upstream_devel-numpy-nightly GitHub Actions
test-conda-python-3.11-spark-master GitHub Actions
test-conda-python-3.12 GitHub Actions
test-conda-python-3.12-cpython-debug GitHub Actions
test-conda-python-3.13 GitHub Actions
test-conda-python-3.9 GitHub Actions
test-conda-python-3.9-pandas-1.1.3-numpy-1.19.5 GitHub Actions
test-conda-python-emscripten GitHub Actions
test-cuda-python-ubuntu-22.04-cuda-11.7.1 GitHub Actions
test-debian-12-python-3-amd64 GitHub Actions
test-debian-12-python-3-i386 GitHub Actions
test-fedora-39-python-3 GitHub Actions
test-ubuntu-22.04-python-3 GitHub Actions
test-ubuntu-22.04-python-313-freethreading GitHub Actions
test-ubuntu-24.04-python-3 GitHub Actions

@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Feb 20, 2025
Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, will merge if CI is green.

@pitrou
Copy link
Member

pitrou commented Feb 20, 2025

CI failures are unrelated.

@pitrou pitrou merged commit f6bfa7b into apache:main Feb 20, 2025
16 checks passed
@pitrou pitrou removed the awaiting committer review Awaiting committer review label Feb 20, 2025
@Linchin
Copy link

Linchin commented Feb 24, 2025

Just fyi this might cause backward incompatibility issue because the user defined extension types are not expecting maps_as_pydicts as an argument for to_py(). We are seeing this in our prerelease tests:

_________________________ test_json_arrow_record_batch _________________________

    def test_json_arrow_record_batch():
        data = [
            json.dumps(value, sort_keys=True, separators=(",", ":"))
            for value in JSON_DATA.values()
        ]
        arr = pa.array(data, type=db_dtypes.JSONArrowType())
        batch = pa.RecordBatch.from_arrays([arr], ["json_col"])
        sink = pa.BufferOutputStream()
    
        with pa.RecordBatchStreamWriter(sink, batch.schema) as writer:
            writer.write_batch(batch)
    
        buf = sink.getvalue()
    
        with pa.ipc.open_stream(buf) as reader:
            result = reader.read_all()
    
        json_col = result.column("json_col")
        assert isinstance(json_col.type, db_dtypes.JSONArrowType)
    
>       s = json_col.to_pylist()

tests/unit/test_json.py:225: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pyarrow/table.pxi:1380: in pyarrow.lib.ChunkedArray.to_pylist
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   TypeError: JSONArrowScalar.as_py() got an unexpected keyword argument 'maps_as_pydicts'

pyarrow/array.pxi:1[67](https://github.com/googleapis/python-db-dtypes-pandas/actions/runs/13017353125/job/37736481135?pr=310#step:5:68)7: TypeError
- generated xml file: /home/runner/work/python-db-dtypes-pandas/python-db-dtypes-pandas/unit_prerelease_3.12_sponge_log.xml -
=========================== short test summary info ============================
FAILED tests/unit/test_json.py::test_json_arrow_to_pylist - TypeError: JSONArrowScalar.as_py() got an unexpected keyword argument 'maps_as_pydicts'
FAILED tests/unit/test_json.py::test_json_arrow_record_batch - TypeError: JSONArrowScalar.as_py() got an unexpected keyword argument 'maps_as_pydicts'
2 failed, 298 passed in 1.[81](https://github.com/googleapis/python-db-dtypes-pandas/actions/runs/13017353125/job/37736481135?pr=310#step:5:82)s

(Link: https://github.com/googleapis/python-db-dtypes-pandas/actions/runs/13017353125/job/37736481135?pr=310)

kou pushed a commit to kou/arrow that referenced this pull request Feb 25, 2025
…o_pylist`, `to_pydict`, `as_py` (apache#45471)

### Rationale for this change

Currently, unfortunately `MapScalar`/`Array` types are not deserialized into proper Python `dict`s, which is unfortunate since this breaks "roundtrips" from Python -> Arrow -> Python:

```
import pyarrow as pa

schema = pa.schema([pa.field('x', pa.map_(pa.string(), pa.int64()))])
data = [{'x': {'a': 1}}]
pa.RecordBatch.from_pylist(data, schema=schema).to_pylist()
# [{'x': [('a', 1)]}]
```

This is especially bad when storing TiBs of deeply nested data (think of lists in structs in maps...) that were created from Python and serialized into Arrow/Parquet, since they can't be read in again with native `pyarrow` methods without doing extremely ugly and computationally costly workarounds.

### What changes are included in this PR?

A new parameter `maps_as_pydicts` is introduced to `to_pylist`, `to_pydict`, `as_py` which will allow proper roundtrips:

```
import pyarrow as pa

schema = pa.schema([pa.field('x', pa.map_(pa.string(), pa.int64()))])
data = [{'x': {'a': 1}}]
pa.RecordBatch.from_pylist(data, schema=schema).to_pylist(maps_as_pydicts="strict")
# [{'x': {'a': 1}}]
```

### Are these changes tested?

Yes. There are tests for `to_pylist` and `to_pydict` included for `pyarrow.Table`, whilst low-level `MapScalar` and especially a nesting with `ListScalar` and `StructScalar` is tested.

Also, duplicate keys now should throw an error, which is also tested for.

### Are there any user-facing changes?

No callsites should be broken, simply a new keyword-only optional parameter is added.
* GitHub Issue: apache#39010

Authored-by: Jonas Dedden <[email protected]>
Signed-off-by: Antoine Pitrou <[email protected]>
@omatthew98
Copy link

Just fyi this might cause backward incompatibility issue because the user defined extension types are not expecting maps_as_pydicts as an argument for to_py(). We are seeing this in our prerelease tests:

_________________________ test_json_arrow_record_batch _________________________

    def test_json_arrow_record_batch():
        data = [
            json.dumps(value, sort_keys=True, separators=(",", ":"))
            for value in JSON_DATA.values()
        ]
        arr = pa.array(data, type=db_dtypes.JSONArrowType())
        batch = pa.RecordBatch.from_arrays([arr], ["json_col"])
        sink = pa.BufferOutputStream()
    
        with pa.RecordBatchStreamWriter(sink, batch.schema) as writer:
            writer.write_batch(batch)
    
        buf = sink.getvalue()
    
        with pa.ipc.open_stream(buf) as reader:
            result = reader.read_all()
    
        json_col = result.column("json_col")
        assert isinstance(json_col.type, db_dtypes.JSONArrowType)
    
>       s = json_col.to_pylist()

tests/unit/test_json.py:225: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pyarrow/table.pxi:1380: in pyarrow.lib.ChunkedArray.to_pylist
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   TypeError: JSONArrowScalar.as_py() got an unexpected keyword argument 'maps_as_pydicts'

pyarrow/array.pxi:1[67](https://github.com/googleapis/python-db-dtypes-pandas/actions/runs/13017353125/job/37736481135?pr=310#step:5:68)7: TypeError
- generated xml file: /home/runner/work/python-db-dtypes-pandas/python-db-dtypes-pandas/unit_prerelease_3.12_sponge_log.xml -
=========================== short test summary info ============================
FAILED tests/unit/test_json.py::test_json_arrow_to_pylist - TypeError: JSONArrowScalar.as_py() got an unexpected keyword argument 'maps_as_pydicts'
FAILED tests/unit/test_json.py::test_json_arrow_record_batch - TypeError: JSONArrowScalar.as_py() got an unexpected keyword argument 'maps_as_pydicts'
2 failed, 298 passed in 1.[81](https://github.com/googleapis/python-db-dtypes-pandas/actions/runs/13017353125/job/37736481135?pr=310#step:5:82)s

(Link: https://github.com/googleapis/python-db-dtypes-pandas/actions/runs/13017353125/job/37736481135?pr=310)

We (Ray Data team) are also running into backward compatibility issues like this in our tests against pyarrow nightly with the same error mentioned here:

[2025-02-25T06:27:20Z] =================================== FAILURES ===================================
--
  | [2025-02-25T06:27:20Z] ____________ test_convert_to_pyarrow_array_object_ext_type_fallback ____________
  | [2025-02-25T06:27:20Z]
  | [2025-02-25T06:27:20Z]     def test_convert_to_pyarrow_array_object_ext_type_fallback():
  | [2025-02-25T06:27:20Z]         column_values = create_ragged_ndarray(
  | [2025-02-25T06:27:20Z]             [
  | [2025-02-25T06:27:20Z]                 "hi",
  | [2025-02-25T06:27:20Z]                 1,
  | [2025-02-25T06:27:20Z]                 None,
  | [2025-02-25T06:27:20Z]                 [[[[]]]],
  | [2025-02-25T06:27:20Z]                 {"a": [[{"b": 2, "c": UserObj(i=123)}]]},
  | [2025-02-25T06:27:20Z]                 UserObj(i=456),
  | [2025-02-25T06:27:20Z]             ]
  | [2025-02-25T06:27:20Z]         )
  | [2025-02-25T06:27:20Z]         column_name = "py_object_column"
  | [2025-02-25T06:27:20Z]
  | [2025-02-25T06:27:20Z]         # First, assert that straightforward conversion into Arrow native types fails
  | [2025-02-25T06:27:20Z]         with pytest.raises(ArrowConversionError) as exc_info:
  | [2025-02-25T06:27:20Z]             _convert_to_pyarrow_native_array(column_values, column_name)
  | [2025-02-25T06:27:20Z]
  | [2025-02-25T06:27:20Z]         assert (
  | [2025-02-25T06:27:20Z]             str(exc_info.value)
  | [2025-02-25T06:27:20Z]             == "Error converting data to Arrow: ['hi' 1 None list([[[[]]]]) {'a': [[{'b': 2, 'c': UserObj(i=123)}]]}\n UserObj(i=456)]"  # noqa: E501
  | [2025-02-25T06:27:20Z]         )
  | [2025-02-25T06:27:20Z]
  | [2025-02-25T06:27:20Z]         # Subsequently, assert that fallback to `ArrowObjectExtensionType` succeeds
  | [2025-02-25T06:27:20Z]         pa_array = convert_to_pyarrow_array(column_values, column_name)
  | [2025-02-25T06:27:20Z]
  | [2025-02-25T06:27:20Z] >       assert pa_array.to_pylist() == column_values.tolist()
  | [2025-02-25T06:27:20Z]
  | [2025-02-25T06:27:20Z] python/ray/air/tests/test_arrow.py:121:
  | [2025-02-25T06:27:20Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
  | [2025-02-25T06:27:20Z]
  | [2025-02-25T06:27:20Z] >   ???
  | [2025-02-25T06:27:20Z] E   TypeError: as_py() got an unexpected keyword argument 'maps_as_pydicts'

@pitrou
Copy link
Member

pitrou commented Feb 25, 2025

@Linchin @omatthew98 I think the way around this would be to take a **kwargs in your as_py method and then forward it to any nested as_py call (if any).

For example turn this:

class JSONArrowScalar(pa.ExtensionScalar):
    def as_py(self):
        return JSONArray._deserialize_json(self.value.as_py() if self.value else None)

into this:

class JSONArrowScalar(pa.ExtensionScalar):
    def as_py(self, **kwargs):
        return JSONArray._deserialize_json(self.value.as_py(**kwargs) if self.value else None)

@pitrou
Copy link
Member

pitrou commented Feb 25, 2025

I've updated the PR description, we should remember to call out this potential incompatibility in the release notes for the next version.

andishgar pushed a commit to andishgar/arrow that referenced this pull request Feb 25, 2025
…o_pylist`, `to_pydict`, `as_py` (apache#45471)

### Rationale for this change

Currently, unfortunately `MapScalar`/`Array` types are not deserialized into proper Python `dict`s, which is unfortunate since this breaks "roundtrips" from Python -> Arrow -> Python:

```
import pyarrow as pa

schema = pa.schema([pa.field('x', pa.map_(pa.string(), pa.int64()))])
data = [{'x': {'a': 1}}]
pa.RecordBatch.from_pylist(data, schema=schema).to_pylist()
# [{'x': [('a', 1)]}]
```

This is especially bad when storing TiBs of deeply nested data (think of lists in structs in maps...) that were created from Python and serialized into Arrow/Parquet, since they can't be read in again with native `pyarrow` methods without doing extremely ugly and computationally costly workarounds.

### What changes are included in this PR?

A new parameter `maps_as_pydicts` is introduced to `to_pylist`, `to_pydict`, `as_py` which will allow proper roundtrips:

```
import pyarrow as pa

schema = pa.schema([pa.field('x', pa.map_(pa.string(), pa.int64()))])
data = [{'x': {'a': 1}}]
pa.RecordBatch.from_pylist(data, schema=schema).to_pylist(maps_as_pydicts="strict")
# [{'x': {'a': 1}}]
```

### Are these changes tested?

Yes. There are tests for `to_pylist` and `to_pydict` included for `pyarrow.Table`, whilst low-level `MapScalar` and especially a nesting with `ListScalar` and `StructScalar` is tested.

Also, duplicate keys now should throw an error, which is also tested for.

### Are there any user-facing changes?

No callsites should be broken, simply a new keyword-only optional parameter is added.
* GitHub Issue: apache#39010

Authored-by: Jonas Dedden <[email protected]>
Signed-off-by: Antoine Pitrou <[email protected]>
raulchen pushed a commit to ray-project/ray that referenced this pull request Mar 3, 2025
#51041)

## Why are these changes needed?
Our tests with pyarrow nightly caught a backwards incompatibility bug
with a [recent pyarrow
change](apache/arrow#45471). To fix this we
simply need to pass along kwargs in our `as_py` method as suggested by
the pyarrow team
[here](apache/arrow#45471 (comment)).


---------

Signed-off-by: Matthew Owen <[email protected]>
xsuler pushed a commit to antgroup/ant-ray that referenced this pull request Mar 4, 2025
ray-project#51041)

## Why are these changes needed?
Our tests with pyarrow nightly caught a backwards incompatibility bug
with a [recent pyarrow
change](apache/arrow#45471). To fix this we
simply need to pass along kwargs in our `as_py` method as suggested by
the pyarrow team
[here](apache/arrow#45471 (comment)).


---------

Signed-off-by: Matthew Owen <[email protected]>
abrarsheikh pushed a commit to ray-project/ray that referenced this pull request Mar 8, 2025
#51041)

## Why are these changes needed?
Our tests with pyarrow nightly caught a backwards incompatibility bug
with a [recent pyarrow
change](apache/arrow#45471). To fix this we
simply need to pass along kwargs in our `as_py` method as suggested by
the pyarrow team
[here](apache/arrow#45471 (comment)).

---------

Signed-off-by: Matthew Owen <[email protected]>
Signed-off-by: Abrar Sheikh <[email protected]>
park12sj pushed a commit to park12sj/ray that referenced this pull request Mar 18, 2025
ray-project#51041)

## Why are these changes needed?
Our tests with pyarrow nightly caught a backwards incompatibility bug
with a [recent pyarrow
change](apache/arrow#45471). To fix this we
simply need to pass along kwargs in our `as_py` method as suggested by
the pyarrow team
[here](apache/arrow#45471 (comment)).


---------

Signed-off-by: Matthew Owen <[email protected]>
jaychia pushed a commit to jaychia/ray that referenced this pull request Mar 19, 2025
ray-project#51041)

## Why are these changes needed?
Our tests with pyarrow nightly caught a backwards incompatibility bug
with a [recent pyarrow
change](apache/arrow#45471). To fix this we
simply need to pass along kwargs in our `as_py` method as suggested by
the pyarrow team
[here](apache/arrow#45471 (comment)).

---------

Signed-off-by: Matthew Owen <[email protected]>
Signed-off-by: Jay Chia <[email protected]>
jaychia pushed a commit to jaychia/ray that referenced this pull request Mar 19, 2025
ray-project#51041)

## Why are these changes needed?
Our tests with pyarrow nightly caught a backwards incompatibility bug
with a [recent pyarrow
change](apache/arrow#45471). To fix this we
simply need to pass along kwargs in our `as_py` method as suggested by
the pyarrow team
[here](apache/arrow#45471 (comment)).

---------

Signed-off-by: Matthew Owen <[email protected]>
Signed-off-by: Jay Chia <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants