Skip to content

Commit 824a273

Browse files
hedeershowkmartin-sichomroeschkepre-commit-ci[bot]
authored
BUG: add pyarrow autogenerated prefix (#55115)
* add pyarrow autogenerated prefix * whats new bug fix * test with no head and pyarrow * only test pyarrow * BUG: This fixes #55009 (`raw=True` caused `apply` method of `DataFrame` to ignore passed arguments) (#55089) * fixes #55009 * update documentation * write documentation * add test * change formatting * cite DataDrame directly in docs Co-authored-by: Matthew Roeschke <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Matthew Roeschke <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * PR review feedback * Update doc/source/whatsnew/v2.2.0.rst Co-authored-by: Matthew Roeschke <[email protected]> * alphabetical whatsnew --------- Co-authored-by: Martin Šícho <[email protected]> Co-authored-by: Matthew Roeschke <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
1 parent 61d2056 commit 824a273

File tree

3 files changed

+25
-0
lines changed

3 files changed

+25
-0
lines changed

doc/source/whatsnew/v2.2.0.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -314,6 +314,7 @@ MultiIndex
314314
I/O
315315
^^^
316316
- Bug in :func:`read_csv` where ``on_bad_lines="warn"`` would write to ``stderr`` instead of raise a Python warning. This now yields a :class:`.errors.ParserWarning` (:issue:`54296`)
317+
- Bug in :func:`read_csv` with ``engine="pyarrow"`` where ``usecols`` wasn't working with a csv with no headers (:issue:`54459`)
317318
- Bug in :func:`read_excel`, with ``engine="xlrd"`` (``xls`` files) erroring when file contains NaNs/Infs (:issue:`54564`)
318319
- Bug in :func:`to_excel`, with ``OdsWriter`` (``ods`` files) writing boolean/string value (:issue:`54994`)
319320

pandas/io/parsers/arrow_parser_wrapper.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -130,6 +130,12 @@ def handle_warning(invalid_row):
130130
)
131131
}
132132
self.convert_options["strings_can_be_null"] = "" in self.kwds["null_values"]
133+
# autogenerated column names are prefixed with 'f' in pyarrow.csv
134+
if self.header is None and "include_columns" in self.convert_options:
135+
self.convert_options["include_columns"] = [
136+
f"f{n}" for n in self.convert_options["include_columns"]
137+
]
138+
133139
self.read_options = {
134140
"autogenerate_column_names": self.header is None,
135141
"skip_rows": self.header

pandas/tests/io/parser/test_header.py

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -684,3 +684,21 @@ def test_header_delim_whitespace(all_parsers):
684684
result = parser.read_csv(StringIO(data), delim_whitespace=True)
685685
expected = DataFrame({"a,b": ["1,2", "3,4"]})
686686
tm.assert_frame_equal(result, expected)
687+
688+
689+
def test_usecols_no_header_pyarrow(pyarrow_parser_only):
690+
parser = pyarrow_parser_only
691+
data = """
692+
a,i,x
693+
b,j,y
694+
"""
695+
result = parser.read_csv(
696+
StringIO(data),
697+
header=None,
698+
usecols=[0, 1],
699+
dtype="string[pyarrow]",
700+
dtype_backend="pyarrow",
701+
engine="pyarrow",
702+
)
703+
expected = DataFrame([["a", "i"], ["b", "j"]], dtype="string[pyarrow]")
704+
tm.assert_frame_equal(result, expected)

0 commit comments

Comments
 (0)