Skip to content

BUG: ValueError when accessing dataFrame with array attribute #59196

Closed
@Zybulon

Description

@Zybulon

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
import numpy as np

attrs = {"A": "B", "G": np.array([1.2, 2.4])}

# This one works
arr = np.random.rand(60, 1)
df_named = pd.DataFrame(arr)
df_named.attrs = attrs
print(df_named[0])

# This one works
arr = np.random.rand(61, 1)
df_named = pd.DataFrame(arr)
df_named.attrs = {"A": "B", "G": "A"}
print(df_named[0])

# This one does not works
arr = np.random.rand(61, 1)
df_named = pd.DataFrame(arr)
df_named.attrs = attrs
print(df_named)  # This works
print(df_named[0])  # This does not works

Issue Description

Hello,

I have a dataFrame of size (61,1) with 2 attributes (one is an array) and I can't print the first Serie of the DataFrame.
I have the following Error :

Traceback (most recent call last):

  File ~\miniforge-pypy3\envs\h5pandas_dev\Lib\site-packages\spyder_kernels\py3compat.py:356 in compat_exec
    exec(code, globals, locals)

  File d:\documents\perso\travail\mbda\pandas_extension\h5pandas\tests\debug.py:23
    print(df_named[0])  # This does not works

  File ~\miniforge-pypy3\envs\h5pandas_dev\Lib\site-packages\pandas\core\series.py:1784 in __repr__
    return self.to_string(**repr_params)

  File ~\miniforge-pypy3\envs\h5pandas_dev\Lib\site-packages\pandas\core\series.py:1871 in to_string
    formatter = fmt.SeriesFormatter(

  File ~\miniforge-pypy3\envs\h5pandas_dev\Lib\site-packages\pandas\io\formats\format.py:225 in __init__
    self._chk_truncate()

  File ~\miniforge-pypy3\envs\h5pandas_dev\Lib\site-packages\pandas\io\formats\format.py:247 in _chk_truncate
    series = concat((series.iloc[:row_num], series.iloc[-row_num:]))

  File ~\miniforge-pypy3\envs\h5pandas_dev\Lib\site-packages\pandas\core\reshape\concat.py:395 in concat
    return op.get_result()

  File ~\miniforge-pypy3\envs\h5pandas_dev\Lib\site-packages\pandas\core\reshape\concat.py:650 in get_result
    return result.__finalize__(self, method="concat")

  File ~\miniforge-pypy3\envs\h5pandas_dev\Lib\site-packages\pandas\core\generic.py:6273 in __finalize__
    have_same_attrs = all(obj.attrs == attrs for obj in other.objs[1:])

  File ~\miniforge-pypy3\envs\h5pandas_dev\Lib\site-packages\pandas\core\generic.py:6273 in <genexpr>
    have_same_attrs = all(obj.attrs == attrs for obj in other.objs[1:])

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

However I can print the DataFrame, it does not raise the ValueError.
If the DataFrame hasn't got the array attribute, I do not have ValueError.
If the DataFrame has only 60 rows, I do not have ValueError.

Expected Behavior

I should not have this ValueError.

Installed Versions

INSTALLED VERSIONS

commit : d9cdd2e
python : 3.12.4.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.19045
machine : AMD64
processor : AMD64 Family 23 Model 1 Stepping 1, AuthenticAMD
byteorder : little
LC_ALL : None
LANG : en
LOCALE : fr_FR.cp1252

pandas : 2.2.2
numpy : 1.26.4
pytz : 2024.1
dateutil : 2.9.0
setuptools : 70.1.1
pip : 24.0
Cython : None
pytest : 8.2.2
hypothesis : None
sphinx : 7.3.7
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.1.4
IPython : 8.26.0
pandas_datareader : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.12.3
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : 3.8.4
numba : None
numexpr : 2.8.7
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 16.1.0
pyreadstat : None
python-calamine : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : 3.9.2
tabulate : 0.9.0
xarray : None
xlrd : None
zstandard : 0.22.0
tzdata : 2024.1
qtpy : 2.4.1
pyqt5 : None

Metadata

Metadata

Assignees

Labels

BugNeeds TriageIssue that has not been reviewed by a pandas team member

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions