You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
mi=pd.MultiIndex.from_tuples([['A0', 'B0'],['A0', 'B1'],
['A1', 'B0'],['A1', 'B1'],['A3', np.nan] ], names=['ia','ib'])
df=pd.DataFrame(np.arange(10).reshape(5,2), mi,columns=['bar', 'foo'])
df2=df.copy()
df2.index.set_levels(['B1','B0'],level=1,inplace=True)
df2barfooiaibA0B101B023A1B145B067A3NaN89# now sort_index to df2 will fill B0 for (A3,NaN) row, but sort_index of df wouldn'tdf2.sort_index()
barfooiaibA0B023B101A1B067B145A3B089
Problem description
It is suspect that the unsort level in MultiIndex DataFrame lead to NaN auto fill.
Note: I come with "unsort level in MultiIndex" DataFrame after some broadcast operation.
Here is the code
defmklbl(prefix, n):
return ["%s%s"% (prefix, i) foriinrange(n)]
miindex=pd.MultiIndex.from_product([mklbl('A', 3),
mklbl('B', 2),
mklbl('C', 2),
mklbl('D', 2)],names=['ia','ib','ic','id'])
micolumns= ['foo','bah']
dfmi=pd.DataFrame(np.arange(len(miindex) *len(micolumns))
.reshape((len(miindex), len(micolumns))),
index=miindex,
columns=micolumns).sort_index().sort_index(axis=1)
dfmi=dfmi.drop('A2')
# now dfmi first level index contain list of value more than it actually hold# and it may lead to the index level change from ['A0', 'A1', 'A2'] to ['A1','A0']bs=dfmi.loc[('A0','B0')].copy().rename({'D1':'D2'})
# this will lead to NaN in some row due to broadcast.
(dfmi+bs).sort_index()
bahfooicidiaibC0D0A0B02.00.0B110.08.0A1B018.016.0B126.024.0D1A0B0NaNNaNB1NaNNaNA1B0NaNNaNB1NaNNaND2A0NaNNaNNaNC1D0A0B010.08.0B118.016.0A1B026.024.0B134.032.0D1A0B0NaNNaNB1NaNNaNA1B0NaNNaNB1NaNNaND2A0NaNNaNNaN# again the (D2,NaN,NaN) get change
Expected Output
sort_index should not change the contain of DataFrame.
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.7.1.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 58 Stepping 9, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
Code Sample
Problem description
It is suspect that the unsort level in MultiIndex DataFrame lead to NaN auto fill.
Note: I come with "unsort level in MultiIndex" DataFrame after some broadcast operation.
Here is the code
Expected Output
sort_index should not change the contain of DataFrame.
Output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.7.1.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 58 Stepping 9, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.24.1
pytest: 4.1.0
pip: 10.0.1
setuptools: 39.0.1
Cython: None
numpy: 1.16.2
scipy: None
pyarrow: None
xarray: None
IPython: 7.3.0
sphinx: None
patsy: None
dateutil: 2.8.0
pytz: 2018.9
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 3.0.3
openpyxl: None
xlrd: 1.2.0
xlwt: None
xlsxwriter: None
lxml.etree: None
bs4: 4.7.1
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None
The text was updated successfully, but these errors were encountered: