You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Wanted to produce grouped histogram such that the heights of the bars add up to 1. The following code results in ValueError: weights should have the same shape as x
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(123)
n = 100
df = pd.DataFrame(np.random.randn(n), columns=['a'])
by = np.random.randint(1,5,n)
df.hist(by=by) # works
plt.show()
weights = np.repeat(1/len(df), len(df))
df.hist(weights = weights) # works
plt.show()
df.hist(by = by, weights = weights) # does not work
plt.show()
In [15]: pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.4.2.final.0
python-bits: 64
OS: Linux
OS-release: 3.18.6-1-ARCH
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
Could you clarify a bit more what you are trying to achieve?
The by splits the original data into groups. df.hist() then calls the matplotlib histogram function for each group with the original weights. In your case the size of each by-group will be random and different. The weights however always are of length 100.
The by and weights combination seems to work if the groups all have the same size and match the weights as in this example:
Thanks @mgdadv for the reply. Yes you understand exactly what I wanted to achieve and yes I did guess that the weights and data size within the groups probably did not match. If this is not a bug (as I thought) could it be a feature request then?
Wanted to produce grouped histogram such that the heights of the bars add up to 1. The following code results in ValueError: weights should have the same shape as x
In [15]: pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.4.2.final.0
python-bits: 64
OS: Linux
OS-release: 3.18.6-1-ARCH
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
pandas: 0.15.2
nose: 1.3.4
Cython: None
numpy: 1.9.1
scipy: 0.15.1
statsmodels: 0.6.1
IPython: 2.4.1
sphinx: None
patsy: 0.3.0
dateutil: 2.4.0
pytz: 2014.10
bottleneck: 1.0.0
tables: None
numexpr: 2.4
matplotlib: 1.4.3
openpyxl: 1.8.6
xlrd: 0.9.3
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
rpy2: 2.5.6
sqlalchemy: None
pymysql: None
psycopg2: None
The text was updated successfully, but these errors were encountered: