Skip to content

Fixing cases when ValueError from alphalens.tears.create_factor_tear_sheet #87

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

dat-boris
Copy link

When we run create_factor_tear_sheet on either:

  1. single value for factor per group (only one value over a date, or sector)
  2. Field with discreet value (not sure if this is a correct use case for alphalens)

We are can run into exception seen below.

qcut over similar edges have been discussed but no solution is implemented yet "officially":

So this provided a simple work around for case when we run into this issue in a simple case, when we have only seen one value of the factor.

Exception seen

/usr/local/lib/python2.7/dist-packages/alphalens/performance.pyc in quantile_calc(x, quantiles)
    242 
    243     def quantile_calc(x, quantiles):
--> 244         return pd.qcut(x, quantiles, labels=False) + 1
    245 
    246     grouper = ['date', 'sector'] if by_sector else ['date']

/usr/local/lib/python2.7/dist-packages/pandas/tools/tile.pyc in qcut(x, q, labels, retbins, precision)
    167     bins = algos.quantile(x, quantiles)
    168     return _bins_to_cuts(x, bins, labels=labels, retbins=retbins,precision=precision,
--> 169                          include_lowest=True)
    170 
    171 

/usr/local/lib/python2.7/dist-packages/pandas/tools/tile.pyc in _bins_to_cuts(x, bins, right, labels, retbins, precision, name, include_lowest)
    187 
    188     if len(algos.unique(bins)) < len(bins):
--> 189         raise ValueError('Bin edges must be unique: %s' % repr(bins))
    190 
    191     if include_lowest:

ValueError: Bin edges must be unique: array([-0.0030303, -0.0030303, -0.0030303, -0.0030303, -0.0030303,
       -0.0030303])

@jameschristopher
Copy link
Contributor

jameschristopher commented Sep 19, 2016

So we have ran into this qcut issue as well, the first and most obvious solution is to reduce the number of quantiles, but it looks like in from your trace (and description) that you are assigning groups of size N >> 1 a value and trying to look at the performance of each group? So are you classifying groups of stocks into varying degrees of under/over performing?

@twiecki
Copy link
Contributor

twiecki commented Oct 6, 2016

Closing due to inactivity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants