Skip to content

Implement roll_monthday, simplify SemiMonthOffset #18762

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Dec 30, 2017
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
6100c3f
implement roll_monthday, roll_qtrday, notes on possible bugs
jbrockmendel Dec 13, 2017
aac0832
simplify SemiMonth.apply
jbrockmendel Dec 13, 2017
c5bc5b2
whitespace/comment fixup
jbrockmendel Dec 13, 2017
15b7916
Merge branch 'master' of https://github.com/pandas-dev/pandas into qt…
jbrockmendel Dec 13, 2017
cb07e88
flake8 indentation cleanup
jbrockmendel Dec 13, 2017
eb5e72f
deprivatize and type roll_monthday
jbrockmendel Dec 13, 2017
ec3b24d
docstrings, de-privatize, types
jbrockmendel Dec 13, 2017
c6f025b
avoid while loop[ in BusinessDay.apply
jbrockmendel Dec 13, 2017
f5694de
docstring edits
jbrockmendel Dec 14, 2017
2566bf5
add typing
jbrockmendel Dec 14, 2017
0838271
Merge branch 'master' of https://github.com/pandas-dev/pandas into qt…
jbrockmendel Dec 14, 2017
a649238
dummy commit to force CI
jbrockmendel Dec 14, 2017
425d3b4
Merge branch 'master' of https://github.com/pandas-dev/pandas into qt…
jbrockmendel Dec 15, 2017
71f138d
Separate int and datetime roll funcs
jbrockmendel Dec 16, 2017
091acc2
Merge branch 'master' of https://github.com/pandas-dev/pandas into qt…
jbrockmendel Dec 18, 2017
365cdb8
Merge branch 'master' of https://github.com/pandas-dev/pandas into qt…
jbrockmendel Dec 24, 2017
9993a91
add tests for liboffsets funcs
jbrockmendel Dec 29, 2017
5d773a5
Merge branch 'master' of https://github.com/pandas-dev/pandas into qt…
jbrockmendel Dec 29, 2017
a5d9dee
Merge branch 'master' of https://github.com/pandas-dev/pandas into qt…
jbrockmendel Dec 29, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 40 additions & 7 deletions pandas/_libs/tslibs/offsets.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -540,11 +540,9 @@ def shift_quarters(int64_t[:] dtindex, int quarters,
n = quarters

months_since = (dts.month - q1start_month) % modby
compare_month = dts.month - months_since
compare_month = compare_month or 12
# compare_day is only relevant for comparison in the case
# where months_since == 0.
compare_day = get_firstbday(dts.year, compare_month)
compare_day = get_firstbday(dts.year, dts.month)

if n <= 0 and (months_since != 0 or
(months_since == 0 and dts.day > compare_day)):
Expand Down Expand Up @@ -573,11 +571,9 @@ def shift_quarters(int64_t[:] dtindex, int quarters,
n = quarters

months_since = (dts.month - q1start_month) % modby
compare_month = dts.month - months_since
compare_month = compare_month or 12
# compare_day is only relevant for comparison in the case
# where months_since == 0.
compare_day = get_lastbday(dts.year, compare_month)
compare_day = get_lastbday(dts.year, dts.month)

if n <= 0 and (months_since != 0 or
(months_since == 0 and dts.day > compare_day)):
Expand Down Expand Up @@ -794,6 +790,7 @@ cpdef datetime shift_month(datetime stamp, int months, object day_opt=None):
return stamp.replace(year=year, month=month, day=day)


# TODO: Can we declare this so it will take datetime _or_ pandas_datetimestruct
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no please don't functions should not take different types like this. have 2 functions. we specifically generate templates when we have to cover lots of dtypes (not suggesting this at all here though).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, will remove this comment and address similar things you've mentioned.

cpdef int get_day_of_month(datetime other, day_opt) except? -1:
"""
Find the day in `other`'s month that satisfies a DateOffset's onOffset
Expand Down Expand Up @@ -844,6 +841,42 @@ cpdef int get_day_of_month(datetime other, day_opt) except? -1:
raise ValueError(day_opt)


def _roll_monthday(n, other, compare):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be de-privatized

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you cpdef this? and type things, add a doc-string

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you have 2 possibilities for types (they must be both either integers or dateimes), then have 2 functions and name it that way, much more readable

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. One consideration that might push in the other direction if we want to get rid of duplicate code: roll_yearday and roll_monthday are basically special cases of roll_qtrday with modby of 12 and 1, respectively. Merging them would mean doing some unnecessary mod calls, no idea what the perf hit would be.

# Either `other` and `compare` are _both_ datetimes or they are both
# integers for days in the same month.

if n > 0 and other < compare:
n -= 1
elif n <= 0 and other > compare:
# as if rolled forward already
n += 1
return n


cpdef inline int roll_qtrday(other, n, month, day_opt='start',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

int modby=3) except? -1:
# TODO: type `other` as datetime-or-pandas_datetimestruct?
# TODO: Merge this with roll_yearday by setting modby=12 there?
# code de-duplication versus perf hit?
# TODO: with small adjustments this could be used in shift_quarters
months_since = other.month % modby - month % modby

if n > 0:
if months_since < 0 or (months_since == 0 and
other.day < get_day_of_month(other,
day_opt)):
# pretend to roll back if on same month but
# before compare_day
n -= 1
else:
if (months_since > 0 or (months_since == 0 and
other.day > get_day_of_month(other,
day_opt))):
# make sure to roll forward, so negate
n += 1
return n


cpdef int roll_yearday(other, n, month, day_opt='start') except? -1:
"""
Possibly increment or decrement the number of periods to shift
Expand Down Expand Up @@ -905,7 +938,7 @@ cpdef int roll_yearday(other, n, month, day_opt='start') except? -1:
other.day < get_day_of_month(other,
day_opt)):
n -= 1
elif n <= 0:
else:
if other.month > month or (other.month == month and
other.day > get_day_of_month(other,
day_opt)):
Expand Down
130 changes: 42 additions & 88 deletions pandas/tseries/offsets.py
Original file line number Diff line number Diff line change
Expand Up @@ -944,15 +944,8 @@ def onOffset(self, dt):

@apply_wraps
def apply(self, other):
n = self.n
compare_day = self._get_offset_day(other)

if n > 0 and other.day < compare_day:
n -= 1
elif n <= 0 and other.day > compare_day:
# as if rolled forward already
n += 1

n = liboffsets._roll_monthday(self.n, other.day, compare_day)
return shift_month(other, n, self._day_opt)

@apply_index_wraps
Expand Down Expand Up @@ -1038,22 +1031,12 @@ class CustomBusinessMonthEnd(_CustomBusinessMonth):

@apply_wraps
def apply(self, other):
n = self.n

# First move to month offset
cur_mend = self.m_offset.rollforward(other)

# Find this custom month offset
cur_cmend = self.cbday.rollback(cur_mend)

# handle zero case. arbitrarily rollforward
if n == 0 and other != cur_cmend:
n += 1

if other < cur_cmend and n >= 1:
n -= 1
elif other > cur_cmend and n <= -1:
n += 1
compare_day = self.cbday.rollback(cur_mend)
n = liboffsets._roll_monthday(self.n, other, compare_day)

new = cur_mend + n * self.m_offset
result = self.cbday.rollback(new)
Expand All @@ -1066,23 +1049,12 @@ class CustomBusinessMonthBegin(_CustomBusinessMonth):

@apply_wraps
def apply(self, other):
n = self.n
dt_in = other

# First move to month offset
cur_mbegin = self.m_offset.rollback(dt_in)
cur_mbegin = self.m_offset.rollback(other)

# Find this custom month offset
cur_cmbegin = self.cbday.rollforward(cur_mbegin)

# handle zero case. arbitrarily rollforward
if n == 0 and dt_in != cur_cmbegin:
n += 1

if dt_in > cur_cmbegin and n <= -1:
n += 1
elif dt_in < cur_cmbegin and n >= 1:
n -= 1
compare_day = self.cbday.rollforward(cur_mbegin)
n = liboffsets._roll_monthday(self.n, other, compare_day)

new = cur_mbegin + n * self.m_offset
result = self.cbday.rollforward(new)
Expand Down Expand Up @@ -1122,21 +1094,21 @@ def rule_code(self):

@apply_wraps
def apply(self, other):
n = self.n
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After #18877 and #18875, the randomized testing I'm running locally is turning up bugs (pytz.NonExistentTimeError notwithstanding) exclusively in SemiMonthOffset and FY5253Quarter.

if not self.onOffset(other):
_, days_in_month = tslib.monthrange(other.year, other.month)
if 1 < other.day < self.day_of_month:
other = other.replace(day=self.day_of_month)
if n > 0:
# rollforward so subtract 1
n -= 1
elif self.day_of_month < other.day < days_in_month:
other = other.replace(day=self.day_of_month)
if n < 0:
# rollforward in the negative direction so add 1
n += 1
elif n == 0:
n = 1
# shift `other` to self.day_of_month, incrementing `n` if necessary
n = liboffsets._roll_monthday(self.n, other.day, self.day_of_month)

days_in_month = tslib.monthrange(other.year, other.month)[1]

# For SemiMonthBegin on other.day == 1 and
# SemiMonthEnd on other.day == days_in_month,
# shifting `other` to `self.day_of_month` _always_ requires
# incrementing/decrementing `n`, regardless of whether it is
# initially positive.
if type(self) is SemiMonthBegin and (self.n <= 0 and other.day == 1):
n -= 1
elif type(self) is SemiMonthEnd and (self.n > 0 and
other.day == days_in_month):
n += 1

return self._apply(n, other)

Expand Down Expand Up @@ -1206,12 +1178,6 @@ def onOffset(self, dt):
return dt.day in (self.day_of_month, days_in_month)

def _apply(self, n, other):
# if other.day is not day_of_month move to day_of_month and update n
if n > 0 and other.day < self.day_of_month:
n -= 1
elif other.day > self.day_of_month:
n += 1

months = n // 2
day = 31 if n % 2 else self.day_of_month
return shift_month(other, months, day)
Expand Down Expand Up @@ -1257,12 +1223,6 @@ def onOffset(self, dt):
return dt.day in (1, self.day_of_month)

def _apply(self, n, other):
# if other.day is not day_of_month move to day_of_month and update n
if other.day < self.day_of_month:
n -= 1
elif n <= 0 and other.day > self.day_of_month:
n += 1

months = n // 2 + n % 2
day = 1 if n % 2 else self.day_of_month
return shift_month(other, months, day)
Expand Down Expand Up @@ -1343,6 +1303,7 @@ def onOffset(self, dt):
if self.normalize and not _is_normalized(dt):
return False
return dt.weekday() == self.weekday
# TODO: Shouldn't this return True when self.weekday is None?

@property
def rule_code(self):
Expand Down Expand Up @@ -1404,15 +1365,12 @@ def apply(self, other):
base = other
offsetOfMonth = self.getOffsetOfMonth(other)

months = self.n
if months > 0 and offsetOfMonth > other:
months -= 1
elif months <= 0 and offsetOfMonth < other:
months += 1
months = liboffsets._roll_monthday(self.n, other, offsetOfMonth)

other = self.getOffsetOfMonth(shift_month(other, months, 'start'))
other = datetime(other.year, other.month, other.day, base.hour,
base.minute, base.second, base.microsecond)
# TODO: Why is this handled differently from LastWeekOfMonth?
return other

def getOffsetOfMonth(self, dt):
Expand Down Expand Up @@ -1485,18 +1443,16 @@ def __init__(self, n=1, normalize=False, weekday=None):
def apply(self, other):
offsetOfMonth = self.getOffsetOfMonth(other)

months = self.n
if months > 0 and offsetOfMonth > other:
months -= 1
elif months <= 0 and offsetOfMonth < other:
months += 1
months = liboffsets._roll_monthday(self.n, other, offsetOfMonth)

return self.getOffsetOfMonth(shift_month(other, months, 'start'))

def getOffsetOfMonth(self, dt):
m = MonthEnd()
d = datetime(dt.year, dt.month, 1, dt.hour, dt.minute,
dt.second, dt.microsecond, tzinfo=dt.tzinfo)
# TODO: Will potentially dropping nanoseconds here be a problem?
# Particularly in onOffset?
eom = m.rollforward(d)
# TODO: Is this DST-safe?
w = Week(weekday=self.weekday)
Expand Down Expand Up @@ -1532,7 +1488,8 @@ class QuarterOffset(DateOffset):
_from_name_startingMonth = None
_adjust_dst = True
# TODO: Consider combining QuarterOffset and YearOffset __init__ at some
# point
# point. Also apply_index, onOffset, rule_code if
# startingMonth vs month attr names are resolved

def __init__(self, n=1, normalize=False, startingMonth=None):
self.n = self._validate_n(n)
Expand Down Expand Up @@ -1563,26 +1520,22 @@ def rule_code(self):

@apply_wraps
def apply(self, other):
n = self.n
compare_day = self._get_offset_day(other)

months_since = (other.month - self.startingMonth) % 3

if n <= 0 and (months_since != 0 or
(months_since == 0 and other.day > compare_day)):
# make sure to roll forward, so negate
n += 1
elif n > 0 and (months_since == 0 and other.day < compare_day):
# pretend to roll back if on same month but before compare_day
n -= 1

return shift_month(other, 3 * n - months_since, self._day_opt)
# months_since: find the calendar quarter containing other.month,
# e.g. if other.month == 8, the calendar quarter is [Jul, Aug, Sep].
# Then find the month in that quarter containing an onOffset date for
# self. `months_since` is the number of months to shift other.month
# to get to this on-offset month.
months_since = other.month % 3 - self.startingMonth % 3
qtrs = liboffsets.roll_qtrday(other, self.n, self.startingMonth,
day_opt=self._day_opt, modby=3)
months = qtrs * 3 - months_since
return shift_month(other, months, self._day_opt)

def onOffset(self, dt):
if self.normalize and not _is_normalized(dt):
return False
modMonth = (dt.month - self.startingMonth) % 3
return modMonth == 0 and dt.day == self._get_offset_day(dt)
mod_month = (dt.month - self.startingMonth) % 3
return mod_month == 0 and dt.day == self._get_offset_day(dt)

@apply_index_wraps
def apply_index(self, dtindex):
Expand Down Expand Up @@ -2126,6 +2079,7 @@ def apply(self, other):
n -= 1
elif n < 0 and other > current_easter:
n += 1
# TODO: Why does this handle the 0 case the opposite of others?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any idea if this is intentional?


# NOTE: easter returns a datetime.date so we have to convert to type of
# other
Expand Down