Skip to content

bpo-30302 Make timedelta.__repr__ more informative. #1493

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 26 commits into from
Jul 25, 2017

Conversation

musically-ut
Copy link
Contributor

@musically-ut musically-ut commented May 7, 2017

Currently, the default implementation of datetime.datetime.repr (the default output string produced at the console/IPython) gives a rather cryptic output:

from datetime import datetime as D
D.fromtimestamp(1390953543.1) - D.fromtimestamp(1121871596)
# datetime.timedelta(3114, 28747, 100000)

For the uninitiated, it is not obvious that the numeric values here are days, seconds and microsecond respectively.

Would there be any pushback against changing this to:

# datetime.timedelta(days=3114, seconds=28747, microseconds=100000)

?

https://bugs.python.org/issue30302

@the-knights-who-say-ni
Copy link

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA).

Unfortunately our records indicate you have not signed the CLA. For legal reasons we need you to sign this before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

Thanks again to your contribution and we look forward to looking at it!

@mention-bot
Copy link

@musically-ut, thanks for your PR! By analyzing the history of the files in this pull request, we identified @tim-one, @abalkin and @serhiy-storchaka to be potential reviewers.

@vadmium
Copy link
Member

vadmium commented May 7, 2017

As well as tests, the documentation needs reviewing. It actually specifies the repr return value.

@musically-ut
Copy link
Contributor Author

@vadmium Good point. I'm assuming that the documentation is not limited to only the doc strings (actually, I couldn't find an instance there on a cursory look).

Is there an easy way to check that I have caught all cases in the extended documentation?

@musically-ut
Copy link
Contributor Author

musically-ut commented May 8, 2017

@vadmium I see what you meant by:

It actually specifies the repr return value.

here: https://github.com/python/cpython/blob/3.6/Doc/library/datetime.rst

So we'll have to consider the risk of someone's code breaking because it depends on parsing the repr representation of timedelta?


Update: Related mail thread addresses the question somewhat: https://marc.info/?l=python-dev&m=145065097122111&w=2

Lib/datetime.py Outdated
return "%s.%s(%d)" % (self.__class__.__module__,
if arg_str != "":
arg_str += ", "
arg_str += "microseconds=%d" % self._microseconds
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concatenate strings using += is inefficient. Moreover, you have to handle comma manually. Please rewrite this code using a list (args = [] ... args.append(...)), and then join it using: ', '.join(args).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Will do that.

"%s(days=1, microseconds=100)" % name)
self.assertEqual(repr(self.theclass(seconds=1, microseconds=100)),
"%s(seconds=1, microseconds=100)" % name)
# self.assertEqual(repr(self.theclass(seconds=-60)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these tests commented? Either enable them or remove them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Martin wanted the handling of negative attributes to change while I am of the opinion that they shouldn't. So we may end up discussing these tests later.

However, I'll remove them from the PR presently.

GET_TD_SECONDS(self));

return PyUnicode_FromFormat("%s(%d)",
char days[64] = "", seconds[64] = "", microseconds[64] = "";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, the >= ""< part is really strange (remove it?).

Moreover, why 64 bytes? Microseconds cannot be larger than 999 999, so 6 digits + 1 null byte, no?

Copy link
Contributor Author

@musically-ut musically-ut Jun 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got the 64 bytes number from here, which could also be 6 digits + 1. However, in light of your next comment, we may not end up needing this at all.

}
}

return PyUnicode_FromFormat("%s(%s%s%s)",
Py_TYPE(self)->tp_name,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, no module nor qualified name here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sprintf(seconds, "seconds=%d", GET_TD_SECONDS(self));
} else {
sprintf(seconds, ", seconds=%d", GET_TD_SECONDS(self));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, the code may be simpler if you use a PyList and ", ".join(), as proposed for the Python code. You can use PyUnicode_FromString("seconds=%d", GET_TD_SECONDS(self)) to build an argument. It would avoid to have to compute a buffer size (and taking the risk of a buffer overflow).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've implemented a version using PyList and PyUnicode. I don't think I'm leaking any memory there, but since this is the first time I'm touching PyObjects, I would really appreciate a second opinion on it.

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result of C API function calls always should be checked for error.

There are memory leaks.

Py_TYPE(self)->tp_name,
GET_TD_DAYS(self));
args_string);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

args_string is leaked.

PyList_Append(args, PyUnicode_FromFormat("microseconds=%d", GET_TD_MICROSECONDS(self)));
}

PyObject *args_string = PyUnicode_Join(PyUnicode_FromString(", "), args);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result of PyUnicode_FromString(", ") is leaked.

PyObject *args = PyList_New(0);

if (GET_TD_DAYS(self) != 0) {
PyList_Append(args, PyUnicode_FromFormat("days=%d", GET_TD_DAYS(self)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyUnicode_FromFormat() and PyList_Append() can fail.

The result of PyUnicode_FromFormat() is leaked.

GET_TD_SECONDS(self));

return PyUnicode_FromFormat("%s(%d)",
PyObject *args = PyList_New(0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyList_New(0) can fail.

Lib/datetime.py Outdated
return "%s.%s(%d)" % (self.__class__.__module__,
args.append("microseconds=%d" % self._microseconds)

arg_str = ', '.join(args)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just inline arg_str. This is simple expression.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks!

@@ -20,7 +20,7 @@
# XXX(gb) First run all the _Pure tests, then all the _Fast tests. You might
# not believe this, but in spite of all the sys.modules trickery running a _Pure
# test last will leave a mix of pure and native datetime stuff lying around.
test_classes = []
all_test_classes = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this is changed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the tests were only running for the C implementation, not for the Python
implementation. This was because test_classes was being overwritten inside the for loop. With this change, the tests run twice: once for the pure Python module and once for the C extension.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the loop rather than renaming this variable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry, I didn't recognize that it was a comment on this review, Github didn't allow me to reply to this from where I actually saw it.

So, yes, the loop is correct, the only thing it was doing incorrectly was overwriting the test_classes variable when it moved from '_Pure' classes to '_Fast' classes. It does need to name them differently (I'm not sure why the names of the classes don't actually change in the verbose output, that seems to take me down the test_suite rabbit hole to figure out whether __name__ is the correct attribute to change or not). However, the behavior of the tests is correct now, i.e. it tests both implementations in sequence.

I'm investigating whether there is a simple fix to make the correct version of classes to show up in the tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the tests were only running for the C implementation, not for the Python
implementation.

This looks like a bug. Please open a new issue for it. This should be fixed in all versions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'll check which versions are affected (i.e. the branches on the cpython repository, right?) and open another bug. What should I do with this fix, though?

If I let it remain here, it may potentially lead to a merge conflict?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After merging the PR that fixes the tests this PR should be merged with master. Merge conflicts that are not resolved automatically should be resolved manually.

Py_XDECREF(days);
return NULL;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can decref days here. This will simplify the cases for seconds and microseconds.


if (PyList_Append(args, days) < 0) {
Py_DECREF(args);
Py_XDECREF(days);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just Py_DECREF. days is not NULL.

Or you can merge two ifs:

if (days == NULL || PyList_Append(args, days) < 0) {
    Py_DECREF(args);
    Py_XDECREF(days);
    return NULL;
}

}

PyObject *args_string = PyUnicode_Join(sep, args);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can decref sep and args just here.

@musically-ut
Copy link
Contributor Author

@serhiy-storchaka Thanks for the suggestions! The code looks much cleaner and more concise now. Have a look!

| ``repr(t)`` | Returns a string in the form |
| | ``datetime.timedelta(D[, S[, U]])``, where D |
| | is negative for negative ``t``. (5) |
+--------------------------------+-----------------------------------------------+
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer don't remove documentation, but update it. If you don't want to describe the format, just say something like "Returns the representation of the time delta as a string."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable. Done.

self.assertEqual(repr(self.theclass(days=1, microseconds=100)),
"%s(days=1, microseconds=100)" % name)
self.assertEqual(repr(self.theclass(seconds=1, microseconds=100)),
"%s(seconds=1, microseconds=100)" % name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that it's worth it to add so many test cases. Just keep 3 examples, 1 negative value, the timedelta(0), and you're done no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the tests were added to differentiate the current implementation from the old one: i.e. no leading days=0 or seconds=0 and to highlight the current handling of negative values (i.e. the normalization of the attributes, as described in the docs). I'll remove the ones which don't test anything unique not covered elsewhere.

@@ -20,7 +20,7 @@
# XXX(gb) First run all the _Pure tests, then all the _Fast tests. You might
# not believe this, but in spite of all the sys.modules trickery running a _Pure
# test last will leave a mix of pure and native datetime stuff lying around.
test_classes = []
all_test_classes = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the loop rather than renaming this variable.

if (microseconds == NULL || PyList_Append(args, microseconds) < 0) {
Py_DECREF(args);
Py_XDECREF(microseconds);
return NULL;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move "Py_DECREF(args); return NULL;" to a new "error:" label at the end, and use "goto error;" here, to factorize the code (duplicated 3 times).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Lib/datetime.py Outdated
if self._microseconds:
args.append("microseconds=%d" % self._microseconds)

if len(args) == 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace this with:

if not args:
args.append('0')

to prevent code duplication.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"if not args:" is better

Py_DECREF(microseconds);
}

if (PyList_Size(args) == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please replace PyList_Size() with PyList_GET_SIZE().

PyList_Size() can fail, PyList_GET_SIZE() cannot.

}

PyObject *repr = PyUnicode_FromFormat("%s(%S)",
Py_TYPE(self)->tp_name,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't recall if you reply to my question: does this include the module name? Does it use the qualified name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it uses the fully qualified name. Otherwise, the tests would have caught it. I did reply to it, but Github probably collapsed the comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, cool :-)

| ``str(t)`` | Returns a string in the form |
| | ``[D day[s], ][H]H:MM:SS[.UUUUUU]``, where D |
| | is negative for negative ``t``. (5) |
| ``str(t)`` | Returns a string in the form | | | ``[D day[s], ][H]H:MM:SS[.UUUUUU]``, where D | | | is negative for negative ``t``. (5) |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line seems weird. Did you try to compile the doc? make -C Doc html

Why did you touch this doc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, crap. This was an illusion created by Vim's soft-wrapping. git add -p too broke the lines weirdly so I wasn't able to catch it. Am running make html now to verify the output. Sorry about missing this.

Lib/datetime.py Outdated
if self._microseconds:
args.append("microseconds=%d" % self._microseconds)

if len(args) == 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"if not args:" is better

Lib/datetime.py Outdated
self._seconds,
self._microseconds)
args = []

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO the empty lines doesn't help readability, you can remove them in all the code filling args.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -20,7 +20,7 @@
# XXX(gb) First run all the _Pure tests, then all the _Fast tests. You might
# not believe this, but in spite of all the sys.modules trickery running a _Pure
# test last will leave a mix of pure and native datetime stuff lying around.
test_classes = []
all_test_classes = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping?

Py_DECREF(microseconds);
}

if (PyList_Size(args) == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please replace PyList_Size() with PyList_GET_SIZE().

PyList_Size() can fail, PyList_GET_SIZE() cannot.


if (PyList_Size(args) == 0) {
PyObject *zero = PyUnicode_FromString("0");

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this useless empty line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

}

PyObject *repr = PyUnicode_FromFormat("%s(%S)",
Py_TYPE(self)->tp_name,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, cool :-)

PyObject *args_string = PyUnicode_Join(sep, args);
Py_DECREF(sep);
Py_DECREF(args);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might remove this empty line for readability, collapse the logical block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I would like to see a review of @serhiy-storchaka and/or @vadmium since we had some disagreement on the repr() output.

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The C code looks correct to me, but a little cumbersome. See my comment on the tracker.

Please add your name in Misc/ACKS if it still is not added and an entry in Misc/NEWS.d/.

@@ -20,7 +20,7 @@
# XXX(gb) First run all the _Pure tests, then all the _Fast tests. You might
# not believe this, but in spite of all the sys.modules trickery running a _Pure
# test last will leave a mix of pure and native datetime stuff lying around.
test_classes = []
all_test_classes = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the tests were only running for the C implementation, not for the Python
implementation.

This looks like a bug. Please open a new issue for it. This should be fixed in all versions.

| ``repr(t)`` | Returns a string in the form |
| | ``datetime.timedelta(D[, S[, U]])``, where D |
| | is negative for negative ``t``. (5) |
| ``repr(t)`` | Returns the representation of the |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear how this representation differs from the above one. May be add "in the calling form" or something like?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returns the representation of the :class:timedelta object such that t == eval(repr(t)).

?

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still LGTM with the new C code ;-)

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yet few nitpicks.

@@ -20,7 +20,7 @@
# XXX(gb) First run all the _Pure tests, then all the _Fast tests. You might
# not believe this, but in spite of all the sys.modules trickery running a _Pure
# test last will leave a mix of pure and native datetime stuff lying around.
test_classes = []
all_test_classes = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After merging the PR that fixes the tests this PR should be merged with master. Merge conflicts that are not resolved automatically should be resolved manually.

return NULL;
}

char *sep = "";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const char*


if (GET_TD_SECONDS(self) != 0) {
Py_SETREF(args, PyUnicode_FromFormat("%U%sseconds=%d",
args,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all formatted arguments may be written in one line. This is more compact.

Misc/ACKS Outdated
@@ -1762,3 +1762,4 @@ Jelle Zijlstra
Gennadiy Zlobin
Doug Zongker
Peter Åstrand
Utkarsh Upadhyay
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Names are ordered in alphabetical order. Your name should be under the U letter.

| | is negative for negative ``t``. (5) |
| ``repr(t)`` | Returns a string representation of the |
| | :class:`timedelta` object such that |
| | ``t == eval(repr(t))``. |
+--------------------------------+-----------------------------------------------+
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

t == eval(repr(t)) isn't necessarily true (e.g. when datetime is aliased during import). So this part of the documentation would need to change again.

@musically-ut musically-ut force-pushed the feat-timedelta-repr branch from 46f0d0c to 3d327f4 Compare July 25, 2017 18:00
@musically-ut
Copy link
Contributor Author

I've rebased the PR on the current master.

@vstinner vstinner merged commit cc5a65c into python:master Jul 25, 2017
musically-ut added a commit to musically-ut/cpython that referenced this pull request Sep 21, 2017
This is a fix to the documentation to reflect the changes merged in as
part of python#1493 (bpo-30302).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants