-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG GH#43414 Infer and coerce datetimes am pm #43416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ghost
commented
Sep 5, 2021
- closes BUG: pd.to_datetime cannot infer and coerce dates with AM/PM and infer_datetime_format=True and errors="coerce". #43414
- tests added / passed
- Ensure all linting tests pass, see here for how to run them
- whatsnew entry
Something interesting I found is that when we infer_datetime_format we actually only use the first record. This is important because in some of the tests we generate data with pd.date_range where by default the first record will start at midnight. This can lead to different behavior than if the data did not start with a record at midnight, which is probably more common in practice. |
Thanks @mikephung122 for the PR. tests are failing https://github.com/pandas-dev/pandas/pull/43416/checks?check_run_id=3517046932
This is clearly stated in the docs.
There is an underlying issue with The fact that specifying |
@simonjayhawkins When I found this problem I also thought it was related to #25143. However, the more I looked into it the more I feel like it's unclear and now I believe fixing this as a separate issue might make more sense. I think this issue and fix by itself is straight-forward. When converting we always try to guess the datetime format first, regardless of the errors argument and error handling. Ideally, the guess_datetime_format function can and should be able to guess date and time formats with AM/PM. However, right now it will return something like this '%m/%d/%Y %H:%M:%S AM' for AM/PM. So when we try to use that to parse with our best guess it never works and the subsequent issue of inconsistent error handling and the fallback is surfaced. We might be able to completely ignore this issue since the fallback method works. However, I'm wondering why it was including in the first place if that's true. The other option would be trying the fallback before accepting the coerced NaT, but then I would say this is still an issue worth fixing. Overall, I think cleaning this up and ensuring that no existing functionality is broken would be significantly more difficult. What do you think? |
@simonjayhawkins I checked the test, it looks like its the new unit tests that are failing for one posix environment while another posix environment shows INTERNALERROR and 'Error: Process completed with exit code 3.'. However, the unit tests don't fail locally or on the other posix environments. I'm wondering if this is an issue with how these are built and ran but don't see anything immediately obvious from the logs during the build or test? This is actually my first time touching any of the cython, are there any additional steps that need to be done outside of changing the .pyx file? |