-
Notifications
You must be signed in to change notification settings - Fork 151
coerce OutOfBoundsDatetime into 2199 #602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Actually, I just realized that the ideal solution would be if you could somehow do the preprocessing of the date/timestamp columns in sas aready, thus take all date and timestamp columns and replace too large values with 2199-12-31... Then you did not need to use coerce at all... |
Hey @rainermensing let me take a look at this when I get a chance. It would be good to have an option to address this. |
Hey @rainermensing , I have to apologize for taking so long to get to this. I've been swamped w/ internal SAS work, but I'm looking at this now. I hope to have this and the parquet support from the other issue finished up this week. Did you still have some changes for that other parquet issue to see about integrating? Or is it good as it is? Thanks! |
Hi @tomweber-sas . Thank you for getting back to this. For the parquet issue, I can only test our full workflow during weekends and we have discovered one more bug last week that I hope I have now finally fixed (and there won't be anything else). I will create a new merge request with final logic some time this or next week so please be patient... I hope this will not cause you any issues |
Yes, I started playing with the code sample you provided and I started thinking of adding a couple options to sd2df for max/min timestamp values. If either or both are provided, do it (I'm going to look at doing it in SAS code or the Python code, just to see if either works better). That way users can provide the replacement values they want, and w/out specifying either, the behavior remains the same so as to not break anyone. I hope to finish looking at that tomorrow. And yes, as for the parquet, once you have those last changes, I can try them out and see about finishing that up too. No problem with any of this, that's what I'm here for :) Thanks, |
@rainermensing, I've finally had time to prototype this. I just pushed to a new branch named I added two parms to sd2df, If you have a chance to play with this and see what you think, that would be great!
for the first, datetimes:
and for dates:
|
#609 merged this in. I believe it allows for what you wanted and more. You can set the high and low timestamps and it works for SAS dates as well as datetimes (timestamps). I've merged this in to main now. I need to write doc and test some more, but then I'll build this into a new release. I'll close this after that's all done. |
Hi Tom,
sorry that I did not get back to you. I could not test it as I am on vacation (still) ....
Sounds great. Thanks for your work/ support!!
Best
Rainer
…________________________________
Von: Tom Weber ***@***.***>
Gesendet: Mittwoch, Juli 10, 2024 5:14:20 PM
An: sassoftware/saspy ***@***.***>
Cc: Rainer Mensing ***@***.***>; Mention ***@***.***>
Betreff: Re: [sassoftware/saspy] coerce OutOfBoundsDatetime into 2199 (Issue #602)
#609<#609> merged this in. I believe it allows for what you wanted and more. You can set the high and low timestamps and it works for SAS dates as well as datetimes (timestamps). I've merged this in to main now. I need to write doc and test some more, but then I'll build this into a new release. I'll close this after that's all done.
Thanks!
Tom
—
Reply to this email directly, view it on GitHub<#602 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AHRLJF6QYJ23ZUSQKRQUMLLZLVFUVAVCNFSM6AAAAABID5TI4OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRQG43TOMZSG4>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Thanks @rainermensing , and no problem, enjoy your vacation! This will all be there when you get back; should be in the latest production version then :) |
I've built this into the latest production version: V5.100.0 - SASPy's 100th release! I'll close this, but if you see anything once you have a chance to try it out, just reopen or open a new one. Thanks again for all of your help with these enhancements! They all help make SASPy better for all! |
Uh oh!
There was an error while loading. Please reload this page.
The Problem is that currently, the sasdata2dataframe method involves as step where timestamps are coerced silently using
df[dvarlist[i]] = pd.to_datetime(df[dvarlist[i]], errors='coerce')
. Pandas has the issue that it cannot handle timestamps larger than a certain size ('2262-04-11 23:47:16.854775807'). SAS however does support timestamps of larger sizes.This means all timestamps larger than this are silently coerced into NaT within pandas. However, these values are not so uncommen i.e. 9999-12-31 as a default 'valid_to' value.
My proposal is to not change the timestamp logic in all relevant methods along these lines:
The default of coerce_into_2199_ts would be False, so that the coercion is not silent anymore. At the same time, the information of the maximum timestamp should be preserved, as well as that of possible existing NaT in the timestamp column.
Hence i.e. for sasioiom.py
The text was updated successfully, but these errors were encountered: