Skip to content

[DC-1028] [DC-1155] Add script to remove select sites' EHR data #628

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 28 commits into from
Nov 9, 2020

Conversation

mark-velez
Copy link
Member

No description provided.

@mark-velez mark-velez requested a review from lrwb-aou September 22, 2020 23:52
Copy link
Contributor

@lrwb-aou lrwb-aou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would appreciate your feedback on some of these comments. We will need to fix the unit test before merging. The other feedback could be addressed now or in a new PR, whichever makes the most sense.

Copy link
Contributor

@lrwb-aou lrwb-aou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is at least one change needed to calculating max_results.

@mark-velez mark-velez force-pushed the mv/purge_hpo branch 2 times, most recently from c7d1699 to 2ca9c9a Compare October 22, 2020 03:30
@mark-velez mark-velez requested a review from lrwb-aou October 22, 2020 15:38
Copy link
Contributor

@lrwb-aou lrwb-aou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the exclude script is good. I think we covered that in the previous review. I'd appreciate talking about the logging changes and your design considerations for them.

@lrwb-aou lrwb-aou changed the title [DC-1028] Add script to remove select sites' EHR data [DC-1028 / DC-1155] Add script to remove select sites' EHR data Oct 23, 2020
@mark-velez
Copy link
Member Author

Thanks @lrwb-aou. Let's discuss logging when I get back on Wed. There's some good feedback in here which I may be able to address by then.

@mark-velez mark-velez changed the title [DC-1028 / DC-1155] Add script to remove select sites' EHR data [DC-1028] [DC-1155] Add script to remove select sites' EHR data Oct 24, 2020
@mark-velez
Copy link
Member Author

@lrwb-aou I wanted to get as much done on this as I can since I will be out for a few days. I changed the PR title because the build check was failing. I pushed commits I thought were not too controversial and closed the associated conversations; please unresolve if needed. I plan to cover the remaining items when I get back.

@danyel-campbell
Copy link
Contributor

@mark-velez @lrwb-aou I merged the changes to fix the intermittent unit test failure for pipeline_logging_test.py so y'all may want to do a pull if you haven't already. 👍🏾

@mark-velez mark-velez requested a review from lrwb-aou November 4, 2020 17:31
@mark-velez
Copy link
Member Author

Hey @lrwb-aou please let me know if there's anything left to address before I rebase and remove obsolete code.

Copy link
Contributor

@lrwb-aou lrwb-aou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got some thoughts. Let me know what you think.

@mark-velez mark-velez requested a review from lrwb-aou November 4, 2020 23:30
Copy link
Contributor

@lrwb-aou lrwb-aou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very few comments. The three comments are about conventions.

str(datetime.date): 'DATE',
str(datetime.datetime): 'TIMESTAMP'
}
dict_types = Union[Dict, OrderedDict]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't go through creating a NamedTuple because I like this setup you've got here. So I created this very small patch file. What do you think? I know we could let someone use strings, but if these helper functions were used by an integration test, a string object iterable would blow this stuff up. Because I find it unlikely that we'll have single character column names in an integration test. On that subject, do you intend for these functions to be used by integration and unit tests alike?

non_string_iterables.txt

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The patch would permit key_order to be List[tuple] or List[List] whereas it is most likely to be passed as a List[str] (example) as it should represent the keys within each row (type=dict) in the order that the result schema. A value belonging to any immutable (hashable?) type is permitted to be used as a key in a dictionary.

I thought maybe the patch had a typo and non_string_iterable_types should be the container type (?), but my IDE is complaining this is invalid.
image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a lot of debate about how to use Iterable without using a string. I don't think there's been a good answer for how to proceed forward. I'm dropping this.

Copy link
Contributor

@lrwb-aou lrwb-aou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good. Just need to remove the dead code.

@mark-velez mark-velez merged commit f9d2943 into develop Nov 9, 2020
@mark-velez mark-velez deleted the mv/purge_hpo branch November 9, 2020 14:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants