Skip to content

ENG-5751 remove old keen logic #10992

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

mkovalua
Copy link
Contributor

Purpose

We don’t use Keen anymore, but we still have a lot of Keen code and config in osf.io. There’s also a legacy column in the node table, keenio_read_key. All that can go. I’m 95% sure we don’t use it anymore, as all metrics logging should be happening through our own metrics endpoints. We probably need to keep the keen js packages, as we still use those to display metrics data pulled from our source.

Changes

Removing old keen code

QA Notes

Please make verification statements inspired by your code and what your code touches.

  • Verify
  • Verify

What are the areas of risk?

Any concerns/considerations/questions that development raised?

Documentation

Side Effects

Ticket

https://openscience.atlassian.net/browse/ENG-5751

Copy link
Member

@felliott felliott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good so far. I've left a couple of notes for things to check on. Please also remove the unneeded keen dependencies for both the python (pyproject.toml) and javascript (package.json) dependency lists.

Thank you!
@felliott

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove this file entirely per Eric's comments in Slack

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove references to osf.management.commands.cumulative_plos_metrics

@mkovalua
Copy link
Contributor Author

Hi @felliott , I left some thoughts about possible removal below

osf_metrics.html

Maybe it is needed to remove everything and not render the tab ‘Metrics’ in admin, some thoughts about:

preprints-tab (html) -> renderPreprintMetrics preprint_summary (js) ->
(py)

'preprint_summary' : {
    'mapper': _map_preprint_summary,
    'class': PreprintSummaryReport,
},  

(not confident whether it is needed to be deleted (how the data are mapped into DailyReport/PreprintSummaryReport do somehow keen is used (not see it)) though it has 'report_date': _timestamp_to_date(row['keen.timestamp']) in _map_preprint_summary )

addons-tab (html) -> AddonMetrics storage_addon_usage (js) ->
(py) 'storage_addon_usage': reports.StorageAddonUsage,

(not confident whether it is needed to be deleted )

raw-numbers-tab (html) -> renderRawNodeMetrics node_summary (js) -> (py)

'node_summary': {
    'mapper': _map_node_summary,
    'class': NodeSummaryReport,
}, 

(not confident whether it is needed to be deleted though it has keen.timestamp in _map_node_summary)

institution-tab (html) -> InstitutionMetrics (js) institution_summary ->

 'institution_summary': {
    'mapper': _map_institution_summary,
    'class': InstitutionSummaryReport,
},

(not confident whether it is needed to be deleted though it has keen.timestamp in _map_institution_summary)

user-tab (html) -> no usage in js (looks like it may be removed because it does not render any data from Back end)

downloads-tab (html) -> DownloadMetrics (js) file_stats -> file_stats (WaterButler) (looks like it may be removed because of legacy file_stats)

@mkovalua mkovalua requested a review from felliott February 28, 2025 15:19
Copy link
Member

@felliott felliott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @mkovalua!

Looking good so far! I've poked the slack channel about removing/versioning analytics_key. I'll update you when I hear back. In addition to the noted changes, can you also please remove keen-analytics from osf's package.json and yarn.lock files? Thank you!

Cheers,
@felliott

// renderKeenMetric("#number-of-unique-downloads", "metric", uniqueDownloadsQuery,
// defaultHeight, defaultColor, publicClient);

// renderDownloadMetrics();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This chart is pulled from internal metrics, so it can be kept in.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this file you should also be able to remove client, publicClient, renderKeenClient, and anything involving keen-analysis. keen-dataviz is still used to render charts from our internal metrics, so that should be left alone. Removing keen-analysis from here should allow you to remove it from admin's package.json / yarn.lock too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, and I guess axing keen-analysis means NodeLogsByUser can go as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should also be good to delete the commented-out code. If we decide to reimplement these later, we can always dig into the git history.

};
// $('#downloads-tab')[0].onclick = function() {
// Metrics.DownloadMetrics();
// };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't comment this out. The first graph ("Daily Download Counts") is still good.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can delete this entire file.

@@ -444,7 +444,7 @@ class CeleryConfig:
'osf.management.commands.sync_datacite_doi_metadata',
'osf.management.commands.update_institution_project_counts',
'osf.management.commands.populate_branched_from',
'osf.management.commands.cumulative_plos_metrics',
# 'osf.management.commands.cumulative_plos_metrics',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can delete this.

@@ -549,7 +549,7 @@ class CeleryConfig:
'osf.management.commands.delete_legacy_quickfiles_nodes',
'osf.management.commands.fix_quickfiles_waterbutler_logs',
'osf.management.commands.sync_doi_metadata',
'osf.management.commands.cumulative_plos_metrics',
# 'osf.management.commands.cumulative_plos_metrics',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can delete this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a heads-up: we might need to regenerate this migration if new migrations have gone out since you started work on this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is only testing popular projects & regs and those come from keen, theny we should be able to delete the entire file.

@@ -274,7 +273,6 @@ class NodeSerializer(TaxonomizableSerializerMixin, JSONAPISerializer):
tags = ValuesListField(attr_name='name', child=ser.CharField(), required=False)
access_requests_enabled = ShowIfVersion(ser.BooleanField(read_only=False, required=False), min_version='2.0', max_version='2.8')
node_license = NodeLicenseSerializer(required=False, source='license')
analytics_key = ShowIfAdminScopeOrAnonymous(ser.CharField(read_only=True, source='keenio_read_key'))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per @brianjgeiger's comment, we can set this to '' rather than delete it. If we do that, we don't need to worry about versioning. Maybe add a short comment saying that this can be removed after another version release.

@mkovalua mkovalua requested a review from felliott March 5, 2025 13:30
@mkovalua mkovalua marked this pull request as ready for review March 6, 2025 20:38
@mkovalua mkovalua changed the base branch from feature/b-and-i-25-01 to feature/pbs-25-05 March 11, 2025 20:23
Copy link
Member

@felliott felliott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @mkovalua!

Nice work! This is getting closer. I think the last bit is to remove the keen javascript tracking from the legacy pages (those under the website/ directory). To do that:

  • Your original change to base.mako was correct; we do want to remove the keen ids and keys from base.mako.
  • website/static/js/keen.js is mostly obsolete. We still use the _logPageview function to log to our internal analytics system. But all the logging to keen can be removed. You can also remove the keen-tracker js library. The keen.helpers.getUniqueId function we use is described here: https://github.com/keen/keen-tracking.js/blob/master/lib/helpers/getUniqueId.js. Feel free to inline that code here if there isn't a better, more standard js way of doing it. A comment in their git history says they got the code from: https://stackoverflow.com/questions/105034/how-do-i-create-a-guid-uuid/2117523#2117523
  • Please add some code to start migrating the old keenSessionId cookie over to a new name, perhaps osfMetricsSessionId. It looks like that cookie only lives for about 25 minutes, so we can prepare a hotfix to delete any mention of keenSessionId after release.
  • I guess since we're purging almost all mentions of keen, it would make sense to rename keen.js to metrics.js.
  • The metrics logging is initiated in website/static/js/pages/base-page.js. If we rename keen.js, that will need to be updated in this file.

Thanks for sticking with this! It's a long process, but we're getting close to the end.

Cheers,
@felliott

@mkovalua mkovalua requested a review from felliott March 14, 2025 13:07
Copy link
Member

@felliott felliott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there! Last few nitpicks: there are some files that still mention keen, but don't use it anymore. Can you remove that or rename methods. In most cases you should be able to replace keen with metrics if you are renaming something. Examples include:

  • admin/metrics/views.py - remove references to KEEN_CREDENTIALS
  • admin/templates/metrics/osf_metrics.html - remove the keen configuration. You can leave keen-dashboard.css named as it is; we're still using the keen-dataset code, so that's okay for now.
  • admin/base/settings/defaults.py - remove the keen config vars
  • website/settings/defaults.py - remove the keen config vars
  • website/settings/local-ci.py - ditto
  • website/static/js/metrics.js - rename _createOrUpdateKeenSession()
  • website/static/js/pages/base-page.js - update mention of KeenIO in comment to "metrics"
  • website/routes.py - remove the keen config and resolve the keen TODO

Thank you!

@mkovalua mkovalua requested a review from felliott March 24, 2025 19:53
Copy link
Member

@felliott felliott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I made a change to remove the old geoip2 code and merged in the existing feature/pbs-25-05 branch to resolve some poetry.lock conflicts. I think this is ready to merge!

@felliott felliott requested a review from brianjgeiger March 31, 2025 14:03
@felliott
Copy link
Member

@brianjgeiger requesting your review since the target is your branch

Copy link
Collaborator

@brianjgeiger brianjgeiger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question that I suspect I know the answer to. The question might be for @felliott.

@@ -1,95 +0,0 @@
from unittest import mock
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not seeing the removal of the script that this tests in this PR. Has it been broken for a while because it relies on Keen, or should this remain in?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brianjgeiger

the removed test rely on get_keen_activity and it is removed because all the logic inside relied on Keen client response

https://github.com/CenterForOpenScience/osf.io/pull/10992/files/18e6fbd0c5a3dc7180f6e34dc594b7b13c0183ca#diff-ae86e79b3a482fcdd2a3ca8cf0f46852e8d5f716886df902ab2184f240bf15b2L31

if you mean about def activity() the code inside is partially removed and remained because not all logic rely on Keen, the Keen related logic is removed or do you mean something another ? Thanks

FYI:
@felliott

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brianjgeiger, this one is a little weird. The test script does technically test non-keen stuff set by the scripts/populate_popular_projects_and_registrations.py script, but it's very limited. I think that script might be out-of-date. It references POPULAR_LINKS_REGISTRATIONS which is only mentioned in the settings files. It seems to have been superseded by scripts/populate_new_and_noteworthy_projects.py. If that's the case, then we can remove scripts/populate_popular..., remove all references to it, and scrub POPULAR_LINKS_REGISTRATIONS from the default and local-ci settings files. We can also probably delete website.utils.activity and the other methods in that file that it calls.

@mfraezz, as another old-timer do you remember a reason why we kept these scripts?

Copy link
Contributor Author

@mkovalua mkovalua Apr 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without

@mock.patch('website.project.utils.get_keen_activity')

(new mock_client is needed to be imitated but it relied on KEEN website.project.utils.get_keen_activity previously )

locally

python3 -m pytest tests/test_populate_popular_projects_and_registrations.py

image

the test does not passed for now

image

FYI: @felliott @brianjgeiger

Copy link
Member

@felliott felliott Apr 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, good spot! Yeah, we need to remove both the @mock.patch and the mock_client from the signature since we're not patching it anymore.

Copy link
Member

@mfraezz mfraezz Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those scripts were what populated these widgets on the dashboard. I believe we disabled them at one point, due to spam-related reasons. Whether or not we want to retain those widgets and related functionality, especially after the angular migration, is likely a Product question.

Screenshot 2025-04-03 at 10 39 37

The _registrations version of the script fell out of use when that was ported to ember. I believe it was intended to be a follow-up improvement, as the widget for it was retained, but the registrations it contains are specified manually in the configs.

Screenshot 2025-04-03 at 10 45 25

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm checking with Product to see if we want to keep this functionality with the Angular project, which may inform what we do here.

def activity():
"""Generate analytics for most popular public projects and registrations.
Called by `scripts/update_populate_projects_and_registrations`
"""
Node = apps.get_model('osf.AbstractNode')
popular_public_projects = []
popular_public_registrations = []
max_projects_to_display = settings.MAX_POPULAR_PROJECTS
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing the answer to my question above is "It's been broken for a while".

@brianjgeiger brianjgeiger deleted the branch CenterForOpenScience:feature/pbs-25-05 April 7, 2025 21:20
mkovalua added a commit to mkovalua/osf.io that referenced this pull request Apr 8, 2025
brianjgeiger pushed a commit that referenced this pull request Apr 9, 2025
* ENG-5751 remove old keen logic
* add migration
* remove legacy keen from metrics .js (maybe another keen logic is also needed to be removed in the files)
* remove dashboard keen (it is not used in any views)
* ENG-5751 code updates
* ENG-5751 remove keen dependency from poetry
* ENG-5751 code updates
* code updates
* ENG-5751 resolve PR comments
* ENG-5751 add default '' analytics_key
* ENG-5751 remove base_analytics.html
* add keen-dashboards.css
* fix/ENG-5751 it seems 'keen' is used in base mako files and keen js rely on it and it is not needed to be deleted
* fix/ENG-5751 resolve PR comments (not confident about 'getUniqueId' implementation https://stackoverflow.com/questions/18230217/javascript-generate-a-random-number-within-a-range-using-crypto-getrandomvalues )
* remove keen-tracking
* update getUniqueId
* fix/ENG-5751 PR comments related updates
* fix/ENG-5751 PR comments related updates
* remove geoip from deps and location code from app
* PR comment related updates
#10992 (comment)
* fix migrations conflicts
* a few tidy-ups:
 * Relocate code changes in api.nodes.serializers to be more compact
   when squished.
 * Really delete `cumulative_plos_metrics.py`. Per Eric's comments on
   Slack, we no longer need this.
 * Remove `cumulative_plos_metrics.py`-related config from
   website.settings.defaults.py
 * Add attributive comments for borrowed code in metrics library.

---------

Co-authored-by: Fitz Elliott <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants