Skip to content

docs.rs search is ordered randomly for non-exact matches #489

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
CGQAQ opened this issue Nov 25, 2019 · 4 comments · Fixed by #721
Closed

docs.rs search is ordered randomly for non-exact matches #489

CGQAQ opened this issue Nov 25, 2019 · 4 comments · Fixed by #721
Labels
A-crate-search Area: A problem or feature request for searching crates

Comments

@CGQAQ
Copy link

CGQAQ commented Nov 25, 2019

Search for Tokio and tokio cames out completely different result
tokio first result: tokio-0.2.0-alpha.6
Tokio first result: new-tokio-smtp-0.8.1, and DO NOT have tokio-0.2.0-alpha.6 at all
futures first result: futures-0.3.1
Futures first result: futures_future-0.1.1, and futures-0.3.1 is fifth result
I don't think this is the behavior a search engine should be.

a
a-r
b
b-r
c
c-r

P.S.: I found this when I was in bed, using my phone to search tokio doc, and my keyboard auto capitalized first character of tokio

@Zexbe
Copy link
Contributor

Zexbe commented Jan 25, 2020

I'm not sure that is the problem. If you do the exact same search, and hit refresh the results change.

@jyn514 jyn514 changed the title docs.rs search is case sensitive docs.rs search is random for non-exact matches Jan 31, 2020
@jyn514 jyn514 changed the title docs.rs search is random for non-exact matches docs.rs search is ordered randomly for non-exact matches Jan 31, 2020
@jyn514
Copy link
Member

jyn514 commented Feb 1, 2020

I came up with a SQL query that works but is kind of complicated:

WITH valid_releases AS (
    SELECT *
    FROM releases INNER JOIN crates ON releases.crate_id = crates.id
    WHERE yanked = false AND rustdoc_status = true AND name LIKE '%regex%'
), ordering AS (
  SELECT ROW_NUMBER() OVER (
    PARTITION BY name
    ORDER BY
      name = 'regex' DESC,
      MAX(version) > '1.0', -- TODO: semver in SQL ??
      SUM(downloads) DESC,
      name
  ) as row,
      name, MAX(version) as version,
      SUM(downloads) as downloads
    FROM valid_releases
    GROUP BY name
    ORDER BY
      name = 'regex' DESC,
      MAX(version) > '1.0', -- TODO: semver in SQL ??
      SUM(downloads) DESC,
      name
    LIMIT 10
) SELECT ordering.*, release_time -- , description, github_stars
  FROM releases INNER JOIN crates ON releases.crate_id = crates.id
                INNER JOIN ordering
                  ON releases.version = ordering.version AND crates.name = ordering.name
  ORDER BY ordering.row ASC;

@jyn514
Copy link
Member

jyn514 commented Feb 1, 2020

I'm not sure that is the problem. If you do the exact same search, and hit refresh the results change.

BTW this is also partially cause by using LIKE instead of ILIKE, so the original issue report was also correct.

@jyn514 jyn514 added the A-crate-search Area: A problem or feature request for searching crates label Feb 1, 2020
@jyn514
Copy link
Member

jyn514 commented May 8, 2020

The new behavior for tokio is

  1. tokio
  2. tokyo
  3. tokio-h2
    ... various other tokio- crates

The new behavior for Tokio is

  1. tokio
  2. tokio-h2
    ... various other tokio- crates

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-crate-search Area: A problem or feature request for searching crates
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants