Skip to content

Semantic_text match_all with Highlighter #128702

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Samiul-TheSoccerFan
Copy link
Contributor

This PR addresses the missing highlighting for semantic_text fields when a match_all query is used. We guide users to use semantic highlighting to view their chunks, but highlighting does not currently return any results with match_all. Highlighting is usually unnecessary for non-inference fields with match_all, but it's essential for semantic_text fields to return the generated chunks. This change ensures those highlighting results are returned as expected.

Test cases:

Only text fields

PUT my-index/
{
  "mappings": {
    "properties": {
      "semantic1": {
        "type": "text"
      },
      "semantic2": {
        "type": "text"
      }
    }
  }
}

POST my-index/_doc/1
{
  "semantic1": "Puggles are pugs and beagles",
  "semantic2": "Chiweenies are chihuahuas and dachshunds"
}


GET my-index/_search
{
  "query": {
    "match_all": {}
  }
}

only inference fields

PUT my-semantic-index/
{
  "mappings": {
    "properties": {
      "semantic1": {
        "type": "semantic_text"
      },
      "semantic2": {
        "type": "semantic_text"
      }
    }
  }
}

POST my-semantic-index/_doc/1
{
  "semantic1": "Puggles are pugs and beagles",
  "semantic2": "Chiweenies are chihuahuas and dachshunds"
}

GET my-semantic-index/_search
{
  "query": {
    "match_all": {}
  },
  "highlight": {
    "fields": {
      "semantic1": {
        "type": "semantic",
        "number_of_fragments": 1
      },
      "semantic2": {
        "type": "semantic",
        "number_of_fragments": 1
      }
    }
  }
}

Both infer and non-inference fields

PUT test-index
{
  "mappings": {
    "properties": {
      "test_field": {
        "type": "semantic_text"
      },
      "non_infer_field": {
        "type": "text"
      }
    }
  }
}

PUT test-index/_doc/doc1
{
  "test_field": "these are not the droids you're looking for. He's free to go around",
  "non_infer_field": "this is a non-inference field"
}

GET test-index/_search
{
  "query": {
    "match_all": {}
  },
  "highlight": {
    "fields": {
      "test_field": {
        "type": "semantic",
        "number_of_fragments": 1
      }
    }
  }
}

Some additional cases

GET my-*/_search
{
  "query": {
    "match_all": {}
  },
  "highlight": {
    "fields": {
      "semantic1": {
        "number_of_fragments": 1
      },
      "semantic2": {
        "number_of_fragments": 1
      }
    }
  }
}

GET my-*/_search
{
  "query": {
    "match_all": {}
  },
  "highlight": {
    "fields": {
      "semantic1": {
        "type": "semantic",
        "number_of_fragments": 1
      },
      "semantic2": {
        "type": "semantic",
        "number_of_fragments": 1
      }
    }
  }
}

// throws error as fields are semantic_text field
GET my-*/_search
{
  "query": {
    "match_all": {}
  },
  "highlight": {
    "fields": {
      "semantic1": {
        "type": "score",
        "number_of_fragments": 1
      },
      "semantic2": {
        "type": "score",
        "number_of_fragments": 1
      }
    }
  }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants