Adding support for hex-encoded byte vectors on knn-search #105393

pmpailis · 2024-02-12T12:54:49Z

This PR updates the parsing of the query_vector param in both knn-search & knn-query to support hex-encoded byte vectors. This means that the following 2 requests are now equivalent (same goes for knn query) and would yield the same results.

POST my_index/_search
{
    "knn":{
        "query_vector": [64, 10, -30],
        "field": "my_vector_byte",
        "k": 10,
        "num_candidates": 100
    },
    "size": 10
}

POST my_index/_search
{
    "knn":{
        "query_vector": "400ae2",
        "field": "my_vector_byte",
        "k": 10,
        "num_candidates": 100
    },
    "size": 10
}

Same parsing is also taking place during indexing, so similarly, we now support both of the following (equivalent) formats

POST my_index/_doc
{
    "my_vector_byte": [64, -10, -30]
}

POST my_index/_doc
{
    "my_vector_byte": "40f6e2"
}

github-actions · 2024-02-12T12:55:03Z

Documentation preview:

✨ Changed pages

elasticsearchmachine · 2024-02-12T12:55:14Z

Pinging @elastic/es-search (Team:Search)

elasticsearchmachine · 2024-02-12T12:56:35Z

Hi @pmpailis, I've created a changelog YAML for you.

carlosdelest · 2024-02-13T08:35:25Z

server/src/main/java/org/elasticsearch/search/vectors/KnnVectorQueryBuilder.java

+        return vector;
+    }
+
+    private static float[] parseQueryVectorArray(XContentParser parser) throws IOException {


This code is duplicated, maybe we could extract it to a common class?

++ Did some refactoring in DenseVectorFieldMapper to avoid code duplication but could very well do the same here. Will update :)

Addressed in 3833517 & 638a737

carlosdelest · 2024-02-13T08:35:47Z

server/src/main/java/org/elasticsearch/index/mapper/vectors/DenseVectorFieldMapper.java

-                    dotProduct += value * value;
-                    index++;
+                XContentParser.Token token = context.parser().currentToken();
+                if (token == XContentParser.Token.START_ARRAY) {


Nit: Maybe we could use a switch expression here

Addressed in 3833517

...tTest/resources/rest-api-spec/test/search.vectors/175_knn_query_hex_encoded_byte_vectors.yml

…ch statement

carlosdelest

LGTM, thanks Panos!

benwtrent

Good progress! Highlevel concerns:

I wonder if we should ever allow hex encoded strings to query float encoded vectors. I get this may help with backwards compatibility.

We shouldn't allow single value numbers to be parsed as an array.

We should parse the hex string directly into byte[] and write that between nodes. Only transforming to float[] for backwards compatibility.

benwtrent · 2024-02-13T12:29:33Z

...Test/resources/rest-api-spec/test/search.vectors/170_knn_search_hex_encoded_byte_vectors.yml

+---
+"Knn search with hex string for float field" :
+  # [64, 10, -30] - is encoded as '400ae2'
+  # this will be properly decoded but only because:
+  # (i) the provided input is compatible as the values are within [Byte.MIN_VALUE, Byte.MAX_VALUE] range
+  # (ii) we do not differentiate between byte and float fields when initially parsing a query
+  - do:


I don't think we should support hexidecimal strings for float at all.

I am conflicted on this. I realize now we allow byte[] (which is just always parsed as float[]).

I wonder if we should allow the hex encoded strings. I am flip flopping here :/. Gonna have to think some more.

Agree, we should not allow this

benwtrent · 2024-02-13T12:30:25Z

...tTest/resources/rest-api-spec/test/search.vectors/175_knn_query_hex_encoded_byte_vectors.yml

+        index: knn_hex_vector_index
+        id: "4"
+        body:
+          my_vector_float: "807f0a"


same concerns as above, I don't think we should allow this.

benwtrent · 2024-02-13T12:34:29Z

...Test/resources/rest-api-spec/test/search.vectors/170_knn_search_hex_encoded_byte_vectors.yml

+  # [-128, 127, 10] - is encoded as '807f0a'
+  - do:
+      catch: /Failed to parse object./
+      index:
+        index: knn_hex_vector_index
+        id: "4"
+        body:
+          my_vector_float: "807f0a"


While I agree we shouldn't allow hex encoded elements indexed into a float field, we should not have test code in setup. Please move to its own test (same goes for the query one).

benwtrent · 2024-02-13T13:33:08Z

server/src/main/java/org/elasticsearch/index/mapper/vectors/DenseVectorFieldMapper.java

+                return switch (token) {
+                    case START_ARRAY -> parseVectorArray(context, fieldMapper, (val, idx) -> byteBuffer.put(val));
+                    case VALUE_STRING -> parseHexEncodedVector(context, fieldMapper, (val, idx) -> byteBuffer.put(val));
+                    case VALUE_NUMBER -> parseNumberVector(context, fieldMapper, (val, idx) -> byteBuffer.put(val));


We shouldn't allow this. If this is something we want to support (just a number that gets put into a vector of dim==1), then it should be a separate PR. Personally, I am against it as it encourages bad behavior. If folks have a single value, they should index it as keyword or int or float

I'm definitely with you on this, but added this because it is what we already have and did not want to change the existing API.

Currently, when parsing the request we make use of declareFloatArray which ends up being defined as

FLOAT_ARRAY(START_ARRAY, VALUE_NUMBER, VALUE_STRING)

hence, while far from ideal, we already support single valued numbers. I'm +1 if you agree to remove this (don't think that it'd actually affect anyone tbf)

hence, while far from ideal, we already support single valued numbers

POST vectors/_doc { "vector": 1 }

"caused_by": { "type": "parsing_exception", "reason": "Failed to parse object: expecting token of type [VALUE_NUMBER] but found [END_OBJECT]", "line": 4, "col": 1 }

When indexing.

But we do allow it on the query side. So, lets disallow it on indexing, but continue to allow it query side.

Guess I missed that :/ Somehow I was under the impression that this was consistently allowed on both indexing & searching. Will proceed to disallow it on indexing and keep the existing behavior on query-time.

server/src/main/java/org/elasticsearch/search/vectors/KnnSearchBuilder.java

benwtrent · 2024-02-13T13:37:19Z

server/src/main/java/org/elasticsearch/search/vectors/KnnVectorQueryBuilder.java

+        return switch (token) {
+            case START_ARRAY -> parseQueryVectorArray(parser);
+            case VALUE_STRING -> parseHexEncodedVector(parser);
+            case VALUE_NUMBER -> parseNumberVector(parser);


We should throw. We shouldn't allow this.

benwtrent · 2024-02-13T13:38:06Z

server/src/main/java/org/elasticsearch/search/vectors/KnnVectorQueryBuilder.java

+    static float[] parseQueryVector(XContentParser parser) throws IOException {
+        XContentParser.Token token = parser.currentToken();
+        return switch (token) {
+            case START_ARRAY -> parseQueryVectorArray(parser);


I think its good to assume float[] when passed an array. But we should be able to handle byte[] directly via hex.

benwtrent · 2024-02-13T13:38:34Z

server/src/main/java/org/elasticsearch/search/vectors/KnnVectorQueryBuilder.java

+    }
+
+    private static float[] parseHexEncodedVector(XContentParser parser) throws IOException {
+        // TODO optimize this as the array returned will be recomputed later again as a byte array


We should do this now :).

pmpailis · 2024-02-20T11:44:36Z

server/src/main/java/org/elasticsearch/index/mapper/vectors/DenseVectorFieldMapper.java

+        void accept(byte value, int index);
+    }
+
+    public static class VectorData {


Not sure if this is the correct place or not - but thought to add it here initially, as almost all related classes already had a dependency on DenseVectorFieldMapper - happy to move to its own class / discuss alternatives.

I think it should be its own top-level class in org.elasticsearch.search.vectors.

It should be a record, you can override the canonical for uniqueness checks

It should also override Writeable and possibly ToXContent and handle all the serialization stuff.

It should handle its own XContent parsing, this way users of this only need to use this class and it can correctly parse numerical array or string values

++ Tbh the main reason that I defined it as a class instead of a record was to hide constructors and enable new object creation only via static methods (to ensure uniqueness).

was to hide constructors and enable new object creation only via static methods (to ensure uniqueness).

You can do that in the canonical ctor and still have static methods that are preferred.

I would do something like:

record VectorData(float[] floats, byte[] bytes) { public VectorData { if (floats == null ^ bytes == null) { throw new IllegalArgumentException("You must supply exactly either floats or bytes"); } } }

pmpailis · 2024-02-20T11:47:33Z

@elasticmachine update branch

elasticmachine · 2024-02-20T11:47:36Z

merge conflict between base and head

benwtrent

Looks like we are headed in the right direction!

I think this code becomes easier to maintain and read if VectorData is a stand alone public record class that handles all its own serialization & parsing.

benwtrent · 2024-02-20T12:36:27Z

server/src/main/java/org/elasticsearch/index/mapper/vectors/DenseVectorFieldMapper.java

+        void accept(byte value, int index);
+    }
+
+    public static class VectorData {


I think it should be its own top-level class in org.elasticsearch.search.vectors.

It should be a record, you can override the canonical for uniqueness checks

It should also override Writeable and possibly ToXContent and handle all the serialization stuff.

It should handle its own XContent parsing, this way users of this only need to use this class and it can correctly parse numerical array or string values

benwtrent · 2024-02-20T12:37:22Z

server/src/main/java/org/elasticsearch/index/mapper/vectors/DenseVectorFieldMapper.java

+        public byte[] asByteVector() {
+            if (isByteVector()) {
+                return byteVector;
+            } else if (isFloatVector()) {
+                ElementType.BYTE.checkVectorBounds(floatVector);
+                byte[] vec = new byte[floatVector.length];
+                for (int i = 0; i < floatVector.length; i++) {
+                    vec[i] = (byte) floatVector[i];
+                }
+                return vec;
+            }
+            return new byte[0];
+        }


It is nice to have this for the mapper. However, this should have checks to ensure that if this is called, its actually a byte vector (meaning, whole numbers between Byte.MIN_VALUE|MAX_VALUE)

There is a call to ElementType.BYTE.checkVectorBounds(floatVector); which would throw if any of the elements is outside of BYTE range or decimal. However, to avoid code duplication, we do have to iterate over the array twice here, which is not great either :/ Will refactor and add all necessary checks in-place.

Pretty sure we iterate twice already. Though, it doesn't make sense to "bounds check" once the vector has switched to byte[] already as you know for sure that byte[] values are whole and between min/max ;) (as they are byte values).

It seems like VectorData should have an as... method and a asElementType(ElementType) to handle these weird scenarios.

The asByteVector and asFloatVector methods are currently called only when moving to explicit implementations (e.g. DenseVectorFieldType#createKnnByteQuery and DenseVectorFieldType#createKnnFloatQuery), and in the toXContent methods. So, in most cases, while the underlying vector has been converted to byte, the VectorData record itself is no longer used, so we won't need to re-transform the data.

Would it make sense to have a "rewrite"-like method to return a new VectorData with byte[] instead of float[]?

It seems like VectorData should have an as... method and a asElementType(ElementType) to handle these weird scenarios.

Not sure the distinction of the two usages is clear to me. Could you please provide an example of a scenario that the asElementType method would handle to help me understand the intent?

Not sure the distinction of the two usages is clear to me. Could you please provide an example of a scenario that the asElementType method would handle to help me understand the intent

Maybe I misunderstand then. I was thinking a query:

Starting from an old node (parsed as float[])

Serialized to new node (read into VectorData)

Toquery is called but really its a element_type: byte so we have to check dimensions and transform into byte[] to create the correct Lucene query.

Yeah, the process remains pretty much the same - exactly as you described it. There was also a parseFloat method that tried to eagerly read into byte[] during initial parsing (from either XContent or older nodes) but has now been removed to better distinguish between explicit byte & float vectors and fail for hex-float combination.

So now, hex aside, we pass around VectorData records holding float vectors (this hasn't changed) and call the asFloatVector / asByteVector in DenseVectorFieldMapper in the createKnnQuery and createExactKnnQuery methods, depending on the element type. At that point, as you've mentioned we do the dimension check / bound check etc and pass the byte[] instance from now on.

I might have misunderstood/missing something , but AFAICT the dimensionality check and conversion happens only at that point, hence why it isn't clear to me the need for an additional asElementType method.

Also, please note that I've just pushed a new set of changes moving VectorData outside of DenseVectorFieldMapper and taking care of serialization as suggested.

benwtrent · 2024-02-20T12:40:43Z

server/src/main/java/org/elasticsearch/search/vectors/ExactKnnQueryBuilder.java

+        if (out.getTransportVersion().onOrAfter(TransportVersions.KNN_EXPLICIT_BYTE_QUERY_VECTOR_PARSING)) {
+            boolean isFloat = query.isFloatVector();
+            out.writeBoolean(isFloat);
+            if (isFloat) {
+                out.writeFloatArray(query.asFloatVector());
+            } else {
+                out.writeByteArray(query.asByteVector());
+            }
+        } else {
+            out.writeFloatArray(query.asFloatVector());
+        }


This and even the transport version checks should encapsulated in VectorData

benwtrent · 2024-02-20T12:42:27Z

server/src/main/java/org/elasticsearch/search/vectors/ExactKnnQueryBuilder.java

        out.writeString(field);
    }

    @Override
    protected void doXContent(XContentBuilder builder, Params params) throws IOException {
        builder.startObject(NAME);
-        builder.field("query", query);
+        builder.field("query", query.asFloatVector());


Transforming byte -> float here isn't necessary. The VectorData should determine which "kind" it is and not bother transforming between them. Encapsulating the toXContent should fix this.

pmpailis · 2024-02-21T20:20:47Z

server/src/main/java/org/elasticsearch/search/vectors/VectorData.java

+        return asFloatVector(true);
+    }
+
+    public float[] asFloatVector(boolean failIfByte) {


Don't really like this - as we only need to force the conversion for serialization. Maybe it'd be better to have a separate (albeit very similar) method instead, or include this logic somehow in the writeTo ? The main challenge there is that in certain cases (e.g. KnnVectorQueryBuilder) there is additional logic in-between handling the query_vector param.

Thinking more on it, I don't think we should do this. Right now we allow "byte" arrays to query float indexed values.

Consider the following:

New Coordinator accepts the hex encoded byte array

Its serialized to an older node (thus transformed to float[])

Then we successfully query a element_type: float field.

I think its ok for us to be lenient here.

What do you think @mayya-sharipova. It seems that preventing byte[] array queries against a element_type: float field is causing more trouble than its worth :(

Yeah, I don't think we should fail here. float v = b is valid in java as its expanding the values. It seems like an unnecessary restriction the more I think about it.

Sorry for flip-flopping on this so much. What do you think @pmpailis should we restrict it? @mayya-sharipova what say you? I am happy to go with the majority as I obviously cannot make up my mind :)

Currently we do support byte values for float fields (we won't know about element type until much later) so I guess it makes sense to be "consistent" for hex as well and let it pass for float vectors. I do understand the reasoning for restricting this, but we don't do that for a standard byte array now either, so this could potentially cause some confusion.

+ the scenario you mentioned, unless we decide to throw if hex & old nodes (which I don't really think it'd be nice), would make us be lenient :)

Updated (temporarily) to not fail for byte -> float conversion. Happy to change back if we decide to do so :) @mayya-sharipova wdyt?

benwtrent · 2024-02-21T21:21:01Z

server/src/main/java/org/elasticsearch/search/vectors/VectorData.java

+    public XContentBuilder toXContent(XContentBuilder builder, Params params) throws IOException {
+        if (floatVector != null) {
+            builder.array(params.param(XCONTENT_PARAM_NAME, DEFAULT_XCONTENT_NAME), floatVector);
+        } else {
+            builder.array(params.param(XCONTENT_PARAM_NAME, DEFAULT_XCONTENT_NAME), byteVector);
+        }
+        return builder;
+    }


It would be much simpler if this would only write the non-null array without a field name. Then the containing objects would do builder.field(fieldName, vectorData); which will get written as fieldName: [1, 2, 3...]

I think the use of XContentParams here, while interesting, is complex.

Yeap I agree with your point. Tbh that's how I had it initially, but some serialization tests were failing with the following, so I resorted to making use of the more complex approach using XContentParams

... Caused by: com.fasterxml.jackson.core.JsonGenerationException: Can not start an array, expecting field name at com.fasterxml.jackson.core.JsonGenerator._reportError(JsonGenerator.java:2849) at com.fasterxml.jackson.dataformat.yaml.YAMLGenerator._verifyValueWrite(YAMLGenerator.java:916) at com.fasterxml.jackson.dataformat.yaml.YAMLGenerator.writeStartArray(YAMLGenerator.java:586) at org.elasticsearch.xcontent.provider.json.JsonXContentGenerator.writeStartArray(JsonXContentGenerator.java:169)

I ended up re-writing the tests using mocks instead of concrete instances for the validation of VectorData#toXContent, but failed to change this one back.

++ for the change, will update it.

Done in 2436b03

benwtrent · 2024-02-21T21:22:32Z

server/src/main/java/org/elasticsearch/search/vectors/VectorData.java

+        if (vec == null) return null;
+        return new VectorData(vec);


Suggested change

if (vec == null) return null;

return new VectorData(vec);

return vec == null ? null : new VectorData(vec);

But not a big deal.

benwtrent · 2024-02-21T21:35:35Z

server/src/main/java/org/elasticsearch/index/mapper/vectors/DenseVectorFieldMapper.java

@@ -400,33 +445,20 @@ public void parseKnnVectorAndIndex(DocumentParserContext context, DenseVectorFie
            @Override
            double parseKnnVectorToByteBuffer(DocumentParserContext context, DenseVectorFieldMapper fieldMapper, ByteBuffer byteBuffer)


I like how much cleaner this is becoming :D

benwtrent · 2024-02-21T21:38:01Z

server/src/main/java/org/elasticsearch/search/vectors/VectorData.java

+        return asFloatVector(true);
+    }
+
+    public float[] asFloatVector(boolean failIfByte) {


Yeah, I don't think we should fail here. float v = b is valid in java as its expanding the values. It seems like an unnecessary restriction the more I think about it.

Sorry for flip-flopping on this so much. What do you think @pmpailis should we restrict it? @mayya-sharipova what say you? I am happy to go with the majority as I obviously cannot make up my mind :)

pmpailis · 2024-02-21T22:16:31Z

@elasticmachine update branch

benwtrent

whether we want to support converting to float when a user has provided a hex vector (have to also consider desired bwc for this)

I think this is OK. float = byte is an acceptable conversion is almost every programming language AND its something that HAS to happen for BWC to not be completely busted.

There is no technical reason for the restriction that I can think of.

@mayya-sharipova what do you think?

server/src/main/java/org/elasticsearch/search/vectors/VectorData.java

server/src/main/java/org/elasticsearch/TransportVersions.java

…ex_encoded_byte_vectors

original-brownbear · 2024-03-04T14:43:36Z

server/src/main/java/org/elasticsearch/index/mapper/vectors/DenseVectorFieldMapper.java

@@ -384,11 +396,44 @@ public void parseKnnVectorAndIndex(DocumentParserContext context, DenseVectorFie
                                + "];"
                        );
                    }
-                    vector[index++] = (byte) value;
+                    consumer.accept((byte) value, index++);


Why do we need to be so tricky with a Consumer here and have a ByteBuffer and byte[] path? As far as I understand, all our ByteBuffer are array-backed anyway? Can't we do without the indirection and non-static callsite here and just always insert into an array?

Thanks for taking a look and at the suggestion @original-brownbear ! Updated to remove the Consumer overhead and parsing directly to a byte array.

server/src/main/java/org/elasticsearch/index/mapper/vectors/DenseVectorFieldMapper.java

…ex_encoded_byte_vectors

server/src/main/java/org/elasticsearch/index/mapper/vectors/DenseVectorFieldMapper.java

mayya-sharipova

@pmpailis Thanks for persisting and addressing all the comments! Great work, Panos, I very much like how the code looks now.

benwtrent

This is good stuff.

pmpailis · 2024-03-11T08:53:35Z

@elasticmachine update branch

pmpailis · 2024-03-11T09:57:15Z

run elasticsearch-ci/part-1

pmpailis · 2024-03-11T10:30:43Z

@elasticmachine update branch

pmpailis · 2024-03-12T09:25:06Z

@elasticmachine update branch

pmpailis · 2024-03-12T10:49:58Z

run elasticsearch-ci/part-1

…ex_encoded_byte_vectors

pmpailis · 2024-03-12T16:38:35Z

The following tests are currently failing, most likely as a side-effect of another test (LoggerTests) updating the log-level for the root logger.

Tests with failures:
 - org.elasticsearch.snapshots.SnapshotResiliencyTests.testIndexNotFoundExceptionLogging
 - org.elasticsearch.snapshots.SnapshotResiliencyTests.testFullSnapshotUnassignedShards
 - org.elasticsearch.snapshots.SnapshotResiliencyTests.testIllegalArgumentExceptionLogging
 - org.elasticsearch.snapshots.SnapshotResiliencyTests.testSnapshotNameAlreadyInUseExceptionLogging

Once this PR is merged, we can proceed with merging this one as well.

…ex_encoded_byte_vectors

pmpailis · 2024-03-13T07:22:33Z

Thanks everyone for the thorough reviews and the discussions ❤️

adding parsing for hex-encoded byte vectors

135d8d0

pmpailis added Team:Search Meta label for search team :Search Relevance/Vectors Vector search v8.13.0 labels Feb 12, 2024

pmpailis added the >feature label Feb 12, 2024

Update docs/changelog/105393.yaml

0e4d7e6

benwtrent requested a review from mayya-sharipova February 12, 2024 13:08

carlosdelest reviewed Feb 13, 2024

View reviewed changes

pmpailis added 3 commits February 13, 2024 11:55

addressing PR comments - removing duplicated code and opting for swit…

3833517

…ch statement

changing visibility of parseQueryVector method

638a737

iter

4d05b3f

carlosdelest approved these changes Feb 13, 2024

View reviewed changes

benwtrent self-requested a review February 13, 2024 13:24

benwtrent reviewed Feb 13, 2024

View reviewed changes

elasticsearchmachine added v8.14.0 and removed v8.13.0 labels Feb 14, 2024

addressing PR comments - adding VectorData DTO

325a900

pmpailis commented Feb 20, 2024

View reviewed changes

Merge branch 'main' into feature/support_for_hex_encoded_byte_vectors

9933da3

benwtrent reviewed Feb 20, 2024

View reviewed changes

adding VectorData as a record

5cec209

pmpailis commented Feb 21, 2024

View reviewed changes

benwtrent reviewed Feb 21, 2024

View reviewed changes

addressing PR comments - simplifying toXContent for VectorData

2436b03

benwtrent reviewed Feb 23, 2024

View reviewed changes

server/src/main/java/org/elasticsearch/search/vectors/VectorData.java Show resolved Hide resolved

server/src/main/java/org/elasticsearch/TransportVersions.java Outdated Show resolved Hide resolved

pmpailis added 2 commits February 23, 2024 15:33

Merge remote-tracking branch 'origin/main' into feature/support_for_h…

db4ba3d

…ex_encoded_byte_vectors

minor iter

e2d0b1d

original-brownbear reviewed Mar 4, 2024

View reviewed changes

mayya-sharipova reviewed Mar 5, 2024

View reviewed changes

server/src/main/java/org/elasticsearch/index/mapper/vectors/DenseVectorFieldMapper.java Outdated Show resolved Hide resolved

Merge remote-tracking branch 'origin/main' into feature/support_for_h…

089b212

…ex_encoded_byte_vectors

mayya-sharipova reviewed Mar 5, 2024

View reviewed changes

server/src/main/java/org/elasticsearch/index/mapper/vectors/DenseVectorFieldMapper.java Show resolved Hide resolved

minor iter

d0355ca

pmpailis commented Mar 6, 2024

View reviewed changes

server/src/main/java/org/elasticsearch/index/mapper/vectors/DenseVectorFieldMapper.java Show resolved Hide resolved

mayya-sharipova approved these changes Mar 8, 2024

View reviewed changes

benwtrent approved these changes Mar 8, 2024

View reviewed changes

Merge branch 'main' into feature/support_for_hex_encoded_byte_vectors

9c3840e

Merge branch 'main' into feature/support_for_hex_encoded_byte_vectors

71591fe

elasticmachine and others added 2 commits March 11, 2024 11:30

Merge branch 'main' into feature/support_for_hex_encoded_byte_vectors

98cee5e

Merge branch 'main' into feature/support_for_hex_encoded_byte_vectors

d3a2838

Merge branch 'main' into feature/support_for_hex_encoded_byte_vectors

80d3538

Merge remote-tracking branch 'origin/main' into feature/support_for_h…

c5282d9

…ex_encoded_byte_vectors

pmpailis added 3 commits March 12, 2024 18:41

Merge remote-tracking branch 'origin/main' into feature/support_for_h…

ce0b82e

…ex_encoded_byte_vectors

Merge branch 'main' into feature/support_for_hex_encoded_byte_vectors

fffd8c3

fixing compilation error

727d747

pmpailis mentioned this pull request Mar 12, 2024

Add retrievers using the parser-only approach #105470

Merged

pmpailis merged commit d471ccb into elastic:main Mar 13, 2024

pmpailis deleted the feature/support_for_hex_encoded_byte_vectors branch May 27, 2025 03:50

	if (vec == null) return null;
	return new VectorData(vec);
	return vec == null ? null : new VectorData(vec);

		@@ -400,33 +445,20 @@ public void parseKnnVectorAndIndex(DocumentParserContext context, DenseVectorFie
		@Override
		double parseKnnVectorToByteBuffer(DocumentParserContext context, DenseVectorFieldMapper fieldMapper, ByteBuffer byteBuffer)

Adding support for hex-encoded byte vectors on knn-search #105393

Adding support for hex-encoded byte vectors on knn-search #105393

Uh oh!

Conversation

pmpailis commented Feb 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 12, 2024

Uh oh!

elasticsearchmachine commented Feb 12, 2024

Uh oh!

elasticsearchmachine commented Feb 12, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pmpailis Feb 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

carlosdelest left a comment

Choose a reason for hiding this comment

Uh oh!

benwtrent left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pmpailis commented Feb 20, 2024

Uh oh!

elasticmachine commented Feb 20, 2024

Uh oh!

benwtrent left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pmpailis commented Feb 12, 2024 •

edited

Loading

pmpailis Feb 13, 2024 •

edited

Loading