Skip to content

A support for 'Protobuf Deserializer' with schema registry #1174

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
dorocoder opened this issue Aug 1, 2021 · 8 comments
Open

A support for 'Protobuf Deserializer' with schema registry #1174

dorocoder opened this issue Aug 1, 2021 · 8 comments

Comments

@dorocoder
Copy link

Description

At looking into the source code for Protobuf, it seems like not possible to deserialize a protobuf message without its corresponding static message type argument.

When it comes to Avro, all things needed have been provided and it works well to deserialize an Avro message via schema registry only.

I wonder wether a support for Protobuf deserialization with schema registry is in the middle of being implemented or not planned yet.

class ProtobufDeserializer(object):
    """
    ProtobufDeserializer decodes bytes written in the Schema Registry
    Protobuf format to an object.
    Args:
        message_type (GeneratedProtocolMessageType): Protobuf Message type.
   ...
@mhowlett
Copy link
Contributor

mhowlett commented Aug 2, 2021

Not planned, haven't investigated. It would require a library that allowed the data to be traversed over dynamically - I'm not sure if this exists in Python or not.

@dorocoder
Copy link
Author

Not planned, haven't investigated. It would require a library that allowed the data to be traversed over dynamically - I'm not sure if this exists in Python or not.

Thanks for your quickest and obvious answer.

@nilansaha
Copy link

Was gonna post this question as well. @mhowlett is there any equivalent libraries in lets say Java that allows the data to be traversed over dynamically ?

@dorocoder
Copy link
Author

Was gonna post this question as well. @mhowlett is there any equivalent libraries in lets say Java that allows the data to be traversed over dynamically ?

I hope you would find some answer at #link

@rayokota
Copy link
Member

Fixed by #1852

@nivgold
Copy link

nivgold commented Feb 12, 2025

Fixed by #1852

Code on master of ProtobufDeserializer's constructor still requires a generated protobuf object...
This issue isn't fixed

@sauljabin
Copy link

Fixed by #1852

Code on master of ProtobufDeserializer's constructor still requires a generated protobuf object... This issue isn't fixed

@nivgold, that makes sense

@rayokota rayokota reopened this Feb 12, 2025
@markfickett
Copy link

I'm trying to consume messages from a topic with a protobuf schema from the Python client (after installing confluent-kafka[protobuf,schemaregistry]). But I'm running into the same issue, plus some difficulty working around it.

Is there a recommended way to build protos from the schema registry, including extensions such as confluent/meta.proto, so that they can be used with ProtobufDeserializer?

It looks like this is possible in Databricks/Spark, where their from_protobuf method loads everything dynamically (we have other teams using Spark processing and that works).

(1) As @nivgold says, the ProtobufDeserializer still requires a Python proto message class, as in the example protobuf_consumer.py. Actually it looks like the example avro_consumer.py also loads the schema from a static file, so this may apply beyond protobuf schemas.

(2) I tried downloading the generated proto file from my Confluent schema registry. It requires confluent.field_meta to define options like:

    int32 my_number_field = 13 [(confluent.field_meta) = {                              
      params: [                                                                  
        {                                                                        
          key: "connect.type",                                                   
          value: "int16"                                                         
        }                                                                        
      ]                                                                          
    }];

This won't compile as is. I tried downloading meta.proto too and adding an import to my proto source file from the schema registry. However...

(3) If I do build the meta.proto message as well as my own message's descriptor, I get an error that TypeError: Couldn't build proto file into descriptor pool: duplicate symbol 'confluent.file_meta'. If I include meta.proto it in source but don't build the proto message, or build the Meta proto but then try to trim the conflicting pieces out, I variously get Python import error or proto generated code missing descriptor errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants