Skip to content

InvalidSchemaFormatValue exception when unmarshalling string with format 'byte' #261

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
eiband opened this issue Jun 29, 2020 · 3 comments

Comments

@eiband
Copy link

eiband commented Jun 29, 2020

When validating a property of type

type: string
format: byte

it is assumed that the base64 encoded value can be represented as utf-8 string. As a result InvalidSchemaFormatValue is raised when unmarshalling binary data which can't be represented as utf-8 string, e.g. the value base64.b64encode(b'\xff').

The exception text is:
Failed to format value /w== to format byte: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte.

@sweh
Copy link

sweh commented Feb 11, 2021

IMO this was introduced with the fix for #117. I don't know the intention of the fix, why byte should be text type, but the way it's implemented fails with non unicode characters. I see two possible solutions, that are working for me:

Get back to the way byte was handled before #117, by just trying if the value can be base64 decoded:

def format_byte(value, encoding='utf8'):
    return b64decode(value)

Or, ignore errors when converting the base64 decoded value to string:

def format_byte(value, encoding='utf8'):
    return text_type(b64decode(value), encoding, "ignore")

I can provide a PR if someone gives me a signal which solution is preferred.

@sweh
Copy link

sweh commented Feb 11, 2021

Meanwhile one can easily fix this with a monkeypatch 🙈

import openapi_core.unmarshalling.schemas.util

def new_format_byte(value, encoding='utf8'):
    return b64decode(value)

openapi_core.unmarshalling.schemas.util.format_byte = new_format_byte

@lopuhin
Copy link

lopuhin commented Mar 17, 2022

Workaround without monkey-patching, if using RequestValidator, pass custom_formatters:

import base64
from openapi_core.validation.request.validators import RequestValidator
from openapi_core.unmarshalling.schemas.formatters import Formatter

validator = RequestValidator(
    spec, url,
    custom_formatters={
        'byte': Formatter.from_callables(lambda x: True, base64.b64decode),
    })
validator.validate(request).raise_for_errors()

It's also possible to make a more intelligent check instead of lambda x: True, if you care about which exception you get from validation, but I think this still is enough to make sure that the value is valid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants