Skip to content

generic/_utils.py: function create_string_object not working with the bytearray type #2434

Closed
@sbourlon

Description

@sbourlon

Hello pypdf team,

While trying to to get the fields of my PDF with the function PdfReader.get_fields(), my code received an exception from the function create_string_object (in pypdf/generic/_utils.py, line 113) because it received a bytearray instead of a str or bytes.

By looking at the traceback, the error occurs when the function def decrypt_object(self, obj: PdfObject) -> PdfObject detects that the object to decrypt is either of type ByteStringObject or TextStringObject, before calling create_string_object.

The documentation about the bytearray type states:

bytearray objects are a mutable counterpart to bytes objects.

As bytearray objects are mutable, they support the mutable sequence operations in addition to the common bytes and bytearray operations described in Bytes and Bytearray Operations.

source: https://docs.python.org/3/library/stdtypes.html#bytearray

So it seems like the function create_string_object could accept bytearray objects and could treat them as bytes instead of raising an exception.

After applying this fix, I was able to read the fields of my PDF.

diff --git a/pypdf/generic/_utils.py b/pypdf/generic/_utils.py
index e6da5cf..edc9153 100644
--- a/pypdf/generic/_utils.py
+++ b/pypdf/generic/_utils.py
@@ -129,7 +129,7 @@ def create_string_object(
     """
     if isinstance(string, str):
         return TextStringObject(string)
-    elif isinstance(string, bytes):
+    elif isinstance(string, bytes | bytearray):
         if isinstance(forced_encoding, (list, dict)):
             out = ""
             for x in string:

Environment

Which environment were you using when you encountered the problem?

$ python -m platform
Linux-6.5.0-15-generic-x86_64-with-glibc2.38

$ python -c "import pypdf;print(pypdf._debug_versions)"
pypdf==3.17.4, crypt_provider=('cryptography', '41.0.7'), PIL=none

Code + PDF

This is a minimal, complete example that shows the issue:

pdf = PdfReader(file)
fields = pdf.get_fields()

I can't provide my PDF file because it contains personal information.

Traceback

This is the complete traceback I see:

Traceback (most recent call last):
  File "/home/stefan/src/source/pdf.py", line 235, in get_pdf_fields
    fields = self.reader.get_fields()
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/stefan/.venv/lib/python3.11/site-packages/pypdf/_reader.py", line 577, in get_fields
    field = f.get_object()
            ^^^^^^^^^^^^^^
  File "/home/stefan/.venv/lib/python3.11/site-packages/pypdf/generic/_base.py", line 312, in get_object
    obj = self.pdf.get_object(self)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/stefan/.venv/lib/python3.11/site-packages/pypdf/_reader.py", line 1417, in get_object
    retval = self._encryption.decrypt_object(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/stefan/.venv/lib/python3.11/site-packages/pypdf/_encryption.py", line 850, in decrypt_object
    return cf.decrypt_object(obj)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/stefan/.venv/lib/python3.11/site-packages/pypdf/_encryption.py", line 104, in decrypt_object
    obj[key] = self.decrypt_object(value)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/stefan/.venv/lib/python3.11/site-packages/pypdf/_encryption.py", line 97, in decrypt_object
    obj = create_string_object(data)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/stefan/.venv/lib/python3.11/site-packages/pypdf/generic/_utils.py", line 163, in create_string_object
    raise TypeError(
TypeError: ('create_string_object should have str or unicode arg: %s', <class 'bytearray'>)

Metadata

Metadata

Assignees

No one assigned

    Labels

    genericThe generic submodule is affectedis-bugFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions