Skip to content

Migrate to Pydantic v2 #27933

Closed
Closed
@lmmx

Description

@lmmx

Feature request

Pydantic v2 was released five months ago in June 2023.

Transformers has pinned to v1 (#24596), which should only be used as a temporary solution.

Leaving it this way means that the many new features of Pydantic 2 are missed, and leaves little hope for the library to keep pace as a roadmap to v3 is already emerging.

In #24597 it was mentioned that part of the barrier was (at the time) in external dependencies that couple Transformers to v1:

Regarding using Pydantic V2, I am afraid that the involved places are not directly in transformers codebase.

For example, in

#24596 (comment)

it shows

2023-06-30T20:07:31.9883431Z  > [19/19] RUN python3 -c "from deepspeed.launcher.runner import main":
2023-06-30T20:07:31.9883916Z 1.621     from deepspeed.runtime.zero.config import DeepSpeedZeroConfig
2023-06-30T20:07:31.9884613Z 1.621   File "/usr/local/lib/python3.8/dist-packages/deepspeed/runtime/zero/config.py", line 76, in <module>
2023-06-30T20:07:31.9885116Z 1.621     class DeepSpeedZeroConfig(DeepSpeedConfigModel):
2023-06-30T20:07:31.9885814Z 1.621   File "/usr/local/lib/python3.8/dist-packages/pydantic/_internal/_model_construction.py", line 171, in __new__
2023-06-30T20:07:31.9886256Z 1.621     set_model_fields(cls, bases, config_wrapper, types_namespace)
2023-06-30T20:07:31.9886812Z 1.621   File "/usr/local/lib/python3.8/dist-packages/pydantic/_internal/_model_construction.py", line 361, in set_model_fields
2023-06-30T20:07:31.9887329Z 1.621     fields, class_vars = collect_model_fields(cls, bases, config_wrapper, types_namespace, typevars_map=typevars_map)
2023-06-30T20:07:31.9888039Z 1.621   File "/usr/local/lib/python3.8/dist-packages/pydantic/_internal/_fields.py", line 112, in collect_model_fields
2023-06-30T20:07:31.9888950Z 1.621     raise NameError(f'Field "{ann_name}" has conflict with protected namespace "{protected_namespace}"')
2023-06-30T20:07:31.9889546Z 1.621 NameError: Field "model_persistence_threshold" has conflict with protected namespace "

which indicates /usr/local/lib/python3.8/dist-packages/deepspeed/runtime/zero/config.py using pydantic.

It's the 3rd party libraries using pydantic have to do something in order to be run with pydantic V2. Right now, transformers can only pin v1 and wait.

These barriers should at the very least be enumerated, I’m sure there are ways to deal with them without holding the entire repo’s development back.

Libraries such as SQLModel have included support for both v1 and v2.

The pin adopted in Transformers has already begun to cause clashes with other libraries on v2 such as Gradio (v2.4.2 as raised in #27273)

Eventually, if pydantic>=2 is used by many libraries, we might consider to update the requirement (as long as not so many things breaking 😄 )

I fully appreciate the need to maintain backcompatibility and it is possible to support both, as examples like SQLModel have demonstrated.

Motivation

The syntax of Pydantic v1 is incompatible with v2. Backpinning should only be used as a temporary measure, it is not a sustainable long-term approach. Specifically, the pin would be relaxed to pydantic<3.0.0 as in SQLModel.

Your contribution

I am opening this feature request to begin discussion and hopefully contribute to its resolution.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions