跳转至

联合类型

工会与其他所有类型的 Pydantic 验证有根本的不同——它不需要所有字段/项目/值都是有效的,而只需要一个成员是有效的。

这导致了一些关于如何验证联合的细微差别:

  • 你应该根据工会的哪个成员(们)验证数据,以及按什么顺序?
  • 验证失败时要引发哪些错误?

验证联合感觉像是给验证过程增加了另一个正交维度。

为了解决这些问题,Pydantic 支持三种验证联合的基本方法:

  1. 从左到右模式 - 最简单的方法,按顺序尝试联合中的每个成员,返回第一个匹配项

  2. 智能模式——类似于“从左到右模式”,成员将按顺序尝试;但是,验证将继续进行,直到找到更好的匹配,这是大多数联合验证的默认模式

  3. 歧视性工会——只对工会的一个成员进行审判,基于一个判别器

提示

In general, we recommend using discriminated unions. They are both more performant and more predictable than untagged unions, as they allow you to control which member of the union to validate against.

For complex cases, if you're using untagged unions, it's recommended to use union_mode='left_to_right' if you need guarantees about the order of validation attempts against the union members.

If you're looking for incredibly specialized behavior, you can use a custom validator.

联盟模式

从左到右模式

注意

由于这种模式经常导致意外的验证结果,因此它不是 Pydantic >=2 的默认模式,而是 union_mode='smart' 是默认模式。

通过这种方法,将按照定义成员的顺序对联合中的每个成员进行验证,并且接受第一个成功的验证作为输入。

如果所有成员的验证都失败,则验证错误包括联合中所有成员的错误。

union_mode='left_to_right' 必须设置为要在其使用的联合字段的 Field 参数。

from typing import Union

from pydantic import BaseModel, Field, ValidationError


class User(BaseModel):
    id: Union[str, int] = Field(union_mode='left_to_right')


print(User(id=123))
#> id=123
print(User(id='hello'))
#> id='hello'

try:
    User(id=[])
except ValidationError as e:
    print(e)
    """
    2 validation errors for User
    id.str
      Input should be a valid string [type=string_type, input_value=[], input_type=list]
    id.int
      Input should be a valid integer [type=int_type, input_value=[], input_type=list]
    """

在这种情况下,成员的顺序非常重要,上述示例稍加调整即可证明:

from typing import Union

from pydantic import BaseModel, Field


class User(BaseModel):
    id: Union[int, str] = Field(union_mode='left_to_right')


print(User(id=123))  # (1)
#> id=123
print(User(id='456'))  # (2)
#> id=456
  1. 不出所料,输入是根据 int 成员进行验证的,结果正如预期的那样。

  2. 我们处于宽松模式,数字字符串 '123' 作为联合的第一个成员 int 的输入是有效的。由于首先尝试了那个,我们得到了令人惊讶的结果,即 id 是一个 int ,而不是一个 str

智能模式

由于 union_mode='left_to_right' 可能会产生令人惊讶的结果,在 Pydantic>=2 中, Union 验证的默认模式是 union_mode='smart'

在这种模式下,pydantic 尝试从联合成员中选择最匹配的输入。具体的算法可能会在 Pydantic 的次要版本之间发生变化,以允许在性能和准确性方面进行改进。

注意

We reserve the right to tweak the internal smart matching algorithm in future versions of Pydantic. If you rely on very specific matching behavior, it's recommended to use union_mode='left_to_right' or discriminated unions.

智能模式算法

The smart mode algorithm uses two metrics to determine the best match for the input:

1. The number of valid fields set (relevant for models, dataclasses, and typed dicts)
2. The exactness of the match (relevant for all types)

#### Number of valid fields set

!!! note
    This metric was introduced in Pydantic v2.8.0. Prior to this version, only exactness was used to determine the best match.

This metric is currently only relevant for models, dataclasses, and typed dicts.

The greater the number of valid fields set, the better the match. The number of fields set on nested models is also taken into account.
These counts bubble up to the top-level union, where the union member with the highest count is considered the best match.

For data types where this metric is relevant, we prioritize this count over exactness. For all other types, we use solely exactness.

#### Exactness

For `exactness`, Pydantic scores a match of a union member into one of the following three groups (from highest score to lowest score):

- An exact type match, for example an `int` input to a `float | int` union validation is an exact type match for the `int` member
- Validation would have succeeded in [`strict` mode](../concepts/strict_mode.md)
- Validation would have succeeded in lax mode

The union match which produced the highest exactness score will be considered the best match.

In smart mode, the following steps are taken to try to select the best match for the input:

=== "`BaseModel`, `dataclass`, and `TypedDict`"

    1. Union members are attempted left to right, with any successful matches scored into one of the three exactness categories described above,
    with the valid fields set count also tallied.
    2. After all members have been evaluated, the member with the highest "valid fields set" count is returned.
    3. If there's a tie for the highest "valid fields set" count, the exactness score is used as a tiebreaker, and the member with the highest exactness score is returned.
    4. If validation failed on all the members, return all the errors.

=== "All other data types"

    1. Union members are attempted left to right, with any successful matches scored into one of the three exactness categories described above.
        - If validation succeeds with an exact type match, that member is returned immediately and following members will not be attempted.
    2. If validation succeeded on at least one member as a "strict" match, the leftmost of those "strict" matches is returned.
    3. If validation succeeded on at least one member in "lax" mode, the leftmost match is returned.
    4. Validation failed on all the members, return all the errors.


from typing import Union
from uuid import UUID

from pydantic import BaseModel


class User(BaseModel):
    id: Union[int, str, UUID]
    name: str


user_01 = User(id=123, name='John Doe')
print(user_01)
#> id=123 name='John Doe'
print(user_01.id)
#> 123
user_02 = User(id='1234', name='John Doe')
print(user_02)
#> id='1234' name='John Doe'
print(user_02.id)
#> 1234
user_03_uuid = UUID('cf57432e-809e-4353-adbd-9d5c0d733868')
user_03 = User(id=user_03_uuid, name='John Doe')
print(user_03)
#> id=UUID('cf57432e-809e-4353-adbd-9d5c0d733868') name='John Doe'
print(user_03.id)
#> cf57432e-809e-4353-adbd-9d5c0d733868
print(user_03_uuid.int)
#> 275603287559914445491632874575877060712

!!提示 Optional[x]Union[x, None] 的快捷方式。

See more details in [Required fields](../concepts/models.md#required-fields).

歧视性工会


歧视性工会有时也被称为“标记联合”。

我们可以使用 discriminated unions 来更有效地验证 Union 类型,通过选择要验证的联合成员。

这使得验证更有效率,并且在验证失败时也避免了错误的扩散。

向联合添加鉴别器也意味着生成的 JSON 模式实现了相关的 OpenAPI 规范。

具有 str 鉴别器的歧视性工会

通常,在有多个模型的 Union 的情况下,联合的所有成员都有一个公共字段,可以用于区分应该根据哪个联合情况验证数据;在 OpenAPI 中,这被称为“鉴别器”。

要根据该信息验证模型,可以在每个模型中设置相同的字段——我们称之为 my_discriminator ——具有区分的值,即一个(或多个) Literal 值。对于你的 Union ,可以在其值中设置判别器: Field(discriminator='my_discriminator')

from typing import Literal, Union

from pydantic import BaseModel, Field, ValidationError


class Cat(BaseModel):
    pet_type: Literal['cat']
    meows: int


class Dog(BaseModel):
    pet_type: Literal['dog']
    barks: float


class Lizard(BaseModel):
    pet_type: Literal['reptile', 'lizard']
    scales: bool


class Model(BaseModel):
    pet: Union[Cat, Dog, Lizard] = Field(..., discriminator='pet_type')
    n: int


print(Model(pet={'pet_type': 'dog', 'barks': 3.14}, n=1))
#> pet=Dog(pet_type='dog', barks=3.14) n=1
try:
    Model(pet={'pet_type': 'dog'}, n=1)
except ValidationError as e:
    print(e)
    """
    1 validation error for Model
    pet.dog.barks
      Field required [type=missing, input_value={'pet_type': 'dog'}, input_type=dict]
    """

Discriminated Unions with callable Discriminator

具有可调用 Discriminator 的歧视性工会

API 文档

pydantic.types.Discriminator

在有多个模型的 Union 情况下,有时并不是所有模型都有一个可以用作判别器的单一统一字段。这是调用函数 Discriminator 的完美用例。

from typing import Any, Literal, Union

from typing_extensions import Annotated

from pydantic import BaseModel, Discriminator, Tag


class Pie(BaseModel):
    time_to_cook: int
    num_ingredients: int


class ApplePie(Pie):
    fruit: Literal['apple'] = 'apple'


class PumpkinPie(Pie):
    filling: Literal['pumpkin'] = 'pumpkin'


def get_discriminator_value(v: Any) -> str:
    if isinstance(v, dict):
        return v.get('fruit', v.get('filling'))
    return getattr(v, 'fruit', getattr(v, 'filling', None))


class ThanksgivingDinner(BaseModel):
    dessert: Annotated[
        Union[
            Annotated[ApplePie, Tag('apple')],
            Annotated[PumpkinPie, Tag('pumpkin')],
        ],
        Discriminator(get_discriminator_value),
    ]


apple_variation = ThanksgivingDinner.model_validate(
    {'dessert': {'fruit': 'apple', 'time_to_cook': 60, 'num_ingredients': 8}}
)
print(repr(apple_variation))
"""
ThanksgivingDinner(dessert=ApplePie(time_to_cook=60, num_ingredients=8, fruit='apple'))
"""

pumpkin_variation = ThanksgivingDinner.model_validate(
    {
        'dessert': {
            'filling': 'pumpkin',
            'time_to_cook': 40,
            'num_ingredients': 6,
        }
    }
)
print(repr(pumpkin_variation))
"""
ThanksgivingDinner(dessert=PumpkinPie(time_to_cook=40, num_ingredients=6, filling='pumpkin'))
"""

Discriminator 也可以用于验证 Union 类型,包括模型和基本类型的组合。

例如:

from typing import Any, Union

from typing_extensions import Annotated

from pydantic import BaseModel, Discriminator, Tag, ValidationError


def model_x_discriminator(v: Any) -> str:
    if isinstance(v, int):
        return 'int'
    if isinstance(v, (dict, BaseModel)):
        return 'model'
    else:
        # return None if the discriminator value isn't found
        return None


class SpecialValue(BaseModel):
    value: int


class DiscriminatedModel(BaseModel):
    value: Annotated[
        Union[
            Annotated[int, Tag('int')],
            Annotated['SpecialValue', Tag('model')],
        ],
        Discriminator(model_x_discriminator),
    ]


model_data = {'value': {'value': 1}}
m = DiscriminatedModel.model_validate(model_data)
print(m)
#> value=SpecialValue(value=1)

int_data = {'value': 123}
m = DiscriminatedModel.model_validate(int_data)
print(m)
#> value=123

try:
    DiscriminatedModel.model_validate({'value': 'not an int or a model'})
except ValidationError as e:
    print(e)  # (1)!
    """
    1 validation error for DiscriminatedModel
    value
      Unable to extract tag using discriminator model_x_discriminator() [type=union_tag_not_found, input_value='not an int or a model', input_type=str]
    """
  1. 注意,可调用的鉴别器函数如果找不到鉴别器值,则返回 None 。当返回 None 时,会引发此 union_tag_not_found 错误。

注意

使用 [ typing.Annotated ][] 字段语法可以方便地对 Uniondiscriminator 信息进行分组。更多详细信息请参见下一个示例。

There are a few ways to set a discriminator for a field, all varying slightly in syntax.

For str discriminators:

some_field: Union[...] = Field(discriminator='my_discriminator'
some_field: Annotated[Union[...], Field(discriminator='my_discriminator')]

For callable Discriminators:

some_field: Union[...] = Field(discriminator=Discriminator(...))
some_field: Annotated[Union[...], Discriminator(...)]
some_field: Annotated[Union[...], Field(discriminator=Discriminator(...))]

警告

不能仅使用单个变体(如 Union[Cat] )使用歧视性工会。

Python changes Union[T] into T at interpretation time, so it is not possible for pydantic to distinguish fields of Union[T] from T.

嵌套的 discriminated unions

只能为一个字段设置一个鉴别器,但有时您想组合多个鉴别器。您可以通过创建嵌套的 Annotated 类型来实现,例如:

from typing import Literal, Union

from typing_extensions import Annotated

from pydantic import BaseModel, Field, ValidationError


class BlackCat(BaseModel):
    pet_type: Literal['cat']
    color: Literal['black']
    black_name: str


class WhiteCat(BaseModel):
    pet_type: Literal['cat']
    color: Literal['white']
    white_name: str


Cat = Annotated[Union[BlackCat, WhiteCat], Field(discriminator='color')]


class Dog(BaseModel):
    pet_type: Literal['dog']
    name: str


Pet = Annotated[Union[Cat, Dog], Field(discriminator='pet_type')]


class Model(BaseModel):
    pet: Pet
    n: int


m = Model(pet={'pet_type': 'cat', 'color': 'black', 'black_name': 'felix'}, n=1)
print(m)
#> pet=BlackCat(pet_type='cat', color='black', black_name='felix') n=1
try:
    Model(pet={'pet_type': 'cat', 'color': 'red'}, n='1')
except ValidationError as e:
    print(e)
    """
    1 validation error for Model
    pet.cat
      Input tag 'red' found using 'color' does not match any of the expected tags: 'black', 'white' [type=union_tag_invalid, input_value={'pet_type': 'cat', 'color': 'red'}, input_type=dict]
    """
try:
    Model(pet={'pet_type': 'cat', 'color': 'black'}, n='1')
except ValidationError as e:
    print(e)
    """
    1 validation error for Model
    pet.cat.black.black_name
      Field required [type=missing, input_value={'pet_type': 'cat', 'color': 'black'}, input_type=dict]
    """

提示

如果您想根据联合(且仅为联合)验证数据,可以使用 pydantic 的 TypeAdapter 构造,而不是从标准 BaseModel 继承。

In the context of the previous example, we have the following:

type_adapter = TypeAdapter(Pet)

pet = type_adapter.validate_python(
    {'pet_type': 'cat', 'color': 'black', 'black_name': 'felix'}
)
print(repr(pet))
#> BlackCat(pet_type='cat', color='black', black_name='felix')

联合验证错误

Union 验证失败时,错误消息可能会非常详细,因为它们会为联合中的每个情况产生验证错误。在处理递归模型时,这一点尤其明显,因为在每个递归级别都可能生成原因。区分联合有助于简化这种情况下的错误消息,因为只有与匹配的区分符值的情况才会产生验证错误。

你还可以通过将这些规范作为参数传递给 Discriminator 构造函数来为 Discriminator 自定义错误类型、消息和上下文,如下例所示。

from typing import Union

from typing_extensions import Annotated

from pydantic import BaseModel, Discriminator, Tag, ValidationError


# Errors are quite verbose with a normal Union:
class Model(BaseModel):
    x: Union[str, 'Model']


try:
    Model.model_validate({'x': {'x': {'x': 1}}})
except ValidationError as e:
    print(e)
    """
    4 validation errors for Model
    x.str
      Input should be a valid string [type=string_type, input_value={'x': {'x': 1}}, input_type=dict]
    x.Model.x.str
      Input should be a valid string [type=string_type, input_value={'x': 1}, input_type=dict]
    x.Model.x.Model.x.str
      Input should be a valid string [type=string_type, input_value=1, input_type=int]
    x.Model.x.Model.x.Model
      Input should be a valid dictionary or instance of Model [type=model_type, input_value=1, input_type=int]
    """

try:
    Model.model_validate({'x': {'x': {'x': {}}}})
except ValidationError as e:
    print(e)
    """
    4 validation errors for Model
    x.str
      Input should be a valid string [type=string_type, input_value={'x': {'x': {}}}, input_type=dict]
    x.Model.x.str
      Input should be a valid string [type=string_type, input_value={'x': {}}, input_type=dict]
    x.Model.x.Model.x.str
      Input should be a valid string [type=string_type, input_value={}, input_type=dict]
    x.Model.x.Model.x.Model.x
      Field required [type=missing, input_value={}, input_type=dict]
    """


# Errors are much simpler with a discriminated union:
def model_x_discriminator(v):
    if isinstance(v, str):
        return 'str'
    if isinstance(v, (dict, BaseModel)):
        return 'model'


class DiscriminatedModel(BaseModel):
    x: Annotated[
        Union[
            Annotated[str, Tag('str')],
            Annotated['DiscriminatedModel', Tag('model')],
        ],
        Discriminator(
            model_x_discriminator,
            custom_error_type='invalid_union_member',  # (1)!
            custom_error_message='Invalid union member',  # (2)!
            custom_error_context={'discriminator': 'str_or_model'},  # (3)!
        ),
    ]


try:
    DiscriminatedModel.model_validate({'x': {'x': {'x': 1}}})
except ValidationError as e:
    print(e)
    """
    1 validation error for DiscriminatedModel
    x.model.x.model.x
      Invalid union member [type=invalid_union_member, input_value=1, input_type=int]
    """

try:
    DiscriminatedModel.model_validate({'x': {'x': {'x': {}}}})
except ValidationError as e:
    print(e)
    """
    1 validation error for DiscriminatedModel
    x.model.x.model.x.model.x
      Field required [type=missing, input_value={}, input_type=dict]
    """

# The data is still handled properly when valid:
data = {'x': {'x': {'x': 'a'}}}
m = DiscriminatedModel.model_validate(data)
print(m.model_dump())
#> {'x': {'x': {'x': 'a'}}}
  1. custom_error_type 是验证失败时引发的 typeValidationError 属性。

  2. custom_error_message 是验证失败时引发的 msgValidationError 属性。

  3. custom_error_context 是验证失败时引发的 ctxValidationError 属性。

你还可以通过使用 Tag 为每个案例添加标签来简化错误消息。当你有像这个例子中的复杂类型时,这特别有用:

from typing import Dict, List, Union

from typing_extensions import Annotated

from pydantic import AfterValidator, Tag, TypeAdapter, ValidationError

DoubledList = Annotated[List[int], AfterValidator(lambda x: x * 2)]
StringsMap = Dict[str, str]


# Not using any `Tag`s for each union case, the errors are not so nice to look at
adapter = TypeAdapter(Union[DoubledList, StringsMap])

try:
    adapter.validate_python(['a'])
except ValidationError as exc_info:
    print(exc_info)
    """
    2 validation errors for union[function-after[<lambda>(), list[int]],dict[str,str]]
    function-after[<lambda>(), list[int]].0
      Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='a', input_type=str]
    dict[str,str]
      Input should be a valid dictionary [type=dict_type, input_value=['a'], input_type=list]
    """

tag_adapter = TypeAdapter(
    Union[
        Annotated[DoubledList, Tag('DoubledList')],
        Annotated[StringsMap, Tag('StringsMap')],
    ]
)

try:
    tag_adapter.validate_python(['a'])
except ValidationError as exc_info:
    print(exc_info)
    """
    2 validation errors for union[DoubledList,StringsMap]
    DoubledList.0
      Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='a', input_type=str]
    StringsMap
      Input should be a valid dictionary [type=dict_type, input_value=['a'], input_type=list]
    """

本文总阅读量