-
-
Notifications
You must be signed in to change notification settings - Fork 495
[BUG] Pydantic V2.11 Seems to cause up to x2.5 memory usage to ninja models build #1444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
HI @M3te0r hm.. well I do not remember any intrusion into models/schema in ninja 1.3->1.4 I guess you have no choices but run a matrix test to measure
I think it's pretty safe to switch ninja 1.3/1.4 back and force (same for pydantic as I did not detected any deprecations or errors) |
Something definitely weird with pydantic 2.11, ran the matrix Code is exactly the same between run, used
Details (same order): Ninja 1.3/Pydantic 2.10.6
Peak memory usage: 497.3 MiB Ninja 1.3/Pydantic 2.11.3
Peak memory usage: 1.2 GiB Ninja 1.4.1/Pydantic 2.10.6
Peak memory usage: 497.3 MiB Ninja 1.4.1/Pydantic 2.11.3
Peak memory usage: 1.2 GiB |
Ok, at least ninja is not 100% to blame :D maybe you can extract some minimal code (few schemas, operations) where the difference is visible - I will be able then to talk something concrete with pydantic folks |
@vitalik ATM I'm not able to make a minimal reproducible exemple (it may be noticeable on quite large and complicated schemas like my work project (discrimate annotated unions, inheritance etc), but :
Original optimization : pydantic/pydantic-core#1616 As ninja.Schema use a wrapped validator on Just for testing perpuposes, I commented out _run_root_validator and was at 147MB Original code: Commented out |
@vitalik I managed to extract and reproduce a part of our code that increases memory usage with Pydantic v2.11.3 from decimal import Decimal
from typing import Annotated
from typing import Any
from typing import Literal
from typing import Union
from uuid import UUID
from django.db.models import TextChoices
from django.utils.translation import gettext as _
from ninja import Field
from ninja import NinjaAPI
from ninja import Router
from ninja import Schema
############## Basic ninja API declaration
api = NinjaAPI(title="Pydantic mem test API", version="1.0.0")
router = Router()
api.add_router("/test", router)
################### DJANGO ENUMS
class DocumentTypes(TextChoices):
PLAN = "PLAN", _("PLAN")
REPORT = "REPORT", _("REPORT")
class DocumentVisibility(TextChoices):
HCP = "HCP", _("HCP")
PATIENT = "PATIENT", _("PATIENT")
BOTH = "BOTH", _("BOTH")
class DateScale(TextChoices):
DAYS = "d", _("Day(s)")
WEEKS = "w", _("Week(s)")
MONTHS = "m", _("Month(s)")
YEARS = "y", _("Year(s)")
class RecordsActivityStatus211(TextChoices):
DRAFT = "DRAFT", _("Draft")
PUBLISHED = "PUBLISHED", _("Published")
class DateFormat(TextChoices):
YYYY_MM_DD = "YYYY-MM-DD", _("Year/Month/Day")
YYYY_MM = "YYYY-MM", _("Year/Month")
YYYY = "YYYY", _("Year")
class ContainerType(TextChoices):
FORM = "Form", _("Form")
CONTENT = "Content", _("Content")
RECORDS = "Records", _("Records")
DOCUMENT = "Document", _("Document")
PATIENT_PROFILE = "PatientProfile", _("Patient profile")
class ElementType211(TextChoices):
TEXT = "text", _("Text")
TEXTAREA = "textarea", _("Textarea")
INTEGER = "integer", _("Integer")
NUMBER = "number", _("Number")
BOOLEAN = "boolean", _("Boolean")
CHOICES = "choices", _("Choices")
DATE = "date", _("Date")
DATETIME = "datetime", _("Datetime")
TIME = "time", _("Time")
RANGE = "range", _("Range")
NRS_SLIDER = "nrs_slider", _("NRS Slider")
EQ5DL = "eq5dl", _("EQ5DL")
RICH = "rich", _("Rich text")
LINK = "link", _("Link")
PHONE = "phone", _("Phone")
EMAIL = "email", _("Email")
COUNTRY = "country", _("Country")
FRENCH_SOCIAL_SECURITY_NUMBER = "french_social_security_number", _(
"French Social Security Number",
)
FILE = "file", _("File")
TABLE = "table", _("Table")
REUSABLE = "reusable", _("Reusable")
FIRST_NAME = "first_name", _("First name")
LAST_NAME = "last_name", _("Last name")
SITE = "site", _("Site")
PATIENT_GROUP = "patient_group", _("Patient group")
LANGUAGE = "language", _("Language")
################ SCHEMAS
class AllTranslationsOut(Schema):
translations: dict[str, dict[str, Any | None]] | None = None
class ImageOut211(Schema):
id: int
image_url: str | None = None
thumbnail_url: str | None = None
filename: str | None = None
title: str | None = None
alt: str | None = None
class LightChoiceOut(Schema):
id: int
title: str | None = None
index: int
class ChoiceOut(LightChoiceOut):
href: str | None = None
weighting: Decimal | None = None
response_code: str | None = None
class LightContainerOut(Schema):
id: int
uuid: UUID
title: str | None = None
color: str
href: str | None = None
content_type: str
pages_count: int
index: int
# Build a list of concrete subclasses of BaseContainerOut to use as list spread
# when building discriminated annotated union NovaContainerElementOut
BASE_CONTAINER_OUT_SUBCLASSES = []
class BaseContainerOut(AllTranslationsOut, Schema):
id: int
uuid: UUID
title: str | None = None
label: str | None = None
description: str | None = None
index: int = Field(
...,
description="This value indicate the sorting order of the element inside the page or the table, depending on the direct parent",
)
editable: bool
movable: bool
deletable: bool
href: str | None = None
# def __init_subclass__(cls, **kwargs):
# super().__init_subclass__(**kwargs)
# if not cls.__name__.startswith("Base"):
# BASE_CONTAINER_OUT_SUBCLASSES.append(cls)
# Not the root cause (had to test in regard of __init_subclass__ pydantic comment)
@classmethod
def __pydantic_init_subclass__(cls, **kwargs: Any) -> None:
super().__init_subclass__(**kwargs)
if not cls.__name__.startswith("Base"):
BASE_CONTAINER_OUT_SUBCLASSES.append(cls)
# TESTED with each subclass uncommented one-by-one
# Peak memory usage : 70.29MB
class RichTextContainerElementOut(BaseContainerOut):
element_type: Literal[ElementType211.RICH]
content: str
# Peak memory usage : 70.358MB
class BaseContainerElementOut(BaseContainerOut):
tips: str | None = None
placeholder: str = ""
required: bool
image: ImageOut211 | None = None
validators: list[dict[str, Any]] = Field(
...,
alias="validators_as_schemas",
title="List of field validators (required, min/max/value/length)",
)
primary: bool | None = False
question_code: str | None = None
# Peak memory usage : 79.337MB
class PatientGroupElementOut(BaseContainerElementOut):
used_by_hcp: bool
element_type: Literal[ElementType211.PATIENT_GROUP]
# Peak memory usage : 87.853MB
class IntegerContainerElementOut(BaseContainerElementOut):
element_type: Literal[ElementType211.INTEGER]
min_value: int | None = None
max_value: int | None = None
# Peak memory usage : 97.706MB
class NumericContainerElementOut(BaseContainerElementOut):
element_type: Literal[ElementType211.NUMBER]
min_value: float | None = None
max_value: float | None = None
max_digits: int | None = None
# Peak memory usage : 106.228MB
class CharContainerElementOut(BaseContainerElementOut):
element_type: Literal[ElementType211.TEXT]
max_length: int | None = None
# Peak memory usage : 114.393MB
class FirstNameContainerElementOut(BaseContainerElementOut):
element_type: Literal[ElementType211.FIRST_NAME]
max_length: int | None = None
# Peak memory usage : 122.557MB
class LastNameContainerElementOut(BaseContainerElementOut):
element_type: Literal[ElementType211.LAST_NAME]
max_length: int | None = None
# Peak memory usage : 131.720MB
class TextContainerElementOut(BaseContainerElementOut):
element_type: Literal[ElementType211.TEXTAREA]
max_length: int | None = None
# Peak memory usage : 140.269MB
class EmailContainerElementOut(BaseContainerElementOut):
element_type: Literal[ElementType211.EMAIL]
# Peak memory usage : 148.106MB
class FrenchSocialSecurityNumberContainerElementOut(BaseContainerElementOut):
element_type: Literal[ElementType211.FRENCH_SOCIAL_SECURITY_NUMBER]
# Peak memory usage : 157.180MB
class FileContainerElementOut(BaseContainerElementOut):
element_type: Literal[ElementType211.FILE]
allow_multiple_files: bool
# Peak memory usage : 165.004MB
class PhoneContainerElementOut(BaseContainerElementOut):
element_type: Literal[ElementType211.PHONE]
# Peak memory usage : 172.829MB
class CountryContainerElementOut(BaseContainerElementOut):
element_type: Literal[ElementType211.COUNTRY]
# Peak memory usage : 181.647MB
class LanguageContainerElementOut(BaseContainerElementOut):
element_type: Literal[ElementType211.LANGUAGE]
# Peak memory usage : 191.200MB
class DateContainerElementOut(BaseContainerElementOut):
element_type: Literal[ElementType211.DATE]
date_format: DateFormat
min_bound_nb: int | None = None
min_bound_scale: DateScale | None = None
max_bound_nb: int | None = None
max_bound_scale: DateScale | None = None
# Peak memory usage : 201.617MB
class DateTimeContainerElementOut(BaseContainerElementOut):
element_type: Literal[ElementType211.DATETIME]
min_bound_nb: int | None = None
min_bound_scale: DateScale | None = None
max_bound_nb: int | None = None
max_bound_scale: DateScale | None = None
# Peak memory usage : 210.828MB
class BooleanContainerElementOut(BaseContainerElementOut):
element_type: Literal[ElementType211.BOOLEAN]
# Peak memory usage : 220.020MB
class RangeContainerElementOut(BaseContainerElementOut):
element_type: Literal[ElementType211.RANGE]
min_value: int
max_value: int
step: int | None
min_label: str = ""
max_label: str = ""
discrete: bool
show_drag_input_value: bool
# Peak memory usage : 229.810MB
class NRSSliderContainerElementOut(BaseContainerElementOut):
element_type: Literal[ElementType211.NRS_SLIDER]
min_value: int
max_value: int
step: int
min_label: str = ""
max_label: str = ""
# Peak memory usage : 239.102MB
class EQ5DLContainerElementOut(BaseContainerElementOut):
element_type: Literal[ElementType211.EQ5DL]
min_value: int
max_value: int
min_label: str | None = None
max_label: str | None = None
display_value: bool
use_validated_slide: bool
step: int
# Peak memory usage : 250.360MB
class ChoicesContainerElementOut(BaseContainerElementOut):
element_type: Literal[ElementType211.CHOICES]
multiple: bool
sorted_by_alpha: bool
options: list[ChoiceOut]
# Peak memory usage : 256.160MB
class LinkContainerElementOut(BaseContainerOut):
element_type: Literal[ElementType211.LINK]
container: LightContainerOut | None = None
NovaContainerElementOut = Annotated[
Union[*BASE_CONTAINER_OUT_SUBCLASSES],
Field(discriminator="element_type"),
]
class PageOut211(Schema):
id: int
uuid: UUID
index: int
href: str | None = None
title: str | None = None
class NovaPageOut211(PageOut211):
elements: list[NovaContainerElementOut]
class ContainerOut211(Schema):
id: int
uuid: UUID
# pages: list[NovaPageOut211]
title: str | None = None
color: str
description: str | None = None
href: str | None = None
content_type: str
index: int
scoring_activated: bool = False
summary_title: str | None = None
summary_content: str | None = None
class NovaContainerOut211(AllTranslationsOut, ContainerOut211):
container_type: (
Literal[ContainerType.FORM]
| Literal[ContainerType.CONTENT]
| Literal[ContainerType.PATIENT_PROFILE]
)
image: ImageOut211 | None = None
used_in_active_workflow: bool | None = False
pages: list[NovaPageOut211]
class DocumentContainerOut211(AllTranslationsOut, ContainerOut211):
container_type: Literal[ContainerType.DOCUMENT]
type: DocumentTypes | None
visibility: DocumentVisibility | None
pages: list[NovaPageOut211]
class RecordsContainerOut211(AllTranslationsOut, Schema):
id: int
uuid: UUID
title: str | None = None
color: str
description: str | None = None
href: str | None = None
content_type: str
container_type: Literal[ContainerType.RECORDS]
index: int
scoring_activated: bool = False
summary_title: str | None = None
summary_content: str | None = None
image: ImageOut211 | None = None
pages: list[NovaPageOut211]
ContainerOutUnion211 = Annotated[
Union[NovaContainerOut211, DocumentContainerOut211, RecordsContainerOut211],
Field(discriminator="container_type"),
]
############ FAKE API OPERATIONS DEFINITION
def fake_view_get(request):
pass
def fake_view_post(request):
pass
def fake_view_patch(request):
pass
fake_operations = (
("GET", fake_view_get, 200),
("POST", fake_view_post, 201),
("PATCH", fake_view_post, 200),
)
for i in range(20):
for method, view_func, response_code in fake_operations:
router.add_api_operation(
f"/fake_{i}/activities/",
[method],
view_func,
response={
response_code: ContainerOutUnion211,
},
summary=f"Test summary {i}",
) With Pydantic v2.10.6 Peak memory usage: 174.4 MiB With Pydantic v2.11.3 Peak memory usage: 257.3 MiB Maybe this have someting to do with nested tagged unions and schemas inheritance ? EDIT 1 : Updated snippet with comment about memory usage uncommenting one by one each BaseContainerOut sublass |
Hi @vitalik did you get the time to check this ? |
Uh oh!
There was an error while loading. Please reload this page.
Describe the bug
A clear and concise description of what the bug is.
Hi, during a dependencies upgrade work, upgrading both ninja and pydantic to last version (1.4, 2.11)
I've seen up to x2.5 in terms of memory allocation, in my case 1.2 GB, with pydantic 2.10 I was at 510 MB, at this point I don't know if it's ninja or pydantic istelf but i've seen that
pydantic._internal._model_construction.complete_model_class
/pydantic.plugin._schema_validator.create_schema_validator are
memory intensiveA bit weird because 2.11 claim is to reduce memory allocation..
I've used memray to have some flamegraph representation:
Before (2.10):
After (2.11)
Versions (please complete the following information):
Note you can quickly get this by runninng in
./manage.py shell
this line:The text was updated successfully, but these errors were encountered: