-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Description
Initial Checks
- I confirm that I'm using Pydantic V2
Description
This is a follow up on discussion at python/cpython#92810 (comment).
The issue is that pydantic is using https://docs.python.org/3/library/abc.html which uses caching for isinstance checks and depending on the number of distinct pydantic basemodel classes it can lead to severe memory and performance issues.
Reproducer:
https://github.com/peterchenadded/pydantic-abc-performance
It took 45 seconds to isinstance 7000 distinct class objects on pydantic 2.11.0b2 with --profile.


It took 1 second to isinstance 7000 distinct class objects on my custom changes in test_performance_fix.py with --profile.


If I changed it to 10_000 distinct class objects it gets killed on my machine.
For the 7000 example, using memray memory hit over 4gb, with the fix memory was less than 100mb.
Note below code where it is alternating the base class is important. If it is a single base class i couldn't reproduce the issue.

I leave it up to the pydantic to decide if they want to fix this by applying the patch or removing abc from there code base or push back to abc.py
Example Code
# https://github.com/peterchenadded/pydantic-abc-performance
from pydantic import BaseModel
import logging
import time
class MyEntity1Pattern(BaseModel):
pass
class MyEntity2Pattern(BaseModel):
pass
def _get_my_entity1_pattern():
class MyTest1(MyEntity1Pattern):
pass
return MyTest1
def _get_my_entity2_pattern():
class MyTest2(MyEntity2Pattern):
pass
return MyTest2
class Measurement(BaseModel):
my_entity_1_pattern_check: float = 0
my_entity_2_pattern_check: float = 0
success: int = 0
failed: int = 0
def count(self, input: bool) -> None:
if input:
self.success += 1
else:
self.failed += 1
def test_performance():
my_objects = []
logging.info("Started getting objects to check")
for i in range(7000):
if i % 2 == 0:
my_objects.append(_get_my_entity1_pattern()())
else:
my_objects.append(_get_my_entity2_pattern()())
logging.info("Completed len(my_objects) = %d", len(my_objects))
logging.info("Started checking objects")
measurement = Measurement()
for m in my_objects:
start = time.time()
measurement.count(isinstance(m, MyEntity1Pattern))
measurement.my_entity_1_pattern_check += time.time() - start
start = time.time()
measurement.count(isinstance(m, MyEntity2Pattern))
measurement.my_entity_2_pattern_check += time.time() - start
logging.info("Completed checking classes")
logging.info("measures:\n%s", measurement.model_dump_json(indent=4))
Python, Pydantic & OS Version
2.11.0b2 but also at least 2.7
OS: Apple M2 Mac