Skip to content

Storing large access data externally #6199

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 31 commits into from
Jun 11, 2025
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
7b2ff24
Storing large access data externally
galvana Jun 4, 2025
a5c78cd
Adding encryption
galvana Jun 5, 2025
b31a00f
Clean up
galvana Jun 7, 2025
12c51d4
Fixing failing tests
galvana Jun 8, 2025
a71a72e
Fixing tests
galvana Jun 8, 2025
1f4f4ff
Fixing tests
galvana Jun 8, 2025
3103315
Fixing test
galvana Jun 8, 2025
11d6a92
Fixing tests
galvana Jun 8, 2025
956988c
Resetting LARGE_DATA_THRESHOLD_BYTES
galvana Jun 8, 2025
1361bd8
Merge branch 'main' into ENG-684-save-large-access-data-externally
galvana Jun 8, 2025
f3c52e1
Cleaning up code and adding fallback to privacy request model
galvana Jun 9, 2025
87ebbe9
Fixing static checks
galvana Jun 9, 2025
2a23e82
Removing pytest mark
galvana Jun 9, 2025
9cfd942
Fixing tests
galvana Jun 9, 2025
813d15a
Test cleanup
galvana Jun 9, 2025
39ed44b
Adding more tests
galvana Jun 10, 2025
cceee47
Merge branch 'main' into ENG-684-save-large-access-data-externally
galvana Jun 10, 2025
f08172b
Static fixes
galvana Jun 10, 2025
b510d17
Fixing test
galvana Jun 10, 2025
a8b52e2
Fixing S3 file limit
galvana Jun 10, 2025
972a01e
Updating large file threshold and optimizing memory usage
galvana Jun 11, 2025
af971aa
Merge branch 'main' into ENG-684-save-large-access-data-externally
galvana Jun 11, 2025
9ed4144
Changes based on PR feedback
galvana Jun 11, 2025
525ca86
Removing unused file
galvana Jun 11, 2025
48ff1c5
Fixing comment format
galvana Jun 11, 2025
73c9fea
Merge branch 'main' into ENG-684-save-large-access-data-externally
galvana Jun 11, 2025
e1e79ca
Fixing patch path
galvana Jun 11, 2025
59600c9
Merge branch 'main' into ENG-684-save-large-access-data-externally
galvana Jun 11, 2025
bfe61ea
Updating change log
galvana Jun 11, 2025
9d31e68
Fixing change log
galvana Jun 11, 2025
c750ecc
Merge branch 'main' into ENG-684-save-large-access-data-externally
galvana Jun 11, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
328 changes: 328 additions & 0 deletions src/fides/api/models/encrypted_large_data.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this belongs in models top level? It kind of blends in with the regular models.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved it under /models/field_types

Original file line number Diff line number Diff line change
@@ -0,0 +1,328 @@
import json
import sys
from datetime import datetime
from typing import Any, List, Optional, Type

from loguru import logger

from fides.api.api.deps import get_autoclose_db_session
from fides.api.schemas.external_storage import ExternalStorageMetadata
from fides.api.service.external_data_storage import (
ExternalDataStorageError,
ExternalDataStorageService,
)
from fides.api.util.collection_util import Row
from fides.api.util.custom_json_encoder import CustomJSONEncoder

# 896MB threshold for external storage
# We only generate an estimated size for large datasets so we want to be conservative
# and fallback to external storage even if we haven't hit the 1GB max limit
LARGE_DATA_THRESHOLD_BYTES = 896 * 1024 * 1024 # 896MB


def calculate_data_size(data: List[Row]) -> int:
"""Calculate the approximate serialized size of access data in bytes using a memory-efficient approach

We need to determine the size of data in a memory-efficient way. Calling `sys.getsizeof` is not accurate for
`Dict` and the way to calculate exact size could take up a lot of memory (`json.dumps`). I went with a
sampling approach where we only need to call `json.dumps` on a sample of data. We know the most
likely reason for large data is a large number of rows vs. a row with a lot of data.

We use this knowledge to:

- Take a sample of records from the list of data
- Calculate exact size of the samples
- Extrapolate the estimated size based on the total number of records
"""

if not data:
return 0

Check warning on line 39 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L39

Added line #L39 was not covered by tests

try:
data_count = len(data)

# For very large datasets, estimate size from a sample to avoid memory issues
if data_count > 1000: # For large datasets, use sampling
logger.debug(

Check warning on line 46 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L46

Added line #L46 was not covered by tests
f"Calculating size for large dataset ({data_count} rows) using sampling"
)

# Use larger sample size for better accuracy (up to 500 records)
sample_size = min(

Check warning on line 51 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L51

Added line #L51 was not covered by tests
500, max(100, data_count // 20)
) # At least 100, up to 500, or 5% of data

# Use stratified sampling for better representation
if data_count > sample_size * 3:
# Take samples from beginning, middle, and end
step = data_count // sample_size
sample_indices = list(range(0, data_count, step))[:sample_size]
sample = [data[i] for i in sample_indices]

Check warning on line 60 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L58-L60

Added lines #L58 - L60 were not covered by tests
else:
# For smaller datasets, just take from the beginning
sample = data[:sample_size]

Check warning on line 63 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L63

Added line #L63 was not covered by tests

# Calculate sample size
sample_json = json.dumps(

Check warning on line 66 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L66

Added line #L66 was not covered by tests
sample, cls=CustomJSONEncoder, separators=(",", ":")
)
sample_bytes = len(sample_json.encode("utf-8"))

Check warning on line 69 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L69

Added line #L69 was not covered by tests

# Better estimation accounting for JSON structure overhead
# Calculate per-record average
avg_record_size = sample_bytes / sample_size

Check warning on line 73 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L73

Added line #L73 was not covered by tests

# Estimate content size
content_size = int(avg_record_size * data_count)

Check warning on line 76 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L76

Added line #L76 was not covered by tests

# Add more accurate JSON structure overhead
# - Array brackets: 2 bytes
# - Commas between records: (data_count - 1) bytes
# - Some padding for variations: 1% of content size
structure_overhead = 2 + (data_count - 1) + int(content_size * 0.01)

Check warning on line 82 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L82

Added line #L82 was not covered by tests

estimated_size = content_size + structure_overhead

Check warning on line 84 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L84

Added line #L84 was not covered by tests

logger.debug(f"Sample: {sample_size} records, {sample_bytes} bytes")
logger.debug(f"Avg per record: {avg_record_size:.1f} bytes")
logger.debug(

Check warning on line 88 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L86-L88

Added lines #L86 - L88 were not covered by tests
f"Estimated size: {estimated_size:,} bytes ({estimated_size / (1024*1024*1024):.2f} GB)"
)
return estimated_size

Check warning on line 91 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L91

Added line #L91 was not covered by tests

# For smaller datasets, calculate exact size
json_str = json.dumps(data, cls=CustomJSONEncoder, separators=(",", ":"))
size = len(json_str.encode("utf-8"))
return size

except (TypeError, ValueError) as e:
logger.warning(

Check warning on line 99 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L98-L99

Added lines #L98 - L99 were not covered by tests
f"Failed to calculate JSON size, falling back to sys.getsizeof: {e}"
)
# Fallback to sys.getsizeof if JSON serialization fails
return sys.getsizeof(data)

Check warning on line 103 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L103

Added line #L103 was not covered by tests


def is_large_data(data: List[Row], threshold_bytes: Optional[int] = None) -> bool:
"""Check if data exceeds the large data threshold

Args:
data: The data to check
threshold_bytes: Custom threshold in bytes. If None, uses LARGE_DATA_THRESHOLD_BYTES
"""
if not data:
return False

Check warning on line 114 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L114

Added line #L114 was not covered by tests

threshold = (

Check warning on line 116 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L116

Added line #L116 was not covered by tests
threshold_bytes if threshold_bytes is not None else LARGE_DATA_THRESHOLD_BYTES
)
size = calculate_data_size(data)
is_large = size > threshold

Check warning on line 120 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L119-L120

Added lines #L119 - L120 were not covered by tests

if is_large:
logger.info(

Check warning on line 123 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L123

Added line #L123 was not covered by tests
f"Data size ({size:,} bytes) exceeds threshold ({threshold:,} bytes) - using external storage"
)

return is_large

Check warning on line 127 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L127

Added line #L127 was not covered by tests

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These top two functions might be useful in other areas as well. I wonder if putting them into a data_size.py type util would make them easier to find/use?


class EncryptedLargeDataDescriptor:
"""
A Python descriptor for database fields with encrypted external storage fallback.

This implements Python's descriptor protocol (by defining __get__ and __set__ methods)
to intercept attribute access and provide custom behavior. When you declare:

```python
class RequestTask(Base):
access_data = EncryptedLargeDataDescriptor("access_data")
```

The descriptor automatically:
1. Encrypts data using SQLAlchemy-Utils StringEncryptedType
2. Uses external storage (S3, GCS, local) when data exceeds size thresholds
3. Handles cleanup of external storage files when data changes
4. Works transparently - fields behave like normal Python attributes

Storage paths use the format: {model_name}/{instance_id}/{field_name}/{timestamp}

This pattern eliminates duplicate code across multiple encrypted fields while providing
a clean, reusable interface that works with any SQLAlchemy model with an 'id' attribute.
"""

def __init__(
self,
field_name: str,
empty_default: Optional[Any] = None,
threshold_bytes: Optional[int] = None,
):
"""
Initialize the descriptor.

Args:
field_name: The name of the database column (e.g., "access_data")
empty_default: Default value when data is None/empty ([] for lists, {} for dicts)
threshold_bytes: Optional custom threshold for external storage
"""
self.field_name = field_name
self.private_field = f"_{field_name}"
self.empty_default = empty_default if empty_default is not None else []
self.threshold_bytes = threshold_bytes or LARGE_DATA_THRESHOLD_BYTES
self.model_class: Optional[str] = None
self.name: Optional[str] = None

def __set_name__(self, owner: Type, name: str) -> None:
"""Called when the descriptor is assigned to a class attribute."""
self.name = name
self.model_class = owner.__name__

def _generate_storage_path(self, instance: Any) -> str:
"""
Generate a storage path using generic naming.

Format: {model_type}/{instance_id}/{field_name}/{timestamp}
"""
instance_id = getattr(instance, "id", None)

Check warning on line 186 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L186

Added line #L186 was not covered by tests
if not instance_id:
raise ValueError(f"Instance {instance} must have an 'id' attribute")

Check warning on line 188 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L188

Added line #L188 was not covered by tests

timestamp = datetime.utcnow().strftime("%Y%m%d-%H%M%S.%f")

Check warning on line 190 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L190

Added line #L190 was not covered by tests

return f"{self.model_class}/{instance_id}/{self.field_name}/{timestamp}"

Check warning on line 192 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L192

Added line #L192 was not covered by tests

def __get__(self, instance: Any, owner: Type) -> Any:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see owner being used anywhere in this function - can probably be removed from the signature?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of the methods that are part of the descriptor protocol, we need to leave it even if we don't use it https://docs.python.org/3/reference/datamodel.html#object.__get__

"""
Get the value, handling external storage retrieval if needed.
"""
if instance is None:
return self

# Get the raw data from the private field
raw_data = getattr(instance, self.private_field)
if raw_data is None:
return None

# Check if it's external storage metadata
if isinstance(raw_data, dict) and "storage_type" in raw_data:
logger.info(

Check warning on line 208 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L208

Added line #L208 was not covered by tests
f"Reading {self.model_class}.{self.field_name} from external storage "
f"({raw_data.get('storage_type')})"
)
try:
metadata = ExternalStorageMetadata.model_validate(raw_data)
data = self._retrieve_external_data(metadata)

Check warning on line 214 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L212-L214

Added lines #L212 - L214 were not covered by tests

# Log retrieval details
record_count = len(data) if isinstance(data, list) else "N/A"
logger.info(

Check warning on line 218 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L217-L218

Added lines #L217 - L218 were not covered by tests
f"Successfully retrieved {self.model_class}.{self.field_name} "
f"from external storage (records: {record_count})"
)
return data if data is not None else self.empty_default
except Exception as e:
logger.error(

Check warning on line 224 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L222-L224

Added lines #L222 - L224 were not covered by tests
f"Failed to retrieve {self.model_class}.{self.field_name} "
f"from external storage: {str(e)}"
)
raise ExternalDataStorageError(

Check warning on line 228 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L228

Added line #L228 was not covered by tests
f"Failed to retrieve {self.field_name}: {str(e)}"
) from e
else:
return raw_data

def __set__(self, instance: Any, value: Any) -> None:
"""
Set the value, automatically using external storage for large data.
"""
if not value:
# Clean up any existing external storage
self._cleanup_external_data(instance)
# Set to empty default
setattr(instance, self.private_field, self.empty_default)
return

# Check if the data is the same as what's already stored
try:
current_data = self.__get__(instance, type(instance))
if current_data == value:
# Data is identical, no need to update
return
except Exception:

Check warning on line 251 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L251

Added line #L251 was not covered by tests
# If we can't get current data, proceed with update
pass

Check warning on line 253 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L253

Added line #L253 was not covered by tests

# Calculate data size
data_size = calculate_data_size(value)

# Check if data exceeds threshold
if data_size > self.threshold_bytes:
logger.info(

Check warning on line 260 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L260

Added line #L260 was not covered by tests
f"{self.model_class}.{self.field_name}: Data size ({data_size:,} bytes) "
f"exceeds threshold ({self.threshold_bytes:,} bytes), storing externally"
)
# Clean up any existing external storage first
self._cleanup_external_data(instance)

Check warning on line 265 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L265

Added line #L265 was not covered by tests

# Store in external storage
metadata = self._store_external_data(instance, value)
setattr(instance, self.private_field, metadata.model_dump())

Check warning on line 269 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L268-L269

Added lines #L268 - L269 were not covered by tests
else:
# Clean up any existing external storage
self._cleanup_external_data(instance)
# Store directly in database
setattr(instance, self.private_field, value)

def _store_external_data(self, instance: Any, data: Any) -> ExternalStorageMetadata:
"""
Store data in external storage using generic path structure.
"""
storage_path = self._generate_storage_path(instance)

Check warning on line 280 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L280

Added line #L280 was not covered by tests

with get_autoclose_db_session() as session:
metadata = ExternalDataStorageService.store_data(

Check warning on line 283 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L282-L283

Added lines #L282 - L283 were not covered by tests
db=session,
storage_path=storage_path,
data=data,
)

logger.info(

Check warning on line 289 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L289

Added line #L289 was not covered by tests
f"Stored {self.model_class}.{self.field_name} to external storage: {storage_path}"
)

return metadata

Check warning on line 293 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L293

Added line #L293 was not covered by tests

def _retrieve_external_data(self, metadata: ExternalStorageMetadata) -> Any:
"""
Retrieve data from external storage.
"""
with get_autoclose_db_session() as session:
return ExternalDataStorageService.retrieve_data(

Check warning on line 300 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L299-L300

Added lines #L299 - L300 were not covered by tests
db=session,
metadata=metadata,
)

def _cleanup_external_data(self, instance: Any) -> None:
"""Clean up external storage if it exists."""
raw_data = getattr(instance, self.private_field, None)
if isinstance(raw_data, dict) and "storage_type" in raw_data:
try:
metadata = ExternalStorageMetadata.model_validate(raw_data)
with get_autoclose_db_session() as session:
ExternalDataStorageService.delete_data(

Check warning on line 312 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L309-L312

Added lines #L309 - L312 were not covered by tests
db=session,
metadata=metadata,
)

logger.info(

Check warning on line 317 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L317

Added line #L317 was not covered by tests
f"Cleaned up external storage for {self.model_class}.{self.field_name}: "
f"{metadata.file_key}"
)
except Exception as e:
logger.warning(

Check warning on line 322 in src/fides/api/models/encrypted_large_data.py

View check run for this annotation

Codecov / codecov/patch

src/fides/api/models/encrypted_large_data.py#L321-L322

Added lines #L321 - L322 were not covered by tests
f"Failed to cleanup external {self.field_name}: {str(e)}"
)

def cleanup(self, instance: Any) -> None:
"""Public method to cleanup external storage."""
self._cleanup_external_data(instance)
17 changes: 15 additions & 2 deletions src/fides/api/models/privacy_request/privacy_request.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@
from fides.api.models.audit_log import AuditLog
from fides.api.models.client import ClientDetail
from fides.api.models.comment import Comment, CommentReference, CommentReferenceType
from fides.api.models.encrypted_large_data import EncryptedLargeDataDescriptor
from fides.api.models.fides_user import FidesUser
from fides.api.models.manual_webhook import AccessManualWebhook
from fides.api.models.policy import (
Expand Down Expand Up @@ -251,7 +252,8 @@ class PrivacyRequest(
awaiting_email_send_at = Column(DateTime(timezone=True), nullable=True)

# Encrypted filtered access results saved for later retrieval
filtered_final_upload = Column( # An encrypted JSON String - Dict[Dict[str, List[Row]]] - rule keys mapped to the filtered access results
_filtered_final_upload = Column( # An encrypted JSON String - Dict[Dict[str, List[Row]]] - rule keys mapped to the filtered access results
"filtered_final_upload",
StringEncryptedType(
type_in=JSONTypeOverride,
key=CONFIG.security.app_encryption_key,
Expand All @@ -260,6 +262,11 @@ class PrivacyRequest(
),
)

# Use descriptor for automatic external storage handling
filtered_final_upload = EncryptedLargeDataDescriptor(
field_name="filtered_final_upload", empty_default={}
)

# Encrypted filtered access results saved for later retrieval
access_result_urls = Column( # An encrypted JSON String - Dict[Dict[str, List[Row]]] - rule keys mapped to the filtered access results
StringEncryptedType(
Expand Down Expand Up @@ -334,6 +341,7 @@ def delete(self, db: Session) -> None:
deleting this object from the database
"""
self.clear_cached_values()
self.cleanup_external_storage()
Attachment.delete_attachments_for_reference_and_type(
db, self.id, AttachmentReferenceType.privacy_request
)
Expand Down Expand Up @@ -1257,6 +1265,11 @@ def get_consent_results(self) -> Dict[str, int]:
# DSR 2.0 does not cache the results so nothing to do here
return {}

def cleanup_external_storage(self) -> None:
"""Clean up all external storage files for this privacy request"""
# Access the descriptor from the class to call cleanup
PrivacyRequest.filtered_final_upload.cleanup(self)

def save_filtered_access_results(
self, db: Session, results: Dict[str, Dict[str, List[Row]]]
) -> None:
Expand Down Expand Up @@ -1544,7 +1557,7 @@ def get_action_required_details(


def _parse_cache_to_checkpoint_action_required(
cache: dict[str, Any]
cache: dict[str, Any],
) -> CheckpointActionRequired:
collection = (
CollectionAddress(
Expand Down
Loading
Loading