After this code, the memory usage increases:
models.prefetch_related_objects(chunk, *prefetch_fields)
Django seems not release the cache, how to release the cache explicitly?
Hi, please show your models, templates, views in order to understand your problem.
Until then try something like that with cache.clear():
from django.core.cache import cache
# Your code
models.prefetch_related_objects(chunk, *prefetch_fields)
# Clear the cache explicitly
cache.clear()
Tried cache.clear(), that didn’t work.
The purpose of my code is to export a large amount of data from Database (PostgreSQL), the problem is that the memory increases based on the size of the data to export, and after completing the export, the memory is still not released. Each row of the data exported is large, contains large arrays.
Tried to release the cache for the model, but failed:
cache.clear()
or models._prefetched_objects_cache = {}
code snippets:
from django.db import models
...
dump(export.zip_handle, export.filename, chunk_count, _iter_data(
progress=progress,
counts_offset=counts_offset,
counts_total=counts_total,
qs=export.qs,
))
...
def _iter_data(
*,
progress: models.TaskProgress,
counts_offset: int,
counts_total: int,
qs: models.QuerySet,
) -> Iterable[list[models.Model]]:
"""
Iterates over the given queryset in efficient chunks.
"""
paginator = Paginator(qs, settings.SAFE_PAGE_SIZE)
# Fetch chunked.
n = 0
for page_number in paginator.page_range:
page = paginator.page(page_number)
yield page.object_list
# Report progress.
n += len(page.object_list)
progress.set(int((counts_offset + n) / counts_total * 100))
def dump(self, zip_handle: AESZipFile, filename: str, chunk_count: int,
chunks: Iterable[Sequence[models.T]]) -> None:
def do_dump() -> Iterable[Iterable[dict[str, Any]]]:
serializer = self.get_serializer()
prefetch_fields = serializer.get_prefetch_fields(key_only=True)
for chunk in chunks:
models.prefetch_related_objects(chunk, *prefetch_fields)
yield map(serializer.to_representation, chunk)
# Tried to release the cache for the model, but failed
models._prefetched_objects_cache = {}
self.encode(zip_handle, filename, chunk_count, do_dump())
def encode(self, zip_handle: AESZipFile, filename: str, chunk_count: int,
chunks: Iterable[Iterable[dict[str, Any]]]) -> None:
path = pathlib.Path(filename)
index_digits = len(str(chunk_count))
csv_fields = self._csv_fields
dumps = json.dumps
index = 1
for chunk in chunks:
chunk_filename = filename if index == 1 else f"{path.stem}_{index:0{index_digits}}{path.suffix}"
with zip_handle.open(chunk_filename, "w", force_zip64=True) as handle:
with storage.text_wrapper(handle, newline="") as text_handle:
writer = csv.writer(text_handle)
writer.writerow(item[0] for item in csv_fields)
writer.writerows(
(row[field_name] if is_raw else dumps(row[field_name]) for field_name, is_raw in csv_fields)
for row in chunk
)
index += 1
del chunk
del chunks
Where are your models?
Perhaps you could use smaller chunk sizes by limiting the number of objects held in memory at any time.
Something like that:
SAFE_PAGE_SIZE = 100 # Play with this value
def _iter_data(
*,
progress: models.TaskProgress,
counts_offset: int,
counts_total: int,
qs: models.QuerySet,
) -> Iterable[list[models.Model]]:
paginator = Paginator(qs, SAFE_PAGE_SIZE)
n = 0
for page_number in paginator.page_range:
page = paginator.page(page_number)
yield page.object_list
n += len(page.object_list)
progress.set(int((counts_offset + n) / counts_total * 100))