Context Navigation

← Previous Ticket
Next Ticket →

#36700 new Cleanup/optimization

ASGIHandler creates reference cycles that require a gc pass to free

Reported by:	Patryk Zawadzki	Owned by:
Component:	HTTP handling	Version:	5.2
Severity:	Normal	Keywords:	memory asgihandler gc
Cc:	Patryk Zawadzki, Carlton Gibson	Triage Stage:	Accepted
Has patch:	no	Needs documentation:	no
Needs tests:	no	Patch needs improvement:	no
Easy pickings:	no	UI/UX:	no

Description

Disclaimer: it's impossible for pure Python code to truly leak memory (in the sense that valgrind would detect), however it's quite easy to create structures that effectively occupy memory for a long time because they require the deepest (generation 2) garbage collection cycle to collect and that happens very rarely. In addition to that, the more such structures aggregate, the more expensive the garbage collection cycle becomes, because it effectively stops the entire interpreter to do its job and it can take seconds. On top of that, it's entirely possible for a container to run out of memory before the garbage collection happens and we (Saleor Commerce) see containers being terminated by the kernel OOM killer due to high memory pressure where most of that memory is locked by garbage.

One such case is found in the ASGIHandler. When handling a request, the ASGIHandler.handle spawns two async tasks. One for the actual app code (process_request) and one for the disconnection handler (ASGIHandler.listen_for_disconnect). The latter will raise RequestAborted every time it receives the http.disconnect ASGI message.

In our setup (uvicorn), the http.disconnect message is received for every request, even after successfully processing the view code and delivering the response, but that's not critical for this issue, it just makes it easy to reproduce this on our end.

Here's where the problem is:

When RequestAborted is raised, its stack trace includes the call to ASGIHandler.handle, which is where ASGIHandler.listen_for_disconnect was called.
In turn, the ASGIHandler.handle stack frame includes references to all local variables.
Among those variables is tasks which holds the references to both async tasks.
Now, one of those tasks is the task created from ASGIHandler.listen_for_disconnect.
The task future is already resolved and now holds a reference back to the RequestAborted exception from step 1. And thus the cycle completes, creating an unfreeable reference cycle.

All of those objects hold references to other objects and stack frames that also become unfreeable, ending up holding a sizeable list of objects hostage until the next time gc.collect(2) happens (which can be minutes, depending on how much code your app executes).

Making ASGIHandler.handle explicitly call tasks.clear() or just del tasks after the tasks are no longer needed breaks the cycle by removing the link between the exception stack frame locals and the future referencing the exception.

PS: I've classified this as a bug as high memory use can lead to OOM kills and crashes but feel free to reclassify as "cleanup/optimization" if that's more fitting.

Attachments (1)

asgi-ref-cycle.svg (98.5 KB ) - added by Patryk Zawadzki 3 weeks ago.

Download all attachments as: .zip

Change History (3)

by Patryk Zawadzki, 3 weeks ago

Attachment:	asgi-ref-cycle.svg added

comment:1 by Natalia Bidart, 3 weeks ago

Cc:	Carlton Gibson added
Triage Stage:	Unreviewed → Accepted
Type:	Bug → Cleanup/optimization

Thank you Patryk Zawadzki for the complete ticket report. Accepting on the basis that the solution provided can be accompanied by a proper test case ensuring proper garbage collection. Looking forward to your contribution!

comment:2 by Carlton Gibson, 3 weeks ago

Hi Patryk, this seems plausible, yes. Thanks for the report.

See also #36315 — which is to use TaskGroup here. There's two related PRs there: gh19366 and gh19370. If you could have a look at those, it may be that that's the better way forward. (They're on my list to look at this cycle, but your eyes would be welcome. 🤹)

Note: See TracTickets for help on using tickets.

Download in other formats:

Issues

Context Navigation

#36700 new Cleanup/optimization

ASGIHandler creates reference cycles that require a gc pass to free

Description

Attachments (1)

Change History (3)

by Patryk Zawadzki, 3 weeks ago

comment:1 by Natalia Bidart, 3 weeks ago

comment:2 by Carlton Gibson, 3 weeks ago

Download in other formats:

Django Links

Learn More

Get Involved

Get Help

Follow Us

Support Us