well a python server process is currently crashing regularly (every few days), going high in cpu usage yet not actively processing anything. The process runs in an alpine linux docker.
To help finding the problem I'd like to see the exact line on which the process is busy. As the process is already running I believe my only option is to use the generic gdb debugger.
Now I tried following this small guide, "adapted" for alpine linux: I noticed gdb was already installed, and the python3 debug symbols are in this package: python3-dbg .
Now if I run (after obviously cding to the virtual environment from thwich python is run). gdb python3 -p 135 (where 135 is the python background process in the docker).
In the spawned gdb process I tried to list the current python line being executed: py-list. However this returns "undefined command: py-list". Similar for the stack trace: py-bt, also this command is not found.
Can I fix this?
When I use the more basic bt command I see:
#0 0x00007fe3dd7e0c3d in epoll_pwait () from /lib/ld-musl-x86_64.so.1
#1 0x00007fe3dc90c3a0 in signals () from /njs/BackgroundServer/venv/lib/python3.6/site-packages/gevent/libev/corecext.cpython-36m-x86_64-linux-gnu.so
#2 0x00007fe3dc90c490 in default_loop_struct () from /njs/BackgroundServer/venv/lib/python3.6/site-packages/gevent/libev/corecext.cpython-36m-x86_64-linux-gnu.so
#3 0x00007fe3dc6e57a1 in epoll_poll (loop=0xe95f, timeout=<optimized out>) at /tmp/pip-install-o4b63_go/gevent/deps/libev/ev_epoll.c:153
#4 0x00007fe3dc6eda5c in ev_run (loop=0x7fe3dc90c3a0 <default_loop_struct>, flags=flags@entry=0) at /tmp/pip-install-o4b63_go/gevent/deps/libev/ev.c:3683
#5 0x00007fe3dc6ede88 in __pyx_pf_6gevent_5libev_8corecext_4loop_14run (__pyx_v_self=0x7fe3d8e86840, __pyx_v_once=<optimized out>, __pyx_v_nowait=<optimized out>)
at src/gevent/libev/gevent.corecext.c:5575
#6 __pyx_pw_6gevent_5libev_8corecext_4loop_15run (__pyx_v_self=0x7fe3d8e86840, __pyx_args=<optimized out>, __pyx_kwds=<optimized out>) at src/gevent/libev/gevent.corecext.c:5526
#7 0x00007fe3dd3d8b46 in _PyCFunction_FastCallDict () from /usr/lib/libpython3.6m.so.1.0
#8 0x00007fe3dd3d8db9 in _PyCFunction_FastCallKeywords () from /usr/lib/libpython3.6m.so.1.0
#9 0x00007fe3dd42f7fd in ?? () from /usr/lib/libpython3.6m.so.1.0
#10 0x00007fe3dd435634 in _PyEval_EvalFrameDefault () from /usr/lib/libpython3.6m.so.1.0
#11 0x00007fe3dd42ee0c in ?? () from /usr/lib/libpython3.6m.so.1.0
#12 0x00007fe3dd43668c in _PyFunction_FastCallDict () from /usr/lib/libpython3.6m.so.1.0
#13 0x00007fe3dd3a25e0 in _PyObject_FastCallDict () from /usr/lib/libpython3.6m.so.1.0
#14 0x00007fe3dd3a2870 in _PyObject_Call_Prepend () from /usr/lib/libpython3.6m.so.1.0
#15 0x00007fe3dd3a24e7 in PyObject_Call () from /usr/lib/libpython3.6m.so.1.0
#16 0x00007fe3dc910289 in g_initialstub (mark=mark@entry=0x7ffc48bd3e00) at greenlet.c:810
#17 0x00007fe3dc90fdba in g_switch (target=0x7fe3d8eb85a0, args=0x7fe3ddc0d048, kwargs=<optimized out>) at greenlet.c:582
#18 0x00007fe3dd3cea91 in ?? () from /usr/lib/libpython3.6m.so.1.0
#19 0x00007fe3dd3c13d0 in ?? () from /usr/lib/libpython3.6m.so.1.0
#20 0x00007fe3dd42f5dd in ?? () from /usr/lib/libpython3.6m.so.1.0
#21 0x00007ffc48bd4118 in ?? ()
#22 0x00007ffc48bd4080 in ?? ()
#23 0x0000000000000002 in ?? ()
#24 0x0000000000000002 in ?? ()
#25 0x00007fe3dd3e402f in ?? () from /usr/lib/libpython3.6m.so.1.0
#26 0xd1145828f949d59e in ?? ()
#27 0x00007ffc48bd40d0 in ?? ()
#28 0x00007fe3dd4ae93a in ?? () from /usr/lib/libpython3.6m.so.1.0
#29 0x0000000000000003 in ?? ()
#30 0x0000000000000000 in ?? ()
I notice however that the memory adresses in the stacktrace do not change at all when I run multiple times bt, so this seems to imply it is actually stuck on waiting in a gevent to return? I guess without the python symbols it's not possible to find which line causes the hang?
py-xxxin a running process is still unavailable. I guess it is how it works:gdbcan dopy-xxxwhen launch a new process but not attach to a running process.