I am trying to track if a C++20 coroutine has ever suspended, so that unhandled_exception knows whether it can simply re-throw; the exception back to the caller of the initial coroutine function, or if it has to call std::current_exception() and store it in the promise to be checked later. My usual approach to this is to transform every awaiter with await_transform and manually check return values and exception throwings of await_suspend to decide what constitutes a suspend. This is error-prone and inefficient though, so I went looking for a better way.
I came up with this: in the awaiter returned from initial_suspend, I resume the coroutine in await_suspend and simply let its exceptions propagate back out to the caller. Simple!
struct InitialSuspend
{
bool await_ready(){ return false; }
void await_suspend(std::coroutine_handle<promise_type> h)
{
h.resume();
h.promise().hasEverSuspended = true;
}
void await_resume(){}
};
I don't allow my coroutines to be destroyed until the return object drops its reference to them, so this will never incur a use-after-free for me. I was really happy when I thought of this solution, since it eliminates all the inefficient bookkeeping and takes advantage of the inherent way coroutines work... or so I thought.
Unfortunately, while testing this in various compilers, I noticed a terrible discrepancy. Some compilers destroy the coroutine when that h.resume() call throws an exception, while other compilers don't. Consider that get_return_object can return an intermediary type that references the coroutine: it will be constructed with that reference, so surely it owns the coroutine now, right? But how should it know when it should destroy the coroutine?
I created a small playground program with logging to compare the behavior of GCC, Clang, and MSVC: https://compiler-explorer.com/z/Yaanxoj9z
The results: GCC never destroys the coroutine itself, Clang destroys the coroutine itself only when get_return_object isn't the same as the coroutine return type, and MSVC always tries to destroy the coroutine itself. That's three completely different behaviors.
I think I could potentially work around this by having the intermediary communicate with the promise destructor, so when the promise is destroyed it can tell the intermediary not to destroy it anymore. But I am not very happy with this, I need to understand why this is happening and why it is different for each compiler, in case I am somehow invoking undefined behavior.
I tried reading through the standard myself, and in § 9.5.4 where it discusses coroutines, I see a rough outline for the compiler transformation of the coroutine:
{
promise-type promise promise-constructor-arguments ;
try {
co_await promise .initial_suspend() ;
function-body
} catch ( ... ) {
if (!initial-await-resume-called )
throw ;
promise .unhandled_exception() ;
}
final-suspend :
co_await promise .final_suspend() ;
}
Below that, initial-await-resume-called is described as "initially false and is set to true immediately before the evaluation of the await-resume expression of the initial await expression". It seems to guard against calling unhandled_exception unless the coroutine has been resumed from its initial suspend point. In my test playground program, unhandled_exception is invoked in all cases, so that seems to indicate the compilers do know we are past the initial suspend point when the exception is thrown.
Since my unhandled_exception unconditionally re-throw;s the exception, final_suspend is never processed. Since the promise type in the above transformation is in the same scope as the final suspend processing, it seems to me like the promise probably should always automatically destroyed by the compiler in this case, but entering the body of await_suspend and resuming the coroutine via .resume() both affect the interpretation of this in ways that are unclear to me, which is probably why the other compilers behave differently. For example, in my test playground program, moving the throw 1; statement to after co_await std::suspend_always{}; results in completely normal and consistent behavior in all three compilers: the compilers never destroy the coroutine, leaving the responsibility to my user code.
I think the main distinction here is about whether the initial suspend awaiter exits its await_suspend via returning normally or via exception, but this doesn't seem to be elaborated upon in the standard, at least not that I can find. I am aware that under normal circumstances, exiting await_suspend via exception is supposed to resume the coroutine and rethrow the exception, and resuming a coroutine that is already at its final suspend point is undefined behavior, but I was under the impression that the initial suspend point gets special treatment for situations exactly like this.
In my test playground program, if I add a throw 1; before h.resume();, all three compilers agree that they are the ones who should destroy the coroutine, and the intermediary type ends up doing a double-destroy. This again adds to the idea that the promise should be destroyed by the compiler. I would guess that what is tricking GCC and Clang is the call to .resume() inside await_suspend before it exits via exception. This adds to the theory that I am invoking undefined behavior, in which case I will have to use a different workaround from the one I proposed earlier.
A second workaround I came up with is to instead symmetric transfer to a monitoring coroutine whose only job is to call .resume(). That way, we have exited the initial awaiter's await_suspend via returning normally, and the exception can still come back to the caller of the original task function. But this approach means the monitoring coroutine has to rethrow from its unhandled_exception too, which means each monitoring coroutine is single-use and has to be freshly reallocated for every normal coroutine. That's terribly annoying to have to deal with when all I really want to know is whether the coroutine fully suspended at any point, and seems roughly equal in complexity to the original approach of using await_transform to track it manually, but the monitoring coroutine approach would at least cut out all the inefficient bookkeeping from every awaiter. I'd really like to not have to use either of those approaches though, the initial suspend tracker seems so elegant in comparison.
std::coroutine_handle<Monitor::promise_type> await_suspend(std::coroutine_handle<promise_type> h)
{
Monitor monitor([](std::coroutine_handle<promise_type> h)
-> Monitor
{
h.resume();
h.promise().everSuspended = true;
co_return;
}(h));
return std::exchange(monitor.h, {});
}
I tried out this second workaround in the test playground program: https://compiler-explorer.com/z/EG8hxxc7M
It ends up being somewhat hacky, and the results are both better and worse. GCC and Clang still behave very differently but at least there's no leaks or double destroys, and MSVC encounters an internal compiler error, so I have no idea what it would do.
What's going on here? Who is really supposed to be responsible for destroying the coroutine in the original case? Am I really invoking undefined behavior? Is there anything I can do to know if the coroutine has ever suspended other than the manual bookkeeping with await_transform?
get_return_objectcan return the coroutine handle or intermediary, and the Task constructor can be the one to.resume()the coroutine: compiler-explorer.com/z/x69Tvxz3E - note, only safe for single-threaded coroutines! Multi-threaded must still track state withawait_transform.