I am debugging A segfault reported by TSAN in the CI of Boost.Beast.
I strongly believe it to be a false positive, but I don't know what to look for in order to suppress it.
It seems to me from the stack trace that the code is correctly instrumented. The code passes all other tests, inclding valgrind, ubsan, etc.
I'm hoping some kind expert may put me out of my misery.
Here is the output:
====== BEGIN OUTPUT ======
beast.http.read
ThreadSanitizer:DEADLYSIGNAL
==132842==ERROR: ThreadSanitizer: SEGV on unknown address 0x7ff5d9cff000 (pc 0x7ff5dceba0d0 bp 0x000000000000 sp 0x7ff5d9c3d910 T132844)
==132842==The signal is caused by a READ memory access.
#0 __sanitizer::StackDepotBase<__sanitizer::StackDepotNode, 1, 20>::Put(__sanitizer::StackTrace, bool*) <null> (libtsan.so.2+0xba0d0)
#1 __tsan::CurrentStackId(__tsan::ThreadState*, unsigned long) <null> (libtsan.so.2+0x8c48f)
#2 __sanitizer::DD::MutexInit(__sanitizer::DDCallback*, __sanitizer::DDMutex*) <null> (libtsan.so.2+0xac534)
#3 __tsan::DDMutexInit(__tsan::ThreadState*, unsigned long, __tsan::SyncVar*) <null> (libtsan.so.2+0x9a3f8)
#4 __tsan::MetaMap::GetSync(__tsan::ThreadState*, unsigned long, unsigned long, bool, bool) <null> (libtsan.so.2+0xa85dc)
#5 __tsan_atomic32_fetch_add <null> (libtsan.so.2+0x783e9)
#6 __gnu_cxx::__exchange_and_add(int volatile*, int) /usr/include/c++/12/ext/atomicity.h:66 (read+0x4188f8)
#7 __gnu_cxx::__exchange_and_add_dispatch(int*, int) /usr/include/c++/12/ext/atomicity.h:101 (read+0x4188f8)
#8 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release_last_use() /usr/include/c++/12/bits/shared_ptr_base.h:187 (read+0x4188f8)
#9 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() /usr/include/c++/12/bits/shared_ptr_base.h:361 (read+0x40c592)
#10 std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count() /usr/include/c++/12/bits/shared_ptr_base.h:1071 (read+0x418fee)
#11 std::__shared_ptr<boost::asio::detail::strand_executor_service::strand_impl, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr() /usr/include/c++/12/bits/shared_ptr_base.h:1524 (read+0x420f83)
#12 std::shared_ptr<boost::asio::detail::strand_executor_service::strand_impl>::~shared_ptr() /usr/include/c++/12/bits/shared_ptr.h:175 (read+0x420faf)
#13 boost::asio::detail::strand_executor_service::invoker<boost::asio::io_context::basic_executor_type<std::allocator<void>, 0ul> const, void>::~invoker() <null> (read+0x471f5f)
#14 boost::asio::detail::executor_op<boost::asio::detail::strand_executor_service::invoker<boost::asio::io_context::basic_executor_type<std::allocator<void>, 0ul> const, void>, boost::asio::detail::recycling_allocator<void, boost::asio::detail::thread_info_base::default_tag>, boost::asio::detail::scheduler_operation>::do_complete(void*, boost::asio::detail::scheduler_operation*, boost::system::error_code const&, unsigned long) <null> (read+0x48b0ac)
#15 boost::asio::detail::scheduler_operation::complete(void*, boost::system::error_code const&, unsigned long) boost/asio/detail/scheduler_operation.hpp:40 (read+0x4fa38e)
#16 boost::asio::detail::scheduler::do_run_one(boost::asio::detail::conditionally_enabled_mutex::scoped_lock&, boost::asio::detail::scheduler_thread_info&, boost::system::error_code const&) boost/asio/detail/impl/scheduler.ipp:492 (read+0x4e8835)
#17 boost::asio::detail::scheduler::run(boost::system::error_code&) boost/asio/detail/impl/scheduler.ipp:210 (read+0x4e74fb)
#18 boost::asio::io_context::run() boost/asio/impl/io_context.ipp:63 (read+0x4dc122)
#19 boost::beast::test::enable_yield_to::enable_yield_to(unsigned long)::{lambda()#1}::operator()() const <null> (read+0x412363)
#20 void std::__invoke_impl<void, boost::beast::test::enable_yield_to::enable_yield_to(unsigned long)::{lambda()#1}>(std::__invoke_other, boost::beast::test::enable_yield_to::enable_yield_to(unsigned long)::{lambda()#1}&&) <null> (read+0x4a1f92)
#21 std::__invoke_result<boost::beast::test::enable_yield_to::enable_yield_to(unsigned long)::{lambda()#1}>::type std::__invoke<boost::beast::test::enable_yield_to::enable_yield_to(unsigned long)::{lambda()#1}>(boost::beast::test::enable_yield_to::enable_yield_to(unsigned long)::{lambda()#1}&&) <null> (read+0x49f9dc)
#22 void std::thread::_Invoker<std::tuple<boost::beast::test::enable_yield_to::enable_yield_to(unsigned long)::{lambda()#1}> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) <null> (read+0x49c90a)
#23 std::thread::_Invoker<std::tuple<boost::beast::test::enable_yield_to::enable_yield_to(unsigned long)::{lambda()#1}> >::operator()() <null> (read+0x49941e)
#24 std::thread::_State_impl<std::thread::_Invoker<std::tuple<boost::beast::test::enable_yield_to::enable_yield_to(unsigned long)::{lambda()#1}> > >::_M_run() <null> (read+0x494c2a)
#25 execute_native_thread_routine <null> (libstdc++.so.6+0xdbb72)
#26 __tsan_thread_start_func <null> (libtsan.so.2+0x393ef)
#27 start_thread <null> (libc.so.6+0x8ce2c)
#28 clone3 <null> (libc.so.6+0x1121af)
ThreadSanitizer can not provide additional info.
SUMMARY: ThreadSanitizer: SEGV (/lib64/libtsan.so.2+0xba0d0) in __sanitizer::StackDepotBase<__sanitizer::StackDepotNode, 1, 20>::Put(__sanitizer::StackTrace, bool*)
==132842==ABORTING
The code being tested is latest master branch.
Command line to reproduce:
$ ./b2 toolset=gcc thread-sanitizer=norecover link=static variant=debug libs/beast/test -q -d+2 -j1
My compiler info:
$ gcc --version
gcc (GCC) 12.1.1 20220507 (Red Hat 12.1.1-1)
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
My OS is Fedora 36. But we see this happen on Ubuntu as well.
Using the
b2line I could repro the SEGV on linux using master (00293a6adb5 from the superproject).I started with a - to me - more convenient setup based on CMake. I modified the CMake to use
thread,undefinedsanitizers instead ofaddress,undefinedforVARIANT=ubasan.Interestingly, it doesn't segfault. It does however seem to have a legit TSAN violation in
basic_stream.cpp, where the effective flags are:Breaking it down for readability:
The reported diagnostic: https://paste.ubuntu.com/p/6SKjmZ9wFT/ (lines truncated for SO):
Given this observation, I thought to see whether excluding the offending TU (
basic_stream.cpp) from the b2 build removes the SEGV. No such luck.On the contrary, the SEGV manfifest with the following TUs:
buffered_read_stream.cppread.cppwrite.cppclose.cpphandshake.cppping.cppread2.cppwrite.cpphttp_examples.cppDropping these TUs from their respective
test/**/Jamfileallows all remaining tests to pass TSAN under b2. Now, I did some soul searching and e.g. unique include diving using a script like:Which uncovers a common subset of includes of 854 includes. 40 of the beast headers are in that consistent set, but 208 are asio headers.
Questions from here:
Choosing to address these 3., 1., 2. (optimizing for return-on-effort)
3. Is a recent Asio change involved? [YES]
Doing the
b2test with only Asio reverted to 1.79.0 (e929e5cf Merge asio from 'develop') passes all the tests cleanly.Just to check that no compiler flags were harmed in the process e.g.
buffered_read_stream.cppshowed the same exact commands:So, now we know that something inside the 208 Asio headers must be involved.
1. Why is SEGV not happening in the CMake build?
Surely, here we should be able to spot difference in compilation flags? Ever-so-slightly redacted to remove spelling differences (left = CMake, right = b2):
I used the process of elimination, figured out that the culprit is
-fno-inline -O0. Somehow it leads to recursive TSAN errors:Observations without
-fno-inline:-O2the symptoms go away-O1the symptoms go away ifNDEBUGis defined-O0the symptom is there regardlessThis suggests that a NDEBUG-sensitive piece of code could be involved. This might be a lead to guide minimization.
Comparing preprocessed sources may highlight specific possible causes. My main suspicions are the revamped
spawned_thread_baseinasio/spawn.hpp, source-locations and or cancellation slots.As a courtesy, here's the preprocessed-buffered-stream-reader.zip containing 4 files (
/home/sehe/{with,without}-NDEBUG.i.asio1.{79,80}.0).