2019-12-16 06:57 CET

View Issue Details Jump to Notes ] Related Changesets ]
IDProjectCategoryView StatusLast Update
0005557Spring engineGeneralpublic2017-05-18 20:31
Reporterabma 
Assigned ToKloot 
PrioritynormalSeveritymajorReproducibilityhave not tried
StatusresolvedResolutionfixed 
Product Version103.0 +git 
Target Version104.0Fixed in Version103.0 +git 
Summary0005557: test_ThreadPool hangs in test "parallel"
Description52 minutes runtime and still running:

test/test_ThreadPool
Running 13 test cases...
Warning: for_mt
[ThreadPool::SetThreadCount][1] wanted=8 current=1 maximum=8
[ThreadPool::SetThreadCount][2] workers=7
Warning: for_mt with stepSize
Warning: parallel


 4372 abma 20 0 20,000t 23732 4756 R 101,2 0,1 52:02.21 test_ThreadPool


(gdb) bt
#0 0x000000000047ee65 in load (__m=std::memory_order_relaxed, this=0x60b00000a400) at /usr/include/c++/4.9/bits/atomic_base.h:500
#1 moodycamel::ConcurrentQueue<ITaskGroup*, moodycamel::ConcurrentQueueDefaultTraits>::ProducerBase::size_approx (this=0x60b00000a3e0)
    at rts/System/ConcurrentQueue.h:1646
#2 0x000000000047ddd9 in moodycamel::ConcurrentQueue<ITaskGroup*, moodycamel::ConcurrentQueueDefaultTraits>::try_dequeue<ITaskGroup*> (
    this=0x4fd940 <taskQueues>, item=@0x7ffd208279a0: 0x0) at rts/System/ConcurrentQueue.h:1041
0000003 0x0000000000475cb7 in ThreadPool::DoTask (tid=0, async=false) at rts/System/Threading/ThreadPool.cpp:152
0000004 0x0000000000476f13 in ThreadPool::WaitForFinished(std::shared_ptr<ITaskGroup>&&) (
    taskGroup=<unknown type in /mnt/tmp/home/dev/spring/develop/test/test_ThreadPool, CU 0x64ccd, DIE 0x8f8b3>) at rts/System/Threading/ThreadPool.cpp:280
0000005 0x000000000045065c in ThreadPool::WaitForFinished<Parallel2TaskGroup<testParallel::test_method()::<lambda()> > >(std::shared_ptr<Parallel2TaskGroup<testParallel::test_method()::<lambda()> > > &) (taskGroup=warning: RTTI symbol not found for class 'std::_Sp_counted_ptr<Parallel2TaskGroup<testParallel::test_method()::{lambda()#1}>*, (__gnu_cxx::_Lock_policy)2>'
warning: RTTI symbol not found for class 'std::_Sp_counted_ptr<Parallel2TaskGroup<testParallel::test_method()::{lambda()#1}>*, (__gnu_cxx::_Lock_policy)2>'
std::shared_ptr (count 3, weak 0) 0x60700001f5f0) at rts/System/Threading/ThreadPool.h:175
#6 0x000000000044e22f in parallel<testParallel::test_method()::<lambda()> >(<unknown type in /mnt/tmp/home/dev/spring/develop/test/test_ThreadPool, CU 0x64e, DIE 0x43e1c>) (f=<unknown type in /mnt/tmp/home/dev/spring/develop/test/test_ThreadPool, CU 0x64e, DIE 0x43e1c>) at rts/System/Threading/ThreadPool.h:573
#7 0x0000000000449cba in testParallel::test_method (this=0x7ffd20827da0) at test/engine/System/testThreadPool.cpp:86
#8 0x0000000000449620 in testParallel_invoker () at test/engine/System/testThreadPool.cpp:76
#9 0x0000000000474b31 in boost::unit_test::ut_detail::invoker<boost::unit_test::ut_detail::unused>::invoke<void (*)()> (this=0x7ffd20827e5f,
    f=@0x60200000ee58: 0x4495a1 <testParallel_invoker()>) at /usr/include/boost/test/utils/callback.hpp:56
0000010 0x0000000000474244 in boost::unit_test::ut_detail::callback0_impl_t<boost::unit_test::ut_detail::unused, void (*)()>::invoke (this=0x60200000ee50)
    at /usr/include/boost/test/utils/callback.hpp:89
#11 0x00007fe401c648f1 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0
0000012 0x00007fe401c3b6de in boost::execution_monitor::catch_signals(boost::unit_test::callback0<int> const&) ()
   from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0
0000013 0x00007fe401c3bf33 in boost::execution_monitor::execute(boost::unit_test::callback0<int> const&) ()
   from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0
0000014 0x00007fe401c64a05 in boost::unit_test::unit_test_monitor_t::execute_and_translate(boost::unit_test::test_case const&) ()
   from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0
#15 0x00007fe401c4b4af in boost::unit_test::framework_impl::visit(boost::unit_test::test_case const&) ()
   from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0
#16 0x00007fe401c7fa33 in boost::unit_test::traverse_test_tree(boost::unit_test::test_suite const&, boost::unit_test::test_tree_visitor&) ()
   from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0
#17 0x00007fe401c469fa in boost::unit_test::framework::run(unsigned long, bool) () from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0
#18 0x00007fe401c62697 in boost::unit_test::unit_test_main(bool (*)(), int, char**) () from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0
#19 0x0000000000447a4f in main (argc=1, argv=0x7ffd20828c88) at /usr/include/boost/test/unit_test.hpp:59


it was much faster on the same machine some time ago -> regression bug
Steps To Reproduceno clue, IMO special on the machine it is so slow is:

- debug build
- debian stable / gcc 4.9.2

run
test/test_ThreadPool
TagsNo tags attached.
Checked infolog.txt for Errors
Attached Files

-Relationships
+Relationships

-Notes

~0017624

Kloot (developer)

it's slower now because ThreadPool::WaitForFinished *blocks* the calling thread(s).

the UT may need to be updated, but that can wait.

~0017644

Kloot (developer)

Last edited: 2017-05-17 20:09

View 3 revisions

threadpool-test runs without any unusual slowdown(s) on my system (it never takes more than 10 seconds even as a debug build), so I'll need stronger evidence that this is a general problem before going further.

~0017645

abma (administrator)

its running on an AMD cpu here, i guess thats related.

what info is needed for further investigation?

~0017646

abma (administrator)

output of "perf top":

  20,73% [vdso] [.] __vdso_clock_gettime
   8,17% test_ThreadPool [.] spring_time::spring_time_native(long)
   6,80% test_ThreadPool [.] moodycamel::ConcurrentQueue<ITaskGroup*,
   5,96% test_ThreadPool [.] ThreadPool::DoTask(int, bool)
   5,72% test_ThreadPool [.] boost::exception_detail::refcount_ptr<bo
   5,16% test_ThreadPool [.] std::atomic<moodycamel::ConcurrentQueue<
   4,57% test_ThreadPool [.] bool moodycamel::ConcurrentQueue<ITaskGr
   4,02% test_ThreadPool [.] ITaskGroup::IsFinished() const
   3,78% test_ThreadPool [.] std::operator&(std::memory_order, std::_
   3,45% test_ThreadPool [.] ThreadPool::WaitForFinished(std::shared_
   2,84% [kernel] [k] acpi_idle_do_entry
   2,59% test_ThreadPool [.] int const& std::max<int>(int const&, int
   2,40% test_ThreadPool [.] moodycamel::ConcurrentQueue<ITaskGroup*,
   2,18% test_ThreadPool [.] boost::exception_detail::refcount_ptr<bo
   1,96% test_ThreadPool [.] spring_time::operator-(spring_time) cons
   1,68% test_ThreadPool [.] std::__array_traits<moodycamel::Concurre
   1,65% test_ThreadPool [.] std::chrono::duration<long, std::ratio<1
   1,62% libasan.so.1.0.0 [.] __interceptor_clock_gettime
   1,48% test_ThreadPool [.] bool moodycamel::details::circular_less_
   1,19% test_ThreadPool [.] spring_time::operator<(spring_time) cons
   0,94% test_ThreadPool [.] std::_Rb_tree_const_iterator<void (*)(in
   0,92% test_ThreadPool [.] std::array<moodycamel::ConcurrentQueue<I
   0,87% test_ThreadPool [.] spring_clock::GetTicks()
   0,73% test_ThreadPool [.] spring_time::getstarttime()
   0,71% test_ThreadPool [.] spring_time::getelapsedtime()
   0,70% test_ThreadPool [.] spring_time::gettime(bool)
   0,60% test_ThreadPool [.] std::array<bool, 16ul>::operator[](unsig
   0,54% test_ThreadPool [.] std::__array_traits<bool, 16ul>::_S_ref(
   0,54% test_ThreadPool [.] std::enable_if<std::chrono::__is_duratio
   0,44% libc-2.19.so [.] __clock_gettime
   0,40% libstdc++.so.6.0.20 [.] std::chrono::_V2::system_clock::now()
   0,11% libstdc++.so.6.0.20 [.] clock_gettime@plt
   0,10% test_ThreadPool [.] _ZNSt6chrono3_V212system_clock3nowEv@plt
   0,07% [kernel] [k] _raw_spin_lock
   0,05% libc-2.19.so [.] __GI___strcmp_ssse3
   0,05% perf_3.16 [.] __hists__add_entry
   0,05% libgdk-x11-2.0.so.0.2400.25 [.] gdk_cairo_set_source_pixbuf
   0,05% perf_3.16 [.] dso__find_symbol

~0017647

abma (administrator)

sorry, more specific: its not slow but it seems to never end / hang.

~0017648

Kloot (developer)

how does the test behave if you uncomment https://github.com/spring/spring/blob/develop/rts/System/Threading/ThreadPool.cpp#L31 ?

~0017649

abma (administrator)

hangs, too:

[ThreadPool::SetThreadCount][2] workers=7
Warning: for_mt with stepSize
Warning: parallel
^C
Program received signal SIGINT, Interrupt.
0x0000000000415a77 in ThreadPool::DoTask (tid=tid@entry=0, async=async@entry=false) at rts/System/Threading/ThreadPool.cpp:146
146 for (int idx = 0; idx <= tid; idx += std::max(tid, 1)) {
(gdb) bt
#0 0x0000000000415a77 in ThreadPool::DoTask (tid=tid@entry=0, async=async@entry=false) at rts/System/Threading/ThreadPool.cpp:146
#1 0x000000000041604a in ThreadPool::WaitForFinished(std::shared_ptr<ITaskGroup>&&) (
    taskGroup=taskGroup@entry=<unknown type in /mnt/tmp/home/dev/spring/develop/test/test_ThreadPool, CU 0x9a893, DIE 0xd822c>)
    at rts/System/Threading/ThreadPool.cpp:280
#2 0x000000000040f656 in WaitForFinished<Parallel2TaskGroup<testParallel::test_method()::<lambda()> > > (taskGroup=<synthetic pointer>)
    at rts/System/Threading/ThreadPool.h:175
0000003 parallel<testParallel::test_method()::<lambda()> > (
    f=<unknown type in /mnt/tmp/home/dev/spring/develop/test/test_ThreadPool, CU 0x0, DIE 0x83bd1>) at rts/System/Threading/ThreadPool.h:573
0000004 testParallel::test_method (this=this@entry=0x7fffffffdaef) at test/engine/System/testThreadPool.cpp:86
0000005 0x000000000040f9ee in testParallel_invoker () at test/engine/System/testThreadPool.cpp:76
#6 0x0000000000411937 in invoke<void (*)()> (this=<optimized out>, f=<optimized out>) at /usr/include/boost/test/utils/callback.hpp:56
#7 boost::unit_test::ut_detail::callback0_impl_t<boost::unit_test::ut_detail::unused, void (*)()>::invoke (this=<optimized out>)
    at /usr/include/boost/test/utils/callback.hpp:89
#8 0x00007ffff7b9a8f1 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0
#9 0x00007ffff7b716de in boost::execution_monitor::catch_signals(boost::unit_test::callback0<int> const&) ()
   from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0
0000010 0x00007ffff7b71f33 in boost::execution_monitor::execute(boost::unit_test::callback0<int> const&) ()
   from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0
#11 0x00007ffff7b9aa05 in boost::unit_test::unit_test_monitor_t::execute_and_translate(boost::unit_test::test_case const&) ()
   from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0
0000012 0x00007ffff7b814af in boost::unit_test::framework_impl::visit(boost::unit_test::test_case const&) ()
   from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0
0000013 0x00007ffff7bb5a33 in boost::unit_test::traverse_test_tree(boost::unit_test::test_suite const&, boost::unit_test::test_tree_visitor&) ()
   from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0
0000014 0x00007ffff7b7c9fa in boost::unit_test::framework::run(unsigned long, bool) ()
   from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0
#15 0x00007ffff7b98697 in boost::unit_test::unit_test_main(bool (*)(), int, char**) ()
   from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0
#16 0x00007ffff6536b45 in __libc_start_main (main=0x4091d0 <main(int, char**)>, argc=1, argv=0x7fffffffe8f8, init=<optimized out>,
    fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe8e8) at libc-start.c:287
#17 0x0000000000409fef in _start ()

~0017650

Kloot (developer)

Last edited: 2017-05-17 22:35

View 2 revisions

that can only mean at least one task enters the pool but never exits it, or if it does complete that the result does not become visible in WaitForFinished.

I have never seen either case happen, no idea how to debug this atm.

~0017651

abma (administrator)

bisected.
e342a2dde71daf9c475a140ed6c897b0b2d9dfb7 is the last good revision

~0017652

Kloot (developer)

btw, do other parts of the UT also hang or just testParallel (https://github.com/spring/spring/blob/develop/test/engine/System/testThreadPool.cpp#L76)?

~0017653

abma (administrator)

Last edited: 2017-05-17 22:52

View 2 revisions

"reaction times"(=testThreadPool8) & testThreadPool10 seems to never end, too, the other tests work.

~0017654

abma (administrator)

Last edited: 2017-05-17 23:13

View 4 revisions

interesting, when i move the hanging tests to the top it doesn't hang:

http://paste.springfiles.com/view/raw/be71997e

edit:

very likely because ThreadPool::SetThreadCount() isn't called :-|

~0017656

Kloot (developer)

Last edited: 2017-05-18 00:27

View 2 revisions

I'd say isolate the absolute minimum test-case that hangs, strip everything else.

edit: reproduced it by accident, will investigate later.

~0017657

Anonymous (viewer)

Fix 3955a005fe5b952c0242806f0c00ca3ffe563d28 committed to develop branch: fix 0005557, repo: spring changeset id: 8226

~0017658

abma (administrator)

nice, thanks!

seems to work! :)

~0017659

abma (administrator)

oh, weird: on some machines the test failes:

test/engine/System/testThreadPool.cpp(90): error in "testParallel": check runs[i] == (1 * (i > 0)) failed

~0017660

Kloot (developer)

the pool seems to be empty during some UT runs, so either a misconfiguration or the number of hardware cores isn't detected correctly.
+Notes

+Related Changesets

-Issue History
Date Modified Username Field Change
2017-05-14 18:22 abma New Issue
2017-05-14 18:23 abma Description Updated View Revisions
2017-05-14 18:23 abma Steps to Reproduce Updated View Revisions
2017-05-14 18:24 abma Severity minor => major
2017-05-14 18:24 abma Description Updated View Revisions
2017-05-14 18:31 Kloot Note Added: 0017624
2017-05-17 19:57 Kloot Note Added: 0017644
2017-05-17 20:07 Kloot Note Edited: 0017644 View Revisions
2017-05-17 20:09 Kloot Note Edited: 0017644 View Revisions
2017-05-17 21:23 abma Note Added: 0017645
2017-05-17 21:25 abma Note Added: 0017646
2017-05-17 22:09 abma Summary test_ThreadPool is SLOW => test_ThreadPool hangs
2017-05-17 22:10 abma Note Added: 0017647
2017-05-17 22:14 Kloot Note Added: 0017648
2017-05-17 22:22 abma Note Added: 0017649
2017-05-17 22:34 Kloot Note Added: 0017650
2017-05-17 22:35 Kloot Note Edited: 0017650 View Revisions
2017-05-17 22:37 abma Note Added: 0017651
2017-05-17 22:37 abma Summary test_ThreadPool hangs => test_ThreadPool hangs in test "parallel"
2017-05-17 22:39 Kloot Note Added: 0017652
2017-05-17 22:51 abma Note Added: 0017653
2017-05-17 22:52 abma Note Edited: 0017653 View Revisions
2017-05-17 23:04 abma Note Added: 0017654
2017-05-17 23:04 abma Note Edited: 0017654 View Revisions
2017-05-17 23:13 abma Note Edited: 0017654 View Revisions
2017-05-17 23:13 abma Note Edited: 0017654 View Revisions
2017-05-18 00:25 Kloot Note Added: 0017656
2017-05-18 00:27 Kloot Note Edited: 0017656 View Revisions
2017-05-18 12:26 Kloot Assigned To => Kloot
2017-05-18 12:26 Kloot Status new => assigned
2017-05-18 19:14 Changeset attached => spring develop 3955a005
2017-05-18 19:14 Anonymous Note Added: 0017657
2017-05-18 19:16 Kloot Status assigned => resolved
2017-05-18 19:16 Kloot Resolution open => fixed
2017-05-18 19:16 Kloot Fixed in Version => 103.0 +git
2017-05-18 19:22 abma Note Added: 0017658
2017-05-18 19:49 abma Note Added: 0017659
2017-05-18 20:31 Kloot Note Added: 0017660
+Issue History