View Issue Details [ Jump to Notes ] [ Related Changesets ] | [ Issue History ] [ Print ] | ||||||||
ID | Project | Category | View Status | Date Submitted | Last Update | ||||
---|---|---|---|---|---|---|---|---|---|
0005557 | Spring engine | General | public | 2017-05-14 18:22 | 2017-05-18 20:31 | ||||
Reporter | abma | ||||||||
Assigned To | Kloot | ||||||||
Priority | normal | Severity | major | Reproducibility | have not tried | ||||
Status | resolved | Resolution | fixed | ||||||
Product Version | 103.0 +git | ||||||||
Target Version | 104.0 | Fixed in Version | 103.0 +git | ||||||
Summary | 0005557: test_ThreadPool hangs in test "parallel" | ||||||||
Description | 52 minutes runtime and still running: test/test_ThreadPool Running 13 test cases... Warning: for_mt [ThreadPool::SetThreadCount][1] wanted=8 current=1 maximum=8 [ThreadPool::SetThreadCount][2] workers=7 Warning: for_mt with stepSize Warning: parallel 4372 abma 20 0 20,000t 23732 4756 R 101,2 0,1 52:02.21 test_ThreadPool (gdb) bt #0 0x000000000047ee65 in load (__m=std::memory_order_relaxed, this=0x60b00000a400) at /usr/include/c++/4.9/bits/atomic_base.h:500 #1 moodycamel::ConcurrentQueue<ITaskGroup*, moodycamel::ConcurrentQueueDefaultTraits>::ProducerBase::size_approx (this=0x60b00000a3e0) at rts/System/ConcurrentQueue.h:1646 #2 0x000000000047ddd9 in moodycamel::ConcurrentQueue<ITaskGroup*, moodycamel::ConcurrentQueueDefaultTraits>::try_dequeue<ITaskGroup*> ( this=0x4fd940 <taskQueues>, item=@0x7ffd208279a0: 0x0) at rts/System/ConcurrentQueue.h:1041 0000003 0x0000000000475cb7 in ThreadPool::DoTask (tid=0, async=false) at rts/System/Threading/ThreadPool.cpp:152 0000004 0x0000000000476f13 in ThreadPool::WaitForFinished(std::shared_ptr<ITaskGroup>&&) ( taskGroup=<unknown type in /mnt/tmp/home/dev/spring/develop/test/test_ThreadPool, CU 0x64ccd, DIE 0x8f8b3>) at rts/System/Threading/ThreadPool.cpp:280 0000005 0x000000000045065c in ThreadPool::WaitForFinished<Parallel2TaskGroup<testParallel::test_method()::<lambda()> > >(std::shared_ptr<Parallel2TaskGroup<testParallel::test_method()::<lambda()> > > &) (taskGroup=warning: RTTI symbol not found for class 'std::_Sp_counted_ptr<Parallel2TaskGroup<testParallel::test_method()::{lambda()#1}>*, (__gnu_cxx::_Lock_policy)2>' warning: RTTI symbol not found for class 'std::_Sp_counted_ptr<Parallel2TaskGroup<testParallel::test_method()::{lambda()#1}>*, (__gnu_cxx::_Lock_policy)2>' std::shared_ptr (count 3, weak 0) 0x60700001f5f0) at rts/System/Threading/ThreadPool.h:175 #6 0x000000000044e22f in parallel<testParallel::test_method()::<lambda()> >(<unknown type in /mnt/tmp/home/dev/spring/develop/test/test_ThreadPool, CU 0x64e, DIE 0x43e1c>) (f=<unknown type in /mnt/tmp/home/dev/spring/develop/test/test_ThreadPool, CU 0x64e, DIE 0x43e1c>) at rts/System/Threading/ThreadPool.h:573 #7 0x0000000000449cba in testParallel::test_method (this=0x7ffd20827da0) at test/engine/System/testThreadPool.cpp:86 #8 0x0000000000449620 in testParallel_invoker () at test/engine/System/testThreadPool.cpp:76 #9 0x0000000000474b31 in boost::unit_test::ut_detail::invoker<boost::unit_test::ut_detail::unused>::invoke<void (*)()> (this=0x7ffd20827e5f, f=@0x60200000ee58: 0x4495a1 <testParallel_invoker()>) at /usr/include/boost/test/utils/callback.hpp:56 0000010 0x0000000000474244 in boost::unit_test::ut_detail::callback0_impl_t<boost::unit_test::ut_detail::unused, void (*)()>::invoke (this=0x60200000ee50) at /usr/include/boost/test/utils/callback.hpp:89 #11 0x00007fe401c648f1 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0 0000012 0x00007fe401c3b6de in boost::execution_monitor::catch_signals(boost::unit_test::callback0<int> const&) () from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0 0000013 0x00007fe401c3bf33 in boost::execution_monitor::execute(boost::unit_test::callback0<int> const&) () from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0 0000014 0x00007fe401c64a05 in boost::unit_test::unit_test_monitor_t::execute_and_translate(boost::unit_test::test_case const&) () from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0 #15 0x00007fe401c4b4af in boost::unit_test::framework_impl::visit(boost::unit_test::test_case const&) () from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0 #16 0x00007fe401c7fa33 in boost::unit_test::traverse_test_tree(boost::unit_test::test_suite const&, boost::unit_test::test_tree_visitor&) () from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0 #17 0x00007fe401c469fa in boost::unit_test::framework::run(unsigned long, bool) () from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0 #18 0x00007fe401c62697 in boost::unit_test::unit_test_main(bool (*)(), int, char**) () from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0 #19 0x0000000000447a4f in main (argc=1, argv=0x7ffd20828c88) at /usr/include/boost/test/unit_test.hpp:59 it was much faster on the same machine some time ago -> regression bug | ||||||||
Steps To Reproduce | no clue, IMO special on the machine it is so slow is: - debug build - debian stable / gcc 4.9.2 run test/test_ThreadPool | ||||||||
Tags | No tags attached. | ||||||||
Checked infolog.txt for Errors | |||||||||
Attached Files |
|
![]() |
|
Kloot (developer) 2017-05-14 18:31 |
it's slower now because ThreadPool::WaitForFinished *blocks* the calling thread(s). the UT may need to be updated, but that can wait. |
Kloot (developer) 2017-05-17 19:57 Last edited: 2017-05-17 20:09 |
threadpool-test runs without any unusual slowdown(s) on my system (it never takes more than 10 seconds even as a debug build), so I'll need stronger evidence that this is a general problem before going further. |
abma (administrator) 2017-05-17 21:23 |
its running on an AMD cpu here, i guess thats related. what info is needed for further investigation? |
abma (administrator) 2017-05-17 21:25 |
output of "perf top": 20,73% [vdso] [.] __vdso_clock_gettime 8,17% test_ThreadPool [.] spring_time::spring_time_native(long) 6,80% test_ThreadPool [.] moodycamel::ConcurrentQueue<ITaskGroup*, 5,96% test_ThreadPool [.] ThreadPool::DoTask(int, bool) 5,72% test_ThreadPool [.] boost::exception_detail::refcount_ptr<bo 5,16% test_ThreadPool [.] std::atomic<moodycamel::ConcurrentQueue< 4,57% test_ThreadPool [.] bool moodycamel::ConcurrentQueue<ITaskGr 4,02% test_ThreadPool [.] ITaskGroup::IsFinished() const 3,78% test_ThreadPool [.] std::operator&(std::memory_order, std::_ 3,45% test_ThreadPool [.] ThreadPool::WaitForFinished(std::shared_ 2,84% [kernel] [k] acpi_idle_do_entry 2,59% test_ThreadPool [.] int const& std::max<int>(int const&, int 2,40% test_ThreadPool [.] moodycamel::ConcurrentQueue<ITaskGroup*, 2,18% test_ThreadPool [.] boost::exception_detail::refcount_ptr<bo 1,96% test_ThreadPool [.] spring_time::operator-(spring_time) cons 1,68% test_ThreadPool [.] std::__array_traits<moodycamel::Concurre 1,65% test_ThreadPool [.] std::chrono::duration<long, std::ratio<1 1,62% libasan.so.1.0.0 [.] __interceptor_clock_gettime 1,48% test_ThreadPool [.] bool moodycamel::details::circular_less_ 1,19% test_ThreadPool [.] spring_time::operator<(spring_time) cons 0,94% test_ThreadPool [.] std::_Rb_tree_const_iterator<void (*)(in 0,92% test_ThreadPool [.] std::array<moodycamel::ConcurrentQueue<I 0,87% test_ThreadPool [.] spring_clock::GetTicks() 0,73% test_ThreadPool [.] spring_time::getstarttime() 0,71% test_ThreadPool [.] spring_time::getelapsedtime() 0,70% test_ThreadPool [.] spring_time::gettime(bool) 0,60% test_ThreadPool [.] std::array<bool, 16ul>::operator[](unsig 0,54% test_ThreadPool [.] std::__array_traits<bool, 16ul>::_S_ref( 0,54% test_ThreadPool [.] std::enable_if<std::chrono::__is_duratio 0,44% libc-2.19.so [.] __clock_gettime 0,40% libstdc++.so.6.0.20 [.] std::chrono::_V2::system_clock::now() 0,11% libstdc++.so.6.0.20 [.] clock_gettime@plt 0,10% test_ThreadPool [.] _ZNSt6chrono3_V212system_clock3nowEv@plt 0,07% [kernel] [k] _raw_spin_lock 0,05% libc-2.19.so [.] __GI___strcmp_ssse3 0,05% perf_3.16 [.] __hists__add_entry 0,05% libgdk-x11-2.0.so.0.2400.25 [.] gdk_cairo_set_source_pixbuf 0,05% perf_3.16 [.] dso__find_symbol |
abma (administrator) 2017-05-17 22:10 |
sorry, more specific: its not slow but it seems to never end / hang. |
Kloot (developer) 2017-05-17 22:14 |
how does the test behave if you uncomment https://github.com/spring/spring/blob/develop/rts/System/Threading/ThreadPool.cpp#L31 ? |
abma (administrator) 2017-05-17 22:22 |
hangs, too: [ThreadPool::SetThreadCount][2] workers=7 Warning: for_mt with stepSize Warning: parallel ^C Program received signal SIGINT, Interrupt. 0x0000000000415a77 in ThreadPool::DoTask (tid=tid@entry=0, async=async@entry=false) at rts/System/Threading/ThreadPool.cpp:146 146 for (int idx = 0; idx <= tid; idx += std::max(tid, 1)) { (gdb) bt #0 0x0000000000415a77 in ThreadPool::DoTask (tid=tid@entry=0, async=async@entry=false) at rts/System/Threading/ThreadPool.cpp:146 #1 0x000000000041604a in ThreadPool::WaitForFinished(std::shared_ptr<ITaskGroup>&&) ( taskGroup=taskGroup@entry=<unknown type in /mnt/tmp/home/dev/spring/develop/test/test_ThreadPool, CU 0x9a893, DIE 0xd822c>) at rts/System/Threading/ThreadPool.cpp:280 #2 0x000000000040f656 in WaitForFinished<Parallel2TaskGroup<testParallel::test_method()::<lambda()> > > (taskGroup=<synthetic pointer>) at rts/System/Threading/ThreadPool.h:175 0000003 parallel<testParallel::test_method()::<lambda()> > ( f=<unknown type in /mnt/tmp/home/dev/spring/develop/test/test_ThreadPool, CU 0x0, DIE 0x83bd1>) at rts/System/Threading/ThreadPool.h:573 0000004 testParallel::test_method (this=this@entry=0x7fffffffdaef) at test/engine/System/testThreadPool.cpp:86 0000005 0x000000000040f9ee in testParallel_invoker () at test/engine/System/testThreadPool.cpp:76 #6 0x0000000000411937 in invoke<void (*)()> (this=<optimized out>, f=<optimized out>) at /usr/include/boost/test/utils/callback.hpp:56 #7 boost::unit_test::ut_detail::callback0_impl_t<boost::unit_test::ut_detail::unused, void (*)()>::invoke (this=<optimized out>) at /usr/include/boost/test/utils/callback.hpp:89 #8 0x00007ffff7b9a8f1 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0 #9 0x00007ffff7b716de in boost::execution_monitor::catch_signals(boost::unit_test::callback0<int> const&) () from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0 0000010 0x00007ffff7b71f33 in boost::execution_monitor::execute(boost::unit_test::callback0<int> const&) () from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0 #11 0x00007ffff7b9aa05 in boost::unit_test::unit_test_monitor_t::execute_and_translate(boost::unit_test::test_case const&) () from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0 0000012 0x00007ffff7b814af in boost::unit_test::framework_impl::visit(boost::unit_test::test_case const&) () from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0 0000013 0x00007ffff7bb5a33 in boost::unit_test::traverse_test_tree(boost::unit_test::test_suite const&, boost::unit_test::test_tree_visitor&) () from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0 0000014 0x00007ffff7b7c9fa in boost::unit_test::framework::run(unsigned long, bool) () from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0 #15 0x00007ffff7b98697 in boost::unit_test::unit_test_main(bool (*)(), int, char**) () from /usr/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.55.0 #16 0x00007ffff6536b45 in __libc_start_main (main=0x4091d0 <main(int, char**)>, argc=1, argv=0x7fffffffe8f8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe8e8) at libc-start.c:287 #17 0x0000000000409fef in _start () |
Kloot (developer) 2017-05-17 22:34 Last edited: 2017-05-17 22:35 |
that can only mean at least one task enters the pool but never exits it, or if it does complete that the result does not become visible in WaitForFinished. I have never seen either case happen, no idea how to debug this atm. |
abma (administrator) 2017-05-17 22:37 |
bisected. e342a2dde71daf9c475a140ed6c897b0b2d9dfb7 is the last good revision |
Kloot (developer) 2017-05-17 22:39 |
btw, do other parts of the UT also hang or just testParallel (https://github.com/spring/spring/blob/develop/test/engine/System/testThreadPool.cpp#L76)? |
abma (administrator) 2017-05-17 22:51 Last edited: 2017-05-17 22:52 |
"reaction times"(=testThreadPool8) & testThreadPool10 seems to never end, too, the other tests work. |
abma (administrator) 2017-05-17 23:04 Last edited: 2017-05-17 23:13 |
interesting, when i move the hanging tests to the top it doesn't hang: http://paste.springfiles.com/view/raw/be71997e edit: very likely because ThreadPool::SetThreadCount() isn't called :-| |
Kloot (developer) 2017-05-18 00:25 Last edited: 2017-05-18 00:27 |
I'd say isolate the absolute minimum test-case that hangs, strip everything else. edit: reproduced it by accident, will investigate later. |
Anonymous (viewer) 2017-05-18 19:14 |
Fix 3955a005fe5b952c0242806f0c00ca3ffe563d28 committed to develop branch: fix 0005557, repo: spring changeset id: 8226 |
abma (administrator) 2017-05-18 19:22 |
nice, thanks! seems to work! :) |
abma (administrator) 2017-05-18 19:49 |
oh, weird: on some machines the test failes: test/engine/System/testThreadPool.cpp(90): error in "testParallel": check runs[i] == (1 * (i > 0)) failed |
Kloot (developer) 2017-05-18 20:31 |
the pool seems to be empty during some UT runs, so either a misconfiguration or the number of hardware cores isn't detected correctly. |
![]() |
|||
spring: develop 3955a005
Timestamp: 2017-05-18 19:18:32 Author: rtri [ Details ] [ Diff ] |
fix 0005557 | ||
mod - rts/System/Threading/ThreadPool.cpp | [ Diff ] [ File ] | ||
mod - rts/System/Threading/ThreadPool.h | [ Diff ] [ File ] | ||
mod - test/engine/System/testThreadPool.cpp | [ Diff ] [ File ] | ||
![]() |
|||
Date Modified | Username | Field | Change |
---|---|---|---|
2017-05-14 18:22 | abma | New Issue | |
2017-05-14 18:23 | abma | Description Updated | View Revisions |
2017-05-14 18:23 | abma | Steps to Reproduce Updated | View Revisions |
2017-05-14 18:24 | abma | Severity | minor => major |
2017-05-14 18:24 | abma | Description Updated | View Revisions |
2017-05-14 18:31 | Kloot | Note Added: 0017624 | |
2017-05-17 19:57 | Kloot | Note Added: 0017644 | |
2017-05-17 20:07 | Kloot | Note Edited: 0017644 | View Revisions |
2017-05-17 20:09 | Kloot | Note Edited: 0017644 | View Revisions |
2017-05-17 21:23 | abma | Note Added: 0017645 | |
2017-05-17 21:25 | abma | Note Added: 0017646 | |
2017-05-17 22:09 | abma | Summary | test_ThreadPool is SLOW => test_ThreadPool hangs |
2017-05-17 22:10 | abma | Note Added: 0017647 | |
2017-05-17 22:14 | Kloot | Note Added: 0017648 | |
2017-05-17 22:22 | abma | Note Added: 0017649 | |
2017-05-17 22:34 | Kloot | Note Added: 0017650 | |
2017-05-17 22:35 | Kloot | Note Edited: 0017650 | View Revisions |
2017-05-17 22:37 | abma | Note Added: 0017651 | |
2017-05-17 22:37 | abma | Summary | test_ThreadPool hangs => test_ThreadPool hangs in test "parallel" |
2017-05-17 22:39 | Kloot | Note Added: 0017652 | |
2017-05-17 22:51 | abma | Note Added: 0017653 | |
2017-05-17 22:52 | abma | Note Edited: 0017653 | View Revisions |
2017-05-17 23:04 | abma | Note Added: 0017654 | |
2017-05-17 23:04 | abma | Note Edited: 0017654 | View Revisions |
2017-05-17 23:13 | abma | Note Edited: 0017654 | View Revisions |
2017-05-17 23:13 | abma | Note Edited: 0017654 | View Revisions |
2017-05-18 00:25 | Kloot | Note Added: 0017656 | |
2017-05-18 00:27 | Kloot | Note Edited: 0017656 | View Revisions |
2017-05-18 12:26 | Kloot | Assigned To | => Kloot |
2017-05-18 12:26 | Kloot | Status | new => assigned |
2017-05-18 19:14 | Changeset attached | => spring develop 3955a005 | |
2017-05-18 19:14 | Anonymous | Note Added: 0017657 | |
2017-05-18 19:16 | Kloot | Status | assigned => resolved |
2017-05-18 19:16 | Kloot | Resolution | open => fixed |
2017-05-18 19:16 | Kloot | Fixed in Version | => 103.0 +git |
2017-05-18 19:22 | abma | Note Added: 0017658 | |
2017-05-18 19:49 | abma | Note Added: 0017659 | |
2017-05-18 20:31 | Kloot | Note Added: 0017660 |