1,458 Posts served
6,169 Conversations started
The TBB class task was designed for high-performance implementations of the TBB templates. It's efficiency, particularly its emphasis on continuation-passing style, comes at some price in convenience. Rick Molloy of Microsoft has posted a description of a task_group interface that Microsoft is considering. It's more convenient for than the TBB interface, particularly when your compiler supports C++ 200x lambda expessions (Section 5.1.1 of N2606).
I implemented a subset of task_group in TBB as a header tbb/task_group.h: 37 lines of C++ and 5 preprocessor lines. It's a small subset.
wait() returns void, not task_group_status, since the blog does not detail task_group_status. But nonetheless, I think some TBB users will find this minimal form useful. For example, it's enough of task_group to write the quicksort in Molloy's post.
The code for header follows my signature. I'd be interested to hear how useful it is.
- Arch
#ifndef __TBB_task_group_H
#define __TBB_task_group_H
#include "tbb/task.h"
namespace tbb {
class task_group;
namespace internal {
// Suppress gratuitous warnings from icc 11.0 when lambda expressions are used in instances of function_task.
#pragma warning(disable: 588)
template<typename Function>
class function_task: public task {
Function my_func;
/*override*/ task* execute() {
my_func();
return NULL;
}
public:
function_task( Function& f ) : my_func(f) {}
};
} // namespace internal
class task_group: internal::no_copy {
private:
empty_task* root;
public:
task_group() {
root = new(task::allocate_root()) empty_task;
root->set_ref_count(1);
}
~task_group() {
if( root->ref_count() )
root->wait_for_all();
root->destroy(*root);
}
template<typename Function>
void run( Function f ) {
task& self = task::self();
self.spawn(*new( self.allocate_additional_child_of( *root )) internal::function_task<Function>(f) );
}
void wait() {
root->wait_for_all();
}
};
} // namespace tbb
#endif /* __TBB_task_group_H */
| July 2, 2008 11:59 PM PDT
Andrey Marochko (Intel)
| Cool work, Arch! Shows both how easy some of the MS concepts (which took them years to arrive to) can be impemented, and how inefficient the implementation can be (beware of the closed sources) :). I think that to provide MS like exception handling behavior you need to explicitly create an isolated context for each task_group and associate it with the root task. I'm also not quite sure that the check in the task_group destructor for the root's refcount being nonzero helps in any way. Actually it may become zero only if the user used task::self() inside its functor, took its parent, and created a child for it not using allocate_additional_child_of(). And in this case our empty root will be executed by the scheduler and destroyed, and the check itself will probably cause access violation. I think the best thing you could do here is to call wait_for_all unconditionally, and rely on assertions inside TBB to warn about misuses. |
| July 3, 2008 6:31 AM PDT
Arch Robison (Intel) | I agree that it sppears that each task_group will likely require a task_group_context to implement the Microsoft's exception-handling semantics. As Microsoft makes more details public (such as exception-handling semantics), I'll update my TBB version to match as closely as practical. The test on root->ref_count() is intended to protect against cases where the destructor of a task_group is called before wait() is called; e.g., out of forgetfullness or because an exception was thrown. The check is necessary because calling wait_for_all unconditionally does not work if it has already been called (by task_group::wait). The reason is that root->wait_for_all() waits until root->ref_count() becomes 1, and then sets root->ref_count() to 0. So calling wait_for_all twice, without resetting the ref_count, is an error. The debug version of TBB has assertions that diagnose this error. |
Arch Robison (Intel)
The following example, which uses lambdas, possibly calls run() from different threads on the same task group. It prints 1000000.
The example is just for show. In generally, it is non-scalable to create many tasks from the same task_group, because creation and completion of a task involve bumping a reference counter, whose cache line becomes a point of contention. Use recursive task creation for scalability, like a nuclear chain reaction.