Tutorial : Thread Local Storage

Previous : Scalable Memory Allocator Top : Tutorial    Next : Task Based Programming

Thread Local Storage

The Intel® Threading Building Blocks (Intel® TBB) library provides two template classes for thread local storage. Both provide a access to a local element per thread and create the elements on demand. They differ in their intended use models:

Class combinable provides thread-local storage for holding per-thread sub-computations that will later be reduced to a single result.

Class enumerable_thread_specific provides thread-local storage that acts like an STL container with one element per thread. The container permits iterating over the elements using the usual STL iteration idioms. 

The following snippet shows a simple use of enumerable_thread_specific to count the number of body and loop invocations in a parallel for loop.

typedef enumerable_thread_specific< std::pair<int,int> > CounterType;
CounterType MyCounters (std::make_pair(0,0));

struct Body {
     void operator()(const tbb::blocked_range<int> &r) const {
         CounterType::reference my_counter = MyCounters.local();
          for (int i = r.begin(); i != r.end(); ++i)            

int main() {
     parallel_for( blocked_range<int>(0, 100000000), Body());
     for (CounterType::const_iterator i = MyCounters.begin();
          i != MyCounters.end();  ++i)
         printf("Thread stats:\n");
            printf("  calls to operator(): %d", i->first);
            printf("  total # of iterations executed: %d\n\n",


You can learn more about the thread local storage features in the Intel® Threading Building Blocks library in the Intel TBB Reference Manual:


Previous : Scalable Memory Allocator Top : Tutorial    Next : Task Based Programming