Tutorial : Scalable Memory Allocator

Previous : Flow Graph Top : Tutorial    Next : Thread Local Storage

Scalable Memory Allocator

Intel® Threading Building Blocks (Intel® TBB) provides two memory allocator templates that are similar to the STL template class std::allocator. These two templates, scalable_allocator<T> and cache_aligned_allocator<T>, address critical issues in parallel programming: scalability and false sharing.

Problems of scalability arise when using memory allocators originally designed for serial programs on multiple threads.  In some of these allocators, threads must compete for access to a single shared pool in a way that allows only one thread to allocate at a time. Use the memory allocator template scalable_allocator<T> avoids such scalability bottlenecks. This template can improve the performance of programs that rapidly allocate and free memory.

Problems of false sharing arise when two threads access different words that share the same cache line. The problem is that a cache line is the unit of information interchange between processor caches. If one processor modifies a cache line and another processor reads (or writes) the same cache line, the cache line must be moved from one processor to the other, even if the two processors are dealing with different words within the line. False sharing can hurt performance because cache lines can take hundreds of clocks to move.  Use the class cache_aligned_allocator<T> to always allocate on a cache line. Two objects allocated by cache_aligned_allocator are guaranteed to not have false sharing.

Using one of these allocator templates is straightforward, since they implement the allocator concept used by some parts of the C++ Standard Library, including the STL containers.  For example, the cache_aligned_allocator can easily be used with a std::vector as shown below.

std::vector<int, cache_aligned_allocator<int> >;

On Windows* and Linux* operating systems, it is also possible to automatically replace all calls to standard functions for dynamic memory allocation (such as malloc) with the Intel TBB scalable equivalents. Doing so can sometimes improve application performance.

There are a number of resources available to learn more about the memory allocation features in the Intel® Threading Building Blocks library.


 Previous : Flow Graph Top : Tutorial    Next : Thread Local Storage