Why Use Intel® TBB?

Why use it?

Intel® Threading Building Blocks (Intel® TBB) lets you easily write parallel C++ programs that take full advantage of multicore performance, that are portable and composable, and that have future-proof scalability.

What is it?

Widely used C++ template library for task parallelism


Primary features
  • Parallel algorithms and data structures
  • Scalable memory allocation and task scheduling


Reasons to use
  • Rich feature set for general purpose parallelism
  • C++; Windows*, Linux*, OS X* and other OSes

Learn More

What's New

License Change from GPL to a more permissive Apache License

Like Intel® TBB 2.0, the Intel® TBB 2017 brings both technical improvements and becomes more open with the switch to an Apache* 2.0 license, which should enable it to take root in more environments while continuing to simplify effective use of multicore hardware.

Reduce overhead on well-balanced workloads with expanded set of partitioners.

Intel® TBB 2017 has expanded a set of partitioners with the tbb::static_partitioner. It can be used intbb::parallel_for and tbb::parallel_reduce to split the work uniformly among workers. The work is initially split into chunks of approximately equal size. The number of chunks is determined at runtime to minimize the overhead of work splitting while providing enough tasks for available workers. Whether these chunks may be further split is unspecified. This reduces overheads involved when the work is originally well-balanced. However, it limits available parallelism and, therefore, might result in performance loss for non-balanced workloads.

Improved Dynamic Memory allocation

Improved dynamic memory allocation replacement on Windows* OS to skip DLLs for which replacement cannot be done, instead of aborting. For 64-bit platforms, quadrupled the worst-case limit on the amount of memory the Intel® TBB allocator can handle. Intel® TBB no longer performs dynamic replacement of memory allocation functions for Microsoft Visual Studio 2008 and earlier versions.

Fully supported flow graph feature with enhancements to specify concurrency, external communications, and a composability layer to support heterogeneous computing.

The tbb::flow::async_node is re-implemented using tbb::flow::multifunction_node template. This allows to specify a concurrency level for the node. Since Intel TBB 4.4 Update 3 a special tbb::flow::async_msg message type was introduced to support communications between the flow graph and external asynchronous activities

Streaming workloads to external computing devices is significantly reworked in this Intel® TBB 2017 and introduced as a preview feature. Intel® TBB flow graph now can be used as a composability layer for heterogeneous computing.

Unlock additional performance for multi-threaded Python by enabling threading composability

An experimental module which unlocks additional performance for multi-threaded Python programs by enabling threading composability between two or more thread-enabled libraries.

Threading composability can accelerate programs by avoiding inefficient threads allocation (called oversubscription) when there are more software threads than available hardware resources.

For more details on all these new features read the following blog.