Concurrent Vector
From Intel Wiki
Contents
|
[edit] Reference
[edit] Summary
Template class for vector that can be concurrently grown and accessed.
[edit] Header
#include "tbb/concurrent_vector.h"
[edit] Syntax
template<typename T, class A = cache_aligned_allocator<T> > class concurrent_vector;
[edit] Description
concurrent_vector is a container having the following main properties:
- It provides random indexed access to its elements. The index of the first element is 0.
- It ensures safe concurrent growing its size (different threads can safely append new elements).
- Adding new elements does not invalidate existing iterators and does not change indices of existing items.
[edit] Compatibility
The class meets all Container Requirements and Reversible Container Requirements from C++ Standard (See ISO/IEC 14882:2003(E), clause 23.1). But it doesn't meet Sequence Requirements due to absence of insert() and erase() methods.
[edit] Declaration
namespace tbb {
template<typename T, class A>
class concurrent_vector {
public:
// Standard types
typedef size_t size_type;
typedef ''allocator-A-rebased-for-T'' allocator_type;
typedef T value_type;
typedef ptrdiff_t difference_type;
typedef T& reference;
typedef const T& const_reference;
typedef T *pointer;
typedef const T *const_pointer;
// Parallel iteration
typedef ''implementation-defined'' iterator;
typedef ''implementation-defined'' const_iterator;
typedef ''implementation-defined'' reverse_iterator;
typedef ''implementation-defined'' const_reverse_iterator;
// Parallel ranges
typedef ''implementation-defined'' range_type;
typedef ''implementation-defined'' const_range_type;
range_type range( size_t grainsize = 1);
const_range_type range( size_t grainsize = 1 ) const;
// Constructors
explicit concurrent_vector( const allocator_type& a = allocator_type() );
concurrent_vector( const concurrent_vector& source );
template<class M>
concurrent_vector( const concurrent_vector<T, M>& source );
explicit concurrent_vector( size_type n );
concurrent_vector( size_type items_count, const_reference t, const allocator_type& a = allocator_type() );
template<class InputIterator>
concurrent_vector( InputIterator first, InputIterator last, const allocator_type& a = allocator_type() );
//! Assignments
concurrent_vector& operator=( const concurrent_vector& source );
template<class M>
concurrent_vector& operator=( const concurrent_vector<T, M>& source );
void assign( size_type n, const_reference t );
template<class InputIterator>
void assign( InputIterator first, InputIterator last );
// Concurrent grow operations
size_type grow_by( size_type delta );
size_type grow_by( size_type delta, const_reference t );
void grow_to_at_least( size_type n );
size_type push_back( const_reference item );
// Items access
reference operator[]( size_type index );
const_reference operator[]( size_type index ) const;
reference at( size_type index );
const_reference at( size_type index ) const;
reference front();
const_reference front() const;
reference back();
const_reference back() const;
// Storage
bool empty() const;
size_type capacity() const;
size_type max_size() const;
size_type size() const;
allocator_type get_allocator() const;
// Non-concurrent operations on whole container
void reserve( size_type n );
void compact();
void swap( concurrent_vector& vector );
void clear();
~concurrent_vector();
// Iterators
iterator begin();
iterator end();
const_iterator begin() const;
const_iterator end() const;
reverse_iterator rbegin();
reverse_iterator rend();
const_reverse_iterator rbegin() const;
const_reverse_iterator rend() const;
};
// template functions
template<typename T, class A1, class A2>
bool operator==( const concurrent_vector<T, A1>& a, const concurrent_vector<T, A2>& b );
template<typename T, class A1, class A2>
bool operator!=( const concurrent_vector<T, A1>& a, const concurrent_vector<T, A2>& b );
template<typename T, class A1, class A2>
bool operator<( const concurrent_vector<T, A1>& a, const concurrent_vector<T, A2>& b );
template<typename T, class A1, class A2>
bool operator>( const concurrent_vector<T, A1>& a, const concurrent_vector<T, A2>& b );
template<typename T, class A1, class A2>
bool operator<=( const concurrent_vector<T, A1>& a, const concurrent_vector<T, A2>& b );
template<typename T, class A1, class A2>
bool operator>=(const concurrent_vector<T, A1>& a, const concurrent_vector<T, A2>& b );
template<typename T, class A>
void swap(concurrent_vector<T, A>& a, concurrent_vector<T, A>& b);
}
[edit] Exception Safety
Methods working with memory allocation and/or new elements construction can throw an exception if allocator fails to allocate memory or element’s default constructor throws one. Concurrent vector’s element of type T must conform to the following requirements:
- Throwing an exception is forbidden for destructor of T.
- Default constructor of T must not throw an exception OR its non-virtual destructor must safely work when its object memory is zero-initialized.
Otherwise, the program’s behavior is undefined.
If an exception happens inside growth or assignment operation, an instance of the vector becomes invalid unless it is stated otherwise in the method documentation. Invalid state means:
- There are no guaranties that all items were initialized by a constructor. The rest of items is zero-filled, [CHECK here] including item where exception happens.
- An invalid vector instance cannot be repaired; it is unable to grow anymore.
- Size and capacity reported by the vector are incorrect, and calculated as if the failed operation were successful.
- Attempt to access not allocated elements using operator[] or iterators results in access violation or segmentation fault exception, and in case of using at() method a C++ exception is thrown.
If a concurrent grow operation successfully completes, all the elements it has added to the vector will remain valid and accessible even if one of subsequent grow operations fails.
[edit] Fragmentation
Unlike an STL vector, a concurrent_vector does not move existing elements if it needs to allocate more memory. The container is divided into a series of contiguous arrays of elements. The first reservation, growth, or assignment operation determines the size of the first array. Using small number of elements as initial size incurs fragmentation that may increase element access time. Internal layout can be optimized by method compact() that merges several smaller arrays into one solid.
[edit] Construction, Copy, and Assignment
- Safety
- All these operations are not thread safe on the same instance and can throw an exception (See #Exception Safety).
[edit] Construction of zero or n items
concurrent_vector( const allocator_type& a = allocator_type() );
explicit concurrent_vector( size_type items_count );
concurrent_vector( size_type items_count, const_reference t, const allocator_type& a = allocator_type() );
void assign( size_type n, const_reference t );
- Effects
- Construct vector empty or with n copies of t inside using optionally specified allocator instance.
[edit] Copy construction
concurrent_vector( const concurrent_vector& source );
concurrent_vector& operator=( const concurrent_vector& source );
- Effects
- Assign contents of source vector. Source vector can have a different allocator type.
- Returns
- operator=() returns reference to left hand side.
- Safety
- Concurrent operations with source are safe iff copying of uninitialized items is permitted. Likewise, it is safe to copy broken source vector after exception occurs. Copying terminates on first unallocated or broken segment.
[edit] Construction with iterators
template<class I>
concurrent_vector( I first, I last, const allocator_type& a = allocator_type() );
template<class I>
void assign( I first, I last );
- Effects
- Construct vector from a source delimited by iterators [first, last) making only N calls to the copy constructor of T (where N is the distance between first and last) and no reallocations.
[edit] Whole Vector Operations
- Safety
- All these operations are not thread safe on the same instance.
[edit] Reservation
void reserve( size_type n );
- Effects
- Reserve space for at least n elements.
- Throws
- std::length_error if n > max_size(). Also, it can throw an exception if allocator fails to allocate a memory.
- Safety
- Exception raised during this call keeps the instance in the valid state.
[edit] Compaction
void compact();
- Effects
- Compact several small segments into solid one and free unused tail segments.
- Safety
- It can throw an exception (see #Exception Safety) but keeps the instance in the valid state.
[edit] Swap
void swap( concurrent_vector& vector );
- Effects
- Swap contents of two vector instances without copying each item.
[edit] Cleaning
void clear();
- Effects
- Erase all elements and free memory. Afterwards, capacity()==0.
- Throws
- None. Since throwing an exception is forbidden for destructor of T, the method doesn't throw any exceptions.
[edit] Destruction
~concurrent_vector();
- Effects
- Calls clear() to erase all elements and destroy the vector.
[edit] Concurrent Operations
- Safety
- The methods described in this section safely execute on the same instance of a concurrent_vector<T>.
[edit] Incremental Growth
size_type grow_by( size_type delta );
size_type grow_by( size_type delta, const_reference item );
- Effects
- Atomically append delta elements to the end of the vector. The new elements are initialized by default or by copying of item.
- Returns
- Old size of the vector. If it returns k, then the new elements are at the half-open index range [k, k+delta).
[edit] Limited Growth
void grow_to_at_least( size_type n );
- Effects
- Grow the vector until it has at least n elements. The new elements are initialized by default constructor of value_type.
[edit] Push of one item
size_type push_back( const_reference item );
- Effects
- Equivalently to grow_by(1, item), atomically append copy of value to the end of the vector.
- Returns
- Index of the copy.
[edit] Storage properties
- Safety
- Since all these methods are const, they are exception and thread safe. Also, the methods are safe to execute on the invalid instance but could return wrong value, see #Exception Safety.
[edit] Size
size_type size() const;
- Returns
- Number of elements in the vector. The result may include elements that are under construction by concurrent calls to growth methods.
[edit] Empty Check
bool empty() const;
- Returns
- true if size() == 0.
[edit] Upper Size
size_type max_size() const;
- Returns
- Highest size vector that might be representable.
[edit] Capacity
size_type capacity() const;
- Returns
- Maximum size to which vector can grow without having to allocate more memory.
[edit] Allocator
allocator_type get_allocator() const;
- Returns
- Allocator object.
[edit] Access
- Safety
- The methods described in this section safely execute on the same instance of a concurrent_vector<T> until they access items under concurrent grow/construction: by index or iterator related to size() or end().
[edit] operator[]
reference operator[]( size_type index );
const_reference operator[]( size_type index ) const;
- Returns
- Normal or const reference to an item with specified index.
[edit] Checked Access
reference at( size_type index );
const_reference at( size_type index ) const;
- Returns
- Normal or const reference to an item with specified index.
- Complexity
- Check for bad or broken index.
- Throws
- std::out_of_range if index >= size(). Or if *this instance is broken, throws std::out_of_range if there are no items after index, or throws std::range_error until inside broken segment.
[edit] Iterators
iterator begin();
iterator end();
const_iterator begin() const;
const_iterator end() const;
reverse_iterator rbegin();
reverse_iterator rend();
const_reverse_iterator rbegin() const;
const_reverse_iterator rend() const;
- Complexity
- Unlike a std::vector, the iterators are not raw pointers.
- Returns
- Normal or const or/and reverse iterator pointing to beginning or end of the vector.
[edit] Edge
reference front();
const_reference front() const;
reference back();
const_reference back() const;
- Returns
- Normal or const reference to an item from begin() or end() of the vector.
[edit] Parallel Iteration
Types const_range_type and range_type model the Range concept (TODO: a link) and provide methods to access the bounds of the range as shown in Table. The types differ only in that the bounds for a const_range_type are of type const_iterator, whereas the bounds for a range_type are of type iterator. Use the range types in conjunction with parallel_for (TODO: a link), parallel_reduce (TODO: a link), and parallel_scan (TODO: a link) to iterate over pairs in a concurrent_vector.
Table: Methods of concurrent_vector::range_type R Pseudo-Signature Semantics R::iterator R::begin() const First item in range R::iterator R::end() const One past last item in range
range_type range( size_t grainsize = 1);
const_range_type range( size_t grainsize = 1 ) const;
- Returns
- Range over entire concurrent_vector that permits read-write or read-only access.
[edit] Files
- tbb/1.0/include/tbb/concurrent_vector.h
- Header
- tbb/1.0/src/tbb/concurrent_vector.cpp
- Internals
- tbb/1.0/src/test/test_concurrent_vector.cpp
- Unit test
- tbb/1.0/examples/parallel_reduce/convex_hull
- Example: Convex Hull
- tbb/1.0/src/perf/time_vector.cpp
- Performance measurements
- test_vector_layout.cpp
- Vector layout modeler
[edit] Code Review Remarks and Suggestions
No, so far.
[edit] Implemented ones
- Arch Robison: BAD_ALLOC should be __TBB_BAD_ALLOC, since users are not supposed to use it directly.
- Arch Robison: inline keyword is unnecessary for a member function whose definition is in the class itself.
- Anton likes it for documentation reason only since "inline" keyword never works with modern compilers as expected.
- Anton Malakhov: Replace volatile keyword with __TBB_load_with_acquire and __TBB_store_with_release as described in Coding Conventions#volatile qualifier
[edit] History
Arch Robison wrote the original version of concurrent_vector. Anton Malakhov revised concurrent_vector to fix various deficiencies reported by the Threading Tools team who was using it.
[edit] Changes since 2.0 version
- Ideas and decisions are in a presentation
- custom allocator support through template argument
- defaulting to cache_aligned_allocator for now
- segment sizes start from 2 (then 2, 4, 8...)
- faster index calculation
- fixes memory blow up for swarm of vector's instances of small size
- first segment block solid allocation
- first growth call specifies a number of segments to be merged in the first allocation.
- introduces additional size_t field in data structure but doesn't require additional mask on segment pointer
- potentially faster then "any solid segment" technique. - in conjunction with faster index calculation gives about 5-10% performance grow for test set of standard sequential algorithms.
- introduced new methods:
- grow_by(size_type n, const_reference t) growth using copying constructor to init new items.
- STL-like constructors.
- operators ==, < and derivatives
- at() method, approved for using after an exception was thrown inside the vector
- get_allocator() method.
- assign() methods
- compact() method to defragment first segments increasing items access time.
- swap() method
- range() defaults on grainsize = 1 supporting auto grainsize algorithms.
- clear() behavior changed to freeing segments memory
- exception safety guaranties
- fixed ranges conversions (range --> const_range)
[edit] Cross-allocator methods
They are:
template<class M> concurrent_vector( const concurrent_vector<T, M>& x ); template<class M> concurrent_vector& operator=( const concurrent_vector<T, M>& x ); template<typename T, class A1, class A2> bool operator==( const concurrent_vector<T, A1>& a, const concurrent_vector<T, A2>& b ); // and others comparisons operators for vectors of different allocator types
I’ve added ones scince I experienced problems in test_concurrent_vetcor while I tried to work with vectors of different allocator types (scalable, NFS, standard, debuging adaptors) but of same value-type.
I guess, users can face the same problems using our palette of allocators without cross-allocator methods. Meanwhile, these methods are simple implemented because the base class is non-template thus they use just a few more type casts.
[edit] Known Issues
[edit] Broken on Itanium + ICC9.0
The new version breaks test_concurrent_vector.exe when all the following are true:
- is compiled on on Itanium
- The Intel 9.0 compiler is used to compiler concurrent_vector.cpp
- The optimization level is -01 or higher.
The test faults even when run with as single thread, which indicates the problem is not a memory fence. The problem does not occur if the test is compiled with gcc, or compiled with -O0. The problem does not occur with the Intel 10.0 compiler. See PVCS Tracker 548.
[edit] Exception passing SEGV problem on ICC10
I’ve found that test_concurrent_vector fails with:
- ICC: Intel(R) C++ Compiler for applications running on IA-32, Version 10.0 Build 20070426 Package ID: l_cc_p_10.0.023
- OR : icc (ICC) 10.1 20071116
- GCC: 3.4.6 OR 3.2.3
- LIBC: 2.3.4 OR 2.3.2
on old Intel64 machine OR on old ia32 (ZPro) It works fine already on Itanium with gcc 3.4.3 and same icc and libc. The bug works only if the throw statement is located in the libtbb.so.
I fail to separate the failure into non-TBB source. Even if I create separate library built by the same rules from a modified concurrent_vector class with only one internal_throw() function and w/o TBB-specifics like atomics – it works fine… but fails when linked to full libtbb.so
So, cutting it down doesn’t help me today :(
The only workaround I know is to move concurrent_vector_base::intenal_throw_exception() method from libtbb into the header.
The simplest code to reproduce the bug is here:
#include "tbb/concurrent_vector.h"
int main( ) {
tbb::concurrent_vector<int> u;
try { u.at( u.size() ); } catch (...) {}
return 0;
}
the failure happens in the following context:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 4160297888 (LWP 5992)]
0x464c457f in ?? ()
(gdb) bt
#0 0x464c457f in ?? ()
#1 0x008ec060 in _Unwind_RaiseException () from /lib/libgcc_s.so.1
#2 0x00bcafd2 in __cxa_throw () from /usr/lib/libstdc++.so.6
#3 0xf7fde503 in tbb::internal::concurrent_vector_base::internal_throw_exception (
this=0xffffd4f4, t=0) at ../../src/tbb/concurrent_vector.cpp:137
[edit] Unrecognized exception on Mac OS
A similar to above problem appears on Mac OS machines (with gcc 4.0.1) and looks like catch can’t recognize and intercept exception of known type. Likewise the first problem, it can’t be reproduced without TBB library.
I’ve checked in the test_concurrent_vector.cpp file with complete testing of exceptions safety. But it prints “ERROR: … known compiler issue” for the new problem.
The fix is to include the following symbols in export lists: _ZTSSt11range_error; _ZTSSt12length_error; _ZTSSt12out_of_range; _ZTSN3tbb14bad_last_allocE;

