Welcome Guest | Login | Register

Concurrent Vector

From Intel Wiki

Jump to: navigation, search

Contents

[edit] Reference

[edit] Summary

Template class for vector that can be concurrently grown and accessed.

[edit] Header

#include "tbb/concurrent_vector.h"

[edit] Syntax

template<typename T, class A = cache_aligned_allocator<T> > class concurrent_vector;

[edit] Description

concurrent_vector is a container having the following main properties:

  • It provides random indexed access to its elements. The index of the first element is 0.
  • It ensures safe concurrent growing its size (different threads can safely append new elements).
  • Adding new elements does not invalidate existing iterators and does not change indices of existing items.

[edit] Compatibility

The class meets all Container Requirements and Reversible Container Requirements from C++ Standard (See ISO/IEC 14882:2003(E), clause 23.1). But it doesn't meet Sequence Requirements due to absence of insert() and erase() methods.

[edit] Declaration

namespace tbb {
    template<typename T, class A>
    class concurrent_vector {
    public:
        // Standard types
        typedef size_t size_type;
        typedef ''allocator-A-rebased-for-T'' allocator_type;
        typedef T value_type;
        typedef ptrdiff_t difference_type;
        typedef T& reference;
        typedef const T& const_reference;
        typedef T *pointer;
        typedef const T *const_pointer;
 
        // Parallel iteration
        typedef ''implementation-defined'' iterator;
        typedef ''implementation-defined'' const_iterator;
        typedef ''implementation-defined'' reverse_iterator;
        typedef ''implementation-defined'' const_reverse_iterator;
 
        // Parallel ranges
        typedef ''implementation-defined'' range_type;
        typedef ''implementation-defined'' const_range_type;
 
        range_type range( size_t grainsize = 1);
        const_range_type range( size_t grainsize = 1 ) const;
 
        // Constructors
        explicit concurrent_vector( const allocator_type& a = allocator_type() );
        concurrent_vector( const concurrent_vector& source );
        template<class M>
        concurrent_vector( const concurrent_vector<T, M>& source );
        explicit concurrent_vector( size_type n );
        concurrent_vector( size_type items_count, const_reference t, const allocator_type& a = allocator_type() );
        template<class InputIterator>
        concurrent_vector( InputIterator first, InputIterator last, const allocator_type& a = allocator_type() );
 
        //! Assignments
        concurrent_vector& operator=( const concurrent_vector& source );
        template<class M>
        concurrent_vector& operator=( const concurrent_vector<T, M>& source );
        void assign( size_type n, const_reference t );
        template<class InputIterator>
        void assign( InputIterator first, InputIterator last );
 
        // Concurrent grow operations
        size_type grow_by( size_type delta );
        size_type grow_by( size_type delta, const_reference t );
        void grow_to_at_least( size_type n );
        size_type push_back( const_reference item );
 
        // Items access
        reference operator[]( size_type index );
        const_reference operator[]( size_type index ) const;
        reference at( size_type index );
        const_reference at( size_type index ) const;
        reference front();
        const_reference front() const;
        reference back();
        const_reference back() const;
 
        // Storage
        bool empty() const;
        size_type capacity() const;
        size_type max_size() const;
        size_type size() const;
        allocator_type get_allocator() const;
 
        // Non-concurrent operations on whole container
        void reserve( size_type n );
        void compact();
        void swap( concurrent_vector& vector );
        void clear();
        ~concurrent_vector();
 
        // Iterators
        iterator begin();
        iterator end();
        const_iterator begin() const;
        const_iterator end() const;
        reverse_iterator rbegin();
        reverse_iterator rend();
        const_reverse_iterator rbegin() const;
        const_reverse_iterator rend() const;
    };
 
    // template functions
    template<typename T, class A1, class A2>
    bool operator==( const concurrent_vector<T, A1>& a, const concurrent_vector<T, A2>& b );
 
    template<typename T, class A1, class A2>
    bool operator!=( const concurrent_vector<T, A1>& a, const concurrent_vector<T, A2>& b );
 
    template<typename T, class A1, class A2>
    bool operator<( const concurrent_vector<T, A1>& a, const concurrent_vector<T, A2>& b );
 
    template<typename T, class A1, class A2>
    bool operator>( const concurrent_vector<T, A1>& a, const concurrent_vector<T, A2>& b );
 
    template<typename T, class A1, class A2>
    bool operator<=( const concurrent_vector<T, A1>& a, const concurrent_vector<T, A2>& b );
 
    template<typename T, class A1, class A2>
    bool operator>=(const concurrent_vector<T, A1>& a, const concurrent_vector<T, A2>& b );
 
    template<typename T, class A>
    void swap(concurrent_vector<T, A>& a, concurrent_vector<T, A>& b);
}

[edit] Exception Safety

Methods working with memory allocation and/or new elements construction can throw an exception if allocator fails to allocate memory or element’s default constructor throws one. Concurrent vector’s element of type T must conform to the following requirements:

  • Throwing an exception is forbidden for destructor of T.
  • Default constructor of T must not throw an exception OR its non-virtual destructor must safely work when its object memory is zero-initialized.

Otherwise, the program’s behavior is undefined.

If an exception happens inside growth or assignment operation, an instance of the vector becomes invalid unless it is stated otherwise in the method documentation. Invalid state means:

  • There are no guaranties that all items were initialized by a constructor. The rest of items is zero-filled, [CHECK here] including item where exception happens.
  • An invalid vector instance cannot be repaired; it is unable to grow anymore.
  • Size and capacity reported by the vector are incorrect, and calculated as if the failed operation were successful.
  • Attempt to access not allocated elements using operator[] or iterators results in access violation or segmentation fault exception, and in case of using at() method a C++ exception is thrown.

If a concurrent grow operation successfully completes, all the elements it has added to the vector will remain valid and accessible even if one of subsequent grow operations fails.

[edit] Fragmentation

Unlike an STL vector, a concurrent_vector does not move existing elements if it needs to allocate more memory. The container is divided into a series of contiguous arrays of elements. The first reservation, growth, or assignment operation determines the size of the first array. Using small number of elements as initial size incurs fragmentation that may increase element access time. Internal layout can be optimized by method compact() that merges several smaller arrays into one solid.

[edit] Construction, Copy, and Assignment

Safety
All these operations are not thread safe on the same instance and can throw an exception (See #Exception Safety).

[edit] Construction of zero or n items

       concurrent_vector( const allocator_type& a = allocator_type() );
       explicit concurrent_vector( size_type items_count );
       concurrent_vector( size_type items_count, const_reference t, const allocator_type& a = allocator_type() );
       void assign( size_type n, const_reference t );
Effects
Construct vector empty or with n copies of t inside using optionally specified allocator instance.

[edit] Copy construction

       concurrent_vector( const concurrent_vector& source );
       concurrent_vector& operator=( const concurrent_vector& source );
Effects
Assign contents of source vector. Source vector can have a different allocator type.
Returns
operator=() returns reference to left hand side.
Safety
Concurrent operations with source are safe iff copying of uninitialized items is permitted. Likewise, it is safe to copy broken source vector after exception occurs. Copying terminates on first unallocated or broken segment.

[edit] Construction with iterators

       template<class I>
       concurrent_vector( I first, I last, const allocator_type& a = allocator_type() );
       template<class I>
       void assign( I first, I last );
Effects
Construct vector from a source delimited by iterators [first, last) making only N calls to the copy constructor of T (where N is the distance between first and last) and no reallocations.

[edit] Whole Vector Operations

Safety
All these operations are not thread safe on the same instance.

[edit] Reservation

       void reserve( size_type n );
Effects
Reserve space for at least n elements.
Throws
std::length_error if n > max_size(). Also, it can throw an exception if allocator fails to allocate a memory.
Safety
Exception raised during this call keeps the instance in the valid state.

[edit] Compaction

       void compact();
Effects
Compact several small segments into solid one and free unused tail segments.
Safety
It can throw an exception (see #Exception Safety) but keeps the instance in the valid state.

[edit] Swap

       void swap( concurrent_vector& vector );
Effects
Swap contents of two vector instances without copying each item.

[edit] Cleaning

       void clear();
Effects
Erase all elements and free memory. Afterwards, capacity()==0.
Throws
None. Since throwing an exception is forbidden for destructor of T, the method doesn't throw any exceptions.

[edit] Destruction

       ~concurrent_vector();
Effects
Calls clear() to erase all elements and destroy the vector.

[edit] Concurrent Operations

Safety
The methods described in this section safely execute on the same instance of a concurrent_vector<T>.

[edit] Incremental Growth

       size_type grow_by( size_type delta );
       size_type grow_by( size_type delta, const_reference item );
Effects
Atomically append delta elements to the end of the vector. The new elements are initialized by default or by copying of item.
Returns
Old size of the vector. If it returns k, then the new elements are at the half-open index range [k, k+delta).

[edit] Limited Growth

       void grow_to_at_least( size_type n );
Effects
Grow the vector until it has at least n elements. The new elements are initialized by default constructor of value_type.

[edit] Push of one item

       size_type push_back( const_reference item );
Effects
Equivalently to grow_by(1, item), atomically append copy of value to the end of the vector.
Returns
Index of the copy.

[edit] Storage properties

Safety
Since all these methods are const, they are exception and thread safe. Also, the methods are safe to execute on the invalid instance but could return wrong value, see #Exception Safety.

[edit] Size

       size_type size() const;
Returns
Number of elements in the vector. The result may include elements that are under construction by concurrent calls to growth methods.

[edit] Empty Check

       bool empty() const;
Returns
true if size() == 0.

[edit] Upper Size

       size_type max_size() const;
Returns
Highest size vector that might be representable.

[edit] Capacity

       size_type capacity() const;
Returns
Maximum size to which vector can grow without having to allocate more memory.

[edit] Allocator

       allocator_type get_allocator() const;
Returns
Allocator object.

[edit] Access

Safety
The methods described in this section safely execute on the same instance of a concurrent_vector<T> until they access items under concurrent grow/construction: by index or iterator related to size() or end().

[edit] operator[]

       reference operator[]( size_type index );
       const_reference operator[]( size_type index ) const;
Returns
Normal or const reference to an item with specified index.

[edit] Checked Access

       reference at( size_type index );
       const_reference at( size_type index ) const;
Returns
Normal or const reference to an item with specified index.
Complexity
Check for bad or broken index.
Throws
std::out_of_range if index >= size(). Or if *this instance is broken, throws std::out_of_range if there are no items after index, or throws std::range_error until inside broken segment.

[edit] Iterators

       iterator begin();
       iterator end();
       const_iterator begin() const;
       const_iterator end() const;
       reverse_iterator rbegin();
       reverse_iterator rend();
       const_reverse_iterator rbegin() const;
       const_reverse_iterator rend() const;
Complexity
Unlike a std::vector, the iterators are not raw pointers.
Returns
Normal or const or/and reverse iterator pointing to beginning or end of the vector.

[edit] Edge

       reference front();
       const_reference front() const;
       reference back();
       const_reference back() const;
Returns
Normal or const reference to an item from begin() or end() of the vector.

[edit] Parallel Iteration

Types const_range_type and range_type model the Range concept (TODO: a link) and provide methods to access the bounds of the range as shown in Table. The types differ only in that the bounds for a const_range_type are of type const_iterator, whereas the bounds for a range_type are of type iterator. Use the range types in conjunction with parallel_for (TODO: a link), parallel_reduce (TODO: a link), and parallel_scan (TODO: a link) to iterate over pairs in a concurrent_vector.

Table: Methods of concurrent_vector::range_type R
Pseudo-Signature Semantics
R::iterator R::begin() const First item in range
R::iterator R::end() const One past last item in range
       range_type range( size_t grainsize = 1);
       const_range_type range( size_t grainsize = 1 ) const;
Returns
Range over entire concurrent_vector that permits read-write or read-only access.

[edit] Files

tbb/1.0/include/tbb/concurrent_vector.h
Header
tbb/1.0/src/tbb/concurrent_vector.cpp
Internals
tbb/1.0/src/test/test_concurrent_vector.cpp
Unit test
tbb/1.0/examples/parallel_reduce/convex_hull
Example: Convex Hull
tbb/1.0/src/perf/time_vector.cpp
Performance measurements
test_vector_layout.cpp
Vector layout modeler

[edit] Code Review Remarks and Suggestions

No, so far.

[edit] Implemented ones

  • Arch Robison: BAD_ALLOC should be __TBB_BAD_ALLOC, since users are not supposed to use it directly.
  • Arch Robison: inline keyword is unnecessary for a member function whose definition is in the class itself.
    • Anton likes it for documentation reason only since "inline" keyword never works with modern compilers as expected.
  • Anton Malakhov: Replace volatile keyword with __TBB_load_with_acquire and __TBB_store_with_release as described in Coding Conventions#volatile qualifier

[edit] History

Arch Robison wrote the original version of concurrent_vector. Anton Malakhov revised concurrent_vector to fix various deficiencies reported by the Threading Tools team who was using it.

[edit] Changes since 2.0 version

  • Ideas and decisions are in a presentation
  • custom allocator support through template argument
    • defaulting to cache_aligned_allocator for now
  • segment sizes start from 2 (then 2, 4, 8...)
    • faster index calculation
    • fixes memory blow up for swarm of vector's instances of small size
  • first segment block solid allocation
    • first growth call specifies a number of segments to be merged in the first allocation.
    • introduces additional size_t field in data structure but doesn't require additional mask on segment pointer
      - potentially faster then "any solid segment" technique.
    • in conjunction with faster index calculation gives about 5-10% performance grow for test set of standard sequential algorithms.
  • introduced new methods:
    • grow_by(size_type n, const_reference t) growth using copying constructor to init new items.
    • STL-like constructors.
    • operators ==, < and derivatives
    • at() method, approved for using after an exception was thrown inside the vector
    • get_allocator() method.
    • assign() methods
    • compact() method to defragment first segments increasing items access time.
    • swap() method
  • range() defaults on grainsize = 1 supporting auto grainsize algorithms.
  • clear() behavior changed to freeing segments memory
  • exception safety guaranties
  • fixed ranges conversions (range --> const_range)

[edit] Cross-allocator methods

They are:

template<class M> concurrent_vector( const concurrent_vector<T, M>& x );
template<class M> concurrent_vector& operator=( const concurrent_vector<T, M>& x );
template<typename T, class A1, class A2>  bool operator==( const concurrent_vector<T, A1>& a, const concurrent_vector<T, A2>& b );
// and others comparisons operators for vectors of different allocator types

I’ve added ones scince I experienced problems in test_concurrent_vetcor while I tried to work with vectors of different allocator types (scalable, NFS, standard, debuging adaptors) but of same value-type.

I guess, users can face the same problems using our palette of allocators without cross-allocator methods. Meanwhile, these methods are simple implemented because the base class is non-template thus they use just a few more type casts.

[edit] Known Issues

[edit] Broken on Itanium + ICC9.0

The new version breaks test_concurrent_vector.exe when all the following are true:

  • is compiled on on Itanium
  • The Intel 9.0 compiler is used to compiler concurrent_vector.cpp
  • The optimization level is -01 or higher.

The test faults even when run with as single thread, which indicates the problem is not a memory fence. The problem does not occur if the test is compiled with gcc, or compiled with -O0. The problem does not occur with the Intel 10.0 compiler. See PVCS Tracker 548.

[edit] Exception passing SEGV problem on ICC10

I’ve found that test_concurrent_vector fails with:

  • ICC: Intel(R) C++ Compiler for applications running on IA-32, Version 10.0 Build 20070426 Package ID: l_cc_p_10.0.023
OR : icc (ICC) 10.1 20071116
  • GCC: 3.4.6 OR 3.2.3
  • LIBC: 2.3.4 OR 2.3.2

on old Intel64 machine OR on old ia32 (ZPro) It works fine already on Itanium with gcc 3.4.3 and same icc and libc. The bug works only if the throw statement is located in the libtbb.so.

I fail to separate the failure into non-TBB source. Even if I create separate library built by the same rules from a modified concurrent_vector class with only one internal_throw() function and w/o TBB-specifics like atomics – it works fine… but fails when linked to full libtbb.so

So, cutting it down doesn’t help me today :(

The only workaround I know is to move concurrent_vector_base::intenal_throw_exception() method from libtbb into the header.

The simplest code to reproduce the bug is here:

#include "tbb/concurrent_vector.h"

int main( ) {
    tbb::concurrent_vector<int> u;
    try { u.at( u.size() ); } catch (...) {}
    return 0;
}

the failure happens in the following context:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 4160297888 (LWP 5992)]
0x464c457f in ?? ()
(gdb) bt
#0  0x464c457f in ?? ()
#1  0x008ec060 in _Unwind_RaiseException () from /lib/libgcc_s.so.1
#2  0x00bcafd2 in __cxa_throw () from /usr/lib/libstdc++.so.6
#3  0xf7fde503 in tbb::internal::concurrent_vector_base::internal_throw_exception (
    this=0xffffd4f4, t=0) at ../../src/tbb/concurrent_vector.cpp:137

[edit] Unrecognized exception on Mac OS

A similar to above problem appears on Mac OS machines (with gcc 4.0.1) and looks like catch can’t recognize and intercept exception of known type. Likewise the first problem, it can’t be reproduced without TBB library.

I’ve checked in the test_concurrent_vector.cpp file with complete testing of exceptions safety. But it prints “ERROR: … known compiler issue” for the new problem.

The fix is to include the following symbols in export lists: _ZTSSt11range_error; _ZTSSt12length_error; _ZTSSt12out_of_range; _ZTSN3tbb14bad_last_allocE;



Page & Feed Options
Print

Bookmark This

 Digg this   del.icio.us