Skip to main content
Ctrl+K
Logo image

About

  • Getting Help and Support
  • Notational Conventions
  • Introduction
  • oneTBB Benefits
  • Testing Approach

Get Started

  • Get Started with oneTBB
  • Installation
  • oneTBB Samples

Developer Guide

  • oneTBB Developer Guide
    • Package Contents
      • Debug Versus Release Libraries
      • Scalable Memory Allocator
      • Windows*
      • Linux*
      • macOS*
    • Parallelizing Simple Loops
      • Initializing and Terminating the Library
      • parallel_for
        • Lambda Expressions
        • Automatic Chunking
        • Controlling Chunking
        • Bandwidth and Cache Affinity
        • Partitioner Summary
      • parallel_reduce
      • Advanced Example
      • Advanced Topic: Other Kinds of Iteration Spaces
    • Parallelizing Complex Loops
      • Cook Until Done: parallel_for_each
      • Working on the Assembly Line: parallel_pipeline
        • Using Circular Buffers
        • Throughput of pipeline
        • Non-Linear Pipelines
      • Summary of Loops and Pipelines
    • Parallelizing Data Flow and Dependence Graphs
      • Parallelizing Data Flow and Dependency Graphs
      • Basic Flow Graph Concepts
        • Flow Graph Basics: Graph Object
        • Flow Graph Basics: Nodes
        • Flow Graph Basics: Edges
        • Flow Graph Basics: Mapping Nodes to Tasks
        • Flow Graph Basics: Message Passing Protocol
        • Flow Graph Basics: Single-push vs. Broadcast-push
        • Flow Graph Basics: Buffering and Forwarding
        • Flow Graph Basics: Reservation
      • Graph Application Categories
        • Data Flow Graph
        • Dependence Graph
      • Predefined Node Types
      • Flow Graph Tips and Tricks
        • Flow Graph Tips on Making Edges
        • Flow Graph Tips on Nested Parallelism
        • Flow Graph Tips for Limiting Resource Consumption
      • Estimating Flow Graph Performance
    • Work Isolation
    • Exceptions and Cancellation
      • Cancellation Without An Exception
      • Cancellation and Nested Parallelism
    • Floating-point Settings
    • Containers
      • concurrent_hash_map
        • More on HashCompare
      • concurrent_vector
      • Concurrent Queue Classes
        • Iterating Over a Concurrent Queue for Debugging
        • When Not to Use Queues
      • Summary of Containers
    • Mutual Exclusion
      • Mutex Flavors
      • Reader Writer Mutexes
      • Upgrade/Downgrade
      • Lock Pathologies
    • Timing
    • Memory Allocation
      • Which Dynamic Libraries to Use
      • Configuring the Memory Allocator
      • Automatically Replacing malloc and Other C/C++ Functions for Dynamic Memory Allocation
        • Windows* OS C/C++ Dynamic Memory Interface Replacement
        • Linux* OS C/C++ Dynamic Memory Interface Replacement
    • The Task Scheduler
      • Task-Based Programming
      • When Task-Based Programming Is Inappropriate
      • How Task Scheduler Works
      • Task Scheduler Bypass
      • Guiding Task Scheduler Execution
    • Design Patterns
      • Agglomeration
      • Elementwise
      • Odd-Even Communication
      • Wavefront
      • Reduction
      • Divide and Conquer
      • GUI Thread
      • Non-Preemptive Priorities
      • Lazy Initialization
      • Local Serializer
      • Fenced Data Transfer
      • Reference Counting
      • General References
    • Migrating from Threading Building Blocks (TBB)
      • Migrating from tbb::task_scheduler_init
      • Migrating from low-level task API
      • Mixing two runtimes
    • Constrained APIs
    • Appendix A Costs of Time Slicing
    • Appendix B Mixing With Other Threading Packages
    • References

Developer Reference

  • oneTBB API Reference
    • oneapi::tbb::info namespace
    • parallel_for_each Body semantics and requirements
    • parallel_sort ranges interface extension
    • TBB_malloc_replacement_log Function
    • Type-specified message keys for join_node
    • Scalable Memory Pools
      • memory_pool
      • fixed_pool
      • memory_pool_allocator
    • Helper Functions for Expressing Graphs
      • Constructors for Flow Graph nodes
      • follows and precedes function templates
      • make_node_set function template
      • make_edges function template
    • concurrent_lru_cache
    • task_group extensions
    • The customizing mutex type for concurrent_hash_map
  • Suggest edit
  • Open issue
  • .rst

Parallelizing Complex Loops

Parallelizing Complex Loops#

You can successfully parallelize many applications using only the constructs in the Parallelizing Simple Loops section. However, some situations call for other parallel patterns. This section describes the support for some of these alternate patterns.

  • Cook Until Done: parallel_for_each
  • Working on the Assembly Line: parallel_pipeline
    • Using Circular Buffers
    • Throughput of pipeline
    • Non-Linear Pipelines
  • Summary of Loops and Pipelines

previous

Advanced Topic: Other Kinds of Iteration Spaces

next

Cook Until Done: parallel_for_each

By Intel

© Copyright 2023, Intel Corporation.

Cookies