Choosing the Right API
Blitz offers several parallel APIs, each optimized for different patterns. This guide helps you pick the right one.
Quick Decision Guide
Section titled “Quick Decision Guide”Start here: “I want to…”
| I want to… | Use | Example |
|---|---|---|
| Process every element in an array | parallelFor | Apply transform to each pixel |
| Sum / min / max an array | iter().sum() / .min() / .max() | Total revenue, highest score |
| Search for an element | iter().findAny() | Find first negative value |
| Check a condition on all elements | iter().any() / .all() | Any NaN? All positive? |
| Transform array to new array | parallelCollect | Convert i32[] to f64[] |
| Transform array in place | parallelMapInPlace or iterMut().mapInPlace() | Normalize pixel values |
| Run 2-8 independent tasks | join | Compute stats + histogram |
| Run 2-64 dynamic tasks | scope + spawn | Load multiple files |
| Run tasks that may fail | tryJoin / tryForEach / tryReduce | Parse + validate data |
| Sort an array | sortAsc / sortDesc / sort | Sort by timestamp |
| Merge many slices into one | parallelFlatten | Combine per-thread results |
| Write to scattered positions | parallelScatter | Build hash table |
| Run code on every thread | broadcast | Initialize thread-local state |
| Fire-and-forget background work | spawn | Write audit log |
| Aggregate with custom logic | parallelReduce | Weighted average, dot product |
API Comparison
Section titled “API Comparison”Data Parallelism: iter vs parallelFor vs parallelCollect
Section titled “Data Parallelism: iter vs parallelFor vs parallelCollect”These three APIs all process array elements in parallel, but they serve different purposes:
const data = [_]i64{ 1, 2, 3, 4, 5 };iter() — Built-in aggregations and searches
const sum = blitz.iter(i64, &data).sum(); // 15const min = blitz.iter(i64, &data).min(); // 1const found = blitz.iter(i64, &data).findAny(pred);Best for: Standard aggregations (sum, min, max, count), searches (findAny, findFirst), and predicates (any, all). Zero boilerplate.
parallelFor — Custom chunk processing
blitz.parallelFor(data.len, Context, ctx, struct { fn body(c: Context, start: usize, end: usize) void { for (c.data[start..end]) |*v| { v.* = expensiveTransform(v.*); } }}.body);Best for: Side effects, in-place mutation, or when you need the chunk boundaries (start, end) for efficient batch processing.
parallelCollect — Map to a new array
blitz.parallelCollect(i64, f64, &input, &output, void, {}, struct { fn transform(_: void, x: i64) f64 { return @as(f64, @floatFromInt(x)) * 0.1; }}.transform);Best for: Transforming an array of type T into an array of type U, producing a new output array.
Summary:
| Feature | iter() | parallelFor | parallelCollect |
|---|---|---|---|
| Aggregation (sum, min) | Built-in | Manual | No |
| Search (find, any, all) | Built-in | Manual | No |
| In-place mutation | iterMut | Yes | No |
| Map T -> U | No | Manual | Built-in |
| Chunk access (start, end) | No | Yes | No |
| Custom grain size | No | WithGrain | WithGrain |
| Boilerplate | Low | Medium | Medium |
Task Parallelism: join vs scope vs spawn
Section titled “Task Parallelism: join vs scope vs spawn”These APIs run independent tasks in parallel, differing in flexibility and guarantees:
join — Fixed tasks with results
const result = blitz.join(.{ .a = .{ computeA, arg_a }, .b = .{ computeB, arg_b },});// result.a, result.bBest for: 2-8 tasks with known types and return values. Supports heterogeneous return types. Works with recursive divide-and-conquer.
scope + spawn — Dynamic task set
blitz.scope(struct { fn run(s: *blitz.Scope) void { s.spawn(task1); s.spawn(task2); if (condition) s.spawn(task3); }}.run);Best for: Variable number of tasks decided at runtime. Tasks are void (no return values). Maximum 64 tasks.
spawn — Fire-and-forget
blitz.spawn(struct { fn run() void { writeLog(); }}.run);// Returns immediatelyBest for: Background work where the caller does not need the result.
Summary:
| Feature | join | scope | spawn |
|---|---|---|---|
| Max tasks | 8 | 64 | 1 |
| Return values | Yes (per task) | No | No |
| Wait for completion | Yes | Yes (at scope exit) | No |
| Dynamic task count | No (comptime) | Yes (runtime) | N/A |
| Recursive | Yes | No | No |
| Error variants | tryJoin | No | No |
Reduction: iter vs parallelReduce
Section titled “Reduction: iter vs parallelReduce”iter() aggregations are best for standard operations:
const sum = blitz.iter(i64, data).sum();const count = blitz.iter(i64, data).count();parallelReduce is best for custom reductions:
// Dot product (no built-in for this)const dot = blitz.parallelReduce( f64, a.len, 0.0, DotCtx, .{ .a = a, .b = b }, struct { fn map(ctx: DotCtx, i: usize) f64 { return ctx.a[i] * ctx.b[i]; } }.map, struct { fn add(x: f64, y: f64) f64 { return x + y; } }.add,);| Feature | iter() | parallelReduce |
|---|---|---|
| sum, min, max, count | Built-in | Manual |
| Custom reduction | reduce(identity, combineFn) | Full control |
| Map + reduce | Limited | Built-in (map + combine) |
| Error handling | No | tryReduce |
| Grain control | No | WithGrain |
Error Handling: try Variants
Section titled “Error Handling: try Variants”Every core API has an error-safe variant:
| Infallible | Fallible | Guarantee |
|---|---|---|
join | tryJoin | All tasks complete before error propagates |
parallelFor | tryForEach | All chunks complete before error propagates |
parallelReduce | tryReduce | All reductions complete before error propagates |
Use try* variants when your parallel body can return a Zig error. See Error Handling for details.
Decision Flowchart
Section titled “Decision Flowchart”Do you have an array/slice to process?├── Yes│ ├── Need sum/min/max/count?│ │ └── iter().sum() / .min() / .max() / .count()│ ├── Need to search/check?│ │ └── iter().findAny() / .any() / .all()│ ├── Need to map T[] -> U[]?│ │ └── parallelCollect()│ ├── Need to transform in place?│ │ └── parallelMapInPlace() or iterMut().mapInPlace()│ ├── Need to sort?│ │ └── sortAsc() / sortDesc() / sort() / sortByKey()│ ├── Need custom reduction?│ │ └── parallelReduce()│ └── Need chunk-based processing?│ └── parallelFor()│├── Do you have independent tasks?│ ├── 2-8 tasks with return values?│ │ └── join()│ ├── Dynamic number of tasks (up to 64)?│ │ └── scope() + spawn()│ ├── Tasks can fail?│ │ └── tryJoin()│ └── Fire-and-forget?│ └── spawn()│├── Need to combine/flatten results?│ ├── Merge slices into one?│ │ └── parallelFlatten()│ └── Scatter to indexed positions?│ └── parallelScatter()│└── Need per-thread operations? └── broadcast()Performance Characteristics
Section titled “Performance Characteristics”| API | Overhead | Scalability | Best Data Size |
|---|---|---|---|
iter().sum() | Very low | Excellent | > 10K |
iter().findAny() | Very low | Excellent (early exit) | Any |
parallelFor | Low | Excellent | > 10K |
parallelReduce | Low | Excellent | > 10K |
parallelCollect | Low | Excellent | > 10K |
join (2 tasks) | ~10 ns | 2x max | N/A |
join (8 tasks) | ~50 ns | 8x max | N/A |
scope (N tasks) | ~100 ns | Nx max | N/A |
sort | Medium | Good | > 10K |
parallelFlatten | Low | Good | > 100 slices |
broadcast | Low | Linear | N/A |
Examples by Domain
Section titled “Examples by Domain”Scientific Computing
Section titled “Scientific Computing”// Matrix operationsconst dot = blitz.parallelReduce(f64, n, 0.0, ctx_type, ctx, mapFn, addFn);blitz.parallelFor(n, ctx_type, ctx, matmulBody);
// Statistical analysisconst result = blitz.join(.{ .mean = .{ computeMean, data }, .stddev = .{ computeStdDev, data }, .median = .{ computeMedian, data },});Data Processing
Section titled “Data Processing”// ETL pipelineblitz.parallelCollect(RawRecord, CleanRecord, raw, clean, void, {}, cleanFn);blitz.sortByKey(CleanRecord, u64, clean, timestampKey);const total = blitz.iter(CleanRecord, clean).reduce(0, sumRevenue);Game Development
Section titled “Game Development”// Entity updatesblitz.parallelFor(entities.len, EntityCtx, ctx, updatePhysics);blitz.parallelFor(entities.len, EntityCtx, ctx, updateAI);
// Spatial queriesconst nearest = blitz.iter(Entity, entities).minByKey(f64, distToPlayer);Image Processing
Section titled “Image Processing”// Per-pixel transformblitz.parallelMapInPlace(Pixel, pixels, BrightnessCtx, ctx, adjustBrightness);
// Histogramconst histogram = blitz.parallelReduce( [256]u32, pixels.len, .{0} ** 256, []const Pixel, pixels, histMap, histCombine,);